Skip to Main Content

Breadcrumb

May 4th

Question and Answer

Connor McDonald

Thanks for the question, Rich.

Asked: March 24, 2002 - 11:28 pm UTC

Last updated: January 15, 2024 - 5:14 am UTC

Version: 8.1.7

Viewed 100K+ times! This question is

You Asked

Tom,

Thanks for providing this forum for answering questions. I'm trying to get the hang of analytic functions and I'm having a problem with the following code. I have to comment out "where dr <= 3)" to get it to work. "dr" shows up as the column heading but I add my conditional statement I get invalid column name. Help!

Ps. I have your book and I hope you sign on the dotted line for the next one.

select *
FROM (select candidate_id,
deptid,
actual_start_date,
annual_sal_amt,
dense_rank()
over (partition by deptid
order by annual_sal_amt desc) dr
from obapp1.offer
where actual_start_date >= '01-JAN-2002'
and annual_sal_amt > 0)
-- where dr <= 3)
order by deptid,
annual_sal_amt desc


and Tom said...

scott@ORA817DEV.US.ORACLE.COM> select *
2 FROM (select empno,
3 deptno,
4 hiredate,
5 sal,
6 dense_rank()
7 over (partition by deptno
8 order by sal desc) dr
9 from emp
10 where hiredate >= to_date( '01-JAN-1980', 'dd-mon-yyyy' )
11 and sal > 0)
12 where dr <= 3
13 order by deptno, sal desc
14 /

EMPNO DEPTNO HIREDATE SAL DR
---------- ---------- --------- ---------- ----------
7839 10 17-NOV-81 5000 1
7782 10 09-JUN-81 2450 2
7934 10 23-JAN-82 1300 3
7788 20 09-DEC-82 3000 1
7902 20 03-DEC-81 3000 1
7566 20 02-APR-81 2975 2
7876 20 12-JAN-83 1100 3
7698 30 01-MAY-81 2850 1
7499 30 20-FEB-81 1600 2
7844 30 08-SEP-81 1500 3

10 rows selected.

You have to apply the constraint against the analytic function (what you call DR) outside of the inline view. The anlytic functions have to be evaluated AND THEN you can "where" on them. Simply put the where dr <= 3 where I have it.

Rating

  (427 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Analytic Functions

Rich Sekunda, March 25, 2002 - 2:07 pm UTC

Thanks for pointing that out. These new (for me and my application) functions are full of possibilities but will take a little getting used to.

Again, thanks for your help!

groupping analytic functions

A reader, September 11, 2002 - 12:19 pm UTC

hi tom

I have this query

select deptno,job, sum(sal), sum(sal) over ()
from emp
group by deptno, job, sal

it works but I dont understand why shouldnt it be

select deptno,job, sum(sal), sum(sal) over ()
from emp
group by deptno, job, sum(sal) over ()


???

Tom Kyte
September 11, 2002 - 12:32 pm UTC

because analytics are done last. You probably mean to do:

1 select a.*, sum(sal) over ()
2 from (
3 select deptno,job, sum(sal) sal
4 from emp
5 group by deptno, job
6* ) a
scott@ORA920.US.ORACLE.COM> /

DEPTNO JOB SAL SUM(SAL)OVER()
------ --------- ---------- --------------
10 CLERK 1300 29025
10 MANAGER 2450 29025
10 PRESIDENT 5000 29025
20 ANALYST 6000 29025
20 CLERK 1900 29025
20 MANAGER 2975 29025
30 CLERK 950 29025
30 MANAGER 2850 29025
30 SALESMAN 5600 29025

9 rows selected.


I think

what the difference between

A reader, September 11, 2002 - 3:23 pm UTC

select a.*, sum(sal) over ()
from (
select deptno, sum(sal) sal
from emp
where deptno in (10, 20)
group by deptno, sal) a

and

select deptno, sum(sal), sum(sal) over ()
from emp
where deptno in (10, 20)
group by deptno, sal

?

Tom Kyte
September 11, 2002 - 3:43 pm UTC

besides that they give different results in general?


ops$tkyte@ORA920.US.ORACLE.COM> select * from emp
  2  /

DEPTNO        SAL
------ ----------
    10        100
    10        100
    10        100
    10        100
    10        100
    20        200
    20        200
    20        200
    20        200

9 rows selected.

ops$tkyte@ORA920.US.ORACLE.COM>
ops$tkyte@ORA920.US.ORACLE.COM> select a.*, sum(sal) over ()
  2  from (
  3  select deptno, sum(sal) sal
  4  from emp
  5  where deptno in (10, 20)
  6  group by deptno, sal) a
  7  /

DEPTNO        SAL SUM(SAL)OVER()
------ ---------- --------------
    10        500           1300
    20        800           1300

ops$tkyte@ORA920.US.ORACLE.COM>
ops$tkyte@ORA920.US.ORACLE.COM> select deptno, sum(sal), sum(sal) over ()
  2  from emp
  3  where deptno in (10, 20)
  4  group by deptno, sal
  5  /

DEPTNO   SUM(SAL) SUM(SAL)OVER()
------ ---------- --------------
    10        500            300
    20        800            300

ops$tkyte@ORA920.US.ORACLE.COM>


The query:

select deptno, sum(sal), sum(sal) over ()
from emp
where deptno in (10, 20)
group by deptno, sal

gets the result set:

select deptno, sum(sal), SAL
  from emp
 where deptno in ( 10, 20 )
 group by DEPTNO, SAL

and then applies the analytic function (they are dead last) to the SAL column -- which since you grouped by it roles all of the same values into one row.  Doing that -- you get the sum of the DISTINCT sal's


I got the sum of the summed sals.


 

conditional anlytic functions

raju, October 30, 2002 - 11:26 am UTC

can the analytic function be evaluated based on a condtion for that function only

we have the surveys, i need to display the survey status(live or not)


SQL> 
SQL>    SELECT
  2     sur.ppl_survey_question_start_date  surve_start_date,
  3     sur.ppl_survey_question_desc survey_desc,
  4     decode(SUBSTR(dr,9),ROWID,'Live',NULL) survey_status
  5     FROM
  6     (
  7     SELECT
  8     ppl_survey_question_start_date,
  9     ppl_survey_question_desc ,
 10     MAX(to_char(ppl_survey_question_start_date,'YYYYMMDD') || ROWID) over(PARTITION BY ppl_division_id) dr
 11     FROM
 12     ppl_survey_question_ref
 13     ) sur
 14  /

SURVE_START_DATE SURVEY_DESC                                                                      SURVEY_STATUS
---------------- -------------------------------------------------------------------------------- -------------
10/25/2002       What do you think?                                                               
10/26/2002       What do you think?                                                               
10/27/2002       What do you think?                                                                               
11/1/2002        How many questions are in this survey?                                           Live


i need to display the survey status as live only when the survey start date <= sysdate. surveys whose start dates are greater than sysdate
should n't be live 

in the above query survey with start date 10/27/2002 should be live

i wrote the following query:
SQL>    SELECT
  2     sur.ppl_survey_question_start_date  surve_start_date,
  3     sur.ppl_survey_question_desc survey_desc,
  4     decode(SUBSTR(dr,9),ROWID,'Live',NULL) survey_status
  5     FROM
  6     (
  7     SELECT
  8     ppl_survey_question_start_date,
  9     ppl_survey_question_desc ,
 10     MAX(to_char(decode(sign(ppl_survey_question_start_date-trunc(SYSDATE)),1,to_date('01010001','ddmmyyyy'),ppl_survey_question_start_date),'YYYYMMDD') || ROWID) over(PARTITION BY ppl_division_id) dr
 11     FROM
 12     ppl_survey_question_ref
 13     ) sur
 14  /

SURVE_START_DATE SURVEY_DESC                                                                      SURVEY_STATUS
---------------- -------------------------------------------------------------------------------- -------------
10/25/2002       What do you think?                                                               
10/26/2002       What do you think?                                                               
10/27/2002       What do you think?                                                               Live
11/1/2002        How many questions are in this survey? 


i want to know, is there a better way to do this?

and also is dense_rank be better than the max function 

Tom Kyte
October 31, 2002 - 8:13 pm UTC

I'm not sure why you are using analytics here at all???


select ppl_survey_question_start_date, ppl_survey_question_desc,
case when ppl_survey_question_start_date <= sysdate then 'LIVE'
else 'NOT LIVE'
end status
from ppl_survey_question_ref
/



conditional analytic functions

raju, October 31, 2002 - 9:22 pm UTC

tom,

i don't want all the surveys to be live, but i want the surveys to be live whose start_date =
max start date and less than today and for a division

we can only have one live survey at a time for a division

for example

if today is : 10/31/2002

survey id start date status Division
---------- ---------- ------- --------
1 10/27/2002 Live abc
2 11/2/20002 abc
3 10/26/2002 Live cde
4 12/1/2002 abc
5 10/24/2002 cde
6 11/21/2002 cde
7 10/26/2002 abc

thanks



Tom Kyte
October 31, 2002 - 9:41 pm UTC

then your selecting of

max( start_date || rowid ) over ( partition by division) max_start_rowid



was good and you just need to:


select ...
decode( start_date || rowid, max_start_rowid, 'LIVE', null ) status
from ( .....


is appropriate

unable to allocate memory error with analytic function

raju, November 18, 2002 - 3:31 pm UTC

the following ref cursor gives with anlytic function an error:

ORA-04031: unable to allocate 4200 bytes of shared memory ("shared pool","SELECT ppl_content_typ_c ...","library cache","kkslpkp - literal info.")
ORA-06512: at "PCSFB.PPL_HOME_PKG", line 78 ORA-06512: at line 1


OPEN o_content_details FOR
SELECT
ppl_content_typ_c content_type,
ppl_content_hdng content_heading,
ppl_content_t content_desc,
ppl_content_url content_url,
ppl_video_url video_url,
ppl_audio_url audio_url
FROM
(
SELECT
a.ppl_content_typ_c,
a.ppl_content_hdng,
a.ppl_content_t,
a.ppl_content_url,
a.ppl_video_url,
a.ppl_audio_url,
dense_rank() over (PARTITION BY a.ppl_content_typ_c ORDER BY decode(ppl_divsn_id_c ,'ALL',1,0) ASC,a.ppl_content_eff_d DESC NULLS LAST) dr
FROM ppl_tcontent a WHERE a.ppl_content_eff_d <= v_curr_date AND a.ppl_divsn_id_c IN ('ALL',v_divsn_id)) WHERE dr = 1;



i don't understand why is it giving that error as this is static
can you guide where i can use bind variables to avoid this error?

purpose of the query is : we have contents for each division, we need to pick the content that has max effective date and which is less than sysdate



Tom Kyte
November 18, 2002 - 8:52 pm UTC

you've trashed the shared pool -- it is full. flush it or restart (so its "clean") and this query won't do that. It's in the parse phase -- trying to allocate library cache and you've plain run out.


Most likely -- you aren't using bind variables elsewhere in your system or your shared pool really is too small (thats very rare, the bind variable thing -- for whatever reason, is not unfortunately)

What are the side-effects of Flushing the shared Pool ?

pasko, November 19, 2002 - 2:47 am UTC

Hi Tom,
Thanks for a Great Website + Great Answers!

Beside the fact that Flushing the shared Pool would remove all cached pl/sql and cursors, what are other side-effects of flushing the shared Pool..


Thanks and best regards.




Tom Kyte
November 19, 2002 - 7:25 am UTC

that is the side effect.

if you find you NEED to flush the shared pool on a recurring basis - you are killing performance, limiting scalability and need to find the app that isn't using bind variables and fix it.

unable to allocate memory error with analytic function

raju, November 19, 2002 - 10:06 am UTC

Tom,

thanks for the response.Even we flush shared pool, after sometime we keep getting this error:

earlier(for around 1 month) it was working well, all of sudden we are getting this error,

this is the rest of the stored proc code,

can you please tell me am i not using bind variables any where?

BEGIN
--
SELECT
ppl_survey_question_id,
ppl_survey_question_desc
INTO
v_survey_id_c,
v_survey_t
FROM
(
SELECT
ppl_survey_question_id,
ppl_survey_question_desc,
ppl_division_id,
ppl_survey_question_start_date
FROM
ppl_survey_question_ref
WHERE ppl_division_id IN ('ALL',v_divsn_id)
AND ppl_survey_question_start_date <= TRUNC(SYSDATE)
ORDER BY decode(ppl_division_id,'ALL',1,0) ASC, ppl_survey_question_start_date DESC NULLS LAST
)
WHERE
/*NOT EXISTS (SELECT 1 FROM ppl_tem_survey_choice emsr
WHERE emsr.ppl_survey_id_c = srv.ppl_survey_id_c
AND emsr.sid_em_id_c = i_emp_id)
AND*/
ROWNUM = 1;
BEGIN
SELECT
'Y'
INTO
v_survey_answered_ind
FROM
ppl_employee_survey_answer res
WHERE res.sid_employee_id = i_emp_id
AND res.ppl_survey_question_id = v_survey_id_c
AND rownum = 1;
EXCEPTION
WHEN no_data_found THEN
v_survey_answered_ind := 'N';
END;
--
OPEN o_survey FOR
SELECT
v_survey_id_c survey_id,
v_survey_t survey_desc,
v_survey_answered_ind question_answered
FROM dual;
--
OPEN o_survey_choice FOR
SELECT
chc.ppl_survey_answer_id choice_seq,
chc.ppl_survey_answer_desc choice_desc
FROM
ppl_survey_answer_ref chc
WHERE chc.ppl_survey_question_id = v_survey_id_c;
--
EXCEPTION
WHEN no_data_found THEN
OPEN o_survey FOR
SELECT 1 FROM dual WHERE 1 = 0;
OPEN o_survey_choice FOR
SELECT 1 FROM dual WHERE 1 = 0;
END;


thanks


Tom Kyte
November 19, 2002 - 5:22 pm UTC

see
</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:1163635055580 <code>

to see if you are using binds.

STATIC sql in plsql is always 100% bind friendly.

I don't get the same results as in your book, p 546

Jon, February 11, 2003 - 10:32 am UTC

At first I thought I was typing it in wrong, but I downloaded the .sql file and did a cut and paste.  Why am I not getting my running totals as shown in the book?

SQL> break on deptno skip 1
SQL> select ename, deptno, sal,
  2         sum(sal) over
  3                (order by deptno, ename) running_total,
  4         sum(sal) over
  5                (partition by deptno
  6                 order by ename) department_total,
  7         row_number() over
  8                (partition by deptno
  9                 order by ename  ) seq
 10  from emp
 11  order by deptno, ename
 12  /

ENAME                   DEPTNO       SAL RUNNING_TOTAL DEPARTMENT_TOTAL       SEQ
-------------------- --------- --------- ------------- ---------------- ---------
Smith                        1        70        106000           106000         1
Smith                                 90        106000           106000         2
Smith                                110        106000           106000         3
Smith                                130        106000           106000         4
Smith                                150        106000           106000         5
Smith                                170        106000           106000         6
Smith                                190        106000           106000         7
Smith                                210        106000           106000         8
Smith                                230        106000           106000         9
Smith                                250        106000           106000        10  

Tom Kyte
February 11, 2003 - 4:50 pm UTC

and so what does select * from EMP look like to you?


looks like you have a whole lotta SMITHs in there...

Answer to your question

Jon, February 12, 2003 - 8:11 am UTC

Yes, I just generated some test data using something like

begin
for i in 1..10 loop
for j in 1..100 loop
  insert into emp values ('Smith',i, j*70);
end loop;
end loop;
end;
/

I never committed the data.  My purpose was to get timing statistics in our environment similar to what you reflected in your book.  But when I got the questionable results, I wondered why.

Using the above,
SQL> select * from emp where rownum < 6;

ENAME                   DEPTNO       SAL
-------------------- --------- ---------
Smith                        1        70
Smith                                140
Smith                                210
Smith                                280
Smith                                350

and the first few rows from the query are:
SQL> select ename, deptno, sal,
  2         sum(sal) over
  3                (order by deptno, ename) running_total,
  4         sum(sal) over
  5                (partition by deptno
  6                 order by ename) department_total,
  7         row_number() over
  8                (partition by deptno
  9                 order by ename  ) seq
 10  from emp
 11  order by deptno, ename
 12  /

ENAME                   DEPTNO       SAL RUNNING_TOTAL DEPARTMENT_TOTAL       SEQ
-------------------- --------- --------- ------------- ---------------- ---------
Smith                        1        70        353500           353500         1
Smith                                140        353500           353500         2
Smith                                210        353500           353500         3
Smith                                280        353500           353500         4
Smith                                350        353500           353500         5
 

Tom Kyte
February 12, 2003 - 8:38 am UTC

they all have the same ENAMES -- when you order by them -- they are all the SAME. Mine worked simply because the ENAMES are unique in the scott/tiger EMP table.

add

, rowid


to the query:

select ename, deptno, sal,
2 sum(sal) over
3 (order by deptno, ename, ROWID) running_total,
4 sum(sal) over
5 (partition by deptno
6 order by ename, ROWID ) department_total,
7 row_number() over
8 (partition by deptno
9 order by ename, ROWID ) seq
10 from emp
11 order by deptno, ename
12 /


or more simply, just run demobld.sql to get the real tables loaded up if you want to follow the examples blow by blow

Thanks

Jon, February 12, 2003 - 8:57 am UTC

I knew there had to be a reason. That's a really important thing to know, and it hadn't occurred to me. In the real world, we might end up with duplicate last names (although I don't live in "Smith-ville") and then you really won't get a running total unless you use the technique you just explained.

As always, you had the answer - you're the best, Tom!

Analytic Function in PL/SQL ref_cursor

Dale Ware, May 08, 2003 - 5:24 pm UTC

Tom,

I run the following anonymous PL/SQL block in my 8i instances.

set term off serverout off feed off timing off head off pause off arraysize 100
col owner form a16
col index_owner form a16
col blevel form 99,999
col leaf_blocks head 'Leaf|Blocks'
col blocks head 'Table|Blocks'
ttitle skip2 "-- Top 10 Indexes which are potential Candidates for Rebuilding" skip 2
variable x refcursor;
set term on serverout on head on pagesize 44 linesize 132 autoprint on
declare
pi_version varchar2(20);
pi_compatibility varchar2(20);
begin
dbms_utility.db_version(pi_version,pi_compatibility);
IF pi_version > '8.1.0.0.0' THEN
OPEN :x FOR 'select owner, index_name, leaf_blocks , blevel, tablespace_name, blocks
from (select idx.owner, index_name, leaf_blocks,
blevel, idx.tablespace_name, blocks,
rank () over (order by leaf_blocks desc nulls last) rank
from dba_indexes idx,
dba_tables tbl
where idx.table_owner = tbl.owner
and idx.table_name = tbl.table_name
and idx.tablespace_name is not null
and idx.leaf_blocks > (tbl.blocks * .75))
where rank <= 10';
END IF;
end;
/
set term off serverout off head off


However, when I run this in 8.0 instances it throws the following error:


OPEN :x FOR 'select owner, index_name, leaf_blocks , blevel, tablespace_name, blocks
*
ERROR at line 7:
ORA-06550: line 7, column 19:
PLS-00103: Encountered the symbol "select owner, index_name, leaf_blocks , blevel, tablespace_name," when expecting one of the
following:
select

Is it possible to run this in 8.0 and below? If so how?

Thanks,
Dale

does this require analytic function or can be done without them

Asim Naveed, May 21, 2003 - 2:43 pm UTC

I am trying to write SQL query for a question, but i
failed to make it. I thought may be analytic functions
can do some thing.
here are the create, and insert statements.

Create table T1
(
QUARTER VARCHAR2(100),
DIVISION VARCHAR2(100),
SALES NUMBER(10)
);

INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr1-1998', 'Div1', 3);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr1-1998', 'Div1', 3);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr1-1998', 'Div2', 13);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr1-1998', 'Div2', 13);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr1-1998', 'Div3', 43);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr1-1998', 'Div3', 43);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr1-1998', 'Div4', 23);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr2-1998', 'Div1', 43);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr2-1998', 'Div2', 33);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr2-1998', 'Div3', 33);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr2-1998', 'Div4', 4);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr3-1998', 'Div1', 4);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr3-1998', 'Div2', 23);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr3-1998', 'Div3', 2);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr3-1998', 'Div4', 1);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr4-1998', 'Div1', 1);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr4-1998', 'Div2', 23);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr4-1998', 'Div3', 2);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr4-1998', 'Div4', 21);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr1-1999', 'Div1', 123);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr1-1999', 'Div2', 3);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr1-1999', 'Div3', 4);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr1-1999', 'Div4', 6);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr2-1999', 'Div1', 7);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr2-1999', 'Div2', 8);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr2-1999', 'Div3', 6);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr2-1999', 'Div4', 5);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr3-1999', 'Div1', 5);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr3-1999', 'Div2', 4);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr3-1999', 'Div3', 6);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr3-1999', 'Div4', 7);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr4-1999', 'Div1', 5);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr4-1999', 'Div2', 55);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr4-1999', 'Div3', 36);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr4-1999', 'Div4', 43);


as you can see there are four divisions in each quarter.
the question is that show all those quarters in which
the sales of div1 is greater than sales of div2. The out
put should be like this

QUARTER DIVISION SUM(SALES)
---------------------------------------------
Qtr1-1999 Div1 123
Qtr1-1999 Div2 3
Qtr2-1998 Div1 43
Qtr2-1998 Div2 33
Qtr3-1999 Div1 5
Qtr3-1999 Div2 4


to acheive the above result i wrote the following query


Select quarter, division, sum(sales) from T1 where division in ('Div1', 'Div2') GROUP BY quarter, division having quarter
IN
(
SELECT quarter
FROM T1
GROUP BY QUARTER
HAVING SUM(DECODE(division, 'Div1', sales, 0))
> SUM(decode(division, 'Div2', sales, 0))
)

this query is working fine, but for some reason
I want a query without DECODE/CASE
which returns same result. Could analytical functions help.?

Thanks in advance



Tom Kyte
May 21, 2003 - 3:14 pm UTC

well, the right way to do this is in one pass.  You get more pleasing output as well.  we could unpivot the data using DUAL if necessary.  But DECODE is the best answer.  Just to make you happy, there is a non-decode (but slower) version.

ops$tkyte@ORA920> select quarter,
  2         sum(decode(division,'Div1',sales,null)) Div1,
  3         sum(decode(division,'Div2',sales,null)) div2
  4    from t1
  5   group by quarter
  6  having sum(decode(division,'Div1',sales,null)) > sum(decode(division,'Div2',sales,null))
  7  /

QUARTER          DIV1       DIV2
---------- ---------- ----------
Qtr1-1999         123          3
Qtr2-1998          43         33
Qtr3-1999           5          4

ops$tkyte@ORA920>
ops$tkyte@ORA920> select quarter,
  2         sum(instr(division,'Div1')*sales) Div1,
  3         sum(instr(division,'Div2')*sales) Div2
  4    from t1
  5  group by quarter
  6  having sum(instr(division,'Div1')*sales) > sum(instr(division,'Div2')*sales)
  7  /

QUARTER          DIV1       DIV2
---------- ---------- ----------
Qtr1-1999         123          3
Qtr2-1998          43         33
Qtr3-1999           5          4

ops$tkyte@ORA920>


 

reason of not using decode

Asim Naveed, May 22, 2003 - 3:46 am UTC

Thanks for your reply.
I am using Oracle Version 9i Release 2

Although your answer is producing the information
without using decode, but my problem is still not solved,
because the reason I avoid decode/case and now also instr
is that I can not break the expression SUM(sales)
into SUM(DECODE(division, 'Div1', sales, null) because
i am writing a dynamic SQL, and the expression "sum(sales)"
is coming from a table at run time, and this expression
could be anything for e.g
SUM(sales), SUM(price*volume), COUNT(DISTINCT salesid),
COUNT(Distinct customerid). etc. etc.

see

SELECT
quarter,
division,
"here goes the full expression which is coming from table"

Also my output requirement is to get divisions on rows
instead of columns, (how can we do this with dual)

Today I have written the following query which is doing
exactly what I want , but, if possible i need a better
query.

SELECT quarter, division, SUM(sales)
from t1 c
where division IN ('Div1', 'Div2')
and quarter IN (
select quarter from t1 a where division = 'Div1'
GROUP BY quarter
HAVING SUM(sales) >
(select NVL(SUM(sales), 0)
from t1 b
where b.quarter = a.quarter AND division = 'Div2')
)
GROUP BY quarter, division


See in the above query, I just replace "SUM(sales)"
with what ever expression I get from
the table (e.g COUNT(salesid), SUM(price*volume)) and its done.

main point is that such a solution is required in which
the aggregate expression can be plugged
as it comes from the table with out breaking that
expression.


Tom Kyte
May 23, 2003 - 7:55 am UTC

you can do this with decode, I see no issues with that at all. why do you believe otherwise?

anyway, guess you are back to your solution.

remember this:

generic is great
generic is slow
specific solutions will massively outperform generic solutions
it is up to you to made the tradeoff, decide which is more important.

One more thing to mention about instr

Asim Naveed, May 22, 2003 - 3:54 am UTC

your solution with instr can produce wrong results.
if the input division are like this.
first division = 'Div1'
second division = 'Div1Div2'


Tom Kyte
May 23, 2003 - 7:55 am UTC

i never really expected anyone to use it, it would be "wrong"

Question

PRS, June 10, 2003 - 2:21 pm UTC

Hi Tom,
I have a table named comments. It has following columns.
opp_id number(10);
comm_id number(10);
commoent varchar2(4000);

Data is shown as below.

opp_id comm_id comment
1 101 test1
1 202 test2
1 300 test3
1 550 test4
2 220 test1
3 301 test1
3 501 test2

Here opp_id and comm_id both are genereated by the oracle sequence.

I need to transfer this data into the following table.

RSF_COMMENT
opp_id number(10);
comm_id number(10);
commoent varchar2(4000);

But here comm_id is a sequential number. Data is shown below.

opp_id comm_id comment
1 1 test1
1 2 test2
1 3 test3
1 4 test4
2 1 test1
3 1 test1
3 2 test2

I cannot use straight insert into SELECT as I need to
derive sequential comment id. So that can be achieved by PL/SQL procedure.

Is there anyway we can derive the sequential id using the
SELECT statement by having anylatical function?

Thanks for your help,
PRS

Tom Kyte
June 10, 2003 - 2:26 pm UTC

ops$tkyte@ORA920> insert into t2
  2  select opp_id, row_number() over (partition by opp_id order by comm_id), com
  3    from t1
  4  /

7 rows created.

ops$tkyte@ORA920> select * from t2;

    OPP_ID    COMM_ID COM
---------- ---------- -----
         1          1 test1
         1          2 test2
         1          3 test3
         1          4 test4
         2          1 test1
         3          1 test1
         3          2 test2

7 rows selected. 

Analytics can solve this one i suppose -- for Asim Naveed

A reader, June 11, 2003 - 3:43 pm UTC

1 select quarter, division, sum(sales) over (partition by quarter, division) sum_sales
2 from t1
3 where quarter in ('Qtr1-1999', 'Qtr2-1998', 'Qtr3-1999')
4* and division in ('Div1', 'Div2')
RKPA01> /

QUARTER DIVISION SUM_SALES
---------- ---------- ----------
Qtr1-1999 Div1 123
Qtr1-1999 Div2 3
Qtr2-1998 Div1 43
Qtr2-1998 Div2 33
Qtr3-1999 Div1 5
Qtr3-1999 Div2 4

6 rows selected.


for MN from USA

Asim Naveed, June 13, 2003 - 11:59 am UTC

MN,

Please read the question carefully , i asked for those
records in which sum(sales) of div1 is greater than
sum(sales) in div2. the out put have to show both records
(i.e of div1 and div2 )

i.e qtr1-1998 div1 200000
qtr1-1998 div2 1000

see above div1 sales is greater than div2 in quarter 1 of
1998.

Is this a candidate for Analytic function

Arun, July 16, 2003 - 9:54 am UTC

Tom,
I have a table with this kind of data

REQID C_NODE P_NODE
3838 1 0
3838 2 9
3838 2 10
3838 3 11
3838 7 3
3838 7 8
3838 8 11
3838 9 3
3838 9 8
3838 10 7
3838 11 1
Now i need a result set such that, for a criterion of 10
i will get
7
3
8
11
1

ie
3838 10 7
3838 7 3
3838 7 8
3838 3 11
3838 8 11
3838 11 1

How could i go about this?. I thought i could use analytic functions, but found we can have running totals etc only through that. Is there a possibility of doing this without procedures

Thanks for all your help,
Regards,
Arun


Tom Kyte
July 16, 2003 - 10:43 am UTC

proof that what is obvious to one person is just a meaningless jumble to another.

"Now i need a result set such that, for a criterion of 10
i will get "

a criterion of 10 what? against what?

I cannot fathom how you got from input data to that result set

Sorry

A reader, July 16, 2003 - 11:04 am UTC

Tom,
Sorry for wasting your time. It never occured... I was wondering with this
problem for the past 4 hours :-(...

For the table

REQID C_NODE P_NODE
3838 1 0
3838 2 9
3838 2 10
3838 3 11
3838 7 3
3838 7 8
3838 8 11
3838 9 3
3838 9 8
3838 10 7
3838 11 1

i need to print
7
3
8
11
1
when the where clause is this

select ...... <tablename > where cnode = 10

which comes by a look up like this
C_NODE P_NODE
10 ---->>>>> 7
---->>> 7 ---->>>>> 3
7 ---->>>>> 8
3 ---->>>>> 11
Regards,
Arun

Tom Kyte
July 16, 2003 - 11:28 am UTC

Much clearer -- that is a hierarchical query -- connect by is the answer, analytics not so useful.  consider:

ops$tkyte@ORA920LAP> column graph format a10
ops$tkyte@ORA920LAP> select rpad( '*', 2*level, '*' ) graph, level, t.*
  2    from t
  3   start with c_node = 10
  4   connect by prior p_node = c_node;

GRAPH           LEVEL      REQID     C_NODE     P_NODE
---------- ---------- ---------- ---------- ----------
**                  1       3838         10          7
****                2       3838          7          3
******              3       3838          3         11
********            4       3838         11          1
**********          5       3838          1          0
****                2       3838          7          8
******              3       3838          8         11
********            4       3838         11          1
**********          5       3838          1          0

9 rows selected.

<b>that is your entire tree -- looking at this:</b>


ops$tkyte@ORA920LAP>
ops$tkyte@ORA920LAP> select rpad( '*', 2*level, '*' ) graph, level, t.*
  2    from t
  3   start with c_node = 10
  4   connect by prior p_node = c_node and level <= 4;

GRAPH           LEVEL      REQID     C_NODE     P_NODE
---------- ---------- ---------- ---------- ----------
**                  1       3838         10          7
****                2       3838          7          3
******              3       3838          3         11
********            4       3838         11          1
****                2       3838          7          8
******              3       3838          8         11
********            4       3838         11          1

7 rows selected.

<b>it looks like you want everything at level 4 and below in the tree... so</b>


ops$tkyte@ORA920LAP>
ops$tkyte@ORA920LAP> select distinct p_node
  2    from t
  3   start with c_node = 10
  4   connect by prior p_node = c_node and level <= 4;

    P_NODE
----------
         1
         3
         7
         8
        11

<b>there you go</b> 

Thanks

Arun, July 18, 2003 - 1:11 am UTC

Thank you very much Tom.
Regards,
Arun

can I use analytic here?

steve, July 25, 2003 - 4:24 pm UTC

Hi Tom,

I have 2 tables:
table1: dept#, itemId, ...
table2: item_id, item_type,...

table2 contains all the items. And I just want to insert
into table1 the items which exist in table2 not in the dept of table1 with type 'A' and 'B'.
e.g.
dept# itemid
----------------
1 100
1 101
2 102
2 103
3 100
4 103
5 101

T2:
item_id type
------- ----
100 A
101 A
102 B
103 C
104 C
105 A

and result I expect:
dept# itemid
----------------
1 100
1 101
1 102
1 105
2 102
2 103
2 100
2 101
2 105
.... ...
5 100
5 101
5 102
5 105

I can get the result by the following sql,
Can I use analytic function to reach the same goal?
beacuse I am thinking if there is new dept added in, I have to add new code.

insert into t1 (dept_no,item_id)
select 1,item_id
FROM t2
WHERE item_type in ('B', 'A')
and item_id not in ( select item_id from t1
where t1.dept_no = 1
)
union all
insert into t1 (dept_no,item_id)
select 2,item_id
FROM t2
WHERE item_type in ('B', 'A')
and item_id not in ( select item_id from t1
where t1.dept_no = 2
)
union all
...
unioa all
insert into t1 (dept_no,item_id)
select 5,item_id
FROM t2
WHERE item_type in ('B', 'A')
and item_id not in ( select item_id from t1
where t1.dept_no = 5
)



Thanks!

Steve


Tom Kyte
July 25, 2003 - 5:10 pm UTC

insert into t1 ( dept_no, item_id )
select *
from (select distinct deptno from t1 ),
(select item_id from t2 )
where (dept_no, item_id ) NOT IN ( select dept_no, item_id from t1 );

cartesian product all of the deptnos with all of the item ids and insert the ones that do not exist.

A minor correction

A reader, July 25, 2003 - 8:10 pm UTC

A minor correction: Item_type needs to be added.

insert into t1 ( dept_no, item_id )
select *
from (select distinct deptno from t1 ),
(select item_id from t2 where item_type in ('A', 'B'))
where (dept_no, item_id ) NOT IN ( select dept_no, item_id from t1 );


rank() and plsql

rd, August 25, 2003 - 10:07 am UTC

Hi,

Please tell me why the following code fails.

-----------------
declare
type cur_type is ref cursor;
c1 cur_type;
vSql varchar2(500):=
'select acc.rowid accrow
,dense_rank() over (partition by acc.social_security_number
order by limit_increase_amount desc
,account_utilisation desc
,account_number desc
) dr
from cm_accounts acc
,cm_customer cus
where acc.social_security_number=cus.social_security_number
and acc.social_security_number=''3507249658'')';

type rid_tab is table of rowid index by binary_integer;
type rank_tab is table of number(2) index by binary_integer;
rids rid_tab;
drs rank_tab;
begin
open c1 for vSql;
loop
fetch c1 bulk collect into rids,drs limit 1000;
exit when c1%notfound;
forall i in 1 .. rids.count
update cm_accounts
set acc_rank=drs(i)
where rowid=rids(i);
end loop;
commit;
end ;
-----------------

Thanks

Tom Kyte
August 25, 2003 - 1:41 pm UTC

well, when I generated some tables to go with this (be nice if tables came with test case), i got this first:

declare
*
ERROR at line 1:
ORA-06502: PL/SQL: numeric or value error: character string buffer too small
ORA-06512: at line 4


so for me at least, 500 wasn't big enough there... upping vSQL to 5000, i got:

declare
*
ERROR at line 1:
ORA-00933: SQL command not properly ended
ORA-06512: at line 21

meaning your sql was bad...


running just your SQL, i see it was:

ops$tkyte@ORA920>             select acc.rowid accrow
  2                     ,dense_rank() over (partition by acc.social_security_number
  3                                                    order by limit_increase_amount desc
  4                                                       ,account_utilisation desc
  5                                                       ,account_number desc
  6                                                         ) dr
  7              from   cm_accounts acc
  8                     ,cm_customer cus
  9              where  acc.social_security_number=cus.social_security_number
 10              and acc.social_security_number='3507249658')
 11  /
            and acc.social_security_number='3507249658')
                                                       *
ERROR at line 10:
ORA-00933: SQL command not properly ended


that parens that didn't need be there. taking that out, I ran it and got:

ops$tkyte@ORA920> declare
  2  type cur_type is ref cursor;
  3  c1  cur_type;
  4  vSql varchar2(5000):=
  5              'select acc.rowid accrow
  6                     ,dense_rank() over (partition by acc.social_security_number
  7                                                    order by limit_increase_amount desc
  8                                                       ,account_utilisation desc
  9                                                       ,account_number desc
 10                                                         ) dr
 11              from   cm_accounts acc
 12                     ,cm_customer cus
 13              where  acc.social_security_number=cus.social_security_number
 14              and acc.social_security_number=''3507249658''';
 15
 16  type rid_tab is table of rowid index by binary_integer;
 17  type rank_tab is table of number(2) index by binary_integer;
 18  rids rid_tab;
 19  drs        rank_tab;
 20  begin
 21     open c1 for vSql;
 22     loop
 23            fetch c1 bulk collect into rids,drs limit 1000;
 24         exit when c1%notfound;
 25         forall i in 1 .. rids.count
 26                   update cm_accounts
 27                set     acc_rank=drs(i)
 28                where     rowid=rids(i);
 29     end loop;
 30     commit;
 31  end ;
 32  /
 
PL/SQL procedure successfully completed.

Now, if you are in 8i (you never said), you would find that this will raise:

declare
*
ERROR at line 1:
ORA-01001: invalid cursor
ORA-06512: at line 23
 

and that is because BULK FETCH on a dynamically opened refcursor was simply "not supported" then.  The solution would be:

ops$tkyte@ORA817DEV> create or replace view v
  2  as
  3  select acc.rowid accrow
  4         ,dense_rank() over (partition by acc.social_security_number
  5                             order by limit_increase_amount desc
  6                                     ,account_utilisation desc
  7                                     ,account_number desc) dr
  8     from  cm_accounts acc
  9          ,cm_customer cus
 10    where  acc.social_security_number=cus.social_security_number
 11           and acc.social_security_number=userenv('client_info');
 
View created.
 
ops$tkyte@ORA817DEV>
ops$tkyte@ORA817DEV>
ops$tkyte@ORA817DEV> declare
  2          type rid_tab is table of rowid index by binary_integer;
  3          type rank_tab is table of number(2) index by binary_integer;
  4          rids rid_tab;
  5          drs  rank_tab;
  6
  7          cursor c1 is select * from v;
  8  begin
  9          dbms_application_info.set_client_info( '3507249658' );
 10          open c1;
 11          loop
 12                  fetch c1 bulk collect into rids,drs limit 1000;
 13          forall i in 1 .. rids.count
 14            update cm_accounts set acc_rank=drs(i) where rowid=rids(i);
 15           exit when c1%notfound;
 16     end loop;
 17     commit;
 18  end ;
 19  /
 
PL/SQL procedure successfully completed.


<b>NOTE the important MOVING Of the exit when c1%notfound!!!!! it would be a bug to have it before the update -- else your last fetch of say 999 rows, which raises the c1%notfound, would never get processed!!!!!</b>







 

A reader, August 25, 2003 - 7:15 pm UTC

Hi,
Here I have something about analytic functions, I hope it help somebody.
</code> http://www.geocities.com/juancarlosreyesp/OracleAskTom.pdf <code>

:)

Helena Marková, August 26, 2003 - 5:19 am UTC


Analytics?

Steve, September 18, 2003 - 10:16 am UTC

I have a table with the following data:
location cpyear period prod_code, cases, cups
99951 2002 12 EU17 1 72
99951 2003 5 EU17 3 216
99951 2002 13 EU17 1 72

and I need output as:

99951 2003 5 EU17 5 360

As you can see cases and cups are summed and I'm looking for the latest year and period.

Thanks

Steve

Tom Kyte
September 18, 2003 - 11:13 am UTC

ops$tkyte@ORA920> select location, max( cpyear * 100 + period ), prod_code, sum(cases), sum(cups)
  2    from t
  3   group by location, prod_code;
 
  LOCATION MAX(CPYEAR*100+PERIOD) PROD_CODE  SUM(CASES)  SUM(CUPS)
---------- ---------------------- ---------- ---------- ----------
     99951                 200305 EU17                5        360
 
<b>or...</b>


ops$tkyte@ORA920>
ops$tkyte@ORA920> select location,
  2         trunc(max( cpyear * 100 + period )/100) cpyear,
  3         mod(max( cpyear * 100 + period ),100) period,
  4         prod_code, sum(cases), sum(cups)
  5    from t
  6   group by location, prod_code;
 
  LOCATION     CPYEAR     PERIOD PROD_CODE  SUM(CASES)  SUM(CUPS)
---------- ---------- ---------- ---------- ---------- ----------
     99951       2003          5 EU17                5        360
 

Steve, September 18, 2003 - 11:40 am UTC

Excellent thanks! I was convinced that I needed to use analytics and so couldn't see the wood for the trees!

Steve

Going further

bobspa, October 08, 2003 - 8:25 pm UTC

Tom,

is is possible to convert this code fragment using analytics => for performance improvement(?) [ i'm not yet too familiar with this 8i feature -- been sleeping for years :( ]. i'm using 8.1.6 EE

select bank_id, txn_date,acct_type,dep_cd,dep_term, interest_rate,currency_code,sum(amt_in_dollars),sum(amt_in_org_curr),count(distinct dep_name), tax_disc
from deposit_txns
group by bank_id, txn_date,acct_type,dep_cd,dep_term, interest_rate,currency_code,tax_disc

the table deposit_txns has millions of rows and is taking a lot of time. best time is 8 minutes elapsed. (sort area is already 8KB)

thanks and best regards

Tom Kyte
October 09, 2003 - 4:04 pm UTC



there is no need for analytics there at all, they do not apply at all in this case. this query only does normal aggregation.


sort area is already '8kb'?? is that a typo?



Calculating count Using analytic functions

Sridhar Reddy, October 27, 2003 - 5:25 pm UTC

Hi Tom,
I have a small issue for which i could not find a solution on this site,so iam asking you.

I have table with the following values
date varchar2 number number
1/12/2002 first 1 2
1/11/2002 first 3 4
1/10/2002 first 5 6
1/13/2002 second 7 8
1/14/2002 second 9 10

I would like to select something like this.
date varchar2 number number count
1/12/2002 first 1 2 3
1/11/2002 first 3 4 3
1/10/2002 first 5 6 3
1/13/2002 second 7 8 2
1/14/2002 second 9 10 2

do a count of the values on the varchar2 column and assign it.is this possible in one single statement?please help

Thanks,
Sridhar






Tom Kyte
October 27, 2003 - 6:49 pm UTC

ops$tkyte@ORA920> select d, v, n1, n2, count(*) over (partition by v)
  2    from t;
 
D          V                  N1         N2 COUNT(*)OVER(PARTITIONBYV)
---------- ---------- ---------- ---------- --------------------------
01/12/2002 first               1          2                          3
01/11/2002 first               3          4                          3
01/10/2002 first               5          6                          3
01/13/2002 second              7          8                          2
01/14/2002 second              9         10                          2
 

Thanks a lot Tom

Sridhar Reddy, October 28, 2003 - 9:07 am UTC

Tom ,
It helped me. thanks a lot for your quick response.
Thanks,
Sridhar.

Taking it a step ahead

Sridhar Reddy, October 28, 2003 - 9:23 am UTC

Tom,
Your response helped me a lot.but i was curious how we can do the same thing taking into account another column.
date v2 n1 n2
1/12/2002 first 10 2
1/11/2002 first 10 4
1/10/2002 first 10 6
1/13/2002 first 20 8
1/14/2002 first 20 10
1/12/2002 two 10 2
1/11/2002 two 10 4
1/10/2002 two 10 6
1/13/2002 two 20 8
1/14/2002 two 20 10

I would like to select something like this.
date v2 n1 n2 count
1/12/2002 first 10 2 3
1/11/2002 first 10 4 3
1/10/2002 first 10 6 3
1/13/2002 first 20 8 2
1/14/2002 first 20 10 2
1/12/2002 two 10 2 3
1/11/2002 two 10 4 3
1/10/2002 two 10 6 3
1/13/2002 two 20 8 2
1/14/2002 two 20 10 2

What should we do if we need to select the count based on N1
even though "first" is repeated 5 times i want to get 3 and 2 as the count based on the column N1.Is this possible in one single statement?Please help.
Thanks,
Sridhar



Tom Kyte
October 28, 2003 - 10:21 am UTC

count(*) over (partition by v2, n1)



Another Example

BK, October 28, 2003 - 11:22 am UTC

Thought of contributing my 2 cents. Don't know without analytic function how this could be achieved.

select description,sales_account,count(*) over(partition
by sales_account) Count_SA,
cost_of_sales_account,count(*) over(partition by
cost_of_sales_account) Count_COSA
from mtl_system_items

Hope this helps.

can I use analytical function in having clause

A reader, February 06, 2004 - 8:53 pm UTC

I need order_id which were packed different people.
Let's say:
ID packed_by
123 james
123 james
122 james
122 james
122 jeff
140 jeff
140 jeff
128 jeff
128 james
I need orders with id 122 and 128.
Can I use select something like this:
select id, packed_by
from t1,t1
where t1.ID=t2.ID
having count(*) over (partition by id) <>
count(*) over (partition by packed_by)


Tom Kyte
February 07, 2004 - 2:24 pm UTC

select id, packed_by
from t
where id in (select id from t group by id having count(distinct packed_by)>1)
/

or

select id, packed_by, cnt
from ( select id, packed_by, count(distinct packed_by) over (partition by id)cnt
from t )
where cnt > 1;



Tom...

Denise, February 20, 2004 - 3:26 pm UTC

please bear with me...I know I've been a real pest recently.
I'm working on several different projects..the primary
one being getting the required tables loaded into the 9i
db so I can start doing analysis reports using Analytic
Functions.

I mentioned in a previous thread about this matrix report.

I found a pl/sql block that another programmer was developing to run this report 3 years ago.It never got completed however I'm looking at it from the standpoint as a flowchart.

I've read somewhere that Analytic Functions can now be
used with PL/SQL in 9i as long as you compile the cursor
SELECT as a view and select from the VIEW instead of
the actual TABLE.

I'm reviewing Chapter 12 of your book.

Perhaps given the power of AF with SQL maybe just scrap
PL/SQL altogether.

what is your take on this?

thanks...and I promise to stay away from your website(
for the rest of the day!)





Tom Kyte
February 20, 2004 - 3:31 pm UTC

In 8i, you had to "hide" analytics with dynamic sql or a view.

In 9i, just "code away", you don't have to hide them anymore.

analytics work great in plsql in 9i -- no issues.


Is this doable in a single statement

Kishan, April 15, 2004 - 11:02 am UTC

The following is part of a big table, where data is queried from different tables and populated. We are converting the table into a view. Is it possible to populate these columns in a one/two statements using analytics or some other better way?

create view p_flat as
select ...
...
(select count(*) over(partition by p_link_id) from p_flat where ast_cls ='1' and prop_sts = '1') as tot_l_inv,
(select count(*) over(partition by p_link_id from p_flat where ast_cls ='1' and prop_sts = '1') as tot_e_inv,
(select count(*) over(partition by p_link_id) from p_flat where ast_cls ='1' and prop_sts = '1') as tot_b_inv,
(select tot_l_inv + tot_e_inv + tot_b_inv from dual) as tot_r_inv
...);


create table p_temp as
select distinct(p_link_id) PLINK_ID, count(*) p_count
from p_flat
where ast_cls = '1' and prop_sts = '1'
group by p_link_id;

update p_flat t0 set (tot_l_inv)=
(select p_count from p_temp t1 where t0.p_link_id = t1.PLINK_ID);

create table p_temp as
select distinct(p_link_id) PLINK_ID, count(*) p_count
from p_flat
where ast_cls = '3' and prop_sts = '1'
group by p_link_id;

update p_flat t0 set (tot_e_inv)=
(select p_count from p_temp t1 where t0.p_link_id = t1.PLINK_ID);

create table p_temp as
select distinct(p_link_id) PLINK_ID, count(*) p_count
from p_flat
where ast_cls = '2' and prop_sts = '1'
group by p_link_id;

update p_flat t0 set (tot_b_inv)=
(select p_count from p_temp t1 where t0.p_link_id = t1.PLINK_ID);

update p_flat t0 set tot_r_inv=tot_l_inv + tot_e_inv + tot_b_inv;

Tom Kyte
April 15, 2004 - 11:22 am UTC

sorry -- i don't understand that first view (didn't read further).



Analytical aggregate based on data sets

Lee, April 15, 2004 - 11:56 am UTC

Tom,

I have a situation that I need to tackle in a query, I tried using analytical function but I canÂ’t figure out how to exactly do this:

I have a table with 10 columns 2 of which are date_effective and date_tarmination, 1 other column is person_id.

I need to find the min date_effective and max date_termination for each person_id for records that every thing but the dates is identical.

This is my problem, if I have the following record set:

PERSON_ID DATE_EFFECTIVE DATE_TERMINATION COL_ZZZ
1 01-JAN-01 01-FEB-01 A
1 02-FEB-01 01-MAR-01 A
1 02-APR-01 01-MAY-01 B
1 02-MAY-01 01-JUN-01 A
1 02-JUN-01 01-JUL-01 A
1 02-JUL-01 01-AUG-01 A

If I use regular min and max analytical function group by the columns IÂ’ll get the following for the set of records with all identical values:

MIN_DATE_EFFECTIVE MAX_DATE_EFFECTIVE
01-JAN-01 01-AUG-01
01-JAN-01 01-AUG-01
02-APR-01 01-MAY-01
01-JAN-01 01-AUG-01
01-JAN-01 01-AUG-01
01-JAN-01 01-AUG-01

However, I want each group to have itÂ’s own min and max dates based on the date order, like the following:

MIN_DATE_EFFECTIVE MAX_DATE_EFFECTIVE
01-JAN-01 01-MAR-01
01-JAN-01 01-MAR-01
02-APR-01 01-MAY-01
01-MAY-01 01-AUG-01
01-MAY-01 01-AUG-01
01-MAY-01 01-AUG-01

I want to say some thing like:
Give me the min and max dates where all the other values are identical from a group that is ordered by date and un broken by non-identical values. Every time that a group is “”broken””, a new group starts.

PERSON_ID DATE_EFFECTIVE DATE_TERMINATION COL_ZZZ
1 01-JAN-01 01-FEB-01 A group 1
1 02-FEB-01 01-MAR-01 A group 1
1 02-APR-01 01-MAY-01 B group 2
1 02-MAY-01 01-JUN-01 A group 3
1 02-JUN-01 01-JUL-01 A group 3
1 02-JUL-01 01-AUG-01 A group 3




I know I can probably do it with PL/SQL but this is a part of an INSERT FIRST query and I really like to do it with SQL, I tried different variations of analytical functions but couldnÂ’t get it.

Your help is appreciated.

Thanks.








Tom Kyte
April 15, 2004 - 12:51 pm UTC

ops$tkyte@ORA9IR2> select person_id, edate, tdate, col_zzz, last_col_zzz,
  2         max(
  3         case when rn = 1
  4                   OR (last_col_zzz <> col_zzz)
  5                   OR (last_col_zzz is null and col_zzz is not null)
  6                   OR (last_col_zzz is not null and col_zzz is null )
  7              then rn
  8          end) over (partition by person_id order by edate) grp
  9    from (
 10  select person_id, edate, tdate, col_zzz,
 11         row_number() over (partition by person_id order by edate) rn,
 12         lag(col_zzz) over (partition by person_id order by edate) last_col_zzz
 13    from t
 14         )
 15  /
 
 PERSON_ID EDATE     TDATE     C L        GRP
---------- --------- --------- - - ----------
         1 01-JAN-01 01-FEB-01 A            1
         1 02-FEB-01 01-MAR-01 A A          1
         1 02-APR-01 01-MAY-01 B A          3
         1 02-MAY-01 01-JUN-01 A B          4
         1 02-JUN-01 01-JUL-01 A A          4
         1 02-JUL-01 01-AUG-01 A A          4
 
6 rows selected.
 

<b>If zzz is NOT NULL, you do not need the other checks (is null/is not null), to support "more columns", it would look like this:</b>


ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> alter table t add col_aaa char(1);
 
Table altered.
 
ops$tkyte@ORA9IR2> update t set col_aaa = rownum;
 
6 rows updated.
 
ops$tkyte@ORA9IR2> update t set col_aaa = 1 where col_aaa = 2;
 
1 row updated.
 
ops$tkyte@ORA9IR2> update t set col_aaa = 5 where col_aaa = 4;
 
1 row updated.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select person_id, edate, tdate,
  2         col_zzz,
  3         col_aaa,
  4         last_col_aaa,
  5         last_col_zzz,
  6         max(
  7         case when rn = 1
  8                   OR (last_col_zzz <> col_zzz)
  9                   OR (last_col_zzz is null and col_zzz is not null)
 10                   OR (last_col_zzz is not null and col_zzz is null )
 11                   OR (last_col_aaa <> col_aaa)
 12                   OR (last_col_aaa is null and col_aaa is not null)
 13                   OR (last_col_aaa is not null and col_aaa is null )
 14              then rn
 15          end) over (partition by person_id order by edate) grp
 16    from (
 17  select person_id, edate, tdate, col_zzz, col_aaa,
 18         row_number() over (partition by person_id order by edate) rn,
 19         lag(col_zzz) over (partition by person_id order by edate) last_col_zzz,
 20         lag(col_aaa) over (partition by person_id order by edate) last_col_aaa
 21    from t
 22         )
 23  /
 
 PERSON_ID EDATE     TDATE     C C L L        GRP
---------- --------- --------- - - - - ----------
         1 01-JAN-01 01-FEB-01 A 1              1
         1 02-FEB-01 01-MAR-01 A 1 1 A          1
         1 02-APR-01 01-MAY-01 B 3 1 A          3
         1 02-MAY-01 01-JUN-01 A 5 3 B          4
         1 02-JUN-01 01-JUL-01 A 5 5 A          4
         1 02-JUL-01 01-AUG-01 A 6 5 A          6
 
6 rows selected.
 
 

Query

PRS, April 15, 2004 - 11:58 am UTC

I have table named PS_RSF_OPP_NOTE as shown below.
ps_rsf_opp_note
(opportunity_id VARCHAR2(15) NOT NULL,
note_seq_nbr NUMBER(*,0) NOT NULL,
rc_summary VARCHAR2(80) NOT NULL,
rc_note_type VARCHAR2(5) NOT NULL,
rc_visibility VARCHAR2(4) NOT NULL,
row_added_dttm DATE,
row_added_oprid VARCHAR2(30) NOT NULL,
row_lastmant_dttm DATE,
row_lastmant_oprid VARCHAR2(30) NOT NULL,
syncid NUMBER(*,0),
syncdttm DATE,
descrlong LONG)

I want to find out the last descrlong column for a given
opportunity_id and max(note_seq_nbr). I wrote the following query.
select descrlong from ps_rsf_opp_note h where h.opportunity_id = '498624973' and
note_seq_nbr = (select count(*) from ps_rsf_opp_note g where g.opportunity_id = '498624973')
This works but I am making a lookup two times to PS_RSF_OPP_NOTE. 1st is to find out max sequence number and
2nd to get the actual record to get the last description.

This is a very huge table with 35 million rows. Is there any analytic function available so I can do this in one lookup?
I have descrlong column datatype LONG.

Any idea is appreciated.

Thanks for your help.
PRS

Tom Kyte
April 15, 2004 - 12:53 pm UTC

select ...
from ( select ....
from t
where ....
order by note_seq_nbr DESC )
where rownum = 1;

is but one approach.


select ...
from ( select ...., MAX(note_seq_nbr) over () max_note_seq_nbr
from t
where .... )
where note_seq_nbr = max_note_seq_nbr
/

is another.

Kishan, April 15, 2004 - 12:16 pm UTC

Let me try and explain...

The update statements are part of a table.
CREATE table p_flat as
(ast_id number(10) primary key,
plink_id varchar2(10),
ast_cls varchar2(10),
prop_sts varchar2(10),
tot_l_inv number(10),
tot_e_inv number(10),
tot_b_inv number(10),
tot_r_inv number(10)
...
...
...)
Records are inserted/updated to this table from different tables using several sql scripts. This process takes 30-45 minutes daily. We are re-engineering this table by converting it to a view/materialized view and hoping to cut down the process time. We are creating the view by reviewing each script and building the appropriate columns by selecting them from corresponding tables given in the script or computing it on the fly.

The statements that I had posted earlier is part of one such script. I wanted to know if we could compute those columns on the fly and hence have that information computed by the database rather than saved in a table.

So instead of the table we want to have this view as follows:

create view p_flat as
SELECT x.*,
t2.p_link_id AS P_LINK_ID,
(select count(*) over(partition by p_link_id) from p_flat where ast_cls ='1' and prop_sts = '1') as tot_l_inv,
(select count(*) over(partition by p_link_id from p_flat where ast_cls ='1' and prop_sts = '1') as tot_e_inv,
(select count(*) over(partition by p_link_id) from p_flat where ast_cls ='1' and prop_sts = '1') as tot_b_inv,
(select tot_l_inv + tot_e_inv + tot_b_inv from dual) as tot_r_inv
...
...
FROM (select ast_id,c1,c2,c3....
FROM from T1) x
FROM T2,t3,t4....
WHERE x.ast_id = t2.ast_id and <other conditions> ...

Hope I am clear.

Thanks for all the help in advance.

Regards,
Kishan.

Tom Kyte
April 15, 2004 - 12:59 pm UTC

i don't understand by "partition by" would be in a scalar subquery -- that is what is confusing me

partition by would imply "more than one row" and a scalar subquery cannot return more than one row.


and all of those columns look "the same" to me.




Inline view gives me error

PRS, April 15, 2004 - 1:40 pm UTC

Following view gives me ORA904 invalid identifier. As soon as I added inline view in from clause join, it does not see outer tables.
SELECT B.opportunity_id
,B.BO_ID_CUST
,A.WSI_ACCOUNT_ID
,E.WSI_ACCOUNT_NO
,B.FIRST_NAME
,B.LAST_NAME
,B.OPPORTUNITY_STATUS
,f.xlatlongname status_desc
,A.TD_OPP_FLW_UP_ACT
,g.xlatlongname acat_desc
,B.PERSON_ID
,D.NAME
,A.RA_CAMPAIGN_ID
,C.RA_CMPGN_NAME
,B.ROW_ADDED_DTTM
,B.ROW_LASTMANT_DTTM
,B.NEXT_ACTIVITY_DT
,B.ACT_CLOSE_DT
,B.ACT_REVENUE
,B.EST_REVENUE
,B.PHONE
,'' dummy

,h.descrlong
,', Created By ' || h.row_added_oprid || ', on ' || h.row_added_dttm note
FROM PS_WSI_RSF_OPP_FLD A
, PS_RSF_OPPORTUNITY B
, PS_RA_CAMPAIGN C
, PS_RD_PERSON_NAME D
, PS_WSI_ACCOUNT E
,PSXLATITEM F
,PSXLATITEM G
,(select descrlong,row_added_oprid,row_added_dttm
from ( select opportunity_id,descrlong,row_added_oprid,row_added_dttm
from ps_rsf_opp_note k
where k.opportunity_id = b.opportunity_id
order by note_seq_nbr desc)
where rownum = 1) h
WHERE A.OPPORTUNITY_ID = B.OPPORTUNITY_ID
AND A.RA_CAMPAIGN_ID = C.RA_CAMPAIGN_ID(+)
AND B.PERSON_ID = D.PERSON_ID(+)
AND A.WSI_ACCOUNT_ID = E.WSI_ACCOUNT_ID(+)
AND f.fieldvalue = b.opportunity_status
AND f.fieldname = 'OPPORTUNITY_STATUS'
AND e.wsi_account_no = '33312345'
AND g.fieldvalue = A.TD_OPP_FLW_UP_ACT
AND g.fieldname = 'TD_OPP_FLW_UP_ACT'


Tom Kyte
April 15, 2004 - 2:19 pm UTC

same answer as the last page you put this on.


please -- just once does it.

Sorry for confusing you....

Kishan, April 15, 2004 - 2:39 pm UTC

Man...I am so muddled up that I end up confusing even "THE TOM"....:)

What we want to do is to compute the columns tot_l_inv, tot_e_inv, tot_b_inv as the count of unique p_link_id values where ast_cls and prop_sts conditions are met.
My SQL skills are not great, and I guess the statements below does the job.

What is the best way to do this?

create view p_flat as
SELECT x.*,
T2.p_link_id AS P_LINK_ID,
(select count(*) from p_flat p1 where ast_cls ='1' and prop_sts = '1' and p1.p_link_id = T2.p_link_id group by p1.p_link_id) as tot_l_inv,
(select count(*) from p_flat p1 where ast_cls ='2' and prop_sts = '1' and p1.p_link_id = T2.p_link_id group by p1.p_link_id) as tot_e_inv,
(select count(*) from p_flat p1 where ast_cls ='3' and prop_sts = '1' and p1.p_link_id = T2.p_link_id group by p1.p_link_id) as tot_b_inv
...
...
FROM (select ast_id FROM from T1) x
FROM T2,T3,T4....
WHERE x.ast_id = T2.ast_id;


Tom Kyte
April 15, 2004 - 2:43 pm UTC

are you trying to transpose rows into columns?


i would not use a scalar subquery for that -- i would use decode/case and group by.

but i'm not following the example at all -- can you start from the beginning, WITH THE SOURCE table. this view with a self referencing thing is just confusing. you have a view p_flat that references p_flat -- not getting it.

This is what we require...

Kishan, April 16, 2004 - 9:45 am UTC

Dear Tom:
The tables and the required view information with sample data is given below.Is it possible to accomplish this in a single sql statement? If not, what would be the best approach as the number of records in both the asset and the prop table are around 100,000 and 25,00,000 respectively and growing.

CREATE TABLE ASSET (
AST_ID NUMBER(10) PRIMARY KEY,
AST_CLS VARCHAR2(10),
PROP_STS VARCHAR2(10));

CREATE TABLE PROP(
AST_ID NUMBER(10) PRIMARY KEY REFERENCES ASSET(AST_ID),
P_LINK_ID NUMBER(10));

insert into asset (ast_id, ast_cls, prop_sts) values(100, '1', '1');
insert into asset (ast_id, ast_cls, prop_sts) values(101, '1', '1');
insert into asset (ast_id, ast_cls, prop_sts) values(102, '1', '1');
insert into asset (ast_id, ast_cls, prop_sts) values(103, '2', '1');
insert into asset (ast_id, ast_cls, prop_sts) values(104, '2', '1');
insert into asset (ast_id, ast_cls, prop_sts) values(105, '3', '1');
insert into asset (ast_id, ast_cls, prop_sts) values(106, '1', '1');
insert into asset (ast_id, ast_cls, prop_sts) values(107, '1', '2');
insert into asset (ast_id, ast_cls, prop_sts) values(108, '1', '2');
insert into asset (ast_id, ast_cls, prop_sts) values(109, '4', '1');
insert into asset (ast_id, ast_cls, prop_sts) values(110, '7', '1');
insert into asset (ast_id, ast_cls, prop_sts) values(111, '7', '1');
insert into asset (ast_id, ast_cls, prop_sts) values(112, '7', '1');

insert into prop(ast_id, p_link_id) values(100,'100');
insert into prop(ast_id, p_link_id) values(101,'100');
insert into prop(ast_id, p_link_id) values(102,'100');
insert into prop(ast_id, p_link_id) values(103,'200');
insert into prop(ast_id, p_link_id) values(104,'200');
insert into prop(ast_id, p_link_id) values(105,'300');
insert into prop(ast_id, p_link_id) values(106,'400');
insert into prop(ast_id, p_link_id) values(107,'400');
insert into prop(ast_id, p_link_id) values(108,'500');
insert into prop(ast_id, p_link_id) values(109,'500');

asset:
AST_ID AST_CLS PROP_STS
---------- ---------- ----------
100 1 1
101 1 1
102 1 1
103 2 1
104 2 1
105 3 1
106 1 1
107 1 2
108 1 2
109 4 1
110 7 1
111 7 1
112 7 1

prop:
AST_ID P_LINK_ID
---------- ----------
100 100
101 100
102 100
103 200
104 200
105 300
106 400
107 400
108 500
109 500

This is the expected result:
AST_ID AST_CLS PROP_STS P_LINK_ID TOT_L_INV TOT_E_INV TOT_B_INV TOT_R_INV
---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
100 1 1 100 3 0 0 3
101 1 1 100 3 0 0 3
102 1 1 100 3 0 0 3
103 2 1 200 0 0 2 2
104 2 1 200 0 0 2 2
105 3 1 300 0 1 0 1
106 1 1 400 1 0 0 1
107 1 2 400 1 0 0 1
108 1 2 500 0 0 0 0
109 4 1 500 0 0 0 0
110 7 1 0 0 0 0
111 7 1 0 0 0 0
112 7 1 0 0 0 0

AST_ID NUMBER(10) - from asset
AST_CLS VARCHAR2(10) - from asset
PROP_STS VARCHAR2(10) - from asset
P_LINK_ID NUMBER(10) - from prop
TOT_L_INV NUMBER - computed value
TOT_E_INV NUMBER - computed value
TOT_B_INV NUMBER - computed value
TOT_R_INV NUMBER - computed value

Conditions:
1. asset.ast_id = prop.ast_id
2. tot_l_inv = for this p_link_id count all unique p_link_id where ast_cls = '1' and prop_sts = '1'
3. tot_e_inv = for this p_link_id count all unique p_link_id where ast_cls = '2' and prop_sts = '1'
4. tot_b_inv = for this p_link_id count all unique p_link_id where ast_cls = '3' and prop_sts = '1'
5. tot_r_inv = tot_l_inv + tot_e_inv + tot_b_inv

Thankful as always...
Kishan.

Tom Kyte
April 16, 2004 - 10:15 am UTC

much better to get the question and not an attempted answer :)


ops$tkyte@ORA9IR2> select x.*, tot_l_inv+tot_e_inv+tot_b_inv tot_r_inv
  2    from (
  3  select a.ast_id, a.ast_cls, a.prop_sts, b.p_link_id,
  4    count( case when ast_cls = '1' and prop_sts = '1' then b.p_link_id end )
  5                  over ( partition by p_link_id ) tot_l_inv,
  6    count( case when ast_cls = '2' and prop_sts = '1' then b.p_link_id end )
  7                  over ( partition by p_link_id ) tot_e_inv,
  8    count( case when ast_cls = '3' and prop_sts = '1' then b.p_link_id end )
  9                  over ( partition by p_link_id ) tot_b_inv
 10    from asset a, prop b
 11   where a.ast_id = b.ast_id(+)
 12         ) x
 13  /
 
AST_ID A P P_LINK_ID  TOT_L_INV  TOT_E_INV  TOT_B_INV  TOT_R_INV
------ - - --------- ---------- ---------- ---------- ----------
   100 1 1       100          3          0          0          3
   101 1 1       100          3          0          0          3
   102 1 1       100          3          0          0          3
   103 2 1       200          0          2          0          2
   104 2 1       200          0          2          0          2
   105 3 1       300          0          0          1          1
   106 1 1       400          1          0          0          1
   107 1 2       400          1          0          0          1
   108 1 2       500          0          0          0          0
   109 4 1       500          0          0          0          0
   110 7 1                    0          0          0          0
   111 7 1                    0          0          0          0
   112 7 1                    0          0          0          0
 
13 rows selected.
 

Please help

Ram, April 21, 2004 - 3:29 am UTC

Hello Tom,
Please see the code below
SQL> create table sales(monthly_trans_date date,total_sales number);

Table created.

SQL> insert into sales values('31-Jan-04',5000);
SQL> insert into sales values('29-Feb-04',15000);
SQL> insert into sales values('31-Mar-04',1500);
SQL> insert into sales values('30-Apr-04',6000);
SQL> insert into sales values('31-May-04',9000);
SQL> insert into sales values('30-Jun-04',7800);
SQL> insert into sales values('31-Jul-04',6700);
SQL> insert into sales values('30-Aug-04',5000);
SQL> insert into sales values('30-Sep-04',4000);
SQL> insert into sales values('31-Oct-04',2000);
SQL> insert into sales values('30-Nov-04',9000)
SQL> insert into sales values('31-Dec-04',6900)

SQL> select * from sales;

MONTHLY_T TOTAL_SALES                                                   

        
--------- -----------                                                   

        
31-JAN-04        5000                                                   

        
29-FEB-04       15000                                                   

        
31-MAR-04        1500                                                   

        
30-APR-04        6000                                                   

        
31-MAY-04        9000                                                   

       
                                                      
30-JUN-04        7800                                                   

        
31-JUL-04        6700                                                   

        
30-AUG-04        5000                                                   

        
30-SEP-04        5000                                                   

        
31-OCT-04        2000                                                   

        

MONTHLY_T TOTAL_SALES                                                   

        
--------- -----------                                                   

        
30-NOV-04        9000                                                   

        
31-DEC-04        6900                                                   

        

12 rows selected.

My requirement is:
I would like to have a Quarter-wise total sales as
 SQL> SELECT ....
    Quarter     sum_sales
    -------     ----------
    JAN-MAR      21500
    APR-JUN      22800
    JUL-SEP      16700
    OCT-DEC      17900
I am in need of the query and Could you please provide that?
I think we can make use of analytic functions with "range interval"
option but I don't know how to proceed with the query?

 

Tom Kyte
April 21, 2004 - 8:03 pm UTC

no analytics here, simple "aggregate"

ops$tkyte@ORA9IR2> select to_char(dt,'MON')||'-'||
  2         to_char(add_months(dt,3),'MON') dt,
  3         s
  4    from (
  5  select trunc(monthly_trans_date,'q') dt, sum(total_sales) s
  6    from sales
  7   group by trunc(monthly_trans_date,'q')
  8         )
  9  /
 
DT               S
------- ----------
JAN-APR      21500
APR-JUL      22800
JUL-OCT      15700
OCT-JAN      17900
 
ops$tkyte@ORA9IR2>
 

Please help

Raju, May 06, 2004 - 5:44 am UTC

Dear Tom,
I would like to get average of amount sold in a sales table for a 
"period gap of three days each".The table I use is as follows 

SQL>create table sales(trans_date date,amount_sold number);

SQL>insert into sales
    select to_char(sysdate+rownum,'dd-mon-yyyy'),(rownum*100)+20
    from obj where rownum <= 9;

I think we can use "avg(amount_sold) over(interval '03' range between rows preceding and current row)" something similar to this.But
I don't know how to group this.

Am I correct and could you please help with the query?
 

Tom Kyte
May 06, 2004 - 7:47 am UTC

you are mostly interested in the first two columns, the last three I use for "verification".  It is many times useful, with sliding windows, to "see" the top of the window, the bottom of the window and the count of observations in the window itself.  first_value/last_value and count are great for that:

ops$tkyte@ORA9IR2> select trans_date,
  2         avg(amount_sold) over (order by trans_date range 2 preceding) avg_amt,
  3         first_value(trans_date) over (order by trans_date range 2 preceding) start_of_window,
  4         last_value(trans_date) over (order by trans_date range 2 preceding) end_of_window,
  5         count(*) over (order by trans_date range 2 preceding) cnt
  6    from sales
  7  /
 
TRANS_DAT    AVG_AMT START_OF_ END_OF_WI        CNT
--------- ---------- --------- --------- ----------
07-MAY-04        120 07-MAY-04 07-MAY-04          1
08-MAY-04        170 07-MAY-04 08-MAY-04          2
09-MAY-04        220 07-MAY-04 09-MAY-04          3
10-MAY-04        320 08-MAY-04 10-MAY-04          3
11-MAY-04        420 09-MAY-04 11-MAY-04          3
12-MAY-04        520 10-MAY-04 12-MAY-04          3
13-MAY-04        620 11-MAY-04 13-MAY-04          3
14-MAY-04        720 12-MAY-04 14-MAY-04          3
15-MAY-04        820 13-MAY-04 15-MAY-04          3
 
9 rows selected.
 

Thanks

Raju, May 06, 2004 - 10:37 am UTC

Dear Tom,
Thanks for your response.But I expected the result set in the format of

Trans_date Amount_sold
---------- -----------
07-May - 09-May ...
10-May - 12-May ...
13-May - 15-May ...
How to get that?
Please do reply.

Tom Kyte
May 06, 2004 - 1:27 pm UTC

hmm, that's even easier, you cannot do that in straight SQL, it is just a little 'math' (subtract and divide, then group)?  

you need a start date (07-may-2004) to be passed to the query


ops$tkyte@ORA9IR2> variable x varchar2(20)
ops$tkyte@ORA9IR2> exec :x := '07-may-2004'
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select min(trans_date), max(trans_date), sum(amount_sold)
  2    from sales
  3   group by floor((trans_date-to_date(:x,'dd-mon-yyyy'))/3);
 
MIN(TRANS MAX(TRANS SUM(AMOUNT_SOLD)
--------- --------- ----------------
07-MAY-04 09-MAY-04              660
10-MAY-04 12-MAY-04             1560
13-MAY-04 15-MAY-04             2460
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
 

Thanks

Raju, May 07, 2004 - 1:58 am UTC

Dear Tom,
Thanks for your reply.But the clause 
"Group by floor(trans_date-to_date(....)/3)"
*frightens* me like anything.
I can't understand that.But I used a easy approach as

SQL> select 'First Three Days' as Three_days,avg(amount_sold) from sales
   2 where to_char(trans_date,'dd') between '7' and '9'
   3 union all
   4 select 'Next Three Days' as Three_days,avg(amount_sold) from sales
   5 where to_char(trans_date,'dd') between '10' and '12'
   6 union all
   7 select 'Last Three Days' as Three_days,avg(amount_sold) from sales
   8 where to_char(trans_date,'dd') between '13' and '15'
  
  

Tom Kyte
May 07, 2004 - 8:04 am UTC


osh kosh bgosh -- we use computers right?  we are technical people right? this is *simple math* the type of which we learn in grade school (subtraction, division, no eigen vectors, no fast fourier transforms).

your query -- 3 full scans, sorts and aggregates.

Mine 1 full scan.


Look at the raw data:


ops$tkyte@ORA9IR2> variable x varchar2(20)
ops$tkyte@ORA9IR2> exec :x := '08-may-2004'
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select trans_date-to_date(:x,'dd-mon-yyyy') daysbtwn,
  2         (trans_date-to_date(:x,'dd-mon-yyyy'))/3 grp,
  3         floor((trans_date-to_date(:x,'dd-mon-yyyy'))/3) grp_trunc,
  4         trans_date, amount_sold
  5    from sales
  6  /
 
  DAYSBTWN        GRP  GRP_TRUNC TRANS_DAT AMOUNT_SOLD
---------- ---------- ---------- --------- -----------
         0          0          0 08-MAY-04         120
         1 .333333333          0 09-MAY-04         220
         2 .666666667          0 10-MAY-04         320
         3          1          1 11-MAY-04         420
         4 1.33333333          1 12-MAY-04         520
         5 1.66666667          1 13-MAY-04         620
         6          2          2 14-MAY-04         720
         7 2.33333333          2 15-MAY-04         820
         8 2.66666667          2 16-MAY-04         920
 
9 rows selected.

This is a common, very simple technique that you should be applying in lots of cases.  it is *math*.  you subtract.  you divide.  you floor.  you do this in your sleep after a while you get to do it so often.


If you want to use more than 3 times the resources (n times as your 3 days might be 5 days or something in the future) go for it.  hardware vendors love that.


Me, I'm still in search of fast=true. 

Thanks

Raju, May 07, 2004 - 8:13 am UTC

Dear Tom,
Thanks for your elaborate effort.I really appreciate that.
And now,If we have a requirement of generating a report for sales table
(i) semi-annually
(ii)bi-Monthly
How to do that?

We can assume sales table with * whole year's data*.Could you
please provide some code snippet for that?

I mean
1)sum of sales semi-annually
2)sum of sales bi-monthly

Please do reply.
Bye!


Tom Kyte
May 07, 2004 - 8:58 am UTC

what is "semi-annually"

In any case -- use the same sort of technique.  Here is one that rolls

jan+feb -> jan
mar+apr -> mar
may+jun -> may

and so on:

ops$tkyte@ORA9IR2> select trans_date,
  2         trunc(decode(length(to_char(trans_date,'mm')/2), 1, add_months(trans_date,-1), trans_date),'mm') trans_date2
  3    from (select add_months(to_date('12-jan-2004'),rownum-1) trans_date
  4            from all_objects
  5                   where rownum <= 14 )
  6  /
 
TRANS_DAT TRANS_DAT
--------- ---------
12-JAN-04 01-JAN-04
12-FEB-04 01-JAN-04
12-MAR-04 01-MAR-04
12-APR-04 01-MAR-04
12-MAY-04 01-MAY-04
12-JUN-04 01-MAY-04
12-JUL-04 01-JUL-04
12-AUG-04 01-JUL-04
12-SEP-04 01-SEP-04
12-OCT-04 01-SEP-04
12-NOV-04 01-NOV-04
12-DEC-04 01-NOV-04
12-JAN-05 01-JAN-05
12-FEB-05 01-JAN-05
 
14 rows selected.


The trunc(decode....) takes any date and returns the first day of jan, mar, may, jul, ....  we simply map the INPUT date to our target date.  So, a query against "sales" that groups by month pairs could look like this:

ops$tkyte@ORA9IR2> create table t
  2  as
  3  select created trans_date, object_id sales
  4    from all_objects
  5  /
 
Table created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select trunc(decode(length(to_char(trans_date,'mm')/2), 1, add_months(trans_date,-1), trans_date),'mm') trans_date,
  2         min(trans_date),
  3             max(trans_date),
  4             count(*),
  5         sum(sales) sales
  6    from t
  7   group by trunc(decode(length(to_char(trans_date,'mm')/2), 1, add_months(trans_date,-1), trans_date),'mm')
  8  /
 
TRANS_DAT MIN(TRANS MAX(TRANS   COUNT(*)      SALES
--------- --------- --------- ---------- ----------
01-MAY-02 12-MAY-02 12-MAY-02      29234  434296846
01-NOV-03 25-NOV-03 16-DEC-03         52    1594079
01-JAN-04 02-JAN-04 11-FEB-04       1307   42788901
01-MAR-04 17-MAR-04 15-APR-04         17     617068
01-MAY-04 03-MAY-04 07-MAY-04          4     150141



Now, you do the same for "semi-annual"

Look at the DATE FORMATS, there are some for qtrs and other stuff as well -- fully documented.
 

But still confused

Jon Waterhouse, June 23, 2004 - 11:37 am UTC

I thought I could use analytic functions to do the following. I have information on a client with a start and end date. Where the periods are contiguous, i.e. I have a row for January and a row for February, I wanted to combine them, i.e. with one row with a start date of January 1 and and end date of the end of February.

But I didn't even manage to get the first step done: marking rows which followed on from a previous period or were immediately followed by another period.

My table looked like this:

CREATE TABLE testMEMBERS (
CLIENT NUMBER (4) NOT NULL,
DATESTART DATE NOT NULL,
DATEEND DATE,
PERSTART NUMBER (1),
PEREND NUMBER (1)) ;

Five rows of data

begin
insert into testmembers (client,datestart,dateend) values (1,to_date('20040101','yyyymmdd'),to_date('20040131','yyyymmdd'));
insert into testmembers (client,datestart,dateend) values (1,to_date('20040201','yyyymmdd'),to_date('20040331','yyyymmdd'));
insert into testmembers (client,datestart,dateend) values (1,to_date('20040501','yyyymmdd'),to_date('20040531','yyyymmdd'));
insert into testmembers (client,datestart,dateend) values (1,to_date('20040601','yyyymmdd'),to_date('20040630','yyyymmdd'));
insert into testmembers (client,datestart,dateend) values (1,to_date('20040701','yyyymmdd'),to_date('20040731','yyyymmdd'));
end;


This query properly identifies the rows that should be marked:

select C.client,C.DATESTART,Q.client,Q.DATESTART,
DECODE(lag(Q.dateend) OVER (PARTITION BY Q.client ORDER BY Q.datestart),Q.datestart-numtodsinterval(1,'DAY'),0,1) contigb4,
DECODE(lead(Q.datestart) OVER (PARTITION BY Q.client ORDER BY Q.datestart),Q.dateend+numtodsinterval(1,'DAY'),0,1) contigafter
from testmembers c,testMEMBERS Q
where C.client = 1
AND c.datestart = q.datestart
AND c.client = q.client

but in this form of the query

SELECT C.client,C.DATESTART FROM testMEMBERS C
WHERE c.client = 1 AND (0,1) IN
(SELECT DECODE(lag(dateend) OVER (PARTITION BY client
ORDER BY datestart),datestart-numtodsinterval(1,'DAY'),0,1) contigb4,
DECODE(lead(datestart) OVER (PARTITION BY client
ORDER BY datestart),dateend+numtodsinterval(1,'DAY'),0,1) contigafter
FROM testmembers q
where c.datestart = q.datestart AND c.client = q.client)
the window function doesn't get called at all, I guess it is because it always the last part of the query to be done. This form would return no rows: the subquery always returns (1,1).

What I was actually trying to do was to update the values in perstart and perend, like so:

update testMEMBERS C set (perstart,perend) =
SELECT DECODE(lag(dateend) OVER (PARTITION BY client
ORDER BY datestart),datestart-numtodsinterval(1,'DAY'),0,1) contigb4,
DECODE(lead(datestart) OVER (PARTITION BY client ORDER BY datestart,dateend+numtodsinterval(1,'DAY'),0,1) contigafter
FROM testmembers q
WHERE c.datestart = q.datestart AND c.client = q.client)
where client = 1
I figure there should be a way to use analytic functions to do it, but how?

Thanks for all the help! (I've read the analytic functions chapter in Expert 1on1 and I still don't really understand why it doesn't work. I'm using 9.0.1)


Tom Kyte
June 23, 2004 - 11:58 am UTC

it is much simplier than that - no self joins (that is what the analytics do for us)

ops$tkyte@ORA9IR2> select client,
  2         min(datestart),
  3         max(dateend)
  4    from (
  5  select client,
  6         datestart,
  7         dateend,
  8         max(grp) over (partition by client order by dateend) grp
  9    from (
 10  select client, datestart, dateend,
 11         case when lag(dateend)
 12                 over (partition by client order by dateend) != datestart-1
 13                   or
 14                   row_number()
 15                     over (partition by client order by dateend) = 1
 16              then row_number()
 17                     over (partition by client order by dateend)
 18          end grp
 19    from testmembers
 20         )
 21         )
 22   group by client, grp
 23   order by 1, 2
 24  /
 
    CLIENT MIN(DATES MAX(DATEE
---------- --------- ---------
         1 01-JAN-04 31-MAR-04
         1 01-MAY-04 31-JUL-04


 

Nice

A reader, June 23, 2004 - 7:13 pm UTC


Can I use analytic functions here?

Nina, July 20, 2004 - 8:46 am UTC

Tom --

I was wondering if I can use analytic functions in the following code:
I have the following query:

SELECT
SUBSTR(spriden_last_name, 1, 20) last_name,
nbbposn_posn posn,
spriden_id id,

FROM spriden, perjtot, pebempl, nbbposn, ptrecls

WHERE spriden_change_ind is NULL
AND spriden_pidm = perjtot_pidm
AND spriden_pidm = pebempl_pidm
AND perjtot_year = year_parm
AND perjtot_posn = nbbposn_posn

This query brings back the following data:

id posn last_name
------------------------------------------------------
545454 A0345 Smith
545454 A9845 Smith
969696 S3456 Williams
568974 A8745 Sherry
568974 H1234 Sherry

As you can see per each id there exist multiple positions (posn). Each position in turn can
have multiple salaries attached to it. To get the salaries I have the following query:

SELECT
nbrbjob_id,
nbrbjob_posn,
nbrbjob_ann_salary,
nbrjobs_contract_type
FROM nbrbjob, nbrjobs X
WHERE nbrbjob_id = v_id (an id from the query above)
AND nbrbjob_posn = v_posn (posn from the query above)
AND nbrbjob_id = nbrjobs_id
AND nbrbjob_posn = nbrjobs_posn
AND nbrbjob_suff = nbrjobs_suff
AND nbrjobs_status = 'A'
AND nbrjobs_effective_date =
(SELECT MAX(nbrjobs_effective_date)
FROM nbrjobs
WHERE nbrjobs_id = X.nbrjobs_id
AND nbrjobs_posn = X.nbrjobs_posn
AND nbrjobs_suff = X.nbrjobs_suff
AND nbrjobs_status = 'A'
AND nbrjobs_effective_date <= TO_DATE(date_parm,'DD-MON-YY'))
ORDER BY nbrbjob_contract_type;

So if we take the first record from above for example with id 545454, the results might be as follows:

id posn salary contract_type
---------------------------------------------------------
545454 A0345 30,000 P
545454 A0345 25,000 O
545454 A0345 15,000 S


After passing to the above query the next posn (545454 id) - A9845, the results are as follows:

id posn salary contract_type
------------------------------------------------------------
545454 A9845 16,000 P


In short, what I need to get for id 545454 is the total sum of all the salaries across all of his
positions (posn) that were selected in the query #1. The thing is that query #1 has a year parm. So if say the year
parm is 2003, id 545454 might have only 1 position A0345 and if parm was 2002 for example, the id 545454 will have
2 positions A0345 and A9845. Hence, I did not see any other way but to have 2 separate queries. Moreover, since
each of the positions can have multiple salaries, to get the sum of the salaries per position, I would have to do an inline
view (no problem there..). The final requirenment is that each positon should have one contract_type of P. So,
what I actually need to get out of the second query is the sum of the salaries per positon and only THE FIRST record with P (ordering by and selecting rownum < 2 should do it.)
I am only interested in contract_type P. Of course, if a record has an S or something with it, I take that if there is no
P. So now the query # 2 looks as follows:

SELECT total_amount, nbrbjob_contract_type
FROM nbrbjob, nbrjobs X,
(SELECT SUM(nbrjobs_ann_salary) total_amount, nbrjobs_id ID
FROM nbrbjob, nbrjobs X
WHERE nbrbjob_id = '3041913'
AND nbrbjob_posn = '102238'
AND nbrbjob_id = nbrjobs_pidm
AND nbrbjob_posn = nbrjobs_posn
AND nbrbjob_suff = nbrjobs_suff
AND nbrjobs_status = 'A'
AND nbrjobs_effective_date =
(SELECT MAX(nbrjobs_effective_date)
FROM nbrjobs
WHERE nbrjobs_id = X.nbrjobs_id
AND nbrjobs_posn = X.nbrjobs_posn
AND nbrjobs_suff = X.nbrjobs_suff
AND nbrjobs_status = 'A'
AND nbrjobs_effective_date <= TO_DATE(date_parm,'DD-MON-YY')
ORDER BY nbrbjob_contract_type
GROUP BY nbrjobs_pidm ) A
WHERE nbrbjob_id = v_id
AND nbrbjob_posn = v_posn
AND nbrbjob_id = nbrjobs_id
AND nbrbjob_posn = nbrjobs_posn
AND nbrbjob_suff = nbrjobs_suff
AND nbrjobs_id=A.ID
AND nbrjobs_status = 'A'
AND nbrjobs_effective_date =
(SELECT MAX(nbrjobs_effective_date)
FROM nbrjobs
WHERE nbrjobs_id = X.nbrjobs_id
AND nbrjobs_posn = X.nbrjobs_posn
AND nbrjobs_suff = X.nbrjobs_suff
AND nbrjobs_status = 'A'
AND nbrjobs_effective_date <= TO_DATE('31-DEC-03','DD-MON-YY'))
AND rownum<2
ORDER BY nbrbjob_contract_type;

I am getting rownum here and ordering by just to get the P contract type first. Anyway, sorry for this long explanation.
In short, I created 2 PL/SQL tables, I get the data form the first query into the first table, then from the second into the second,
then compare the results and print id, name and sum of salary (that is the end goal here - just to print the sum of salaries per
id - no position needed). I was wondering though if I could simplify all of this and use analytic functions? Is it
possible here?

thank you in advance.



Tom Kyte
July 20, 2004 - 7:54 pm UTC

lots of tables there -- got lost when you added the year?

but don't see why you cannot just join your first query to the second query (use an inline view and join to it) by the same two fields you are passing to it and then just "sum"?

analytics in plsql

kit, July 20, 2004 - 1:59 pm UTC

is there restriction of analytics in pl/sql eg
cursor c_xx is
SELECT 'FINAL', xxx_tid, princ
FROM (
SELECT xxx_tid , ABS (principal_amt) princ
, row_number () over (PARTITION BY wss_tid ORDER BY version DESC) rn
FROM tempa
WHERE xxx_tid = xxx_tid)
WHERE rn = 1;

fails to compile in pl/sql but ok in sql. it complains about the "(" after over in the row_number

Tom Kyte
July 20, 2004 - 8:42 pm UTC

ops$tkyte@ORA9IR2> create table tempa ( xxx_tid number, principal_amt number, wss_tid number , version number );
 
Table created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>    SELECT 'FINAL', xxx_tid, princ
  2  FROM (
  3  SELECT xxx_tid , ABS (principal_amt) princ
  4   , row_number () over (PARTITION BY wss_tid ORDER BY version DESC) rn
  5  FROM tempa
  6  WHERE xxx_tid = xxx_tid)
  7  WHERE  rn = 1;
 
no rows selected
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> declare
  2  cursor c_xx is
  3     SELECT 'FINAL', xxx_tid, princ
  4  FROM (
  5  SELECT xxx_tid , ABS (principal_amt) princ
  6   , row_number () over (PARTITION BY wss_tid ORDER BY version DESC) rn
  7  FROM tempa
  8  WHERE xxx_tid = xxx_tid)
  9  WHERE  rn = 1;
 10
 11  begin
 12  null;
 13  end;
 14  /
 
PL/SQL procedure successfully completed.
 


worked for more, need more details on exactly how to reproduce. 

Analytic functions

kit, July 21, 2004 - 5:19 am UTC

I'm working on 8.1.7.4 instance.
I still get the same error and also reproduced your scenario and same problem

I presume its like cannot use case in pl/sql but ok for normall sql

The thing is i have used other analytics before in 8.17

Tom Kyte
July 21, 2004 - 8:29 am UTC

in 817, analytics, group by cube, group by rollup, order by in a subquery, scalar subqueries and other SQL constructs are not supported.

you have never used analytics in 817 in plsql unless you

a) used a view to hide the construct
b) used dynanmic sql to do the same.

analytics

Isi, September 01, 2004 - 8:54 am UTC

I looked and looked but you do nor mention how to run an average within an query , for all values between the 5 and 95 percentile. i belive this is a challenge

Tom Kyte
September 01, 2004 - 9:06 am UTC

do you mean something like this:

ops$tkyte@ORA9IR2> select emp.deptno, avg(sal)
  2    from emp,
  3         (select deptno,
  4                 percentile_cont(0.5) within group (order by sal) mins,
  5                 percentile_cont(0.95) within group (order by sal) maxs
  6            from emp
  7           group by deptno ) x
  8   where emp.sal between x.mins and x.maxs
  9     and emp.deptno = x.deptno
 10   group by emp.deptno;
 
    DEPTNO   AVG(SAL)
---------- ----------
        10   12403.13
        20 15145.3167
        30   7846.875
 

Call ranging different windows!

Prasad, September 05, 2004 - 10:48 am UTC

Hi tom,

Hope you are doing fine.
Here is my requirement
There are 3 time Windows
(MON,TUE,WED,THU,FRI) AND TIME BETWEEN (08:00:00 TO 18:00:00) => WEEKDAY
(MON,TUE,WED,THU,FRI) and (TIME < 08:00:00 OR TIME > 18:00:00) => EVENING
(SAT,SUN) => WEEKEND

Now, my inputs are
1: Time of call
2: Duration of call (in seconds)

Now with these two input I've to find out what n all boundaries did my call crossed.

e.g. If the call originated on Friday at 17:55:00 and lasted for 22200 seconds (6 Hrs 10 mins) then it crossed:
5 mins => WEEKDAY CALL
6 Hrs => EVENING CALL
5 mins => WEEKEND CALL

Please do let me know if I'm not clear here.

Thanks as always !

Tom Kyte
September 05, 2004 - 12:18 pm UTC

I guess I would write a stored procedure or some other function in whatever language I was best at coding in to procedurally figure this one out.

Any suggestion how to go about it?

A reader, September 06, 2004 - 4:33 am UTC


Tom Kyte
September 06, 2004 - 10:24 am UTC

it seems to me to be rather straight forward procedural code, solved easily in an almost "brute force fashion"? coding "101" stuff. anyway.

a loop, 3 counters, a little addition, an exit when start_date >= end_date.....

you have:

(MON,TUE,WED,THU,FRI) AND TIME BETWEEN (08:00:00 TO 18:00:00) => WEEKDAY
(MON,TUE,WED,THU,FRI) and (TIME < 08:00:00 OR TIME > 18:00:00) => EVENING
(SAT,SUN) => WEEKEND

Now, my inputs are
1: Time of call
2: Duration of call (in seconds)
............

this is purely "psuedo code"

l_start_date := input;
l_end_date := input + 1/24/60/60*input_seconds;
while l_start_date <= l_end_date
loop
if (to_char(l_start_date,'dy') in ( 'SAT', 'SUN' ))
then
-- if weekend, advance to midnight monday morning.....
l_next_time := least( trunc(next_day,l_start_date,'MON'), l_end_date);
l_weekend_time := l_weekend_time + (l_next_time-l_start_date);
elsif (to_char(l_start_date,'hh24') between 8 and 17
then
-- else, not a weekend, is it between 8am and 6pm? if so, advance to
-- 6pm on the same day....
l_next_time := least(trunc(l_start_date)+18/24, l_end_date);
l_weekday := l_weekday + (l_next_time-l_start_date);
elsif (to_char(l_date_date,'hh24') < 8 )
then
-- else, not a weekend, not between 8am and 6pm, but it is before 8am
-- advance to 8am then...
l_next_time := least(trunc(l_start_date)+8/24, l_end_date );
l_evening := l_evening + (l_next_time-l_start_date);
else
-- else, not a weekend, not between midnight and 8am. advance to mid
-- night of the next day
l_next_time := least(trunc(l_start_date)+1, l_end_date);
l_evening := l_evening + (l_next_time-l_start_date);
end if;
l_start_date := l_next_time;
end loop;





U R A GENIUS!!

A reader, September 06, 2004 - 1:23 pm UTC


How to merge overlapping records?

Tony, September 07, 2004 - 4:08 am UTC

I've data as below and want to merge the records for overlapping dates. MIN and MAX of start and end dates for overlapping records should be the Start and end date of merged record. Is it possible to do it in a single query using analytical or any other features in oracle9i (9.2)?.

Before merge:

Item Code Start_date End_date
============== =========== ===========
111 15-May-2004 20-Jun-2004
111 22-May-2004 07-Jun-2004
111 20-Jun-2004 13-Aug-2004
111 27-May-2004 30-Aug-2004
111 02-Sep-2004 23-Dec-2004
222 21-May-2004 19-Aug-2004
...
...

After the merge:

Item Code Start_date End_date
============== =========== ===========
111 15-May-2004 30-Aug-2004
111 02-Sep-2004 23-Dec-2004
222 21-May-2004 19-Aug-2004
...
...

Scripts:

create table item(item_code number, start_date date, end_date date);

insert into item values (111,'15-May-2004','20-Jun-2004');
insert into item values (111,'22-May-2004','07-Jun-2004');
insert into item values (111,'20-Jun-2004','13-Aug-2004');
insert into item values (111,'27-May-2004','30-Aug-2004');
insert into item values (111,'02-Sep-2004','23-Dec-2004');
insert into item values (222,'21-May-2004','19-Aug-2004');


Thanks in advance.


Tom Kyte
September 07, 2004 - 7:43 am UTC

variation on a theme:
https://asktom.oracle.com/Misc/oramag/on-format-negation-and-sliding.html

"analytics to the rescue"

ops$tkyte@ORA9IR2> select item_code,
  2         start_date,
  3         end_date,
  4         case when lag(end_date) over (partition by item_code order by end_date) < start_date
  5                   or
  6                   lag(end_date) over (partition by item_code order by end_date) is null
  7              then row_number() over (partition by item_code order by end_date)
  8          end rn
  9    from item
 10   order by 1, 3
 11  /
 
 ITEM_CODE START_DAT END_DATE          RN
---------- --------- --------- ----------
       111 22-MAY-04 07-JUN-04          1
       111 15-MAY-04 20-JUN-04
       111 20-JUN-04 13-AUG-04
       111 27-MAY-04 30-AUG-04
       111 02-SEP-04 23-DEC-04          5
       222 21-MAY-04 19-AUG-04          1
 
6 rows selected.
 
ops$tkyte@ORA9IR2> select item_code,
  2         start_date,
  3         end_date,
  4         max(rn) over (partition by item_code order by end_date) max_rn
  5    from (
  6  select item_code,
  7         start_date,
  8         end_date,
  9         case when lag(end_date) over (partition by item_code order by end_date) < start_date
 10                   or
 11                   lag(end_date) over (partition by item_code order by end_date) is null
 12              then row_number() over (partition by item_code order by end_date)
 13          end rn
 14    from item
 15         )
 16   order by 1, 3
 17  /
 
 ITEM_CODE START_DAT END_DATE      MAX_RN
---------- --------- --------- ----------
       111 22-MAY-04 07-JUN-04          1
       111 15-MAY-04 20-JUN-04          1
       111 20-JUN-04 13-AUG-04          1
       111 27-MAY-04 30-AUG-04          1
       111 02-SEP-04 23-DEC-04          5
       222 21-MAY-04 19-AUG-04          1
 
6 rows selected.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select item_code, min(start_date), max(end_date)
  2    from (
  3  select item_code,
  4         start_date,
  5         end_date,
  6         max(rn) over (partition by item_code order by end_date) max_rn
  7    from (
  8  select item_code,
  9         start_date,
 10         end_date,
 11         case when lag(end_date) over (partition by item_code order by end_date) < start_date
 12                   or
 13                   lag(end_date) over (partition by item_code order by end_date) is null
 14              then row_number() over (partition by item_code order by end_date)
 15          end rn
 16    from item
 17         )
 18         )
 19   group by item_code, max_rn
 20   order by 1, 2
 21  /
 
 ITEM_CODE MIN(START MAX(END_D
---------- --------- ---------
       111 15-MAY-04 30-AUG-04
       111 02-SEP-04 23-DEC-04
       222 21-MAY-04 19-AUG-04
 
 

Any function available

P.Karthick, September 07, 2004 - 7:11 am UTC

Hai,

I have created two tables as shown below.

create table temp_trans (dept_id char(4),emp_id char(4),emp_name char(20))

create table temp_trans_1 (emp_id char(4))


I have inserted the following data in temp_trans.

insert into temp_trans values('D001','E001','KARTHICK')

insert into temp_trans values('D001','E002','KAUSHIK')

insert into temp_trans values('D001','E003','RAVI')

insert into temp_trans values('D001','E004','SESHU')

insert into temp_trans values('D002','E001','VIMAL')

insert into temp_trans values('D002','E002','VIJAY')

insert into temp_trans values('D002','E003','SURESH')


I have inserted the following data in temp_trans_1.

insert into temp_trans_1 values('E001')

insert into temp_trans_1 values('E002')

insert into temp_trans_1 values('E003')

insert into temp_trans_1 values('E004')

insert into temp_trans_1 values('E005')



Now iam trying to get a output of this format.

------------------------------------------------------
DEPT E001 E002 E003 E004 E005
------------------------------------------------------

D001 KARTHICK KAUSHIK RAVI SESHU

D002 VIMAL VIJAY SURESH

------------------------------------------------------


Where E001,E002... are got from temp_trans_1.

I dont want to use dynamic sql. is it possible with simple sql statement. is there
any function available to take a transpose of a table i.e. interchanging row and column.


Thank you.



Tom Kyte
September 07, 2004 - 9:26 am UTC

go look on the other pages you posted this. and please try to just sort of do it "once" in the future....


How to merge overlapping records?

Tony, September 08, 2004 - 1:34 am UTC

Your answer to my previous question on &#8220;merging overlapping records&#8221; is wonderful. I forgot to mention one more scenario. The merge should also include records whose starting date is one day ahead of ending date of any record.

Before merge:

Item Code Start_date End_date
============== =========== ===========
111 15-May-2004 20-Jun-2004
111 21-Jun-2004 07-Aug-2004
222 21-May-2004 19-Aug-2004

After the merge:

Item Code Start_date End_date
============== =========== ===========
111 15-May-2004 07-Aug-2004
222 21-May-2004 19-Aug-2004


Is it possible to tweak your above query to achieve this in a single query? Sorry for troubling you.

Though I work in Oracle for several years, analytical features are very difficult to understand and use appropriately. I think most of the developers will agree with me. Are you planning to write a book on analytical functions? If so, it&#8217;ll be of great help to Oracle lovers. Can you point to any URL or Book on analytical functions?

Thanks in advance.








Tom Kyte
September 08, 2004 - 9:24 am UTC

do you have my book "expert one on one Oracle"?

there is an entire chapter on it in there.

there is an entire chapter on it in the "Data warehousing guide" as well (otn.oracle.com)

as for your followup -- give it a try -- it is fairly obvious how "records that overlap" are assigned here, the case statement does that. tweak it. also read the related article, it goes step by step what is happening here.

ora7.3 and ora8i

ygp, September 09, 2004 - 10:13 am UTC

I have two oracle instances 'ora8i" and "ora73".

I wanted to generate a report by using analytic functions on a table "ora73tab" in ora73.

I created a DB-link "ora8i_to_ora73" in ora8i and then created a synonym "ora8isyn" for the table "ora73tab@ora8i_to_ora73".

Now, using analytic functions on the synonym ora8isyn ( in instance ora8i, of course )is allowed; it works.

Does it mean that the query gievn to ora8i is not supplied to ora73 as it is ?


Tom Kyte
September 09, 2004 - 11:39 am UTC

ora73 didn't have analytics, therefore the data is getting pulled into Oracle8i and processed.

You can verify that with explain plan or autotrace -- it'll show you the sql that is being sent to 7.3

It's fun to play with Analytics using Tony's question

Winston, September 09, 2004 - 1:59 pm UTC

Using 'Analytics to the Rescue' techniques, I found the following query will serve the purpose.

select item_code,
min(start_date) start_date,
max(end_date) end_date
from
(
select item_code, start_date,end_date,
max(rn) over (partition by item_code order by end_date) max_rn from
(
select item_code, start_date, end_date,
case
when start_date - pr_enddate < 1/24
or pr_enddate is null
then row_num
end rn
from
(
select item_code, start_date, end_date,
lag(end_date)
over (partition by item_code order by end_date) pr_enddate ,
row_number()
over ( order by end_date) row_num
from item)
))
group by item_code, max_rn
order by 1

A reader, September 10, 2004 - 6:03 pm UTC

SQL> select * from t;

         X D
---------- ---------
       100 10-SEP-04
       100 10-SEP-04
       200 09-SEP-04
       200 09-SEP-04
       300 09-SEP-04
       300 09-SEP-04

I need the following

select x, (the summary of x for the whole month of september)
from t where d='10-SEP-04'

what is the best way to do this. Can we do this using analytic functions? 

Tom Kyte
September 10, 2004 - 7:44 pm UTC

yup and if you would have given create table/insert into (as the page you used to add this said in bold...) i would even have shown you how!

A reader, September 10, 2004 - 10:17 pm UTC

Here we go Tom, sorry for not giving you the ddl and inserts in the first place. In addition can you also show me how I can populate this kind of data with one single insert.

create table t ( x number, d date)
/
insert into t ( 100, sysdate);
/
insert into t ( 200, sysdate-1);
/
insert into t ( 300, sysdate-2);
/
insert into t ( 400, sysdate-3);
/
insert into t ( 500, sysdate-4);
/
insert into t ( 600, sysdate-5);
/



Tom Kyte
September 11, 2004 - 8:14 am UTC

be nice if you tested them (they are not sql as typed)...  And if they did work -- you insert each twice.  

anyway, to make it more interesting, I added another month:

ops$tkyte@ORA9IR2> select x, d, sum(x) over (partition by trunc(d,'mm'))
  2    from t;
 
         X D         SUM(X)OVER(PARTITIONBYTRUNC(D,'MM'))
---------- --------- ------------------------------------
       100 11-SEP-04                                 2100
       200 10-SEP-04                                 2100
       300 09-SEP-04                                 2100
       400 08-SEP-04                                 2100
       500 07-SEP-04                                 2100
       600 06-SEP-04                                 2100
        50 11-OCT-04                                 1050
       100 10-OCT-04                                 1050
       150 09-OCT-04                                 1050
       200 08-OCT-04                                 1050
       250 07-OCT-04                                 1050
       300 06-OCT-04                                 1050
 
12 rows selected.
 
 

you missed the where clause

A reader, September 11, 2004 - 9:38 am UTC

1.
select x, (the summary of x for the whole month of september) from t where d='10-SEP-04'

Tom, you missed the where clause in your solution, my requirement is with the where clause included.

2.
alter table t add ( z varchar2(10) default 'A')
/
UPDATE table T set z = 'B' where x < 300
/

( my sqlplus is hanging when I'm trying to connect to my database, I could not test the above, please forgive me)

Now I want the totals for the month grouped by z for the whole month in the last column.

Thanks in advance.


Tom Kyte
September 11, 2004 - 10:50 am UTC


1)

select *
from (
select x, d, sum(x) over (partition by trunc(d,'mm'))
from t )
where d = TO_DATE('.....')



2) give it a go! try it out! really -- the best way to learn analytics is to take the concepts I've shown here on this site over and over and over and apply them.

(hint: partition by is like group by)

Analytics vs Scalar Subquery

Reader.., September 11, 2004 - 1:03 pm UTC

What is the gain (if any) you obtain using here Analytics vs any other thing (say, a scalar subquery) ?

Tom Kyte
September 11, 2004 - 1:07 pm UTC

analytics offer a one pass flexibility on the data that scalar subqueries don't

they are both useful in their own right.

A reader, September 11, 2004 - 2:18 pm UTC

Tom, just curious know how you can solve the above using a scalar subquery. Also want to know what scalar means in terms of sql.

Tom Kyte
September 11, 2004 - 2:59 pm UTC

ops$tkyte@ORA9IR2> select x, d,
  2         (select sum(x)
  3            from t t2
  4           where d between trunc(t.d,'mm') and add_months(trunc(t.d,'mm'),1)
  5             and d <> add_months(trunc(t.d,'mm'),1) ) sum_x
  6    from t
  7  /
 
         X D              SUM_X
---------- --------- ----------
       100 11-SEP-04       2100
       200 10-SEP-04       2100
       300 09-SEP-04       2100
       400 08-SEP-04       2100
       500 07-SEP-04       2100
       600 06-SEP-04       2100
        50 11-OCT-04       1050
       100 10-OCT-04       1050
       150 09-OCT-04       1050
       200 08-OCT-04       1050
       250 07-OCT-04       1050
       300 06-OCT-04       1050
 
12 rows selected.


a scalar subquery is a subquery used where you could use the string 'FOOBAR' (as an expression).  it returns a single column and 0 or 1 rows. 

A reader, September 11, 2004 - 3:14 pm UTC

ops$tkyte@ORA9IR2> select x, d,
2 (select sum(x)
3 from t t2
4 where d between trunc(t.d,'mm') and add_months(trunc(t.d,'mm'),1)
5 and d <> add_months(trunc(t.d,'mm'),1) ) sum_x
6 from t
7 /


Why do we need to do a between on the where cant we just do a where to_char(d, 'mm/yyyy') = to_char(date_suppliedasparam, 'mm/yyyy')

Isnt the below more appropriate

select x, d,
(select sum(x)
from t t2
where where to_char(d, 'mm/yyyy') = to_char(date_param, 'mm/yyyy') monthly_sum_of_x
from t
where d = date_param


or



select
a.x,
a.d,
a.monthly_sum_of_x

( select x, d,
sum(x) over (partition by trunc(d,'mm')) monthly_sum_of_x
from t
where to_char(d, 'mm/yyyy') = to_char(date_param, 'mm/yyyy'))

where a.d =date_param



1.Please comment on both of them

2.what do you mean by 'one pass flexibility on the data '

Tom Kyte
September 11, 2004 - 3:34 pm UTC

why the between? it is probable we want to do an index range scan there, not a full table scan for every execution of the scalar subquery.

where column between 'value' and 'value' --> index range scan
where f(column) = 'value' --> no index (assuming no function based indexes that is)


so, avoid performing f(column) -- period.


the analytics can get the answer in a single pass on the table. the scalar subquery would

a) read the table to find rows
b) for each row found, re-read the table to get the summation



A reader, September 11, 2004 - 3:25 pm UTC

2.
alter table t add ( z varchar2(10) default 'A')
/
UPDATE table T set z = 'B' where x < 300
/

( my sqlplus is hanging when I'm trying to connect to my database, I could not
test the above, please forgive me)

Now I want the totals for the month grouped by z for the whole month in the last
column.

Tom, as I have said above, my oracle is hanging when I'm trying to connect. I need to have this solution ready by next week Mon. Please help.

Would the below work

select
a.x,
a.d,
a.monthly_sum_of_x_for_z

( select x, d, z,
sum(x) over (partition by trunc(d,'mm'), z) monthly_sum_of_x_for_z
from t
where to_char(d, 'mm/yyyy') = to_char(date_param, 'mm/yyyy'))

where a.d =date_param



Tom Kyte
September 11, 2004 - 3:35 pm UTC

if you cannot log into your database, you don't need any queries no?

really -- i'm not "program everything for me please". sounds like you might need to go into work -- if you have work due on monday morning?

Analytic Functions

sam, October 12, 2004 - 7:02 pm UTC

I just want to add one more condition to the first followup.

You gave a example

scott@ORA817DEV.US.ORACLE.COM> select *
2 FROM (select empno,
3 deptno,
4 hiredate,
5 sal,
6 dense_rank()
7 over (partition by deptno
8 order by sal desc) dr
9 from emp
10 where hiredate >= to_date( '01-JAN-1980', 'dd-mon-yyyy' )
11 and sal > 0)
12 where dr <= 3
13 order by deptno, sal desc
14 /

EMPNO DEPTNO HIREDATE SAL DR
---------- ---------- --------- ---------- ----------
7839 10 17-NOV-81 5000 1
7782 10 09-JUN-81 2450 2
7934 10 23-JAN-82 1300 3
7788 20 09-DEC-82 3000 1
7902 20 03-DEC-81 3000 1
7566 20 02-APR-81 2975 2
7876 20 12-JAN-83 1100 3
7698 30 01-MAY-81 2850 1
7499 30 20-FEB-81 1600 2
7844 30 08-SEP-81 1500 3

10 rows selected.

Can you help me modify above query to have
if sal > 3000 then only show remaining rows. Else show just the first row.

So the result should look like

EMPNO DEPTNO HIREDATE SAL DR
---------- ---------- --------- ---------- ----------
7839 10 17-NOV-81 5000 1
7782 10 09-JUN-81 2450 2
7934 10 23-JAN-82 1300 3
7788 20 09-DEC-82 3000 1
7902 20 03-DEC-81 3000 1
7698 30 01-MAY-81 2850 1

Thanks for your help as always.


Tom Kyte
October 12, 2004 - 7:56 pm UTC

ops$tkyte@ORA9IR2> select *
  2  FROM (select empno,
  3               deptno,
  4               hiredate,
  5               sal,
  6               dense_rank()
  7               over (partition by deptno
  8                     order by sal desc) dr,
  9                       max(sal)
 10                           over (partition by deptno) max_sal
 11               from emp
 12               where hiredate >= to_date( '01-JAN-1980', 'dd-mon-yyyy' )
 13               and sal > 0)
 14           where dr <= 3
 15  order by deptno, sal desc
 16  /
 
     EMPNO     DEPTNO HIREDATE         SAL         DR    MAX_SAL
---------- ---------- --------- ---------- ---------- ----------
      7839         10 17-NOV-81       5000          1       5000
      7782         10 09-JUN-81       2450          2       5000
      7934         10 23-JAN-82       1300          3       5000
      7788         20 09-DEC-82       3000          1       3000
      7902         20 03-DEC-81       3000          1       3000
      7566         20 02-APR-81       2975          2       3000
      7876         20 12-JAN-83       1100          3       3000
      7698         30 01-MAY-81       2850          1       2850
      7499         30 20-FEB-81       1600          2       2850
      7844         30 08-SEP-81       1500          3       2850
 
10 rows selected.
 


That should be sufficient to get you going........... now what should the where clause be -- that should be easy. 

An observation...

Yogee, October 12, 2004 - 11:18 pm UTC

Is it always like

select ....
from ( ...query with analytical function)
where ....

Does it always have to be a sub-query?
Would that add to performance problems?

Tom Kyte
October 13, 2004 - 8:04 am UTC

if you want to "where" on the analytics, you have to use inline views

it is not always like that, there are analytics without inline views, but when you want to where on the analytic, you need the inline view.


it would not be "adding to a performance problem". consider:


select *
from ( select a.*, max(dt) over (partition by empno) max_dt_by_empno
from emp_history )
where dt = max_dt_by_empno
/

to get the "most current emp record from a history". Compare that to:

select *
from emp_history X
where dt = ( select max(dt) from emp_history Y where x.empno = y.empno );


or

select *
from emp_history
where (empno,dt) in (select empno,max(dt) from emp_history group by empno);


I see the last two as performance issues, the first one as the solution to them.
</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:5280776862234#5767134024470 <code>
for example....


Analytic Functions

sam, October 13, 2004 - 9:49 am UTC

Thanks for replying back.

Here is another scenario

The query looks like

select * from
( select AUCTION_ID,
AUCTION_VER,
AUCTION_SUB_CAT_ID,
REC_STAT,
dense_rank() over (partition by auction_id order by
auction_ver desc) dr
from SECLEND_AUCTION )
where dr <= 2

The result is
AUCTION_ID AUCTION_VER AUCTION_SUB REC_STAT DR
1 2 5 4 1
1 1 5 3 2
2 1 5 3 1
42 2 5 2 1
42 1 5 4 2
43 2 5 2 1
43 1 5 6 2

What I want is if the rec_stat = 2 then show 2 rows
If rec_stat <> 2 (we can not do min / max on this column it's random)then get only first row.
So in above case AUCTION_ID = 1 & AUCTION_VER = 2 should not be shown.

Thanks for your help.



Tom Kyte
October 13, 2004 - 10:06 am UTC

so, use

max( decode( rec_stat, 2, rec_stat ) ) over (partition by whatever) max_rs


or

count( decode(rec_stat,2,2) ) over (partition by whatever) cnt_rs


-- and that'll tell you if you have at least ONE rec_stat in your partition by clause. Now, you can add the predicate on that, like "and cnt_rs > 0" for example.


you know the technique, know it is just a matter of "application"

Analytic Functions

sam, October 13, 2004 - 10:41 am UTC

Tom,

Sorry not to be crystal clear in my requirement.

select * from
( select AUCTION_ID,
AUCTION_VER,
AUCTION_SUB_CAT_ID,
REC_STAT,
dense_rank() over (partition by auction_id order by
auction_ver desc) dr
from SECLEND_AUCTION )
where dr <= 2

The result is
AUCTION_ID AUCTION_VER AUCTION_SUB REC_STAT DR
1 2 5 4 1
1 1 5 2 2
2 1 5 2 1
42 2 5 2 1
42 1 5 4 2
43 12 5 2 1
43 11 5 6 2

It's not just rec_Stat 2 present in partition or not.
It's the position of REC_STAT = 2.
As you see in above example Auction_id 1 has 2 rows.
AUCTION_VER 1 and 2 , (they are not necessary to be 1 & 2)
So the logic is
for max(AUCTION_VER) if REC_STAT = 2 then show 2 rows.
else show only 1 row.

So in above case for AUCTION_ID 1 only 1 row is displayed.
For AUCTION_ID 2 only one row is displayed since there is no second row.





Tom Kyte
October 13, 2004 - 12:00 pm UTC

so, first_value(rec_stat) over (partition by key order by whatever desc)

use the analytics -- you know the technique now -- just a matter of seeing what is available and applying them!!!




how to get this output

A reader, October 18, 2004 - 8:11 am UTC

Hi

I have this table


select * from test01;

ID_TRANSACCION ID_TIPO_TRANS
-------------- -------------
1 1
1 1
1 3
2 2
2 2
3 1
4 1

I would like to get those rows which have more than 1 combination of both columns so should get this output

ID_TRANSACCION ID_TIPO_TRANS
-------------- -------------
1 1
1 3
2 2
3 1
4 1

Can we do this with row_number() ?

Tom Kyte
October 18, 2004 - 9:07 am UTC

that looks simply like "distinct" -- but distinct conflicts with "more than 1 combnination" -- but then again, your output conflicts with that as well....

try select distinct.

I make a pardon

A reader, October 18, 2004 - 9:29 am UTC

Hi

I am sorry, posted the wrong output

ID_TRANSACCION ID_TIPO_TRANS
-------------- -------------
1 1
1 1
1 3
2 3
2 2
3 1
4 1

I should get this


ID_TRANSACCION
--------------
1
2

Those IDs which has more than two combinations.

Sorry :(

Tom Kyte
October 18, 2004 - 9:50 am UTC

<quote>
I would like to get those rows which have more than 1 combination of both
columns so should get this output
</quote>

well, that conflicts with everything. (being able to clearly state the goal is *vital* in our profession)

See -- here I have to guess. I have to guess "is the text right" or "is the picture right" or "is my interpretation of the picture right" -- or actually "which one of my interpretations is correct"

I guess you want to see distinct values for id_transaccion such that there is more than one value for that id_transaccion in that table (eg: id_tipo_trans is quite simply not relevant to the problem at all -- it has no bearing on the outcome).

that could be:

select id_transaccion from t group by id_transaccion having count(*) > 1;

Or, maybe it could be "you want to see the distinct values for id_transaccion such that there are more than one distinct id_tipo_trans values for it" -- that is, id_tipo_trans is very relevant.

that could be:

select id_transaccion from t group by id_transaccion
having count(distinct ip_tipo_trans) > 1;


or there could be some other thing you wanted -- but I didn't guess it :)





the second query

A reader, October 18, 2004 - 10:01 am UTC

Hi

The second query is what I need. I need to see those id_transaccion which belongs to more than one id_tipo_trans. I have a further question, if I want to output both id_transaccion and id_tipo_trans the group by query will not work. I guess I can do that with analytic function correct?


Tom Kyte
October 18, 2004 - 10:17 am UTC

select *
from ( select a, b, count(distinct b) over (partition by a) cnt
from t )
where cnt > 1;



Analytics in 817

Al, October 20, 2004 - 10:30 am UTC

Tom,
I can run the following query in 817 without any problems , I want to use the results of the query to update a separate table based on the curr and prev columns, when I create an annonymous block and embed the sql in a cursor I get an error is there another way to do this.
SELECT MAX( DECODE(
a.ccy_hedge_event_type_id,
13, a.present_value
)) AS curr_ffp,
MAX( DECODE(
a.ccy_hedge_event_type_id,
15, a.present_value
)) AS curr_ffr,
MAX( DECODE(
a.ccy_hedge_event_type_id,
17, a.present_value
)) AS curr_ffs,
LAG( MAX( DECODE(
a.ccy_hedge_event_type_id,
15, a.present_value
))) OVER ( PARTITION BY a.deal_reference ORDER BY date_from ) prev_ffr,
a.deal_reference,
date_from,
date_to
FROM ccy_hedge_daily_value a, ccy_hedge_deal b
WHERE b.ccy_hedge_deal_type_id IN( 2, 3 )
AND a.deal_reference = b.deal_reference
GROUP BY a.deal_reference, date_from, date_to;
No problem returns 4000 records in < 1 sec

DECLARE
CURSOR c_data IS
SELECT MAX( DECODE(
a.ccy_hedge_event_type_id,
13, a.present_value
)) AS curr_ffp,
MAX( DECODE(
a.ccy_hedge_event_type_id,
15, a.present_value
)) AS curr_ffr,
MAX( DECODE(
a.ccy_hedge_event_type_id,
17, a.present_value
)) AS curr_ffs,
LAG( MAX( DECODE(
a.ccy_hedge_event_type_id,
15, a.present_value
))) OVER ( PARTITION BY a.deal_reference ORDER BY date_from ) prev_ffr,
a.deal_reference,
date_from,
date_to
FROM ccy_hedge_daily_value a, ccy_hedge_deal b
WHERE b.ccy_hedge_deal_type_id IN( 2, 3 )
AND a.deal_reference = b.deal_reference
GROUP BY a.deal_reference, date_from, date_to;
BEGIN
FOR c_rec IN c_data LOOP
DBMS_OUTPUT.put_line('reference ' || c_rec.deal_reference );
END LOOP;
END;

ORA-06550: line 19, column 30:
PLS-00103: Encountered the symbol "(" when expecting one of the following:

Tom Kyte
October 20, 2004 - 11:45 am UTC

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:3027089372477 <code>


in 8i, you must use native dynamic sql or a view.

Can you use analytics to compare 'snapshot' data?

Robert, October 20, 2004 - 4:45 pm UTC

Tom,

I have searched your website for this answer, and spent several hours working on this (I finally came up with an effective 'brute force' methed).

Is there a simple, slick method of using analytics to compare 'snapshot' records within the same table?

Here is a sample "segment growth" table...

create table t1
( snaptime date
, seg_name varchar2(10)
, seg_size number
);

insert into t1 values (trunc(sysdate) - 20, 'X1', 3000);
insert into t1 values (trunc(sysdate) - 20, 'X2', 10000);
insert into t1 values (trunc(sysdate) - 5,  'X1', 4000);
insert into t1 values (trunc(sysdate) - 5,  'X2', 12000);
insert into t1 values (trunc(sysdate) - 5,  'X3', 600);
insert into t1 values (trunc(sysdate) - 0,  'X1', 5000);
insert into t1 values (trunc(sysdate) - 0,  'X3', 500);

SQL> select * from t1;

SNAPTIME  SEG_NAME    SEG_SIZE
--------- ---------- ---------
30-SEP-04 X1              3000
30-SEP-04 X2             10000
15-OCT-04 X1              4000
15-OCT-04 X2             12000
15-OCT-04 X3               600
20-OCT-04 X1              5000
20-OCT-04 X3               500

Here is a simple report comparing the latest 2 snapshots...

BEGIN       END         SEGMENT GROWTH
----------- ----------- ------- -------
15-OCT-2004 20-OCT-2004 T1         1000       
15-OCT-2004 20-OCT-2004 T2       -12000       
15-OCT-2004 20-OCT-2004 T3         -100       

Can one use analytics to easily write a query to generate a report like the one above... and even more sophistacated reports?

Thanks,

Robert. 

Tom Kyte
October 20, 2004 - 5:16 pm UTC

sure, lag and lead.

select * from (
select snaptime st, lead(snaptime)
over (partition by segment order by snaptime) end,
segment,
lead(seg_size)
over (partition by segment order by snaptime) - seg_size growth
from table )
where end is not null;


just take the current row and use lead to look forward a row.

That's it! Thanks, Tom!

Robert, October 20, 2004 - 5:20 pm UTC


One enhancement....

Robert, October 20, 2004 - 6:34 pm UTC

Tom,

How would I return only one row per segment. The row with the MAX endtime?

Here is how the query works now....

SQL> select * from (
  2  select snaptime st
  3       , lead(snaptime) over (partition by seg_name order by snaptime) end
  4       , seg_name,
  5         lead(seg_size) over (partition by seg_name order by snaptime) - seg_size growth
  6  from t1 )
  7  where end is not null
  8  /

ST        END       SEG_NAME      GROWTH
--------- --------- ---------- ---------
30-SEP-04 15-OCT-04 X1              1000
15-OCT-04 20-OCT-04 X1              1000
30-SEP-04 15-OCT-04 X2              2000
15-OCT-04 20-OCT-04 X3              -100

4 rows selected.

How would I get the following...only the 'most recent' row for each segment?

ST        END       SEG_NAME      GROWTH
--------- --------- ---------- ---------
15-OCT-04 20-OCT-04 X1              1000
30-SEP-04 15-OCT-04 X2              2000
15-OCT-04 20-OCT-04 X3              -100

Thanks,

Robert. 

Never mind! ..... I've got it.......But thanks for your help!

Robert, October 20, 2004 - 7:02 pm UTC

Tom,
Never mind.... here is the answer to my own question.
Thanks,
Robert.

"analytics rock!" :)

  1  select * from (
  2  select snaptime st
  3       , lead(snaptime) over (partition by seg_name order by snaptime) end
  4       , seg_name
  5       , lead(seg_size) over (partition by seg_name order by snaptime) - seg_size growth
  6       , max(snaptime) over (partition by seg_name) max_snap
  7  from t1 )
  8  where end is not null
  9* and end = max_snap
SQL> /

ST        END       SEG_NAME      GROWTH MAX_SNAP
--------- --------- ---------- --------- ---------
15-OCT-04 20-OCT-04 X1              1000 20-OCT-04
30-SEP-04 15-OCT-04 X2              2000 15-OCT-04
15-OCT-04 20-OCT-04 X3              -100 20-OCT-04
 

Dynamic SQL implementation restriction

Al, October 21, 2004 - 6:17 am UTC

hi Tom,
the view worked fine however the dynamic sql gave me a error
when I tried to run it, I changed the block to

BEGIN
FOR c_rec IN 'SELECT MAX( DECODE(
a.ccy_hedge_event_type_id,
13, a.present_value
)) AS curr_ffp,
MAX( DECODE(
a.ccy_hedge_event_type_id,
15, a.present_value
)) AS curr_ffr,
MAX( DECODE(
a.ccy_hedge_event_type_id,
17, a.present_value
)) AS curr_ffs,
LAG( MAX( DECODE(
a.ccy_hedge_event_type_id,
15, a.present_value
))) OVER ( PARTITION BY a.deal_reference ORDER BY date_from ) prev_ffr,
a.deal_reference,
date_from,
date_to
FROM ccy_hedge_daily_value a, ccy_hedge_deal b
WHERE b.ccy_hedge_deal_type_id IN( 2, 3 )
AND a.deal_reference = b.deal_reference'
LOOP
DBMS_OUTPUT.put_line('reference ' || c_rec.deal_reference );
END LOOP;
END;

The following error has occurred:

ORA-06550: line 3, column 17:
PLS-00999: implementation restriction (may be temporary)
ORA-06550: line 3, column 4:
PL/SQL: Statement ignored

Is there some parameter that needs to be set before allowing this type of statement?




Tom Kyte
October 21, 2004 - 7:03 am UTC

that is not the way to do dynamic sql.


declare
type rc is ref cursor;
l_cursor rc;
begin
open l_cursor for '......'
loop
fetch l_cursor into ....;
exit when l_cursor%notfound;
process row
end loop;
close l_cursor

re-dynamic sql implementation restriction

Al, October 21, 2004 - 7:11 am UTC

Thanks Tom,
I should have checked your book the ansewer was on page 700, that was a quick response, thought you would be sleeping as I posted it at 10:00 UK time



Analytic-enabled regular aggregate functions

A reader, November 05, 2004 - 11:36 am UTC

Oracle has analytic-enabled the regular aggregate built-in functions like sum, min, max, etc. I am having trouble understanding how that is useful.

I mean, isnt

sum(col1) over (partition by col2)

the same as

sum(col) ... group by col2

Yes, I realize that sum(col) over () is something that cannot be done without analytic functions, but cant everything else be done using the traditional group-by?

Thanks

Tom Kyte
November 05, 2004 - 5:16 pm UTC

1 select empno, sal,
2 avg(sal) over (partition by deptno) avg_sal_by_deptno,
3 avg(sal) over (partition by job) avg_sal_by_job,
4 sum(sal) over (partition by mgr) sum_sal_by_mgr
5* from emp
scott@ORA9IR2> /

EMPNO SAL AVG_SAL_BY_DEPTNO AVG_SAL_BY_JOB SUM_SAL_BY_MGR
---------- ---------- ----------------- -------------- --------------
7934 1300 2916.66667 1037.5 1300
7782 2450 2916.66667 2758.33333 8275
7839 5000 2916.66667 5000 5000
7788 3000 2175 3000 6000
7902 3000 2175 3000 6000
7876 1100 2175 1037.5 1100
7566 2975 2175 2758.33333 8275
7369 800 2175 1037.5 800
7900 950 1566.66667 1037.5 6550
7521 1250 1566.66667 1400 6550
7844 1500 1566.66667 1400 6550
7654 1250 1566.66667 1400 6550
7499 1600 1566.66667 1400 6550
7698 2850 1566.66667 2758.33333 8275

14 rows selected.


Try doing that with "group by" -- group by squishes out rows (highly technical term there -- squish).

You would have to have 3 inline views and join to get this without analytics.

You want to compare the salary of each individual to the various averages and compute the pct of salary made by mgr.



Analytics give you the "aggregates" with the "details"

Code change ?

A reader, November 05, 2004 - 5:54 pm UTC

So Tom, that means that if we had something like this..

SQL@9iR2> select ename, ip_address
2 from hostnames
3 where ip_address in ( select ip_address from hostnames group by ip_address having count ( * ) > 1 );

.. that finds how many ip address are repeated and then list the address and hostnames ( please ignore this, just what I am trying to express -- I didn't find anything more adecuate ), should be changed to something like ..

SQL@9iR2> select ename,
2 ip_address
3 from (
4 select ename,
5 ip_address,
6 case when count( * ) over( partition by ip_address ) > 1
7 then rownum
8 end rn
9 from hostnames
10 )
11 where rn is not null
12 /

.. because of performance gain using analytics ? Is that how developers nowadays should be instructed ?

Tom Kyte
November 05, 2004 - 6:25 pm UTC

If I was processing all of the rows or most of the rows in the table, yes I would code:

select *
from ( select x.*, count(*) over (partition by ip_address) cnt
from hostnames x
)
where cnt > 1;




there are times when you have predicates on the other columns (since analytics are applied AFTER the where clause) that we would not use that technique.


Helena Marková, November 18, 2004 - 4:49 am UTC


Repeating values and Runing total

Sanjeev Sapre, November 19, 2004 - 12:56 am UTC

Tom,
We were trying to run following query

SELECT ename "Ename", deptno "Deptno", sal "Sal",
SUM(sal)
OVER (ORDER BY deptno,ename ) "Running Total",
SUM(SAL)
OVER (PARTITION BY deptno
ORDER BY ename) "Dept Total",
ROW_NUMBER()
OVER (PARTITION BY deptno
ORDER BY ENAME) "Seq"
FROM emp
ORDER BY deptno, ename

the result is like this

scott@VIS> /
Press any key to continue..

Ename Deptno Sal Running Total Dept Total Seq
---------- ---------- ---------- ------------- ---------- ----------
CLARK 10 2450 2450 2450 1
KING 10 5000 7450 7450 2
Kairaj 10 10000 17450 17450 3
MILLER 10 1300 21350 21350 4
MILLER 10 1300 21350 21350 5
MILLER 10 1300 21350 21350 6
ADAMS 20 1100 22450 1100 1
FORD 20 3000 25450 4100 2
JONES 20 2975 28425 7075 3
SCOTT 20 2975 34375 13025 4
SCOTT 20 2975 34375 13025 5
SMITH 20 1100 35475 14125 6
ALLEN 30 1600 37075 1600 1
BLAKE 30 2850 39925 4450 2
JAMES 30 950 40875 5400 3
MARTIN 30 1250 42125 6650 4
TURNER 30 1500 43625 8150 5
WARD 30 1250 44875 9400 6
Jim 50 25000 69875 25000 1

19 rows selected.

there are two cases where sal is repeated ( same ) 1300 and 2975. within the same salary employees there is case of same ename. For these employees the running sum is clubbed together rather than going incrementally.

We are unable to understand how this works ? Is this corret or is a bug. Please explain.

regards


Tom Kyte
November 19, 2004 - 10:22 am UTC

when you order your data by ename, deptno -- which comes first

MILLER 10

or

MILLER 10

or

MILLER 10

? none do, they are "the same".


If you order by something that makes them all "unique" ultimately -- then they do have an order. consider:

scott@ORA9IR2> edit
Wrote file afiedt.buf

1 select ename, sal,
2 sum(sal) over (order by sal) sum1,
3 sum(sal) over (order by sal, EMPNO) sum2
4 from emp
5* order by sal
scott@ORA9IR2> /

ENAME SAL SUM1 SUM2
---------- ---------- ---------- ----------
SMITH 800 800 800
JAMES 950 1750 1750
ADAMS 1100 2850 2850
WARD 1250 5350 4100
MARTIN 1250 5350 5350

MILLER 1300 6650 6650
TURNER 1500 8150 8150
ALLEN 1600 9750 9750
CLARK 2450 12200 12200
BLAKE 2850 15050 15050
JONES 2975 18025 18025
SCOTT 3000 24025 21025
FORD 3000 24025 24025
KING 5000 29025 29025

14 rows selected.

In the above, when we order by sal -- 1250 does not come before or after 1250, it comes at the same time. so the analytic looking at the data ordered by SAL considers those two to be "the same". but, order by SAL and then EMPNO and they become "different"


This is the expected behavior.



Using Group By clause with AF

denni50, November 19, 2004 - 9:57 am UTC

Hi Tom

I'm trying to re-write and expand the following query using AF:

Select idnumber,sum(giftamount)
from gift where giftdate>=to_date('01-JAN-2004', 'dd-mon-yyyy')
group by idnumber
having sum(giftamount)>=2500
order by idnumber;

To:

select idnumber from (select idnumber, sum(giftamount) SumGifts,
dense_rank()
over (partition by sum(giftamount)
order by idnumber) dr
from gift
where giftdate >= to_date('01-JAN-2004', 'dd-mon-yyyy')
having sum(giftamount) >= 2500)
where dr <= 100
order by idnumber;

and keep getting the below error:
ERROR at line 1:
ORA-00937: not a single-group group function

I know the idnumber needs to be grouped to get the sum(giftamount) but I'm not quite sure where or how to insert the group by clause within the partition clause.

What I would like to achieve is a listing like the below example:

sum(giftamount) idnumber
between
10000 - 5000
1111
2222
3333

2500 - 4999.99 4444
5555
6666
7777.....and so on.

thanks a bunch!







Tom Kyte
November 19, 2004 - 11:33 am UTC

add more layers of inline views there....

do the dense_rank on an inline view that computed the sum(giftamount) by idnumber.

simple query

Menon, November 25, 2004 - 8:32 pm UTC

I have been struggling with a seemingly simple requirement.
I have a list of rows of a number as:
1
2
3
1
2
3
1
2
...

I want to assign a rank of 0 to first set of numbers
till 3, 1 to the second set of (1-3), 2 to the third
set and so on:
num rank
---- ----
1 0
2 0
3 0
1 1
2 1
3 1
1 2
2 2
3 2

The problem is that I can also have a set such as:
num rank
3 0
1 1
2 1
3 1
1 2
2 2
3 2

Also the upper limit 3 should not be hard coded in the
solution - it should work for sets of numbers
with limit of max 10.

Any ideas?

Thanx.

Tom Kyte
November 25, 2004 - 8:38 pm UTC

problem is -- rows have no order....

so, where is the missing column here -- rows don't come out in any order -- so give us the create table, insert into's and the select that makes them come out in 1,2,3,1,2,3,1,2,3 order -- and then we can talk about how to assign 0, 1, 2 groups...

thanx!

Menon, November 25, 2004 - 8:55 pm UTC

here you go...
---
scott@ORA10G> drop table t;

Table dropped.

scott@ORA10G> create table t ( x number, y number );

Table created.

scott@ORA10G> insert into t values( 1, 2 );

1 row created.

scott@ORA10G> insert into t values( 2, 3 );

1 row created.

scott@ORA10G> insert into t values( 3, 1 );

1 row created.

scott@ORA10G> insert into t values( 4, 2 );

1 row created.

scott@ORA10G> insert into t values( 5, 3 );

1 row created.

scott@ORA10G> insert into t values( 6, 1 );

1 row created.

scott@ORA10G> insert into t values( 7, 2 );

1 row created.

scott@ORA10G> insert into t values( 8, 3 );

1 row created.

scott@ORA10G> insert into t values( 9, 1 );

1 row created.

scott@ORA10G> insert into t values( 10, 2 );

1 row created.

scott@ORA10G> insert into t values( 11, 3 );

1 row created.

scott@ORA10G> insert into t values( 12, 1 );

1 row created.

scott@ORA10G> insert into t values( 13, 2 );

1 row created.

scott@ORA10G> insert into t values( 14, 3 );

1 row created.

scott@ORA10G> insert into t values( 15, 1 );

1 row created.

scott@ORA10G> insert into t values( 16, 2 );

1 row created.

scott@ORA10G> insert into t values( 17, 3 );

1 row created.

scott@ORA10G> insert into t values( 18, 1 );

1 row created.

scott@ORA10G> insert into t values( 19, 2 );

1 row created.

scott@ORA10G> insert into t values( 20, 3 );

1 row created.

scott@ORA10G> select * from t order by x, y;

X Y
---------- ----------
1 2
2 3
3 1
4 2
5 3
6 1
7 2
8 3
9 1
10 2
11 3
12 1
13 2
14 3
15 1
16 2
17 3
18 1
19 2
20 3

20 rows selected.
---
The values in the second row have started with 2.
It could have started with 1 or 3 as well.
Again, the same set of numbers could have been
from 1 to 4 or from 1 to 5 and so on.
Btw, that was a quick response on thanxgiving -
Bet you are working on your book now:)


Tom Kyte
November 26, 2004 - 8:58 am UTC

ops$tkyte@ORA9IR2> select y, dense_rank() over ( order by grp )-1 rnk
  2    from (
  3  select y, max(grp) over (order by x) grp
  4    from (
  5  select x,
  6         y,
  7         case when nvl(lag(y) over (order by x),y+1) > y then row_number() over (order by x) end grp
  8    from t
  9         )
 10         )
 11  /
 
         Y        RNK
---------- ----------
         2          0
         3          0
         1          1
         2          1
         3          1
         1          2
         2          2
         3          2
         1          3
         2          3
         3          3
         1          4
         2          4
         3          4
         1          5
         2          5
         3          5
         1          6
         2          6
         3          6
 
20 rows selected.
 

excellent!

Menon, November 26, 2004 - 10:35 am UTC

Thanx. It took me a few minutes to understand it:)
Had to rerun the query with the inner ones only.
The technique that you used is interesting. You
first created a query to "mark" the points where
the number changed to something less than the
previous number. Then you used dense_rank to
get at the answer. This seems like a very useful
technique in general. I believe I have seen something
like this in an article written by you?
And you have some name to it as well ("mark and..")?
Or may be I am thinking of something else altogether.
Anyway, hope you had a great thanxgiving.
I will see you in the OOW.

Tom Kyte
November 26, 2004 - 10:49 am UTC

you remember correctly, it is this technique:

</code> https://www.oracle.com/technetwork/issue-archive/2014/14-mar/o24asktom-2147206.html <code>

in analytics to the rescue.

thanx.

Menon, November 26, 2004 - 10:52 am UTC

Yes. I found that article in some other reply of yours just now.
I swear you had given a name to it (was it mark and forward)?
Anyway, thanx a lot.

ok - here is the original problem

Menon, November 26, 2004 - 11:51 am UTC

The problem where I wanted to apply the above technique
was to generate a calendar for a given month and year
using SQL.
I am sure you would have a different and most likely
a more elegant solution than the one I have:)
--------
scott@ORA10G> -- genarate a calendar for a given month
scott@ORA10G> variable month number;
scott@ORA10G> variable year number;
scott@ORA10G> exec :month := &1

PL/SQL procedure successfully completed.

scott@ORA10G> exec :year := &2

PL/SQL procedure successfully completed.

scott@ORA10G> column sun format 999
scott@ORA10G> column mon format 999
scott@ORA10G> column tue format 999
scott@ORA10G> column wed format 999
scott@ORA10G> column thu format 999
scott@ORA10G> column fri format 999
scott@ORA10G> column sat format 999
scott@ORA10G> prompt calendar for the month &1 year &2
calendar for the month 12 year 2004
scott@ORA10G> select max( decode( day_of_week, 1, day_num, null) ) sun,
2 max( decode( day_of_week, 2, day_num, null) ) mon,
3 max( decode( day_of_week, 3, day_num, null) ) tue,
4 max( decode( day_of_week, 4, day_num, null) ) wed,
5 max( decode( day_of_week, 5, day_num, null) ) thu,
6 max( decode( day_of_week, 6, day_num, null) ) fri,
7 max( decode( day_of_week, 7, day_num, null) ) sat
8 from
9 (
10 select day_num, day_of_week,
11 max( grp ) over ( order by day_num ) grp
12 from
13 (
14 select day_num, day_of_week,
15 case when day_of_week < lag_day_of_week then rn end grp
16 from
17 (
18 select day_num, day_of_week,
19 nvl(lag( day_of_week) over( order by day_num ), day_of_week+1) lag_day_of_week,
20 row_number() over( order by day_num) rn
21 from
22 (
23 select day_num, decode( to_char( curr_date, 'dy'), 'sun', 1,
24 'mon', 2,
25 'tue', 3,
26 'wed', 4,
27 'thu', 5,
28 'fri', 6,
29 'sat', 7) day_of_week
30 from
31 (
32 select rownum day_num, to_date( :month || '-' ||rownum || '-2004', 'MM-DD-YYYY') curr_date
33 from all_objects
34 where rownum <= to_number(to_char(last_day(to_date(:month ||'-01-'||:year, 'MM-DD-YYYY')), 'DD'))
35 )
36 )
37 )
38 )
39 )
40 group by grp;

SUN MON TUE WED THU FRI SAT
---- ---- ---- ---- ---- ---- ----
1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31




thanx

Menon, November 26, 2004 - 1:09 pm UTC

well, that solution is definitely much better:)
I did not think of looking for a format that gives
me the week number. In the above solution that I
posted, for what it is worth, we can substitute the
select day_num, decode( to_char( curr_date, 'dy'), 'sun', 1,
'mon', 2,
'tue', 3,
'wed', 4,
'thu', 5,
'fri', 6,
'sat', 7) day_of_week

with
select day_num, to_number(to_char( curr_date, 'D')) day_of_week

so the select becomes:
select max( decode( day_of_week, 1, day_num, null) ) sun,
max( decode( day_of_week, 2, day_num, null) ) mon,
max( decode( day_of_week, 3, day_num, null) ) tue,
max( decode( day_of_week, 4, day_num, null) ) wed,
max( decode( day_of_week, 5, day_num, null) ) thu,
max( decode( day_of_week, 6, day_num, null) ) fri,
max( decode( day_of_week, 7, day_num, null) ) sat
from
(
select day_num, day_of_week,
max( grp ) over ( order by day_num ) grp
from
(
select day_num, day_of_week,
case when day_of_week < lag_day_of_week then rn end grp
from
(
select day_num, day_of_week,
nvl(lag( day_of_week) over( order by day_num ), day_of_week+1) lag_day_of_week,
row_number() over( order by day_num) rn
from
(
select day_num, to_number(to_char( curr_date, 'D')) day_of_week
from
(
select rownum day_num, to_date( :month || '-' ||rownum || '-2004', 'MM-DD-YYYY') curr_date
from all_objects
where rownum <= to_number(to_char(last_day(to_date(:month ||'-01-'||:year, 'MM-DD-YYYY')), 'DD'))
)
)
)
)
)
group by grp;

I am glad I tried my solution before looking at yours
because:
1. I reinforced the idea of your "marking" trick.
2. I learnt about some new date formats after I looked
at your solution.




RE: simple query

Joaquin Martinez, November 28, 2004 - 9:05 pm UTC

Was actually playing around for the weekend with analytics ( since I cannot practice them at work -- not too much of a need there and besides, version 8.1.7 SE, so no dice :( ), so I dont loose the 'habit' of managing them and used this thread to *try* to reproduce Tom's answers.. happened to be bothered for a way to do this 'without' the need of them. Surprisingly for me, I came up with an answer.

SQL> create table t as
  2  select x, decode( y, 0, 3, y ) y
  3    from (
  4  select rownum x, mod( ( rownum + 1 ),3 ) y
  5    from all_objects
  6   where rownum <= 20
  7         )
  8  /

Table created.

SQL> select x,
  2         y,
  3         trunc( ( rownum + ( trunc( z / 100 ) - abs( ( mod( z,100 ) - ( trunc( z / 100 ) + 2 ) ) ) ) )
  4                / trunc( z / 100 ) ) rnk
  5    from (
  6  select x,
  7         y,
  8         ( select max( y ) * 100 + mod( min( rownum * 100 + y ),100 )
  9             from t ) z
 10    from t
 11         )
 12  /

         X          Y        RNK
---------- ---------- ----------
         1          2          0
         2          3          0
         3          1          1
         4          2          1
         5          3          1
         6          1          2
         7          2          2
         8          3          2
         9          1          3
        10          2          3
        11          3          3
        12          1          4
        13          2          4
        14          3          4
        15          1          5
        16          2          5
        17          3          5
        18          1          6
        19          2          6
        20          3          6

20 rows selected.

.. of course, not as efficient as using analytics ( two table access, for example ) but wanted to share it with all of you. 

analytic function

karthick, November 30, 2004 - 5:19 am UTC

I have a table ,

create table temp_rollup(dept1 char(2),no number(2))

and data are as follows,

insert into temp_rollup values('a',2)
insert into temp_rollup values('a',2)
insert into temp_rollup values('a',2)
insert into temp_rollup values('b',2)
insert into temp_rollup values('b',2)
insert into temp_rollup values('b',2)


i am trying to get the output as follows

Dept No
------------
a 2
a 2
a 2
sub total 6
b 2
b 2
b 2
sub total 6
total 12

Can analytic function help me in achiving this.i tried out with rollup but dint get
the exact output.

i dont want to go with union all.

Tom Kyte
November 30, 2004 - 7:57 am UTC

You know -- don't like it when people say "don't want to go with feature X".  what if feature X was the best, only, superior, correct way to do it? why rule out feature "X"

anyway, feature "X" isn't needed this time.

<b>in pure SQL</b>

ops$tkyte@ORA9IR2> select decode( grouping(dept1)+grouping(rowid), 1, 'Subtotal', 2, 'Total', dept1 ), sum(no)
  2    from temp_rollup
  3   group by rollup(dept1,rowid)
  4  /
 
DECODE(G    SUM(NO)
-------- ----------
a                 2
a                 2
a                 2
Subtotal          6
b                 2
b                 2
b                 2
Subtotal          6
Total            12
 
9 rows selected.
 

<b>and if this is a sqlplus report, this will do it</b>


ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> break on dept1       on report
ops$tkyte@ORA9IR2> compute sum of no on report
ops$tkyte@ORA9IR2> compute sum of no on dept1
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select dept1, no from temp_rollup order by dept1, no;
 
DEPT1              NO
---------- ----------
a                   2
                    2
                    2
********** ----------
sum                 6
b                   2
                    2
                    2
********** ----------
sum                 6
           ----------
sum                12
 
6 rows selected.
 
 

Heres a 3 month

A reader, November 30, 2004 - 4:01 pm UTC

select * from (
select SUN_MON_TUE_WED_THU_FRI_SAT from (
select '--------'||rpad(rtrim(TO_CHAR(add_months(sysdate,-1),'Month')),9,'-')||'-----------' ||' '||
'--------'||rpad(rtrim(TO_CHAR(sysdate,'Month')),9,'-')||'-----------' ||' '||
'--------'||rpad(rtrim(TO_CHAR(add_months(sysdate,+1),'Month')),9,'-')||'-----------' SUN_MON_TUE_WED_THU_FRI_SAT, 1 rowno from dual
union --
select 'SUN MON TUE WED THU FRI SAT' ||' '||
'SUN MON TUE WED THU FRI SAT' ||' '||
'SUN MON TUE WED THU FRI SAT' SUN_MON_TUE_WED_THU_FRI_SAT, 2 rowno from dual
union --
select '----------------------------' ||' '||
'----------------------------' ||' '||
'----------------------------' SUN_MON_TUE_WED_THU_FRI_SAT , 3 rowno from dual
union --
select '----------'||TO_CHAR(add_months(sysdate,-1),'YYYY')||'--------------'||' '||
'----------'||TO_CHAR(sysdate,'YYYY')||'--------------'||' '||
'----------'||TO_CHAR(add_months(sysdate,+1),'YYYY')||'--------------' SUN_MON_TUE_WED_THU_FRI_SAT, 10 rowno from dual
union --
SELECT (case when day1 <= add_months(last_day(l_date),-1) OR day1 > last_day(l_date) then ' ' ELSE ' '||to_char(day1,'DD') ||' ' end)
||(case when day2 <= add_months(last_day(l_date),-1) OR day2 > last_day(l_date) then ' ' ELSE ' '||to_char(day2,'DD') ||' ' end)
||(case when day3 <= add_months(last_day(l_date),-1) OR day3 > last_day(l_date) then ' ' ELSE ' '||to_char(day3,'DD') ||' ' end)
||(case when day4 <= add_months(last_day(l_date),-1) OR day4 > last_day(l_date) then ' ' ELSE ' '||to_char(day4,'DD') ||' ' end)
||(case when day5 <= add_months(last_day(l_date),-1) OR day5 > last_day(l_date) then ' ' ELSE ' '||to_char(day5,'DD') ||' ' end)
||(case when day6 <= add_months(last_day(l_date),-1) OR day6 > last_day(l_date) then ' ' ELSE ' '||to_char(day6,'DD') ||' ' end)
||(case when day7 <= add_months(last_day(l_date),-1) OR day7 > last_day(l_date) then ' ' ELSE ' '||to_char(day7,'DD') ||' ' end)
||' '||(case when day8 <= add_months(last_day(m_date),-1) OR day8 > last_day(m_date) then ' ' ELSE (case when trunc(day8) = trunc(m_date) then '-'||to_char(day8,'DD')||'-' ELSE ' '||to_char(day8,'DD') ||' ' end) end)
||(case when day9 <= add_months(last_day(m_date),-1) OR day9 > last_day(m_date) then ' ' ELSE (case when trunc(day9) = trunc(m_date) then '-'||to_char(day9,'DD')||'-' ELSE ' '||to_char(day9,'DD') ||' ' end) end)
||(case when day10 <= add_months(last_day(m_date),-1) OR day10 > last_day(m_date) then ' ' ELSE (case when trunc(day10) = trunc(m_date) then '-'||to_char(day10,'DD')||'-' ELSE ' '||to_char(day10,'DD') ||' ' end) end)
||(case when day11 <= add_months(last_day(m_date),-1) OR day11 > last_day(m_date) then ' ' ELSE (case when trunc(day11) = trunc(m_date) then '-'||to_char(day11,'DD')||'-' ELSE ' '||to_char(day11,'DD') ||' ' end) end)
||(case when day12 <= add_months(last_day(m_date),-1) OR day12 > last_day(m_date) then ' ' ELSE (case when trunc(day12) = trunc(m_date) then '-'||to_char(day12,'DD')||'-' ELSE ' '||to_char(day12,'DD') ||' ' end) end)
||(case when day13 <= add_months(last_day(m_date),-1) OR day13 > last_day(m_date) then ' ' ELSE (case when trunc(day13) = trunc(m_date) then '-'||to_char(day13,'DD')||'-' ELSE ' '||to_char(day13,'DD') ||' ' end) end)
||(case when day14 <= add_months(last_day(m_date),-1) OR day14 > last_day(m_date) then ' ' ELSE (case when trunc(day14) = trunc(m_date) then '-'||to_char(day14,'DD')||'-' ELSE ' '||to_char(day14,'DD') ||' ' end) end)
||' '||(case when day15 <= add_months(last_day(r_date),-1) OR day15 > last_day(r_date) then ' ' ELSE ' '||to_char(day15,'DD') ||' ' end)
||(case when day16 <= add_months(last_day(r_date),-1) OR day16 > last_day(r_date) then ' ' ELSE ' '||to_char(day16,'DD') ||' ' end)
||(case when day17 <= add_months(last_day(r_date),-1) OR day17 > last_day(r_date) then ' ' ELSE ' '||to_char(day17,'DD') ||' ' end)
||(case when day18 <= add_months(last_day(r_date),-1) OR day18 > last_day(r_date) then ' ' ELSE ' '||to_char(day18,'DD') ||' ' end)
||(case when day19 <= add_months(last_day(r_date),-1) OR day19 > last_day(r_date) then ' ' ELSE ' '||to_char(day19,'DD') ||' ' end)
||(case when day20 <= add_months(last_day(r_date),-1) OR day20 > last_day(r_date) then ' ' ELSE ' '||to_char(day20,'DD') ||' ' end)
||(case when day21 <= add_months(last_day(r_date),-1) OR day21 > last_day(r_date) then ' ' ELSE ' '||to_char(day21,'DD') ||' ' end)
SUN_MON_TUE_WED_THU_FRI_SAT, rownum+3 rowno
FROM (select (add_months(last_day(a.l_date),-1)
- to_number(to_char(add_months(last_day(a.l_date),-1),'D') ) + 1) + (7*(rownum-1)) day1,
(add_months(last_day(a.l_date),-1)
- to_number(to_char(add_months(last_day(a.l_date),-1),'D') ) + 2) + (7*(rownum-1))day2,
(add_months(last_day(a.l_date),-1)
- to_number(to_char(add_months(last_day(a.l_date),-1),'D') ) + 3) + (7*(rownum-1))day3,
(add_months(last_day(a.l_date),-1)
- to_number(to_char(add_months(last_day(a.l_date),-1),'D') ) + 4) + (7*(rownum-1))day4,
(add_months(last_day(a.l_date),-1)
- to_number(to_char(add_months(last_day(a.l_date),-1),'D') ) + 5) + (7*(rownum-1))day5,
(add_months(last_day(a.l_date),-1)
- to_number(to_char(add_months(last_day(a.l_date),-1),'D') ) + 6) + (7*(rownum-1))day6,
(add_months(last_day(a.l_date),-1)
- to_number(to_char(add_months(last_day(a.l_date),-1),'D') ) + 7) + (7*(rownum-1))day7,
(add_months(last_day(a.m_date),-1)
- to_number(to_char(add_months(last_day(a.m_date),-1),'D') ) + 1) + (7*(rownum-1)) day8,
(add_months(last_day(a.m_date),-1)
- to_number(to_char(add_months(last_day(a.m_date),-1),'D') ) + 2) + (7*(rownum-1))day9,
(add_months(last_day(a.m_date),-1)
- to_number(to_char(add_months(last_day(a.m_date),-1),'D') ) + 3) + (7*(rownum-1))day10,
(add_months(last_day(a.m_date),-1)
- to_number(to_char(add_months(last_day(a.m_date),-1),'D') ) + 4) + (7*(rownum-1))day11,
(add_months(last_day(a.m_date),-1)
- to_number(to_char(add_months(last_day(a.m_date),-1),'D') ) + 5) + (7*(rownum-1))day12,
(add_months(last_day(a.m_date),-1)
- to_number(to_char(add_months(last_day(a.m_date),-1),'D') ) + 6) + (7*(rownum-1))day13,
(add_months(last_day(a.m_date),-1)
- to_number(to_char(add_months(last_day(a.m_date),-1),'D') ) + 7) + (7*(rownum-1))day14,
(add_months(last_day(a.r_date),-1)
- to_number(to_char(add_months(last_day(a.r_date),-1),'D') ) + 1) + (7*(rownum-1)) day15,
(add_months(last_day(a.r_date),-1)
- to_number(to_char(add_months(last_day(a.r_date),-1),'D') ) + 2) + (7*(rownum-1))day16,
(add_months(last_day(a.r_date),-1)
- to_number(to_char(add_months(last_day(a.r_date),-1),'D') ) + 3) + (7*(rownum-1))day17,
(add_months(last_day(a.r_date),-1)
- to_number(to_char(add_months(last_day(a.r_date),-1),'D') ) + 4) + (7*(rownum-1))day18,
(add_months(last_day(a.r_date),-1)
- to_number(to_char(add_months(last_day(a.r_date),-1),'D') ) + 5) + (7*(rownum-1))day19,
(add_months(last_day(a.r_date),-1)
- to_number(to_char(add_months(last_day(a.r_date),-1),'D') ) + 6) + (7*(rownum-1))day20,
(add_months(last_day(a.r_date),-1)
- to_number(to_char(add_months(last_day(a.r_date),-1),'D') ) + 7) + (7*(rownum-1))day21,
a.l_date ,a.m_date,a.r_date
from all_objects, (select add_months(s_date,-1) l_date, s_date m_date, add_months(s_date,+1) r_date
FROM (select sysdate s_date from dual)) a where rownum <= 6)
) --to_date('1/26/2002','mm/dd/yyyy')
ORDER BY rowno )
/


Squish

A reader, December 08, 2004 - 10:46 pm UTC

"Try doing that with "group by" -- group by squishes out rows (highly technical term there -- squish)"
...
Analytics give you the "aggregates" with the "details" "

Regarding using analytic-enabled aggregate functions...

Often, I find that I need to add a DISTINCT to my queries to remove dupes when I use analytic functions! In your example above, it was not needed because your query used 3 analytic functions and the whole row was not duplicated.

So, if I find that I need to DISTINCT, am I using the analytic functions when a simple GROUP BY would suffice?

As your example above showed, doing 3 (sum and 2 averages) without using AF would be hairy indeed, but would you say that if my query involves only ONE analytic function, it can ALWAYS be done using regular GROUP BY?

Thanks

Tom Kyte
December 09, 2004 - 1:08 pm UTC

maybe -- maybe not.

You say "i have to use distinct" -- I personally see nothing wrong in general with getting two rows that have the same "values" -- it is not something you HAVE to remove. Your application might need it, but it is not mandatory.


select ename, deptno, count(*) over (partition by deptno) cnt_by_dept
from emp;

that cannot be done simply with a group by

And that could contain "duplicates"

but I would say removing the duplicates would be an ERROR. If two SMITH's work in deptno = 10 and my report was to show the above data, removing extra smiths would be an error.

Analytic Functions

krishna padala, December 11, 2004 - 12:28 am UTC

Hi Tom ,

The use of dense rank in our application and it is very use full code. Your exaple helped me a lot.
Could you please help in the folloiwng case.

I have dept_no sevirity_code
101 s1
101 s1
101 s2
101 s2
101 s2
102 s1
103 s2
103 s2
103 s3
103 s4

Now we would like to have the output as

D.id S1 S2 S3 S4

101 2 3 0 0
102 1 0 0 0
103 0 2 1 1

Thanks in advance...

Tom Kyte
December 11, 2004 - 7:55 am UTC

search for

pivot

rewrite w/o PARTITION BY clause

Shalu, December 13, 2004 - 1:55 pm UTC

Tom, could you pls help me rewrite this qry without using analytic function. (I am writing this qry in After Parameter Form trigger in reports designer, which supports DENSE_RANK() but doesn't support PARTITION By clause.)

SELECT portfolio_id, name
FROM (SELECT p.portfolio_id, p.name,
DENSE_RANK () OVER
(PARTITION BY p.name
ORDER BY TRUNC (p.asof_date) DESC, p.portfolio_id DESC) AS dr
FROM vn_portfolio p, vn_project pr
WHERE p.project_id = pr.project_id
AND pr.name = 'FAS13'
AND TRUNC (p.asof_date) <=
TO_DATE ('12/02/2004 00:00:00', 'MM/DD/YYYY HH24:MI:SS')
AND p.name IN ('A', 'C', 'B'))
WHERE dr = 1
/

It returns:
PORTFOLIO_ID NAME
------------ -----
10 A
45 B
50 C

This sql gives the ids of the latest portfolios in the FAS133 project for 12/02, because a project can have duplicate portfolios (multiple As or Bs or Cs) for a day, but each having a unique portfolio_id. (Portfolio Id is a running sequence). Now A, B, C are loaded everyday. Let's say for 12/02 we don't have 'A' available then I want the latest A from prior dates.

Here are the underlying ddls:

Create table My_project(project_id number, name varchar2(5))
/

Insert into My_project (project_id, name) values (100, 'FAS13')
/
Insert into my_project (project_id, name) values (200, 'XYZ')
/

Create table my_portfolio(portfolio_id number, name varchar2(5), asof_date date, project_id number)
/

Insert into my_portfolio (portfolio_id, name, asof_date, project_id)
values (10, 'A', to_date('12/01/2004 06:00:00','MM/DD/YYYY HH24:MI:SS'), 100)
/
Insert into my_portfolio (portfolio_id, name, asof_date, project_id)
values (20, 'B', to_date('12/01/2004 14:00:00','MM/DD/YYYY HH24:MI:SS'), 100)
/
Insert into my_portfolio (portfolio_id, name, asof_date, project_id)
values (30, 'C', to_date('12/01/2004 12:00:00','MM/DD/YYYY HH24:MI:SS'), 100)
/
Insert into my_portfolio (portfolio_id, name, asof_date, project_id)
values (40, 'B', to_date('12/02/2004 16:00:00','MM/DD/YYYY HH24:MI:SS'), 100)
/
Insert into my_portfolio (portfolio_id, name, asof_date, project_id)
values (50, 'C', to_date('12/02/2004 20:00:00','MM/DD/YYYY HH24:MI:SS'), 100)
/
Insert into my_portfolio (portfolio_id, name, asof_date, project_id)
values (45, 'B', to_date('12/02/2004 16:00:00','MM/DD/YYYY HH24:MI:SS'), 100)
/

Thanks!


Tom Kyte
December 13, 2004 - 2:23 pm UTC

can you hide the construct in a view?

Dupes

A reader, December 13, 2004 - 3:09 pm UTC

"You say "i have to use distinct" -- I personally see nothing wrong in general with getting two rows that have the same "values" -- it is not something you HAVE to remove"

Um..how would you like it if your bank statement shows you the same transaction repeated twice?

"select ename, deptno, count(*) over (partition by deptno) cnt_by_dept from emp;

If two SMITH's work in deptno = 10 and my report was to show the above data, removing extra smiths would be an error"

Yes, but your query above is too simplistic. A more real-world query might be like

select emp_pk, ename, deptno, count(*) over (partition by deptno) cnt_by_dept from emp;

The app might choose to hide the emp_pk from the user depending on the requirements.

Even after adding the emp_pk in there, I still get "dupes" because I am NOT doing a group by.

Yes, analytics are great for showing the aggregate and detail in the same row, but to avoid these dupes, one needs to use DISTINCT and that interferes with view merging, predicate pushing and all that fun stuff?

Thanks

Tom Kyte
December 13, 2004 - 3:14 pm UTC

Um, why would my bank statement showed the same transaction twice?

the only way for the same transaction to appear twice would have been for the SOURCE DATA TO HAVE IT TWICE.

For you see -- analytics don't "add rows".

What dups could possibly come from:

select emp_pk, ename, deptno, count(*) over (partition by deptno)
cnt_by_dept from emp;

????? (assuming that emp_pk is the primary key).


Show me a dupe with that -- please?

ANALYTICS DO NOT, CANNOT, WILL NOT produce "new records". Show me a duplicate.... And I'll show you source data that has duplicates in it already.

A reader, December 13, 2004 - 6:32 pm UTC

Yes, analytic dont generate new records, its that they dont "squish" source records like a GROUP BY does

select deptno,count(*) over (partition by deptno) from emp;

would produce dupes

select deptno,count(*) from emp group by deptno;

will not.

Thats what I meant.

See, I frequently use analytics to *avoid* the need to group by a whole bunch of columns when all I need is a little summary level data along with a bunch of details

col1,col2,col3,count(*) over (partition by col4,col5,col6)

instead of

...,count(*)... group by col4,col5,col6

Well, there is really no equivalent without using analytics.

In this case, there are lots of dupes so I need to DISTINCT the whole thing which kind of defeats the purpose?

Or maybe I am missing some basic concept here?

Thanks

Tom Kyte
December 13, 2004 - 7:08 pm UTC

but now you are going totally against your prior example -- of "would you like to see your banking transaction twice"

there are times to use analytics.

There are times to use group by.

they are utterly DIFFERENT BEASTS, totally.

I would not be using analytics to avoid group bys of lots of columns (how CAN you -- you'd be partitioning by them!)

It would be wasteful to use analytics and then distinct, vs just doing what you meant to do in the first place -- GROUP BY.

why would you want to *avoid* the right way to do it??

Analytics are not creating dups here, analytics are just simply doing what you said to do.


Show me a real world case where:

col1,col2,col3,count(*) over (partition by col4,col5,col6)

makes "sense"? (dups or not)

If that returns dups -- you have really big problems -- what count(*) was 5 for?


I'm not getting it at all.

That's just the same situation i'm in

Stefan, January 06, 2005 - 10:09 am UTC

Hi Tom,

I just got your two book Expert One on One and Effective Oracle by Design, and they are a very interesting read so far. I especially like the depth of information to be found in 1 on 1, it's just great :), amazing work!

Anyhow, i was trying to get a simple result set as in "show me the average number of servers in use at any given hour during office hours".

An example:

auditing@PURPOSE> create table server (id number, when date );

Table created.

insert into server values (1,to_date('01012005080000','DDMMYYYYHH24MISS'));
insert into server values (2,to_date('01012005080500','DDMMYYYYHH24MISS'));
insert into server values (3,to_date('01012005081000','DDMMYYYYHH24MISS'));
insert into server values (4,to_date('01012005082000','DDMMYYYYHH24MISS'));
insert into server values (5,to_date('01012005090000','DDMMYYYYHH24MISS'));
insert into server values (6,to_date('01012005090500','DDMMYYYYHH24MISS'));
insert into server values (7,to_date('01012005090900','DDMMYYYYHH24MISS'));

auditing@PURPOSE> select * from server;

ID WHEN
---------- ---------
1 01-JAN-05
2 01-JAN-05
3 01-JAN-05
4 01-JAN-05
5 01-JAN-05
6 01-JAN-05
7 01-JAN-05

7 rows selected.

The GROUP BY approach:
auditing@PURPOSE> select to_char(when,'DD HH24'), count(*) from server group by to_char(when,'DD HH24');

TO_CH COUNT(*)
----- ----------
01 08 4
01 09 3

Elapsed: 00:00:00.00

and the Analytics approach:

auditing@PURPOSE> select to_char(when,'DD HH24'), count(*) over (partition by to_char(when,'DD HH24')) from server;

TO_CH COUNT(*)OVER(PARTITIONBYTO_CHAR(WHEN,'DDHH24'))
----- -----------------------------------------------
01 08 4
01 08 4
01 08 4
01 08 4
01 09 3
01 09 3
01 09 3

7 rows selected.

Elapsed: 00:00:00.00

Not the same results, so i have to add that evil DISTINCT:

auditing@PURPOSE> select distinct to_char(when,'DD HH24'), count(*) over (partition by to_char(when,'DD HH24')) from server;

TO_CH COUNT(*)OVER(PARTITIONBYTO_CHAR(WHEN,'DDHH24'))
----- -----------------------------------------------
01 08 4
01 09 3

Elapsed: 00:00:00.00

which does give me the right result, but if GROUP BY query and the DISTINCT + Analytics query are compared in i.e. Runstats:

auditing@PURPOSE> select to_char(when,'DD HH24'), count(*) from server group by to_char(when,'DD HH24');

TO_CH COUNT(*)
----- ----------
01 08 4
01 09 3

auditing@PURPOSE> exec rs.rs_middle;

PL/SQL procedure successfully completed.

auditing@PURPOSE> select distinct to_char(when,'DD HH24'), count(*) over (partition by to_char(when,'DD HH24')) from server;

TO_CH COUNT(*)OVER(PARTITIONBYTO_CHAR(WHEN,'DDHH24'))
----- -----------------------------------------------
01 08 4
01 09 3

auditing@PURPOSE> exec rs.rs_stop(500);
Run1 ran in 1914 hsecs
Run2 ran in 1122 hsecs
Run1 ran in 170.59% of the time
Name Run1 Run2 Diff
LATCH.cache buffers chains 461 2,026 1,565

Run1 latches total versus runs -- difference and pct
Run1 Run2 Diff Pct
3,966 6,041 2,075 65.65%

PL/SQL procedure successfully completed.


We can see that the group by in a case like this appears to be the far better choice (and if i attempted to run this on a) more data and b) in a for 1..1000 loop the difference got enormous).

Am i correct that this kind of query just won't be doable in a performant way using analytics ?

PS Thanks a lot of providing this forum to the community, this site just rocks! :)




Tom Kyte
January 06, 2005 - 11:13 am UTC

this would be inappropriate for analytics.

analytics do not mean "group by is dead", far from it.

analytics are used when you WANT the detail records -- group by is used when you want to aggregate. You clearly wanted to aggregate here, group by would be the correct approach.

Help on SQL using analytics

Vikas Khanna, January 12, 2005 - 7:42 am UTC

Hi Tom,

Please help me in writing a SQL query which outputs the results as a concatenated string in the form of

Ename1,Ename2,Ename3,Ename4,Ename5,Ename6 and so on. using the EMP table under Scott schema.

Is it possible to do that?

What all possible methods are there to do that if the output cannot be returned using SQL?

Regards,


Tom Kyte
January 12, 2005 - 9:49 am UTC

search this site for stragg

Query

A reader, April 26, 2005 - 11:21 pm UTC

I have one table like this:

PERSON_ID PRODUCT_CODE ACCOUNT_ID
1000 1 ...
1000 2 ...
1000 3 ...
1001 1 ...
1001 1 ...
1002 2 ...
1002 3 ...
1003 1 ...
1003 1 ...
1003 2 ...

I want to get a list of all PERSON_IDs excluding those who have just accounts with PRODUCT_CODE=1.

So in above example, I want to exclude 1001 person_id since he has only those accounts for which product_code=1. 1003 will be included in the result set because although he has two accounts with product_code=1 he also has some other account for some different product_code.

How can I write the SQL to do this?


Tom Kyte
April 27, 2005 - 7:41 am UTC

do you need all of the rows for that person_id or just a list of person_ids?

select distinct person_id from t where product_code <> 1;

gets a list of person_ids


select *
from ( select t.*, count( case when product_code <> 1 then 1 end )
over (partition by person_id) cnt
from t )
where cnt > 0;

if you need all of the details (even product_code = 1 records) for person_ids that have some row with product_code <> 1;


count using analytic functions

A reader, May 05, 2005 - 2:05 pm UTC

Hi Tom,

I have a table t as

create table t
(
d_id number,
status char(1),
crdate date
);

insert into t values (1, 'Y', '04-05-2005 00:00:00');
insert into t values (2, 'Y', '04-05-2005 00:00:00');
insert into t values (3, 'N', '04-05-2005 00:00:00');
insert into t values (4, 'Y', '04-09-2005 00:00:00');
insert into t values (5, 'Y', '04-09-2005 00:00:00');
insert into t values (6, 'N', '04-09-2005 00:00:00');
insert into t values (7, 'N', '04-09-2005 00:00:00');
insert into t values (8, 'N', '04-09-2005 00:00:00');
commit;

I need result of this table to look like the following:

crdate s1 s2
------- --- ----

04-05-2005 00:00:00 2 (status = 'Y') 1 (status = 'N')
04-09-2005 00:00:00 2 (status = 'Y') 3 (status = 'N')


Thanks.



Tom Kyte
May 05, 2005 - 2:07 pm UTC

ops$tkyte@ORA10G> select crdate, count(case when status = 'Y' then 1 end) s1, count(case when status='N' then 1 end) s2
  2    from t
  3   group by crdate;
 
CRDATE                      S1         S2
------------------- ---------- ----------
04-05-2005 00:00:00          2          1
04-09-2005 00:00:00          2          3
 

You rock Tom!!!! Thanks a lot!!!!

A reader, May 05, 2005 - 2:31 pm UTC


Error or WAD (Works as Designed)?

A reader, May 31, 2005 - 5:01 pm UTC

CREATE TABLE DEPT (
DEPT_NO NUMBER,
PT_DEPT_NO NUMBER,
CONSTRAINT dept_pk PRIMARY KEY ( DEPT_NO ),
CONSTRAINT dept_fk FOREIGN KEY ( PT_DEPT_NO ) REFERENCES DEPT( DEPT_NO )
);

INSERT INTO DEPT VALUES( 1, NULL );
INSERT INTO DEPT VALUES( 2, 1 );

SELECT DEPT_NO, PT_DEPT_NO, PRIOR DEPT_NO PRIOR_DEPT_NO
FROM DEPT
START WITH PT_DEPT_NO IS NULL
CONNECT BY PRIOR DEPT_NO = PT_DEPT_NO;

SELECT DEPT_NO, PT_DEPT_NO, PRIOR DEPT_NO PRIOR_DEPT_NO, COUNT(*) OVER ()
FROM DEPT
START WITH PT_DEPT_NO IS NULL
CONNECT BY PRIOR DEPT_NO = PT_DEPT_NO;

Adding in the analytic kills the PRIOR data. I would think you would either get the prior column or get a syntax error.

Tom Kyte
June 01, 2005 - 8:44 am UTC

well, the analytics will destroy the sort order in general anyway, so the way I would do this would be:

ops$tkyte@ORA9IR2> select x.*, count(*) over ()
  2    from (
  3          SELECT DEPT_NO, PT_DEPT_NO, PRIOR DEPT_NO PRIOR_DEPT_NO, rownum r
  4          FROM DEPT
  5          START WITH PT_DEPT_NO IS NULL
  6          CONNECT BY PRIOR DEPT_NO = PT_DEPT_NO
  7         ) x
  8   order by r
  9                  ;
 
   DEPT_NO PT_DEPT_NO PRIOR_DEPT_NO          R COUNT(*)OVER()
---------- ---------- ------------- ---------- --------------
         1                                   1              2
         2          1             1          2              2

 

Prior vs Analytics

A reader, June 01, 2005 - 1:25 pm UTC

I didn't actually care about the order of the result set. It just seemed odd that PRIOR was syntactically correct but returned the wrong answer based on the analytic being there.

Why does the addition of the analytic kill the PRIOR from working?

Tom Kyte
June 01, 2005 - 5:17 pm UTC

Don't know, I did file a bug to have it looked at.

Wondering if this can be done using analytics

dharma, June 20, 2005 - 5:35 pm UTC

Hi Tom,

This query can be done using tables, But out of curiosity I want to know if this can be done using analytics.

I have amount in a bit of sequence

10 13 17 21

and a tablular chart like this

Month paid | 1 2 3 4
--------------------------
Month Born
1 |10 21 17 13
|
2 |13 10 21 17
|
3 |17 13 10 21
|
4 |21 17 13 10


If I give the "month born" and "month paid" I have to get the amount. (Actually the months would be from 1..12)

Thanks



Tom Kyte
June 20, 2005 - 5:46 pm UTC

does not compute. where does a "chart" come into play in the database and what is a "bit of sequence" in database speak?

dharma, June 20, 2005 - 6:15 pm UTC

The sequence : I meant that the difference between the amounts resembles a sequnce. 10 13 17 21 would be 3,4,4
and this is seen in the chart. If for month born=1 and month paidn=1 then the amount is 10; and so the data would look like (1,1)=10 (1,4)=13 (1,3)=17 (1,2)=21.

I have to get the amount to be paid, based on when a person is born and when he pays, from the chart.


Tom Kyte
June 20, 2005 - 6:21 pm UTC

give me tables. wait, first read this:

</code> http://asktom.oracle.com/Misc/how-to-ask-questions.html <code>

and understand that to you, this all makes obvious sense. To use, this is just text on the screen.

10 13 17 21 would be 3,4,4???? huh?

phrase the question more precisely, eg: what are the inputs and outputs here.


is it as easy as:

select decode( :MONTH_PAID, 1, c1, 2, c2, 3, c3, ...., 12, c12 )
from t
where month_born = :MONTH_BORN;


(turn your chart into a table, with columns month_born and c1..c12 for month paid?)

If so, I would not model it as a chart - but pretend you have no inside knowledge, now -- describe the problem to someone.

dharma, June 20, 2005 - 6:37 pm UTC

In my first post I have already mentioned that I have done this using tables. I wanted to know from the chart if it was possible to write a query to get the desired result by passing the "month paid" and "month born" parameters. I am a regular user of this site and I have already read the "How to ask a question" a couple of times. Its just that I thought the question was clear enough. My apologies if its not.
Thanks

Tom Kyte
June 21, 2005 - 7:50 am UTC

ummm, well, do you care to share your tables. I'm not understanding what a "chart" is.

If the "chart" is a table with rows and columns, the query is above, the one with decode.

If you've already done this using tables, then you must have already answered your own question? I mean, you have the data in some tables, you are running queries to get the answer? So -- yes, I am still somewhat confused as to the question here still.

Look at your question, do you see any create tables? insert intos? I see questions about "what is a chart in relation to a database" for example?

Need Top N query with sum of the rest.

Shrikant, June 21, 2005 - 6:01 am UTC

Dear Tom,

Thanks alot for the valuable information/suggestions/links you are providing. And also thanks alot for your book "Expert One-on-One".


I have a question regarding the TOP-N-Query. I have following Fact Table "Invoice_Fact"

Country Customer Sale_Amount Quantity
USA USA1 5000 50
AUS AUS1 500 30
USA USA1 2000 30
USA USA2 700 30
USA USA3 250 40
AUS AUS2 300 70
USA USA4 1700 30
AUS AUS3 600 20
USA USA4 1700 30
USA USA5 1200 40
USA USA6 300 30
USA USA7 500 70
AUS AUS4 2600 20
AUS AUS4 1600 20
AUS AUS5 1800 10
AUS AUS6 1400 15


The end user wants a report which will list top 5 customers based on the sales amount and also give the total for rest customers. I mean the output should be like this.

Country Rank Customer Sum_Sales Quantity %on_Total_Sale
--------------------------------------------------------
USA 1 USA1 7000 80 52.43%
USA 2 USA4 3400 60 25.47%
USA 3 USA5 1200 40 8.99%
USA 4 USA2 700 30 5.24%
USA 5 USA7 500 70 3.75%
USA 6 others 550 70 4.12%

Here "others" means sum of USA3 and USA6. And the percentage is taken based on the grand sales total for USA which is 13350.

I wanted to achieve this in SQL, please guide me in this.

P.S :-
Tom. I know I should not ask a fresh question in Review, but believe me I have searched for similar question but couldnt find it. And I found the base question and related answers closely matches with my requirement, thats why I put the question here.


Thanks alot in advance.

Tom Kyte
June 21, 2005 - 4:44 pm UTC

not a create table or insert into in sight....

I Will Help with inserts and create table

AS, June 21, 2005 - 6:59 pm UTC

Hi Tom,

I Will add the required scripts as i was trying to do this myself.
I have also worked out a solution as per my understanding and I am anixously waiting for your comments. I am trying to get hold of analytical and aggregate functions.

Thanks

create table t ( country varchar2(10) ,
customer varchar2 (10),
sale_amount number (5),
quantity number(5) ) ;

insert into t
select 'USA', 'USA1', 5000, 50 from dual union
select 'AUS', 'AUS1', 500 , 30 from dual union
select 'USA', 'USA1', 2000, 30 from dual union
select 'USA', 'USA2', 700, 30 from dual union
select 'USA', 'USA3', 250, 40 from dual union
select 'AUS', 'AUS2', 300, 70 from dual union
select 'USA', 'USA4', 1700, 30 from dual union
select 'AUS', 'AUS3', 600, 20 from dual union
select 'USA', 'USA4' , 1700, 30 from dual union
select 'USA', 'USA5', 1200, 40 from dual union
select 'USA', 'USA6', 300, 30 from dual union
select 'USA', 'USA7', 500, 70 from dual union
select 'AUS', 'AUS4', 2600, 20 from dual union
select 'AUS', 'AUS4', 1600, 20 from dual union
select 'AUS', 'AUS5', 1800, 10 from dual union
select 'AUS', 'AUS6', 1400, 15 from dual
;
commit ;


I came up with this solution .

select mytab.*, ratio_to_report( sale_amount ) over ( ) * 100 as on_total_sale
from
(select country,
rank,
customer,
sale_amount,
quantity
from
(select country,
sale_amount,
rank( ) over ( order by sale_amount desc) rank,
-- dense_rank ( ) over( order by sale_amount desc ) drank,
customer ,
quantity
from (select country,
customer,
sum(sale_amount) as sale_amount,
sum(quantity) as quantity
from t
group by country, customer)
order by sale_amount desc)
where rank <=5
UNION ALL
select 'USA',
6,
'Others',
sum(sale_amount) as ,
sum(quantity)
from
(select country,
sale_amount,
rank( ) over ( order by sale_amount desc) rank,
-- dense_rank ( ) over( order by sale_amount desc ) drank,
customer ,
quantity
from (select country,
customer,
sum(sale_amount) as sale_amount,
sum(quantity) as quantity
from t
group by country, customer)
order by sale_amount desc)
where rank > 5
group by 'USA', 6,'Others') mytab ;


COUNTRY RANK CUSTOMER SALE_AMOUNT QUANTITY ON_TOTAL_SALE
---------- ------------- ---------- ------------- ------------- -------------
USA 1 USA1 7000 80 34.23
AUS 2 AUS4 4200 40 20.54
AUS 3 AUS5 1800 10 8.80
USA 4 USA4 1700 30 8.31
AUS 5 AUS6 1400 15 6.85
USA 6 Others 4350 330 21.27

6 rows selected.


Tom Kyte
June 21, 2005 - 11:53 pm UTC

ops$tkyte@ORA10GR1> select cty, cust, amt, qty, dr, ratio_to_report(amt) over ()*100 tot_pct
  2    from (
  3  select case when dr <= 5 then country else 'oth' end cty,
  4         case when dr <= 5 then customer else 'others' end cust,
  5             sum(amt) amt, sum(qty) qty, dr
  6    from (
  7  select country, customer, amt, qty,
  8         least( dense_rank() over (order by amt desc nulls last), 6 ) dr
  9    from (
 10  select country, customer, sum(sale_amount) amt, sum(quantity) qty
 11    from t
 12   group by country, customer
 13         )
 14             )
 15   group by case when dr <= 5 then country else 'oth' end ,
 16            case when dr <= 5 then customer else 'others' end ,
 17                    dr
 18             )
 19   order by dr
 20  /
 
CTY        CUST              AMT        QTY         DR    TOT_PCT
---------- ---------- ---------- ---------- ---------- ----------
USA        USA1             7000         80          1 34.2298289
AUS        AUS4             4200         40          2 20.5378973
AUS        AUS5             1800         10          3 8.80195599
USA        USA4             1700         30          4 8.31295844
AUS        AUS6             1400         15          5 6.84596577
oth        others           4350        330          6 21.2713936
 
6 rows selected.
 
 

Thanks You Very Much, Tom.

A reader, June 22, 2005 - 3:52 am UTC

You are a great.

A reader, June 22, 2005 - 7:22 am UTC

Hi Tom,

I have the foll requirement:

create table t1
(
datecol date,
idcol1 number,
idcol2 number,
col1 varchar2(20),
charcol char(1),
opcol varchar2(10)
);

I am doing the following queries in a single procedure all based on t1.

select max(datecol), max(idcol1), max(idcol2)
into tdatecol, tidcol1, tidcol2
from t1
where col1 = '1234'
and charcol = 'A';

select max(datecol), max(idcol2)
into tdatecol1, tidcol21
from t1
where idcol1 = tidcol1
and charcol = 'B';

select opcol into topcol
from t1
where idcol2 = tidcol2;

if (tidcol21 is not null) then
select opcol into topcol1
from t1
where idcol2 = tidcol21;
end if;

Is it possible for me to put this all in a single query? Can I use analytic functions for this? Please help.

Thanks.


Tom Kyte
June 22, 2005 - 7:31 am UTC

please, ask things once and in one place. I'll play with this when I get a chance later.

A reader, June 22, 2005 - 10:05 am UTC

Hi Tom,

I am sorry about that. Though I already posted the same question in another topic, I think two people asked two questions at the same time and I didn't know if you got my question. I am sorry about that.

Thanks a lot for your help.



Possible answer for "A Reader"

Sean, June 22, 2005 - 11:45 am UTC

I wasn't sure about constraints and so on, so I took a guess based on the queries that idcol2 must be unique

I don't consider this optimal, but it's a start.

Hope it helps!


SELECT tdatecol, tidcol1, tidcol2,
tdatecol1, tidcol21, topcol,
(CASE
WHEN tidcol21 IS NOT NULL
THEN (SELECT opcol
FROM t1
WHERE idcol2 = tidcol21)
ELSE 'UNKNOWN'
-- no "else" in original question,
--so I didn't know what to do here
END
) topcol1
FROM (SELECT DISTINCT tdatecol, tidcol1, tidcol2,
MAX(datecol) OVER () tdatecol1,
MAX(b.idcol2) OVER () tidcol21,
(SELECT opcol FROM t1
WHERE idcol2 = tidcol2) topcol
FROM (SELECT MAX(datecol) tdatecol,
MAX(idcol1) tidcol1,
MAX(idcol2) tidcol2
FROM t1
WHERE col1 = '1234'
AND charcol = 'A'
) a,
t1 b
WHERE b.idcol1(+) = a.tidcol1
AND b.charcol(+) = 'B')

Shameful cross posting

Mike, June 23, 2005 - 2:35 am UTC

This is a shameful cross posting as I've not generated a lot of discussion on google groups. There is however an interesting (I Think) Analyitics challenge on there

See

</code> http://groups-beta.google.com/group/comp.databases.oracle.server/browse_frm/thread/d132a82f68dc7862/a8f467ef122f7f02?hl=en#a8f467ef122f7f02 <code>

or if that doesn't work serach for:

SQL Challenge DEGREES

I've got a C++ Developer breathing down my neck in terms of performance (as I'm using a bit of PL/SQL) so I could do with some help to save the name of Oracle and SQL.

Big thanks to those who follow the link

Cheers,

Mike.

Little debugging Â…

Gabe, June 24, 2005 - 4:12 pm UTC

Mike:

[I only read the google groups Â… donÂ’t like to leave my email Â… so here it is].

Jonathan missed a little thing in there Â… should use row_number() and not rank().

Consider:

create table test_data
( id number(9) not null
,degrees number(3) not null check (degrees between 0 and 359)
,sequence_no number(9)
);

insert into test_data values ( 1, 358, null);
insert into test_data values ( 1, 10, null);
insert into test_data values ( 1, 180, null);
insert into test_data values ( 1, 181, null);

flip@FLOP> select
2 id,
3 sequence_no,
4 degrees,
5 rank() over (
6 partition by id
7 order by
8 decode(first,1,0,degrees)
9 ) new_seq
10 from (
11 select
12 id,
13 sequence_no,
14 degrees,
15 rank() over (
16 partition by id
17 order by least(degrees, abs(360-degrees))
18 ) first
19 from
20 test_data
21 )
22 ;

ID SEQUENCE_NO DEGREES NEW_SEQ
---------- ----------- ---------- ----------
1 358 1
1 10 2
1 180 3
1 181 4

Seems OK Â… but Â…

flip@FLOP> insert into test_data values ( 1, 2, null);

1 row created.

flip@FLOP> select
2 id,
3 sequence_no,
4 degrees,
5 rank() over (
6 partition by id
7 order by
8 decode(first,1,0,degrees)
9 ) new_seq
10 from (
11 select
12 id,
13 sequence_no,
14 degrees,
15 rank() over (
16 partition by id
17 order by least(degrees, abs(360-degrees))
18 ) first
19 from
20 test_data
21 )
22 ;

ID SEQUENCE_NO DEGREES NEW_SEQ
---------- ----------- ---------- ----------
1 358 1
1 2 1 <== wrong sequence number
1 10 3
1 180 4
1 181 5

One row_number() seems to fix it Â…

flip@FLOP> select
2 id,
3 sequence_no,
4 degrees,
5 row_number() over (
6 partition by id
7 order by
8 decode(first,1,0,degrees)
9 ) new_seq
10 from (
11 select
12 id,
13 sequence_no,
14 degrees,
15 rank() over (
16 partition by id
17 order by least(degrees, abs(360-degrees))
18 ) first
19 from
20 test_data
21 )
22 ;

ID SEQUENCE_NO DEGREES NEW_SEQ
---------- ----------- ---------- ----------
1 358 1
1 2 2
1 10 3
1 180 4
1 181 5

But [while proving the results are non-deterministic] we also see:

flip@FLOP> delete from test_data where id=1 and degrees=358;

1 row deleted.

flip@FLOP> insert into test_data values ( 1, 358, null);

1 row created.

flip@FLOP> select
2 id,
3 sequence_no,
4 degrees,
5 row_number() over (
6 partition by id
7 order by
8 decode(first,1,0,degrees)
9 ) new_seq
10 from (
11 select
12 id,
13 sequence_no,
14 degrees,
15 rank() over (
16 partition by id
17 order by least(degrees, abs(360-degrees))
18 ) first
19 from
20 test_data
21 )
22 ;

ID SEQUENCE_NO DEGREES NEW_SEQ
---------- ----------- ---------- ----------
1 2 1
1 358 2 <== wrong sequence number
1 10 3
1 180 4
1 181 5

So, both calls to rank() should be replaced with row_number() Â…

flip@FLOP> select
2 id,
3 sequence_no,
4 degrees,
5 row_number() over (
6 partition by id
7 order by
8 decode(first,1,0,degrees)
9 ) new_seq
10 from (
11 select
12 id,
13 sequence_no,
14 degrees,
15 row_number() over (
16 partition by id
17 order by least(degrees, abs(360-degrees))
18 ) first
19 from
20 test_data
21 )
22 ;

ID SEQUENCE_NO DEGREES NEW_SEQ
---------- ----------- ---------- ----------
1 2 1
1 10 2
1 180 3
1 181 4
1 358 5

Which still leaves the result non-deterministic ... if that concerns you.

Cheers.



payments distribution problem, could you help.

Alexander Semenov, August 09, 2005 - 5:49 am UTC

Hi, Tom!

Is there any way to implement the task:

There are two tables “Debt_tab” and “Pmnt_tab” with such structure:

create table debt_tab(
debt_id number,
ag_id number,
debt_sum number,
);

create table pmnt_tab(
pm_id number,
ag_id number,
pmnt_sum number,
);

insert into debt_tab
values(1, 25, 101.0);
insert into debt_tab
values(2, 25, 200.0);
insert into debt_tab
values(3, 35, 213.0);
insert into debt_tab
values(4, 35, 15.0);
insert into debt_tab
values(5, 35, 132.0);
insert into debt_tab
values(6, 35, 39.0);


insert into pmnt_tab
values(1, 25, 25.0);
insert into pmnt_tab
values(2, 25, 80.0);
insert into pmnt_tab
values(3, 25, 250.0);
insert into pmnt_tab
values(4, 35, 100.0);
insert into pmnt_tab
values(5, 35, 120.0);
insert into pmnt_tab
values(6, 35, 108.0);

Is it possible to make a SQL that will return payments distribution via depts according to their order:

debt_id ag_id debt_sum pm_id pmnt_sum rest_topay
1 25 101.0 1 25.0 76.0
1 25 101.0 2 76.0 0.0
2 25 200.0 2 4.0 196.0
2 25 200.0 3 196.0 0.0
25 3 54.0 -54.0
3 35 213.0 4 100.0 113.0
3 35 213.0 5 113.0 0.0
4 35 15.0 5 7.0 8.0
4 35 15.0 6 8.0 0.0
5 35 132.0 6 100.0 32.0

Thank you.

Tom Kyte
August 09, 2005 - 10:11 am UTC

it would be super tricky to get this part out:

1 25 101.0 2 76.0 0.0
2 25 200.0 2 4.0 196.0 <<=== that one
2 25 200.0 3 196.0 0.0

nothing pops to mind for this.

Solution for above?

Bob B, August 09, 2005 - 2:13 pm UTC

I'm not sure where the 5th record came from (the one with no debt id or debt sum), but this query returns all the other records in the same order and seems to have the logic correct.

SELECT *
FROM (
SELECT
d.DEBT_ID,
d.AG_ID,
d.DEBT_SUM,
p.PM_ID,
LEAST( p.PMNT_SUM, d.RUNNING_TOTAL - p.RUNNING_TOTAL + p.PMNT_SUM, d.DEBT_SUM - d.RUNNING_TOTAL + p.RUNNING_TOTAL ) PMNT_SUM,
CASE
WHEN d.RUNNING_TOTAL - p.RUNNING_TOTAL < -1 * p.PMNT_SUM THEN NULL
WHEN d.RUNNING_TOTAL - p.RUNNING_TOTAL > d.DEBT_SUM THEN NULL
ELSE GREATEST( d.RUNNING_TOTAL - p.RUNNING_TOTAL, 0 )
END REST_TO_PAY
FROM (
SELECT DEBT_ID, AG_ID, DEBT_SUM, SUM( DEBT_SUM ) OVER ( PARTITION BY AG_ID ORDER BY DEBT_ID ) RUNNING_TOTAL
FROM DEBT_TAB
) d, (
SELECT PM_ID, AG_ID, PMNT_SUM, SUM( PMNT_SUM ) OVER ( PARTITION BY AG_ID ORDER BY PM_ID ) RUNNING_TOTAL
FROM PMNT_TAB pt
)p
WHERE d.AG_ID = p.AG_ID
)
WHERE REST_TO_PAY IS NOT NULL
ORDER BY DEBT_ID, PM_ID

DEBT_ID AG_ID DEBT_SUM PM_ID PMNT_SUM REST_TO_PAY
1 25 101 1 25 76
1 25 101 2 76 0
2 25 200 2 4 196
2 25 200 3 196 0
3 35 213 4 100 113
3 35 213 5 113 0
4 35 15 5 7 8
4 35 15 6 8 0
5 35 132 6 100 32


Tom Kyte
August 09, 2005 - 3:22 pm UTC

thanks -- these are great "puzzles" aren't they sometimes.

Puzzles are always fun

Bob B, August 09, 2005 - 4:01 pm UTC

It was simpler than my original idea of using the MODEL clause. The problem is similar to the merge portion of a sort merge. Take two sequenced lists and use the elements from each list in order, according to the "sort" rules. In this case the rules were:

1) Net owed (REST_TO_PAY) must be less than or equal to the current debt record
2) Prior net owed (REST_TO_PAY + current payment) must be greater than 0

Puzzles are always fun

Alexander Semenov, August 10, 2005 - 10:02 am UTC

Bob, thanks a lot!


Found a bug in the result view of the puzzle but not in distribution result.

Alexander Semenov, August 17, 2005 - 2:19 pm UTC

Hi,
The bug resides at PMNT_SUM:

LEAST( p.PMNT_SUM,
d.RUNNING_TOTAL - p.RUNNING_TOTAL + p.PMNT_SUM,
d.DEBT_SUM - d.RUNNING_TOTAL + p.RUNNING_TOTAL ) as PMNT_SUM

The bug isn't seen in the example because of "ideal conditions" shown here. It rises when a payment shifts and spreaded between a few debts.
Try this:

create table debt_tab(
debt_id number,
ag_id number,
debt_sum number
);

create table pmnt_tab(
pm_id number,
ag_id number,
pmnt_sum number
);

insert into debt_tab
values(3, 35, 1650.0);
insert into debt_tab
values(4, 35, 804.0);
insert into debt_tab
values(5, 35, 525.0);
insert into debt_tab
values(6, 35, 765.0);


insert into pmnt_tab
values(4, 35, 520.0);
insert into pmnt_tab
values(5, 35, 19.0);
insert into pmnt_tab
values(6, 35, 2800.0);

and after executing the distribution query instead of expeced we get:
DEBT_ID AG_ID DEBT_SUM PM_ID p_sum PMNT_SUM REST_TO_PAY
3 35 1650 4 520 520 1130
3 35 1650 5 19 19 1111
3 35 1650 6 2800 1111 0
4 35 804 6 2800 1689 0 <---here
5 35 525 6 2800 885 0
6 35 765 6 2800 360 405

1689=2800-1111 = d.Debt_Sum-d.Running_Total+p.Running_Total

It seems that some other condition is missed or something else...
Could it be fixed?

Thank you.

Bug is fixed

Alexander Semenov, August 18, 2005 - 6:12 am UTC

Yesterday I was too tired...
The solution:

LEAST(
p.PMNT_SUM,
d.RUNNING_TOTAL - p.RUNNING_TOTAL + p.PMNT_SUM,
d.DEBT_SUM - d.RUNNING_TOTAL + p.RUNNING_TOTAL,
d.DEBT_SUM <--
) as PMNT_SUM



A query that will fill in the missing rows for the payments distribution problem posted on August 09

Frank Zhou, August 24, 2005 - 11:39 am UTC

Here is the sql that built on top of Bob's nice query 
This sql seems return a resultset that will fill in the missing rows 

(     25               3       54.0       -54.0 )  <==========missing rows

Thanks,

Frank



SQL> SELECT debt_id , ag_id , debt_sum, pm_id, PMNT_SUM, rest_to_pay
  2   FROM
  3  (    SELECT debt_id   ,
  4              ag_id     ,
  5              DECODE( r, 1, debt_sum, NULL ) debt_sum,
  6              pm_id,
  7              CASE WHEN r = 1 THEN PMNT_SUM
  8                   ELSE CASE WHEN diff > 0
  9                     THEN diff
 10                     ELSE NULL END
 11                   END AS PMNT_SUM,
 12              CASE WHEN r = 2
 13                   THEN 0 - diff
 14               ELSE  REST_TO_PAY END AS rest_to_pay,
 15            flag, r , diff
 16      FROM ( SELECT *
 17          FROM ( SELECT  d.DEBT_ID,
 18                         d.AG_ID,
 19                  d.DEBT_SUM,
 20                         p.PM_ID,
 21                    LEAST(p.PMNT_SUM,
 22                          d.RUNNING_TOTAL - p.RUNNING_TOTAL + p.PMNT_SUM,
 23                          d.DEBT_SUM - d.RUNNING_TOTAL + p.RUNNING_TOTAL,
 24                          d.DEBT_SUM ) AS PMNT_SUM,
 25               CASE
 26               WHEN d.RUNNING_TOTAL - p.RUNNING_TOTAL < -1 * p.PMNT_SUM 
                  THEN NULL
 27               WHEN d.RUNNING_TOTAL - p.RUNNING_TOTAL > d.DEBT_SUM 
                  THEN NULL
 28               ELSE GREATEST( d.RUNNING_TOTAL - p.RUNNING_TOTAL, 0 )
 29               END REST_TO_PAY,
 30               pm_sum - de_sum  AS diff,
 31                CASE WHEN d.ag_id != lead(d.ag_id ) over (ORDER BY d.ag_id)
 32                 THEN '1' END AS flag
 33                 FROM (
 34                      SELECT DEBT_ID, AG_ID, DEBT_SUM,
 35                      SUM( DEBT_SUM ) OVER ( PARTITION BY AG_ID 
                            ORDER BY DEBT_ID ) RUNNING_TOTAL ,
 36                      SUM( DEBT_SUM ) OVER ( PARTITION BY AG_ID) de_sum
 37                      FROM DEBT_TAB) d,
 38                     ( SELECT PM_ID, AG_ID, PMNT_SUM,
 39                       SUM( PMNT_SUM ) OVER ( PARTITION BY AG_ID 
                              ORDER BY PM_ID ) RUNNING_TOTAL,
 40                       SUM( PMNT_SUM ) OVER ( PARTITION BY AG_ID) pm_sum
 41                           FROM PMNT_TAB pt ) p
 42                  WHERE d.AG_ID = p.AG_ID)
 43                WHERE REST_TO_PAY IS NOT NULL),
 44                (SELECT 1 r FROM dual UNION ALL SELECT 2 r FROM dual)
 45            )
 46           WHERE r = 1 OR ( flag = 1 AND diff > 0 )
 47       ORDER BY DEBT_ID, PM_ID
 48  /

   DEBT_ID      AG_ID   DEBT_SUM      PM_ID   PMNT_SUM REST_TO_PAY              
---------- ---------- ---------- ---------- ---------- -----------              
         1         25        101          1         25          76              
         1         25        101          2         76           0              
         2         25        200          2          4         196              
         2         25        200          3        196           0              
         2         25                     3         54         -54 <==      
         3         35        213          4        100         113              
         3         35        213          5        113           0              
         4         35         15          5          7           8              
         4         35         15          6          8           0              
         5         35        132          6        100          32              

10 rows selected.

SQL> spool off
 

Ovelapping data sets

Russell, September 09, 2005 - 4:11 am UTC

Hi Tom,

I have a table structure as follows, and relevant data.

CREATE TABLE TABLE1 (
REFDATE DATE,
AGS VARCHAR2(10),
CONFIRMATIONDATE DATE,
STARTTIME DATE,
COMPLETIONDATE DATE,
SORTORDER NUMBER );

Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 07:59:23', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 07:59:24', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:23:58', 'MM/DD/YYYY HH24:MI:SS'), 24);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 12:24:25', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:24:26', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:24:28', 'MM/DD/YYYY HH24:MI:SS'), 23);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 12:24:47', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:24:48', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:24:50', 'MM/DD/YYYY HH24:MI:SS'), 22);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 12:25:09', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:25:10', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:25:13', 'MM/DD/YYYY HH24:MI:SS'), 21);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 12:25:34', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:25:35', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:25:37', 'MM/DD/YYYY HH24:MI:SS'), 20);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 12:25:58', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:25:59', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:26:01', 'MM/DD/YYYY HH24:MI:SS'), 19);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 12:26:21', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:26:22', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:26:24', 'MM/DD/YYYY HH24:MI:SS'), 18);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 12:26:43', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:26:44', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:26:46', 'MM/DD/YYYY HH24:MI:SS'), 17);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 12:27:06', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:27:07', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:27:09', 'MM/DD/YYYY HH24:MI:SS'), 16);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 12:57:08', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:57:09', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 13:18:06', 'MM/DD/YYYY HH24:MI:SS'), 15);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 13:19:20', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 13:19:21', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 14:56:53', 'MM/DD/YYYY HH24:MI:SS'), 14);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 14:57:29', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 14:57:30', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 14:57:32', 'MM/DD/YYYY HH24:MI:SS'), 13);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 14:57:59', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 14:58:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 14:58:05', 'MM/DD/YYYY HH24:MI:SS'), 12);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 07:54:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 11:00:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 15:00:00', 'MM/DD/YYYY HH24:MI:SS'), 11);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 07:55:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 10:30:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 15:02:00', 'MM/DD/YYYY HH24:MI:SS'), 10);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 07:58:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 11:30:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 15:04:00', 'MM/DD/YYYY HH24:MI:SS'), 9);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 07:57:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 09:30:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 15:06:00', 'MM/DD/YYYY HH24:MI:SS'), 8);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 12:32:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:32:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 16:00:00', 'MM/DD/YYYY HH24:MI:SS'), 7);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 12:30:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:32:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 16:02:00', 'MM/DD/YYYY HH24:MI:SS'), 6);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 14:58:26', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 14:58:27', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 16:03:27', 'MM/DD/YYYY HH24:MI:SS'), 5);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 16:03:50', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 16:03:51', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 16:03:53', 'MM/DD/YYYY HH24:MI:SS'), 4);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 12:31:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:31:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 16:14:00', 'MM/DD/YYYY HH24:MI:SS'), 3);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 16:05:27', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 16:05:28', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 16:31:00', 'MM/DD/YYYY HH24:MI:SS'), 2);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 16:31:24', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 16:31:25', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 16:34:34', 'MM/DD/YYYY HH24:MI:SS'), 1);

The records indicate tasks completed by our field workforce. The ConfirmationDate is the time when a technician selects a task. They are meant to travel to the customers premise, and once onsite, they start working (ie STARTTIME). Once finished they COMPLETE the task. It is possible for a technician to have more than one task at a time (how they work on more than 1 is a mystery to me). The Sortorder field is something I have added to the task, and is basically based on the CompeltionDate in Descending order.

I am trying to work out Work and Travel Times. The travel time should be set as the time between the earliest CONFIRMATIONDATE and earliest STARTTIME (displayed on the first task only). The worktime would be the time between the minimum STARTTIME, and the maximum COMPLETIONDATE divided by the number of jobs in the set. I am trying to ascertain the complete overlapping set. I have attempted, using analytic functions displayed on your site to calculate this result. I would be fine, if I could just put the tasks in order, based on one column (IE COMPLETIONDATE), but having two columns with conflicting sorts (CONFIRMATIONDATE and COMPLETIONDATE), and then establishing the total overlapping records for both of these sets is really testing me.

Looking at the dataset, records with SORTORDER between 12 and 24 are seperate tasks without overlapping times but, tasks 1 to 11 overlap each other, and ultimately tasks 12 to 24. Therefore for this example, I am looking for a result like:
TravelTime between 7:54 am and 7:59:24, and worktime being between 7:59:24 and 16:34:34 (divided by 24)

I am using Oracle 9i. Is it possible to do this in a single SQL statement?

I hope this is clear enough

Thanks
Russell

Tom Kyte
September 09, 2005 - 7:16 am UTC

I don't see one easily -- no.

the problem is, we'd need to know the number of tasks in the group (in order to find the maximum completion time) but in order to know the number of jobs we need to know the ultimate completion times....



can we treat this data in minute levels instead of second levels? or even better - say a level of 5 minutes of granularity - we can "explode the data out" to have an observation per minute/5minutes/whatever and just aggregate.

Calculate month-wise

Kumar, November 02, 2005 - 1:48 am UTC

Hi Tom,

I have table like this,

Action-type startdt enddt
ADD 10/01/2005 11/01/2005
ADD 09/01/2005 10/01/2005
UPDATE 10/01/2005 10/01/2005
DELETE 10/01/2005 10/01/2005

Now, I have to generate a report (based on the enddate) as follows,

ActionType currentmonthcount month-1_count month-2_count
Add 2 2 1
update 1 1 1
Delete 1 0 0

Can you please help me in this regard?

Thanks

Tom Kyte
November 02, 2005 - 5:03 am UTC

I'd really like to, but there wasn't any create table or insert into :(

Calculate month-wise

Kumar, November 02, 2005 - 6:44 am UTC

Hi Tom,

Thanks for your reply.

Please find below the create table statement and my select query. I would like to know, is there a better way to frame this query.

CREATE TABLE WORKFLOW_HISTORY (
ACTION_TYPE VARCHAR2 (255),
WORK_START_DTM DATE,
WORK_END_DTM DATE,
) ;

select a. action_type,
decode(b.cnt, null, 0, b.cnt) December,
decode(c.cnt, null, 0, c.cnt) January,
decode(d.cnt, null, 0, d.cnt) Feburary,
decode(e.cnt, null, 0, e.cnt) March,
decode(f.cnt, null, 0, f.cnt) April,
decode(g.cnt, null, 0, g.cnt) May,
decode(h.cnt, null, 0, h.cnt) June,
decode(j.cnt, null, 0, j.cnt) July,
decode(k.cnt, null, 0, k.cnt) August,
decode(m.cnt, null, 0, m.cnt) September,
decode(n.cnt, null, 0, n.cnt) October,
decode(p.cnt, null, 0, p.cnt) November
from workflow_history a,
(select action_type, count(action_type) cnt from workflow_history
where to_char(trunc(work_end_dtm), 'mon') = to_char(add_months(sysdate,-11), 'mon')
and work_end_dtm is not null
group by action_type) b,
(select action_type, count(action_type) cnt from workflow_history
where to_char(trunc(work_end_dtm), 'mon') = to_char(add_months(sysdate,-10), 'mon')
and work_end_dtm is not null
group by action_type) c,
(select action_type, count(action_type) cnt from workflow_history
where to_char(trunc(work_end_dtm), 'mon') = to_char(add_months(sysdate,-9), 'mon')
and work_end_dtm is not null
group by action_type) d,
(select action_type, count(action_type) cnt from workflow_history
where to_char(trunc(work_end_dtm), 'mon') = to_char(add_months(sysdate,-8), 'mon')
and work_end_dtm is not null
group by action_type) e,
(select action_type, count(action_type) cnt from workflow_history
where to_char(trunc(work_end_dtm), 'mon') = to_char(add_months(sysdate,-7), 'mon')
and work_end_dtm is not null
group by action_type) f,
(select action_type, count(action_type) cnt from workflow_history
where to_char(trunc(work_end_dtm), 'mon') = to_char(add_months(sysdate,-6), 'mon')
and work_end_dtm is not null
group by action_type) g,
(select action_type, count(action_type) cnt from workflow_history
where to_char(trunc(work_end_dtm), 'mon') = to_char(add_months(sysdate,-5), 'mon')
and work_end_dtm is not null
group by action_type) h,
(select action_type, count(action_type) cnt from workflow_history
where to_char(trunc(work_end_dtm), 'mon') = to_char(add_months(sysdate,-4), 'mon')
and work_end_dtm is not null
group by action_type) j,
(select action_type, count(action_type) cnt from workflow_history
where to_char(trunc(work_end_dtm), 'mon') = to_char(add_months(sysdate,-3), 'mon')
and work_end_dtm is not null
group by action_type) k,
(select action_type, count(action_type) cnt from workflow_history
where to_char(trunc(work_end_dtm), 'mon') = to_char(add_months(sysdate,-2), 'mon')
and work_end_dtm is not null
group by action_type) m,
(select action_type, count(action_type) cnt from workflow_history
where to_char(trunc(work_end_dtm), 'mon') = to_char(add_months(sysdate,-1), 'mon')
and work_end_dtm is not null
group by action_type) n,
(select action_type, count(action_type) cnt from workflow_history
where to_char(trunc(work_end_dtm), 'mon') = to_char(sysdate, 'mon')
and work_end_dtm is not null
group by action_type) p
where a.action_type = b.action_type (+)
and a.action_type = c.action_type (+)
and a.action_type = d.action_type (+)
and a.action_type = e.action_type (+)
and a.action_type = f.action_type (+)
and a.action_type = g.action_type (+)
and a.action_type = h.action_type (+)
and a.action_type = j.action_type (+)
and a.action_type = k.action_type (+)
and a.action_type = m.action_type (+)
and a.action_type = n.action_type (+)
and a.action_type = p.action_type (+)
group by a.action_type, b.cnt, c.cnt, d.cnt, e.cnt, f.cnt, g.cnt, h.cnt, j.cnt,
k.cnt, m.cnt, n.cnt, p.cnt

Tom Kyte
November 03, 2005 - 5:03 am UTC

yes there is.


select action_type,
count( case when trunc(work_end_dtm,'mm') =
trunc(add_months(sysdate,-11),'mm') then action_type end ) dec,
count( case when trunc(work_end_dtm,'mm') =
trunc(add_months(sysdate,-10),'mm') then action_type end ) jan,
count( case when trunc(work_end_dtm,'mm') =
trunc(add_months(sysdate,-9 ),'mm') then action_type end ) feb
/* .... */
from workflow_history
where work_end_dtm >= add_months(sysdate,-11)
group by action_type;



when dealig with dates, try to not use TO_CHAR except to format (exception to that would be when looking for 'saturdays' for example- but in general,use TRUNC, not TO_CHAR)



Re: Month-wise

Jagjeet Singh, November 02, 2005 - 9:49 am UTC

SQL> select * from a;

ACTION_TYP WORK_STAR WORK_END_
---------- --------- ---------
ADD        10-JAN-05 11-JAN-05
ADD        09-JAN-05 10-JAN-05
UPDATE     10-JAN-05 10-JAN-05
DELETE     10-JAN-05 10-JAN-05

SQL> select
  2  action_type,
  3  sum(decode(mm,1,1,0)) a1,
  4  sum(decode(mm,2,1,0)) a2,
  5  sum(decode(mm,3,1,0)) a3,
  6  sum(decode(mm,5,1,0)) a4,
  7  sum(decode(mm,6,1,0)) a5,
  8  sum(decode(mm,7,1,0)) a6,
  9  sum(decode(mm,8,1,0)) a7,
 10  sum(decode(mm,9,1,0)) a8,
 11  sum(decode(mm,10,1,0)) a9,
 12  sum(decode(mm,11,1,0)) a10,
 13  sum(decode(mm,12,1,0)) a11,
 14  sum(decode(mm,12,1,0)) a12
 15  from
 16  (select action_type,to_number(to_char(trunc(work_end_dtm),'mm')) mm from a)
 17  group by action_type
SQL> /

ACTION_TYP  A1  A2  A3  A4  A5  A6  A7  A8  A9 A10 A11 A12
---------- --- --- --- --- --- --- --- --- --- --- --- ---
ADD          2   0   0   0   0   0   0   0   0   0   0   0
DELETE       1   0   0   0   0   0   0   0   0   0   0   0
UPDATE       1   0   0   0   0   0   0   0   0   0   0   0


A1 = Jan
A2 = Feb 
 ..
 

Re: Month-wise - TYPO

Jagjeet Singh, November 02, 2005 - 9:59 am UTC

TYPO = Instead of using 4 I repeated 12 two time
in month.

RE: Calculate month-wise

Duke Ganote, November 02, 2005 - 10:06 am UTC

Kumar-- I'm curious to see what analytic-function oriented query Tom might propose. I'm still wrestling with the analytic functions, so I've only formulated a non-analytic alternative. For comparison, here's what I came up.

First the set up. I assume there are no records with work_end_dtm older than 365 days. Then I generated some sample data:

create table workflow_history
as select case mod(level,3)
when 0 then 'Delete'
when 1 then 'Add'
when 2 then 'Update'
end AS ACTION_TYPE,
sysdate-mod(level,365)-5 AS work_start_dtm,
sysdate-mod(level,365) AS work_end_dtm
from dual connect by level < 50000
/
analyze table workflow_history compute statistics
/

Next I ran the following non-analytic-function query:

select action_type
, sum(back11) December
, sum(back10) January
, sum(back9) February
, sum(back8) March
, sum(back7) April
, sum(back6) May
, sum(back5) June
, sum(back4) July
, sum(back3) August
, sum(back2) September
, sum(back1) October
, sum(current_month) November
from (
select action_type,
case when to_char(sysdate, 'mon')
= to_char(trunc(work_end_dtm), 'mon')
then 1 end as CURRENT_month,
case when to_char(add_months(sysdate,-1),'mon')
= to_char(trunc(work_end_dtm), 'mon')
then 1 end as BACK1,
case when to_char(add_months(sysdate,-2),'mon')
= to_char(trunc(work_end_dtm), 'mon')
then 1 end as BACK2,
case when to_char(add_months(sysdate,-3),'mon')
= to_char(trunc(work_end_dtm), 'mon')
then 1 end as BACK3,
case when to_char(add_months(sysdate,-4),'mon')
= to_char(trunc(work_end_dtm), 'mon')
then 1 end as BACK4,
case when to_char(add_months(sysdate,-5),'mon')
= to_char(trunc(work_end_dtm), 'mon')
then 1 end as BACK5,
case when to_char(add_months(sysdate,-6),'mon')
= to_char(trunc(work_end_dtm), 'mon')
then 1 end as BACK6,
case when to_char(add_months(sysdate,-7),'mon')
= to_char(trunc(work_end_dtm), 'mon')
then 1 end as BACK7,
case when to_char(add_months(sysdate,-8),'mon')
= to_char(trunc(work_end_dtm), 'mon')
then 1 end as BACK8,
case when to_char(add_months(sysdate,-9),'mon')
= to_char(trunc(work_end_dtm), 'mon')
then 1 end as BACK9,
case when to_char(add_months(sysdate,-10),'mon')
= to_char(trunc(work_end_dtm), 'mon')
then 1 end as BACK10,
case when to_char(add_months(sysdate,-11),'mon')
= to_char(trunc(work_end_dtm), 'mon')
then 1 end as BACK11,
case when to_char(add_months(sysdate,-12),'mon')
= to_char(trunc(work_end_dtm), 'mon')
then 1 end as BACK12
from workflow_history
where work_end_dtm IS NOT NULL
)
group by action_type
/

The statistics are:

Statistics
---------------------------------------------------
0 recursive calls
0 db block gets
192 consistent gets
0 physical reads
0 redo size
1224 bytes sent via SQL*Net to client
784 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
3 rows processed

That's 192 LIOs; I got 2496 when I ran the query you provided previously:

Statistics
---------------------------------------------------
1 recursive calls
0 db block gets
2496 consistent gets
0 physical reads
0 redo size
1224 bytes sent via SQL*Net to client
2254 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
13 sorts (memory)
0 sorts (disk)
3 rows processed

We're looking forward to Tom's return from UKOUG...

Re: Month-wise

Kumar, November 02, 2005 - 11:08 pm UTC

Duke Ganote:

Thanks for your reply. This looks much better than what I wrote.

Curious to see tom's reply also.

Regards
Kumar

Re:

Reader, November 02, 2005 - 11:23 pm UTC

Hi Kumar,

have you tried Jagjeet's query.
Is it anything wrong .. ?

Re: Month-wise

Kumar, November 03, 2005 - 2:56 am UTC

Hi,

Jagjeet query is also correct much faster than my query.

Kumar

code compare

Duke Ganote, November 03, 2005 - 5:46 am UTC

Jagjeet used absolute dates, specifically
to_number(to_char(trunc(work_end_dtm),'mm'))
Tom and I used relative dating like Kumar did in his example. Tom's formulation:
trunc(work_end_dtm,'mm') =
trunc(add_months(sysdate,-11),'mm')

Tom's example (you have to look back right after Kumar's query -- Tom can jump back and answer prior entries) didn't use subqueries, which is cleaner code:

select action_type,
count( case when trunc(work_end_dtm,'mm') =
trunc(add_months(sysdate,-11),'mm') then action_type end )
dec,
count( case when trunc(work_end_dtm,'mm') =
trunc(add_months(sysdate,-10),'mm') then action_type end )
jan,
count( case when trunc(work_end_dtm,'mm') =
trunc(add_months(sysdate,-9 ),'mm') then action_type end )
feb
/* .... */
from workflow_history
where work_end_dtm >= add_months(sysdate,-11)
group by action_type;


LIO compare & "the great disappearing ROWNUM"

Duke Ganote, November 03, 2005 - 9:38 am UTC

The LIOs for Tom's query were comparable to the query I formulated, but his code is cleaner.

But we've strayed from analytic functions... I noticed this behavior, which I call "the great disappearing ROWNUM" when ordering by a constant:

select 'by rownum desc' ordering, rownum, level l
, row_number() over (order by rownum desc) "row_nbr"
from dual connect by level < 5
UNION
select 'by null or constant', rownum, level l
, row_number() over (order by null desc)
from dual connect by level < 5
/
ORDERING ROWNUM L row_nbr
------------------- ---------- ---------- ----------
by null or constant 1 1
by null or constant 2 2
by null or constant 3 3
by null or constant 4 4
by rownum desc 1 1 4
by rownum desc 2 2 3
by rownum desc 3 3 2
by rownum desc 4 4 1

9i and 10g show the same results, and I'm trying to understand the usefulness and intentionality of the behavior.

Tom Kyte
November 04, 2005 - 2:32 am UTC

that looks like a bug, I'll file one.

Re: Month Wise

Jagjeet Singh, November 03, 2005 - 10:36 pm UTC

> Jagjeet used absolute dates, specifically
> to_number(to_char(trunc(work_end_dtm),'mm'))
> Tom and I used relative dating like Kumar did in his
> example.

Will it have any effect on output ?

Tom Kyte
November 04, 2005 - 3:36 am UTC

to use relative versus absolute dates? Sure, wait a couple of weeks and see what happens...



"the great disappearing ROWNUM"

Matthias Rogel, November 04, 2005 - 2:42 am UTC

Re: Month Wise (relative vs absolute dates); ROWNUM

Duke Ganote, November 04, 2005 - 6:32 am UTC

There's a bit of requirements confusion because Kumar's original example asked for results like this:

ActionType currentmonthcount month-1_count month-2_count
Add 2 2 1
update 1 1 1
Delete 1 0 0

but in his query, the column labels are absolute ("December", etc) though the calculations are for relative dates.
******************
Matthias Rogel indeed found the same issue I did for ROWNUM vs ROW_NUMBER. However, I can't imagine a real-world query where I'd use both ROWNUM and a ROW_NUMBER ordered by a constant in the same query: ROWNUM would be simpler.

Tom Kyte
November 04, 2005 - 8:50 am UTC

if you can use rownum, you should (top-n query, more efficient).

use row_number only when you need to partition the data and do a top-n by some set of columns

Re: Month-wise

Jagjeet Singh, November 04, 2005 - 8:35 am UTC

> to use relative versus absolute dates? Sure, wait a couple of weeks and see
> what happens...

I was talking about Kumars's query's output.



Tom Kyte
November 04, 2005 - 8:55 am UTC

today - nothing.

but if you use absolute dates versus relative dates - problems come in the future.

Calculate month-wise

Kumar, November 07, 2005 - 1:12 am UTC

Hello Tom,

Many thanks for your reply and suggestion.

Using Analytic Functions within Pro*C code

Prat, November 09, 2005 - 1:20 pm UTC

Hi Tom,

Please could you let us know if we can use the Analytic functions within Pro*C codes. I tried using the DENSE_RANK function in my pro*C code, and compiled it with my pro*C compiler (9.2.0.4), but it appears that it does not recognise the DENSE_RANK syntax.

Thanks,
Prat

Tom Kyte
November 11, 2005 - 10:24 am UTC

you would need to use dynamic sql for that unfortunately.

A reader, November 14, 2005 - 3:07 pm UTC

Dear Tom,

My attempts to get the following did not make much head way. Can I seek your help, please?
I want to get the time spent on scattered read for a given active session in the past 60 minutes in buckets of 5 minutes using v$active_session_history.

Thanks in advance.
Shivaswamy


Tom Kyte
November 14, 2005 - 4:24 pm UTC

you would just need to "normal" the sample time down to 5 minute intervals - no analytics or anything fancy.



sample_time,

to_char(sample_time,'dd-mon-yyyy h24:')||
to_char(5*trunc((to_char(sample_time,'mi'))/5),'fm00')


just fix up the "5 minute" part.

Restriction on CLOB?

Yuan, December 02, 2005 - 11:12 am UTC

Is there a restriction on using CLOB as an argument for LAG() and LEAD()? When I tried, I get:

ORA-00932: inconsistent datatypes: expected - got CLOB

I was unable to find any such restriction in the documentation.

Tom Kyte
December 02, 2005 - 11:39 am UTC

analytics take "expressions", here is a potential workaround:

ops$tkyte@ORA10GR2> select id, x, (select x from t where id = last_id) last_x
  2    from (
  3  select id, x, lag(id) over (order by id) last_id
  4    from t
  5         )
  6  /

        ID X          LAST_X
---------- ---------- ----------
         1 hello
         2 world      hello
 

Restriction on CLOB? (continued)

Yuan, December 02, 2005 - 1:14 pm UTC

I don't have a PK field on the table in question. Can I use ROWID?

Tom Kyte
December 03, 2005 - 9:55 am UTC

you don't have a primary key on a table?????? what?

yes, you can use rowid, but hey - you are missing something here.

No PK

Yuan, December 05, 2005 - 7:36 am UTC

It's a history/audit table of another table. Every iteration of every record on the other table is on this table. This table will only have records added and never updated. Do you still think I need a PK? If so, why?

Tom Kyte
December 06, 2005 - 4:18 am UTC

rowid will be fine.



What is wrong with this ?

A reader, December 12, 2005 - 6:39 pm UTC

I am using
db: oracle 8.1.7.4
os: sun sol v8, 64 bit

dev-->
dev--> CREATE OR REPLACE PROCEDURE select_rec (
2 my_cur OUT credit.cursortype,
3 num_days IN NUMBER
4 )
5 AS
6 BEGIN
7 OPEN my_cur FOR
8 SELECT *
9 FROM (SELECT oh.*, DENSE_RANK() OVER(PARTITION BY ID_GROUP) rn
10 FROM ora_history oh
11 WHERE status = 2)
12 WHERE rn = 1;
13
14 END select_rec;
15 /

Warning: Procedure created with compilation errors.

Elapsed: 00:00:00.00
dev--> show err
Errors for PROCEDURE SELECT_REC:

LINE/COL ERROR
-------- -----------------------------------------------------------------
9/48 PLS-00103: Encountered the symbol "(" when expecting one of the
following:
, from


Do you know why is this error occured ?

Missing a table alias???

Mike, December 13, 2005 - 9:07 am UTC

Is it possible that the inner table expression is missing the required "alias" after the parentheses? My formulation of the query seems to work just fine.

SELECT MISSING_TABLE_ALIAS.* FROM
(SELECT oh.*, DENSE_RANK() OVER(PARTITION BY ID_GROUP) rn
FROM ora_history oh
WHERE oh.status = 2) MISSING_TABLE_ALIAS
WHERE MISSING_TABLE_ALIAS.rn = 1;



Tom Kyte
December 13, 2005 - 9:50 am UTC

no, read the above link - 8i PLSQL did not understand analytics or <list of stuff>, the workaround is documented in full there.

Thanks Tom.

A reader, December 13, 2005 - 10:41 am UTC

I have another qustion. Please let me know if this is possible in single sql in oracel 8.1.7.4

I have following query..

SELECT *
FROM (SELECT oh.*, DENSE_RANK() OVER(PARTITION BY ID , oh_side order by oh_date desc) rn
FROM ora_history oh
WHERE status = 2)
WHERE rn = 1;

now, teh dense_rank is created based on partition
id and oh_side. now oh_side can assume only 2 values 1 or 0.
my problem is when oh_size is 0 I want to sort "order by oh_date desc"
and if the oh_side = 1 I want to sort order by "oh_date asc"

is that possible somehow ? may be using decode ? in side
-- over( PARTITION BY ID , oh_side order by oh_date decode(oh_side,1,desc,2,asc))

possible ? good idea ?

Tom Kyte
December 13, 2005 - 1:07 pm UTC

partition by id, oh_side
order by decode( oh_side, 1, oh_date) DESC,
decode( oh_side, 2, oh_date) ASC

Thanks tom.

A reader, December 13, 2005 - 2:32 pm UTC


UKOUG-2005

Reader, December 17, 2005 - 6:11 am UTC

I attended your session at UKOUG-2005 --SQL Techniques
I am now going thru the slides .
You showed an excellent use of fmt ....
I am unable to find that SQL .
Can you help ?

Thanks

Ashok


Tom Kyte
December 17, 2005 - 11:17 am UTC

fmt?

or do you mean "fm" the format modifier in to_char?

select owner,
to_number(substr(data,1,10)) cnt,
to_date(substr(data,11,14),'yyyymmddhh24miss') min_created,
to_date(substr(data,25),'yyyymmddhh24miss') max_created
from (
select owner,
(select to_char( count(*), 'fm0000000000') ||
to_char( min(created),'yyyymmddhh24miss') ||
to_char( max(created),'yyyymmddhh24miss')
from big_table.big_table b
where b.owner = a.owner and b.object_type = 'TABLE' ) data
from (
select a.owner
from big_table.big_table_owners a
order by a.owner
) a
where rownum <= 2
)
/


I used fm there to remove the leading whitespace on a fixed width, postive integer.

First time where .....

Yuan, December 20, 2005 - 3:38 pm UTC

Suppose I have a Sale_History table as such:

CREATE TABLE Sale_History (Dt DATE,
Sale_Id INTEGER,
Status VARCHAR2(1),
Other VARCHAR2(1))
/
INSERT INTO Sale_History VALUES ('1-DEC-05', 1, 'A', '1')
/
INSERT INTO Sale_History VALUES ('2-DEC-05', 1, 'A', '2')
/
INSERT INTO Sale_History VALUES ('3-DEC-05', 1, 'B', '2')
/
INSERT INTO Sale_History VALUES ('4-DEC-05', 1, 'A', '2')
/
INSERT INTO Sale_History VALUES ('5-DEC-05', 1, 'A', '1')
/
COMMIT
/

How do I find the most earliest time Sale_Id 1 became the current status this time around? In this example, I'd want to see 12/4, not 12/1.

Tom Kyte
December 20, 2005 - 4:42 pm UTC

added another sale_id to make it "more interesting"

ops$tkyte@ORA10GR2> select * from sale_history order by sale_id, dt;

DT           SALE_ID S O
--------- ---------- - -
01-DEC-05          1 A 1
02-DEC-05          1 A 2
03-DEC-05          1 B 2
04-DEC-05          1 A 2
05-DEC-05          1 A 1
02-DEC-05          2 A 1
03-DEC-05          2 A 2
04-DEC-05          2 B 2
05-DEC-05          2 A 2
06-DEC-05          2 A 1

10 rows selected.

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> select sale_id, max(min_dt)
  2    from (
  3  select sale_id, min(dt) over (partition by sale_id, maxgrp) min_dt
  4    from (
  5  select sale_id, dt, status,
  6         max(grp) over (partition by sale_id order by dt) maxgrp
  7    from (
  8  select dt, sale_id, status,
  9         case when lag(status) over (partition by sale_id order by dt) <> status
 10              then row_number() over (partition by sale_id order by dt)
 11          end grp
 12    from sale_history
 13         )
 14   where status = 'A'
 15         )
 16         )
 17   group by sale_id
 18  /

   SALE_ID MAX(MIN_D
---------- ---------
         1 04-DEC-05
         2 05-DEC-05


8..12 - start to setup for grouping by marking the beginning of a group.

5.. 14 - carry the group "down", keep only 'A' groups, contigous 'A' records in the set has the same maxgrp id by sale_id now...

3..15 - for each group, assign the min_dt

1..17 - by sale_id, find the maximum date - which is the minimum date of the last group...

 

First time where ..... (continued)

Yuan, December 20, 2005 - 4:00 pm UTC

I think I've found an answer to my question, but please feel free to post a better one:

SELECT A.Status, D.*
FROM (SELECT Sale_Id, MAX(Dt) Dt
FROM (SELECT Sale_Id,
Dt,
Status, LEAD(Status) OVER (PARTITION BY Sale_Id ORDER BY Dt DESC) Next_Status
FROM Sale_History)
WHERE Status != Next_Status
GROUP BY Sale_Id) D,
Sale_History A
WHERE A.Sale_Id = D.Sale_Id
AND A.Dt = D.Dt

My solution assumes the table is unique by Sale_Id, Dt, which I believe to be an acceptable assumption in my real table, which actually has Timestamp instead of date.

Analytic Functions

Reader123, December 20, 2005 - 4:52 pm UTC

This site is very helpful ,as always.

First time where ..... (continued)

Yuan, December 21, 2005 - 10:33 am UTC

You gave me a breakthrough on how to get around not being able to use analytic functions in the where clause, but I'm very confused about why so many steps were necessary. Wouldn't you be able to do the same thing with this?

select sale_id, max(dt)
from (
select dt, sale_id, status,
case when lag(status) over (partition by sale_id order by dt) <>
status
then row_number() over (partition by sale_id order by dt)
end grp
from sale_history
)
WHERE grp IS NOT null
group by sale_id

Another thing, your solution assumes the final status is 'A', which is not a valid assumption. I've added a couple of rows to your Sale_Id 2 to illustrate this point.

CREATE TABLE Sale_History (Dt DATE,
Sale_Id INTEGER,
Status VARCHAR2(1),
Other VARCHAR2(1))
/
INSERT INTO Sale_History VALUES ('1-DEC-05', 1, 'A', '1')
/
INSERT INTO Sale_History VALUES ('2-DEC-05', 1, 'A', '2')
/
INSERT INTO Sale_History VALUES ('3-DEC-05', 1, 'B', '2')
/
INSERT INTO Sale_History VALUES ('4-DEC-05', 1, 'A', '2')
/
INSERT INTO Sale_History VALUES ('5-DEC-05', 1, 'A', '1')
/
INSERT INTO Sale_History VALUES ('02-DEC-05', 2, 'A', '1')
/
INSERT INTO Sale_History VALUES ('03-DEC-05', 2, 'A', '2')
/
INSERT INTO Sale_History VALUES ('04-DEC-05', 2, 'B', '2')
/
INSERT INTO Sale_History VALUES ('05-DEC-05', 2, 'A', '2')
/
INSERT INTO Sale_History VALUES ('06-DEC-05', 2, 'A', '1')
/
INSERT INTO Sale_History VALUES ('07-DEC-05', 2, 'D', '1')
/
INSERT INTO Sale_History VALUES ('08-DEC-05', 2, 'D', '0')
/
COMMIT
/

If you simply remove your line 14 from your query, that seems to work just fine, but wouldn't mine be even simpler? Am I missing something?

One other requirement I forgot to mention. I need to know the current status for each sale. For 1, it'd be 'A' and for 2 'D'. My original solution does that, but I'll try to work with what you've taught me to see if I can use that too.

Tom Kyte
December 21, 2005 - 7:43 pm UTC

good points all - and a good show of why boundary conditions are important...

I can break your example too 


   WHERE grp IS NOT null

(think of something with ALL A's) - but yes, you are right, we can optmize this...

if you added this:

INSERT INTO Sale_History VALUES ('06-DEC-05', 3, 'A', '1')
/


you would not find this record BUT, modify just a bit:

ops$tkyte@ORA10GR2> select sale_id, max(dt)
  2      from (
  3    select dt, sale_id, status,
  4           case when nvl(lag(status) over (partition by sale_id order by dt),'xx') <> status
  5                then row_number() over (partition by sale_id order by dt)
  6            end grp
  7      from sale_history
  8           )
  9      WHERE grp IS NOT null
 10     group by sale_id
 11  /

   SALE_ID MAX(DT)
---------- ---------
         1 04-DEC-05
         2 07-DEC-05
         3 06-DEC-05



and we do - I was thinking "too procedurally" :) 

First time where ..... (continued)

Yuan, December 21, 2005 - 10:45 am UTC

This seems to work:
select * from sale_history where rowid in (select first_value(rid) OVER (PARTITION BY sale_id ORDER BY dt desc) rid
from (
select dt, sale_id, status,
case when lag(status) over (partition by sale_id order by dt) <>
status
then row_number() over (partition by sale_id order by dt)
end grp, ROWID rid
from sale_history
)
WHERE grp IS NOT NULL)

Would I be better off using EXISTS rather than IN on a big table?

Tom Kyte
December 21, 2005 - 7:43 pm UTC

see above, but I'd rather do a single pass, with analytics.

First time where ..... (continued)

Yuan, December 21, 2005 - 11:37 am UTC

Simpler yet:

select * from sale_history where rowid in (select first_value(rid) OVER (PARTITION BY sale_id ORDER BY dt desc) rid
from (
select dt, sale_id, status,
case when lag(status) over (partition by sale_id order by dt) <>
status
then 1
end grp, ROWID rid
from sale_history
)
WHERE grp = 1)

First time where ..... (continued)

Yuan, December 22, 2005 - 10:13 am UTC

You're absolutely right! I totally forgot about those that never changed statuses. I thank you and my users would too... if they ever learned of my oversight. 8^D

How about this?

select * from sale_history where rowid in (select first_value(rid) OVER
(PARTITION BY sale_id ORDER BY dt desc) rid
from (
select dt, sale_id, status,
case when lag(status) over (partition by sale_id order by dt) <> status
then 1
when lag(status) over (partition by sale_id order by dt) is null
then 1
end grp, ROWID rid
from sale_history
)
WHERE grp = 1)

I prefer the second when statement to handle this scenario so I never have to worry about a possible new 'xx' status getting added to the list of possible statuses.

Also, I think you missed this part:
<quote>One other requirement I forgot to mention. I need to know the current status for each sale. For 1, it'd be 'A' and for 2 'D'. My original solution does that, but I'll try to work with what you've taught me to see if I can use that
too. </quote>

Your final solution did not provide the status for each sale. That's why my final solution does another pass.

Tom Kyte
December 22, 2005 - 11:05 am UTC

status was a varchar2(1) :)

but yes, this would work just fine - or

decode( lag(status) over (partit.....), status, null, 1 )


(decode is nice for a "compare A to B and include NULLs as being a 'value'")



First time where ..... (continued)

Yuan, December 22, 2005 - 2:56 pm UTC

Right again. I'll use DECODE. Also saves from having to evaluate the lag twice, which I was doing my way.

How about getting the status? Do you see a better way than what I'm doing?

Tom Kyte
December 22, 2005 - 5:22 pm UTC

but don't we know the status ? it is "a"

First time where ..... (continued)

Yuan, December 22, 2005 - 8:25 pm UTC

No, A for Sales 1 and 3 in our examples, but D for Sale 2. The status for each Sale would be whatever it was changed to last.

Tom Kyte
December 23, 2005 - 12:16 pm UTC

Ahh, then my first query totally misinterpreted your question - I thought "A" was "Active" and that is what you were looking for.

So, it is the minimum timestamp for the last "group" in SALE_ID (grouping by status)


ops$tkyte@ORA10GR2> select sale_id,
  2         to_date( substr(data,6,14), 'yyyymmddhh24miss') dt,
  3             substr(data,20) status
  4    from (
  5  select sale_id, min( to_char(rn,'fm00000') || to_char(dt,'yyyymmddhh24miss') || status ) data
  6    from (
  7  select sale_id, dt, status,
  8         decode(lead(status) over (partition by sale_id order by dt DESC),
  9                status, to_number(null),
 10                row_number() over (partition by sale_id order by dt DESC)) rn
 11    from sale_history
 12         )
 13   where rn is not null
 14   group by sale_id
 15         )
 16  /

   SALE_ID DT        ST
---------- --------- --
         1 04-DEC-05 A
         2 07-DEC-05 D

 

First time where ..... (continued)

Yuan, December 23, 2005 - 2:28 pm UTC

Very interesting way of doing it. Thanks. I think I've improved upon mine, though. No 2nd pass.

select distinct sale_id,
first_value(dt) OVER (PARTITION BY sale_id ORDER BY dt desc) dt,
FIRST_VALUE(status) OVER (PARTITION BY sale_id ORDER BY dt desc) status
from (select dt, sale_id, status,
decode(lag(status) over (partition by sale_id order by dt), status, NULL, 1) grp
from sale_history
)
WHERE grp = 1

Tom Kyte
December 23, 2005 - 2:51 pm UTC

we do about the same - i grouped by - you distinct-ed. (but you have two buffer sorts, I'll have one)

First time where ..... (continued)

Yuan, December 24, 2005 - 8:04 am UTC

Yes, one point for you there, but mine wouldn't need modification if date turned into timestamp or timestamp precision changed. One point for me?

Tom Kyte
December 24, 2005 - 9:29 am UTC

I didn't know we were racing?

If the precision changed, that would be a datatype change, all queries become suspect and must be looked at - but yes, yours might not have to change, mine would.

First time where ..... (continued)

Yuan, December 27, 2005 - 6:59 am UTC

I'm always racing. Just ask my wife.

Thanks again for everything!

DISTINCT and analytics

VA, December 27, 2005 - 2:02 pm UTC

I have

select col1,col2,col3
from ...
where ...

This gives 20 rows

select distinct col1,col2,col3
from ...
where ...

gives 10 rows but

select distinct col1,col2,col3,count(*) over () cnt
from ...
where ...

Here 'cnt' gives 20 even though only 10 rows are returned.

How can I get the 'cnt' column to give me the real count after the distinct?

Thanks




Tom Kyte
December 27, 2005 - 2:25 pm UTC

select a.*, count(*) over ()
from (select disinct c1, c2, c3 from t ) a




A reader, January 12, 2006 - 10:44 am UTC

hi tom,

could you please give me a hand with the following case:

drop table MyTab ;

create table MyTab(
MyGroup varchar2( 10 ),
MyVersion number,
MyMember varchar2( 10 ),
constraint XPKMyTab primary key( MyGroup, MyVersion, MyMember )
) ;

insert into MyTab( MyGroup, MyVersion, MyMember )
values( 'Group A', 1, 'Member 1' ) ;
insert into MyTab( MyGroup, MyVersion, MyMember )
values( 'Group A', 2, 'Member 1' ) ;
insert into MyTab( MyGroup, MyVersion, MyMember )
values( 'Group A', 2, 'Member 2' ) ;
insert into MyTab( MyGroup, MyVersion, MyMember )
values( 'Group A', 3, 'Member 1' ) ;
insert into MyTab( MyGroup, MyVersion, MyMember )
values( 'Group A', 3, 'Member 3' ) ;
commit ;

the challenge (from my current point of view) is to write a SQL statement that doesn't generate the output of:

select MyGroup, MyVersion, count(*) Cnt
from MyTab
group by MyGroup, MyVersion
order by MyGroup, MyVersion ;

... only but also includes additional information about whether CONSECUTIVE 'groups' contain the SAME 'members' or not. for example: the 'versions' 2 and 3 of 'group' A contain the same number of 'members' but do not include the same 'members'. therefore 'version' 3 of 'group' A should be marked as "members did change" ...

is this something that can be achieved with analytics such as "lag()" and the like at all (at least pure SQL)?

Tom Kyte
January 12, 2006 - 11:12 am UTC

how big is this table in real life.

A reader, January 12, 2006 - 4:14 pm UTC

best guess: at most 500.000 (the set is generated by a query)

Tom Kyte
January 13, 2006 - 10:54 am UTC

You can use a technique like this

a) create a set that has your columns PLUS a count of rows by your group key (data in my example)

b) outer join this result with itself by group/verion/member and group by your group by

c) compare the counts to decide if the next version has the same members

I expanded your test case to include some examples where the next version was actually the "same":

ops$tkyte@ORA9IR2> select * from myTab;

MYGROUP     MYVERSION MYMEMBER
---------- ---------- ----------
Group A             1 Member 1
Group A             2 Member 1
Group A             2 Member 2
Group A             3 Member 1
Group A             3 Member 3
Group A             4 Member 1
Group A             4 Member 3
Group b             1 Member 1
Group b             2 Member 1
Group b             2 Member 2
Group b             3 Member 1
Group b             3 Member 3

12 rows selected.

ops$tkyte@ORA9IR2> with data
  2  as
  3  (
  4  select mygroup, myversion, mymember, count(*) over (partition by mygroup, myversion) cnt
  5    from mytab
  6  )
  7  select a.mygroup, a.myversion,
  8         case when a.cnt = max(b.cnt) and a.cnt = count(b.mymember)
  9                  then 'Next Group Same'
 10                          else 'Next Group Different'
 11                  end what
 12    from data a, data b
 13   where a.mygroup = b.mygroup(+)
 14     and a.myversion = (b.myversion(+)-1)
 15     and a.mymember = b.mymember(+)
 16   group by a.mygroup, a.myversion, a.cnt
 17  /

MYGROUP     MYVERSION WHAT
---------- ---------- ------------------------------
Group A             1 Next Group Different
Group A             2 Next Group Different
Group A             3 Next Group Same
Group A             4 Next Group Different
Group b             1 Next Group Different
Group b             2 Next Group Different
Group b             3 Next Group Different

7 rows selected.
 

group by composite primary key

apparao, January 13, 2006 - 1:58 am UTC

I have a table A which I am copying into another table B which has a composite primary key defined on the first two colloumns .Table A has duplicate rows so i cannot insert as it is into table B due to the primary key constraint. What can I do so that I can merge the offending rows of table A into one row and then insert into table B ? Any chance of using group by.

Tom Kyte
January 13, 2006 - 11:12 am UTC

describe how you would "merge them". that bit is missing here - what is your "algorithm" for turning multiple rows into 1.

can we use group by on composite primary key

apparao, January 15, 2006 - 8:41 pm UTC

hi tom here is my merge logic
Table A
col 1 col2 col3 col4 col5
row one 1 2 10 20 www
row two 1 2 11 10 www

On inserting into
Table B(condition :col1 and col2 form a composite primary key in table B)
col1 col2 col3 col4 col5
1 2 21 30 www

merge logic :I would like to add the remaning colloumns for duplicate primary Keys ( composite primary key composed of col1 and col2)

Assume that for non numeric colloumns like col4 in table A it contains the same values for row 1 and row 2.

regards
apparao

Tom Kyte
January 16, 2006 - 9:04 am UTC

I can see how to get col3 and col4 - but col5 - what if it was

www
yyy

in the two rows, what would col5 be ?

apparao, January 16, 2006 - 8:50 pm UTC

Assume that for non numeric colloumns like col5 in table A it contains the same values for row 1 and row 2.

the business rule is I add numeric coloumns and for non numeric colloumns the values contained are assumed to be the same i.e. there will be no case where it will be
www
yyy

it will always be
www
www

or

yyy
yyy

just in case if the previous assumption was not there and the business rule says for non numeric colloumns select any one randomly how do i do that.

i thought about the solution to my original question and came up with the following answer is there a better way do do it

select col1,col2 ,sum(col3),sum(col4),max(col5) group by col1,col2


Tom Kyte
January 16, 2006 - 9:07 pm UTC

then you can either do the

select c1, c2, sum(c3), sum(c4), c5
group by c1, c2, c5;


or max(c5) and group by c1, c2


IF your assumption ever proves wrong, that there is a www yyy - then the first one will let you "see it".

apparao, January 16, 2006 - 8:53 pm UTC

sorry a typo i meant to say

select col1,col2 ,sum(col3),sum(col4),max(col5) from tableA group by col1,col2


Thank you for the response

apparao, January 17, 2006 - 11:16 pm UTC

As always Thank you for the timely and quick response.

regards
apparao

analytic function

omer, January 26, 2006 - 8:28 am UTC

Hi Tom!
I have a table lot_temp

 SQL> select * 
  2  from lot_temp
  3  /

    LOT_NO    AVAIL_BALES
---------- --------------
         9              2
         1              3
        12              4
         4              5
         2              8
         3             13
         8              7

7 rows selected.

I have to give 30 bales from all lots and it should be equally given from every lot, must be integer
so the result will like.

SQL> select lot_no, avail_bales, given_bales --calculated
  2  from lot_temp
  3* order by lot_no
SQL> /

    LOT_NO    AVAIL_BALES GIVEN_BALES
---------- -------------- -----------
         1              3           5
         2              8           5
         3             13           4
         4              5           4
         8              7           4
         9              2           4
        12              4           4

7 rows selected.

i am giving 5 bales from 1,2 lots because they are older (i have date in another table, but lot_no can be used to sort)

but , given bales should not greater than available bales. here you can see given bales are greater than available bales.
Lot no 1 has 3 available bales so 3 bales should be given from lot 1, and remaining 27 should be equally divide in remaining 6 lots.

please help me to solve this problem.
I am using SQL.

Regards 

Tom Kyte
January 26, 2006 - 10:23 am UTC

bin fitting is not something you will be doing easily in SQL - you might be able to cook up something with the model clause (i'm not going to try myself though), but it requires an "algorithm" that can procedurally process the data.



try this

Jaromir D.B. Nemec, January 26, 2006 - 1:10 pm UTC

Hi Omer,
There are probable more elegant solutions, but I'll start with this simple approach:
Lots under the average give all available bales.
The rest of bales is divided uniformly with possible correction of +1 to get the sum right.
See select below

select
lot_no, avail_bales,
case when
avail_bales <= avg_bal then avail_bales -- below avg give all
else
div + case when rn <= mod then 1 else 0 end
end as given_bales
from (
select
lot_no, avail_bales,avg_bal, rn,
trunc((30 - sum_le_avg) / cnt_gt_avg) div,
mod((30 - sum_le_avg) , cnt_gt_avg) mod
from(
select
lot_no, avail_bales,avg_bal,
sum(case when avail_bales <= avg_bal then avail_bales else 0 end ) over () sum_le_avg,
sum(case when avail_bales > avg_bal then 1 else 0 end ) over () cnt_gt_avg,
row_number() over (partition by
case when avail_bales > avg_bal then 1 else 0 end
order by case when avail_bales > avg_bal then avail_bales else 0 end) rn
from(
select
lot_no, avail_bales,
trunc(30 / count(avail_bales) over ()) avg_bal
from lot_temp
)))
order by lot_no;

LOT_NO AVAIL_BALES GIVEN_BALES
---------- ----------- -----------
1 3 3
2 8 5
3 13 5
4 5 6
8 7 5
9 2 2
12 4 4

7 rows selected.


Regads,

Jaromir D.B. Nemec


analytic question

omer, January 27, 2006 - 1:17 am UTC

Hi!
Ok Tom, will you go to solve this in PL/SQL.

thanks jaromir for your answer but you can see the result it is not correct.
At last level when you add average, it can exceeds from available bales

Regards

A SQL solution

Michel Cadot, January 30, 2006 - 9:25 am UTC

Hi,

Here's a query with model and analytic functions.
If you worked with floating point number it'ld be easy (i put some comments and additional columns to see how it works):

SQL> select lot_no, avail_bales from lot_temp order by lot_no;
    LOT_NO AVAIL_BALES
---------- -----------
         1           3
         2           8
         3          13
         4           5
         8           7
         9           2
        12           4

7 rows selected.

SQL> def tot_bales=30
SQL> break on report
SQL> compute sum label Total of given_bales on report
SQL> def max_iter=10
SQL> select ln lot_no, ab avail_bales, gb given_bales, rb, rl, pr, ni
  2  from lot_temp
  3  model
  4    dimension by (lot_no ln)
  5    measures (avail_bales ab, cast(null as number) gb,
  6              cast(null as number) rb, cast(null as number) rl,
  7              cast(null as number) pr, 0 as ni)
  8    rules update sequential order 
  9      iterate(&max_iter) 
 10      until (rl[1] = 0)
 11      ( -- rb = remaining bales at the beginning
 12        rb[ANY] = &tot_bales - nvl(sum(gb)[ANY],0),
 13        -- rl = remaining lots at the beginning
 14        rl[ANY] = count(*)[ANY] - count(gb)[ANY],
 15        -- gb = given bales
 16        gb[ANY] = case
 17                  when gb[cv()] is null and rb[cv()]/rl[cv()] >= ab[cv()]
 18                    -- if share does not fit take only available
 19                    then ab[cv()]
 20                  when gb[cv()] is not null
 21                    -- don't change what's already set
 22                    then gb[cv()]
 23                  when pr[cv()] = rl[cv()] and gb[cv()] is null 
 24                    -- no change in the previous iteration -> fill the rest
 25                    then rb[cv()]/rl[cv()]
 26                  end,
 27        -- memorize previous remaining lots
 28        pr[ANY] = rl[cv()],
 29        -- rb = remaining bales at the end
 30        rb[ANY] = &tot_bales - nvl(sum(gb)[ANY],0),
 31        -- rl = remaining lots at the end
 32        rl[ANY] = count(*)[ANY] - count(gb)[ANY],
 33        -- number of iterations (just curious)
 34        ni[ANY] = ni[cv()] + 1
 35      )
 36  order by 1
 37  /
    LOT_NO AVAIL_BALES GIVEN_BALES         RB         RL         PR         NI
---------- ----------- ----------- ---------- ---------- ---------- ----------
         1           3           3 1.0000E-38          0          3          4
         2           8  5.33333333 1.0000E-38          0          3          4
         3          13  5.33333333 1.0000E-38          0          3          4
         4           5           5 1.0000E-38          0          3          4
         8           7  5.33333333 1.0000E-38          0          3          4
         9           2           2 1.0000E-38          0          3          4
        12           4           4 1.0000E-38          0          3          4
                       -----------
Total                           30

7 rows selected.

Now we can enclose this query to get the integer behaviour you want:

SQL> select ln lot_no, ab avail_bales, -- gb, 
  2         case
  3         when gb = trunc(gb) then gb
  4         when sum(ceil(gb)-gb) over(order by ln) 
  5              < sum(gb-trunc(gb)) over()
  6           then ceil(gb)
  7         else trunc(gb)
  8         end given_bales
  9  from ( select ln, ab, gb
 10         from lot_temp
 11         model
 12           dimension by (lot_no ln)
 13           measures (avail_bales ab, cast(null as number) gb,
 14                     cast(null as number) rb, cast(null as number) rl,
 15                     cast(null as number) pr, 0 as ni)
 16           rules update sequential order 
 17             iterate(&max_iter) 
 18             until (rl[1] = 0)
 19             ( -- rb = remaining bales at the beginning
 20               rb[ANY] = &tot_bales - nvl(sum(gb)[ANY],0),
 21               -- rl = remaining lots at the beginning
 22               rl[ANY] = count(*)[ANY] - count(gb)[ANY],
 23               -- gb = given bales
 24               gb[ANY] = case
 25                         when    gb[cv()] is null
 26                             and rb[cv()]/rl[cv()] >= ab[cv()]
 27                           -- if share does not fit take only available
 28                           then ab[cv()]
 29                         when gb[cv()] is not null
 30                           -- don't change what's already set
 31                           then gb[cv()]
 32                         when pr[cv()] = rl[cv()] and gb[cv()] is null 
 33                           -- no change in the previous iteration
 34                           -- -> fill the rest
 35                           then rb[cv()]/rl[cv()]
 36                         end,
 37               -- memorize previous remaining lots
 38               pr[ANY] = rl[cv()],
 39               -- rb = remaining bales at the end
 40               rb[ANY] = &tot_bales - nvl(sum(gb)[ANY],0),
 41               -- rl = remaining lots at the end
 42               rl[ANY] = count(*)[ANY] - count(gb)[ANY],
 43               -- number of iterations (just curious)
 44               ni[ANY] = ni[cv()] + 1
 45             )
 46       )
 47  order by 1
 48  /
    LOT_NO AVAIL_BALES GIVEN_BALES
---------- ----------- -----------
         1           3           3
         2           8           6
         3          13           5
         4           5           5
         8           7           5
         9           2           2
        12           4           4
                       -----------
Total                           30

7 rows selected.

Regards
Michel

 

model clause

Omer, February 01, 2006 - 1:20 am UTC

Hi!
superb query. I was not aware of this new model clause.
thanks for this execellent piece of code.

Regards

ROW_NUMBER() ordering

Zahir M, February 03, 2006 - 4:59 pm UTC

I have two identical sqls ( except for the where clause ) ; 
the order of the result set are different , even though my order clause is the same in the 'Row Number' function.
Please explain .

SQL> Select * from ( 
  2    Select ename ,deptno , row_number() over( order by deptno) from emp) ;

ENAME       DEPTNO ROW_NUMBER()OVER(ORDERBYDEPTNO)
------- ---------- -------------------------------
CLARK           10                               1
Zahir           10                               2
Zahir           10                               3
MILLER          10                               4
KING            10                               5
JONES           20                               6
SMITH           20                               7
FORD            20                               8
ADAMS           20                               9
SCOTT           20                              10
WARD            30                              11

ENAME       DEPTNO ROW_NUMBER()OVER(ORDERBYDEPTNO)
------- ---------- -------------------------------
TURNER          30                              12
ALLEN           30                              13
JAMES           30                              14
MARTIN          30                              15
BLAKE           30                              16

16 rows selected.


  1  Select * from (
  2    Select ename ,deptno , row_number() over( order by deptno) rn from emp )
  3*   Where rn < 17
SQL> /

ENAME       DEPTNO         RN
------- ---------- ----------
Zahir           10          1
Zahir           10          2
CLARK           10          3
KING            10          4
MILLER          10          5
SMITH           20          6
JONES           20          7
SCOTT           20          8
ADAMS           20          9
FORD            20         10
ALLEN           30         11

ENAME       DEPTNO         RN
------- ---------- ----------
WARD            30         12
TURNER          30         13
JAMES           30         14
MARTIN          30         15
BLAKE           30         16

SQL> Select version from v$instance;

VERSION
-----------------
10.2.0.1.0 

Follow up to ROW_NUMBER() ordering

Michel Cadot, February 07, 2006 - 4:06 pm UTC

Hi Zahir,

In your row_number you said you order by deptno and your row numbers are sorted by deptno.

If you want more sorts then give more columns in your "order by" clause.
If you don't specify an order for other columns there is no reason to get a specific order.

Finally, the reason to get a different order is you don't give an order.
Moreover you should be happy to get the final result in the row number order (rn column), this can be another order.

Regards
Michel

Rank() Order By

Arjun, March 29, 2006 - 1:47 pm UTC

Still not clear why the order is diffrent

Tom Kyte
March 29, 2006 - 2:23 pm UTC

because the query doesn't have an order by on it, without the query having an order by - you cannot assume anything about the order of rows in the result set.

Finding rows between 2 alphanumeric values

TS, March 31, 2006 - 4:55 pm UTC

Tom,

I have a table in which the accounts are alphanumeric,
something like this:-

table_a
=======
accountid
---------
A1234
B123
B235
B9000
C6754

select * from table_a where account_id between 'B100'
and 'B300';

gets me all the rows with accountid's starting with B%
I only want 2 records retrieved:
B123 and B235

Could you pls let me know of a way.





Tom Kyte
March 31, 2006 - 5:19 pm UTC

are you sure about that

  1* select * from t where accountid between 'B100' and 'B300'
ops$tkyte@ORA9IR2> /

ACCOUNTID
----------
B123
B235

ops$tkyte@ORA9IR2> select * from t;

ACCOUNTID
----------
A1234
B123
B235
B9000
C6754
 

Finding rows between 2 alphanumeric values

TS, March 31, 2006 - 5:48 pm UTC

ok. my bad. Actually the problem is like this:-

Acctid
------
B100
B235
B23456
B11234

select * from table_a where acctid between 'B100' and 'B300';

gets me all four rows. I think the reason being its trying
to sort every character value and since the 3rd and 4th rows
first couple of values falls between 100 and 300 it retrieves those rows also.

Is there a work around for this?


Thanks
TS


Tom Kyte
March 31, 2006 - 7:00 pm UTC

well, I don't know what workaround there could be - you seem to be thinking this is two fields

<string> <number>


if they are in fact in this format, you'd have to do something with substr and to_number()

otherwise, this is working exactly as you would expect. B2<ANYTHING> is by definition between B1.. and B3.. - regardless of what the .. is.

this is how strings "by their very definition" compare.

select query from matadata

Tony, April 04, 2006 - 7:02 am UTC

Tom

How are you?

I have one query .

I am creating one select statement from the table metadata.

the metadata table contains

V_COL_NAME | V_SRC_TAB | V_FILTER_CLAUSE | V_GRP_BY_CLAUSE
------------------------------------------------------------
empno | emp | emp.deptno = dept.deptno | empno
empname | dept | | empname
dname | | | dname
sum(sal) | | |

The table creation script is : CREATE TABLE REP_QUERY_DATA(V_COL_NAME VARCHAR2(500 BYTE),
V_SRC_TAB VARCHAR2(500 BYTE),
V_FILTER_CLAUSE VARCHAR2(500 BYTE),
V_GRP_BY_CLAUSE VARCHAR2(500 BYTE)
)

I want to create the select query using this table.

The query will be :

select
empno,
empname,
dname,
sum(sal)
from
emp,
dept
where emp.deptno = dept.deptno
group by
empno,
empname,
dname

can you please guide me?

Tom Kyte
April 04, 2006 - 9:58 am UTC

it is your data - you are storing it, you designed it.


but you cannot put it all together yourself???


I won't even comment on how bad an idea I think this is. The data model you are using is wholly inappropriate. You might want to give a bit more thought to that - I see the need for at least 5 tables to do this right.

Help on Query using analytics

Jonty, April 27, 2006 - 5:24 pm UTC

Hello Tom,

I have following data in my table.

Row# Item Order_Date Order_No
---- ----- ---------- ---------
1 101 1-Jan-2006 9001
2 101 2-Jan-2006 9002
3 101 8-Jan-2006 9003
4 101 16-Jan-2006 9004
5 101 18-Jan-2006 9005

6 102 1-Jan-2006 9006
7 102 5-Jan-2006 9007
8 102 7-Jan-2006 9008
9 102 20-Jan-2006 9009

10 103 5-Jan-2006 9010
11 103 7-Jan-2006 9011
12 103 9-Jan-2006 9012


I need to find out which are the latest group of Orders for each Item.

Group of Ordres - any 2 orders order_date are within 1 week time gap.

In above example,
For
Item 101 - Row # 4,5 will be group # 1
Row# 1,2,3 wiil be group # 2

Item 102 - Row # 9 will be group # 1
Row # 6,7,8 will be group # 2

Item 103 - Row # 10, 11, 12 will be group # 1

I am only interested in group # 1. Query results should be like this

Item # Order No
-------- ------------
101 9004
101 9005

102 9009

103 9010
103 9011
103 9012

I really appreciate your help on this.

Thanks

Jonty

Summation over week's data

A reader, June 01, 2006 - 6:02 pm UTC

Hi Tom,

Can you please assist me with the sql using analytics to compute the sum of the weekly data for a report. I will extend it then to cover monthly and yearly data.

Here is the sql:

SELECT TRUNC (a_date), COUNT (DISTINCT p_id)
FROM a_table
WHERE (a_str = 'Login')
AND (a_date >= TRUNC (SYSDATE - (TO_CHAR (SYSDATE, 'D') - 1))
AND (a_date < TRUNC (SYSDATE) + 1)
)
GROUP BY (TRUNC (a_date))

Row# TRUNC(A_DATE) COUNT(DISTINCT P_ID)

1 05/28/2006 143
2 05/29/2006 472
3 05/30/2006 3959
4 05/31/2006 4065
5 06/01/2006 3720

On the sixth row, I'd like to have the sum of the count values.

Thanks for your help.

Ma$e

Tom Kyte
June 01, 2006 - 6:24 pm UTC

give me a table and inserts to work with!

(you want a group by rollup apparently.... that might be enough for you to get it...)


scott@ORA10GR2> select trunc(hiredate,'y'), count(empno)
2 from emp
3 group by rollup(trunc(hiredate,'y'))
4 /

TRUNC(HIR COUNT(EMPNO)
--------- ------------
01-JAN-80 1
01-JAN-81 10
01-JAN-82 2
01-JAN-83 1
14


The distinct in the select is the culprit....

A reader, June 01, 2006 - 11:41 pm UTC

Hi Tom:

I'm working on getting the ddl and dml for you but in the mean time here is what I think is the problem.

It is the distinct in the select that is the key.

SQL> SELECT   TRUNC (d_date), COUNT (DISTINCT p_id)
  2      FROM d_table
  3     WHERE (    d_date >=
  4                                   TRUNC (to_date('06-JAN-2006','DD-MON-YYYY') - (TO_CHAR 
  5  (to_date('06-JAN-2006','DD-MON-YYYY'), 'D') - 1))
  6            AND (D_DATE < TRUNC (to_date('06-JAN-2006','DD-MON-YYYY')) + 1)
  7           )
  8  GROUP BY rollup (TRUNC (D_DATE));

TRUNC(D_D COUNT(DISTINCTP_ID)
--------- -------------------
01-JAN-06                   1
02-JAN-06                   1
03-JAN-06                   2
04-JAN-06                   3
05-JAN-06                   2
06-JAN-06                   1
                            6

I want the total to be 10. The rollup should perform the sum of the count(...) values.

Basically the sql is to answer: How many total unique user logins have occured daily and the total of that at the bottom (i.e total of all unqiue logins).

However:

SQL> SELECT COUNT(DISTINCT P_ID) FROM D_TABLE WHERE D_DATE BETWEEN '01-JAN-2006'  AND '06-JAN-2006';


COUNT(DISTINCTP_ID)
-------------------
                  5

7 rows selected.

As usual your assistance is greatly appreciated.

Ma$e 

here is the dml, ddl....

A reader, June 02, 2006 - 12:04 am UTC

SQL> desc d_table;
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 P_ID                                      NOT NULL NUMBER(10)
 D_DATE                                    NOT NULL DATE


insert into d_table values (7052,'01-JAN-06');
insert into d_table values (7052,'02-JAN-06');
insert into d_table values (7052,'02-JAN-06');
insert into d_table values (7052,'02-JAN-06');
insert into d_table values (1000,'03-JAN-06');
insert into d_table values (1000,'03-JAN-06');
insert into d_table values (1000,'03-JAN-06');
insert into d_table values (7204,'03-JAN-06');
insert into d_table values (7204,'03-JAN-06');
insert into d_table values (7204,'03-JAN-06');
insert into d_table values (7204,'03-JAN-06');
insert into d_table values (1000,'03-JAN-06');
insert into d_table values (1000,'03-JAN-06');
insert into d_table values (7204,'03-JAN-06');
insert into d_table values (7204,'03-JAN-06');
insert into d_table values (7058,'04-JAN-06');
insert into d_table values (7040,'04-JAN-06');
insert into d_table values (1000,'04-JAN-06');
insert into d_table values (1000,'04-JAN-06');
insert into d_table values (7040,'04-JAN-06');
insert into d_table values (7052,'05-JAN-06');
insert into d_table values (7052,'05-JAN-06');
insert into d_table values (7052,'05-JAN-06');
insert into d_table values (7052,'05-JAN-06');
insert into d_table values (7052,'05-JAN-06');
insert into d_table values (7052,'05-JAN-06');
insert into d_table values (7204,'05-JAN-06');
insert into d_table values (7022,'06-JAN-06');

thanks
 

Tom Kyte
June 02, 2006 - 10:23 am UTC

ops$tkyte@ORA10GR2> select dt, sum(cnt)
  2    from (
  3  SELECT   TRUNC (d_date) dt, COUNT (DISTINCT p_id) cnt
  4      FROM d_table
  5     WHERE (    d_date >=
  6           TRUNC (to_date('06-JAN-2006','DD-MON-YYYY') - (TO_CHAR
  7  (to_date('06-JAN-2006','DD-MON-YYYY'), 'D') - 1))
  8            AND (D_DATE < TRUNC (to_date('06-JAN-2006','DD-MON-YYYY')) + 1)
  9           )
 10  GROUP BY (TRUNC (D_DATE))
 11         )
 12  group by rollup(dt);
 
DT          SUM(CNT)
--------- ----------
01-JAN-06          1
02-JAN-06          1
03-JAN-06          2
04-JAN-06          3
05-JAN-06          2
06-JAN-06          1
                  10
 
7 rows selected.
 

Most problems do have simple solutions....if you can think out of the box...

A reader, June 02, 2006 - 11:35 am UTC

Thanks Tom,

I should have expected the solution would be that simple,but I was stuck on a single query and could not think out of the box...

Ma$e


Ami

Ami, June 09, 2006 - 10:37 am UTC

Hi Tom,

I need to create table B from table A. Below is sample data from table A.

col_1 col_2 col_3 col_4
489915 89831 76592641 415082
492207 89831 76592641 415082
638185 89831 76592641 415082
638813 89831 76592641 415082
638172 89831 76592641 415082
492433 89831 76592641 415082

526170 89834 45897659 167876
533538 89834 45897659 167876
531911 89834 45897659 167876

838980 89840 67731620 561277
926029 89840 67731620 561277
925192 89840 67731620 561277
923034 89840 67731620 561277

900096 89844 26091723 2854
863387 89844 794581462 16122
678308 89844 99188161 68802
884397 89844 6952568 599327
882958 89844 6952568 599327
881272 89844 6952568 599327
879516 89844 6952568 599327

487415 89905 1414739 82226
638480 89905 1414739 82226
638135 89905 1414739 82226
636633 89905 1414739 82226
636631 89905 1414739 82226
491705 89905 931873913 158128

The table is unique by col_1 .
For table B, I need one record from each group of col_2 based on max(col_1) , but if a group has more than 1 value in col_4 then the record I need has to be the record with the max(col_4).

Any ideas on how to do this in sql?
As usual, your advice is greatly appreciated.

Thank you,
Ami



Tom Kyte
June 09, 2006 - 1:22 pm UTC

sure, we could do this...

but without a create table and insert into to demonstrate with, I won't.

You can. It involves analytics - lots of examples of analytics on the site.

Your recommendation

Mark, August 01, 2006 - 11:00 am UTC

Hi Tom,

I'd like your *economic* recommendation of how to go about learning the Analytics Functions in Oracle 9i R2. Can you recommend a good hands-on book or anything that has clear examples? I'm browsing the Oracle docs but perhaps looking for something a bit better and more hands-on.

Regards,
Mark

Tom Kyte
August 01, 2006 - 6:47 pm UTC

They hid the best bits in the data warehousing guide:

</code> http://docs.oracle.com/docs/cd/B10501_01/server.920/a96520/analysis.htm#1020 <code>

and I've been told my chapter in "Expert one on one Oracle" (first book, shipped as a CD with the latest book "Expert Oracle Database Architecture") has a pretty readable chapter on it.



row_number()

Su Baba, August 03, 2006 - 12:37 pm UTC

CREATE TABLE x (
name VARCHAR2(10),
theDate DATE
);

INSERT INTO x VALUES ('John', TO_DATE('2006/01/01', 'YYYY/MM/DD'));
INSERT INTO x VALUES ('John', TO_DATE('2006/04/01', 'YYYY/MM/DD'));
INSERT INTO x VALUES ('John', TO_DATE('2006/07/01', 'YYYY/MM/DD'));
INSERT INTO x VALUES ('Mary', TO_DATE('2006/04/01', 'YYYY/MM/DD'));
INSERT INTO x VALUES ('Mary', TO_DATE('2006/07/01', 'YYYY/MM/DD'));
INSERT INTO x VALUES ('Ken', TO_DATE('2006/04/01', 'YYYY/MM/DD'));
INSERT INTO x VALUES ('Ken', TO_DATE('2006/07/01', 'YYYY/MM/DD'));
commit;


SELECT name,
row_number() OVER (PARTITION BY name ORDER BY theDate) rn
FROM x;

NAME RN
---------- ----------
John 1
John 2
John 3
Ken 1
Ken 2
Mary 1
Mary 2

Now is there anyway to use analytical functions to produce rn2 (see below) which gives rownum across names, but the rownum should stay the same within the same name?

NAME RN RN2
---------- ---------- ---
John 1 1
John 2 1
John 3 1
Ken 1 2
Ken 2 2
Mary 1 3
Mary 2 3


Tom Kyte
August 03, 2006 - 4:25 pm UTC

dense_rank() over (order by name)

select

Robert Hanrahan, August 10, 2006 - 5:07 am UTC

hi Tom,

I have a question for you:

I have a select like this:
SELECT a.id_contract, a.credit_card, b.num_of_contracts
FROM big_table a,(select count(credit_card) num_of_contracts, credit_card from big_table a
group by credit_card
having count(credit_card) > 1) b
where a.credit_card = b.credit_card
order by credit_card, id_contract;

and the output is this:
id_contract credit_card n
----------- -------------- -
7349873 36009899812733 2
7349922 36009899812733 2
7762917 371232759765007 4
7766297 371232759765007 4
7771404 371232759765007 4
7771426 371232759765007 4

Now, I have to insert this data inside another table in this way:

id_contract1 id_contract2 id_contract3 id_contract4 credit_card
7349873 7349922 NULL NULL 36009899812733(2)
7762917 7766297 7771404 7771426 37123275976500(4)

What do I have to do to accomplish this?

let me know

Robert Hanrahan

Tom Kyte
August 10, 2006 - 9:20 am UTC

you'll use row_number() over (partition by credit_card order by id_contract) to assign a number 1, 2, 3, 4... to the rows - so take your query above, call it q1

select q1.*, row_number() over (... )rn
from (Q1) q1

call that q2. then pivot it.

select credit_card,
max( decode( rn, 1, id_contract ) ),
max( decode( rn, 2, id_contract ) ),
...
max( decode( rn, N, id_contract ) ) /* n is max number of id_contracts
per credit_card */
from (Q2)
group by credit_card

Variance of First time where

Milan, August 15, 2006 - 9:04 am UTC

Hi Tom,

I have following table:

create table sale_hist
(
tmstmp date,
sale_id number(4),
sale_type_id number(4),
value number(4)
)

insert into sale_hist values (to_date('01082006','ddmmyyyy'), 1, 1, 1);
insert into sale_hist values (to_date('01082006','ddmmyyyy'), 1, 2, 1);
insert into sale_hist values (to_date('01082006','ddmmyyyy'), 1, 3, 1);
insert into sale_hist values (to_date('02082006','ddmmyyyy'), 1, 1, 0);
insert into sale_hist values (to_date('02082006','ddmmyyyy'), 1, 2, 0);
insert into sale_hist values (to_date('02082006','ddmmyyyy'), 1, 3, 1);
insert into sale_hist values (to_date('03082006','ddmmyyyy'), 1, 3, 0);
insert into sale_hist values (to_date('05082006','ddmmyyyy'), 1, 1, 2);
insert into sale_hist values (to_date('05082006','ddmmyyyy'), 1, 2, 3);
insert into sale_hist values (to_date('06082006','ddmmyyyy'), 1, 1, 1);
insert into sale_hist values (to_date('06082006','ddmmyyyy'), 1, 3, 2);

For sale_id I have to sum the latest values for all sale_type_id

I think I can do it with

select
sum(value)
from
(
select distinct sale_id, sale_type_id,
first_value(value) over (partition by sale_id, sale_type_id
order by tmstmp desc) value
from sale_hist
where sale_id = :p_sale_id
and tmstmp <= :p_tmstmp
)

My problem is:
I have to found for every sale_id the latest tmstmp when the sum started to be positive.

I have read the similar questions on this page and I can do it for single value (LAG ...). But I have problem with the sum, I cannot relate the tmstmp and sale_id to previous 'sum' statement. Could you help me, please?

Thanks in advance.
Milan


Tom Kyte
August 15, 2006 - 12:16 pm UTC

I don't know what you mean by "I have to found for every sale_id the latest tmstmp when the sum started to be positive."

Variance of First time where

Milan, August 15, 2006 - 5:20 pm UTC

Hi Tom,
thanks for fast response. I try to clarify it.

I get following values (sum the latest values for all sale_type_id) from the data for tmstmp <= [1. column] and sale_id = 1

tmstmp <= sale_id sum
01.08.2006 1 3
02.08.2006 1 1
03.08.2006 1 0
04.08.2006 1 0
05.08.2006 1 5
06.08.2006 1 6

For example :p_tmstmp = 06.08.2006
If the sum for :p_tmstmp is positive (sum > 0) I have to find in the history the begining of continuos row of positive sums - it is 05.08.2006 in this case.
...
04.08.2006 1 0 -- zero
05.08.2006 1 5 -- positive <=== the begining of continuous row
06.08.2006 1 6 -- positive


Tom Kyte
August 15, 2006 - 6:16 pm UTC

do you mean like this:
ops$tkyte%ORA9IR2> select *
  2    from (
  3  select sale_id, tmstmp, value,
  4         max(flag) over (partition by sale_id order by tmstmp DESC) max_flag
  5    from (
  6  select sale_id, tmstmp, value,
  7         case when row_number() over (partition by sale_id order by tmstmp DESC) = 1
  8                  then 1
  9                          when value <= 0
 10                          then 2
 11                  end flag
 12    from sale_hist
 13         )
 14         )
 15   where max_flag = 1
 16   order by sale_id, tmstmp DESC
 17  /

   SALE_ID TMSTMP         VALUE   MAX_FLAG
---------- --------- ---------- ----------
         1 06-AUG-06          1          1
         1 06-AUG-06          2          1
         1 05-AUG-06          2          1
         1 05-AUG-06          3          1
 

Variance of First time where

Milan, August 16, 2006 - 4:29 am UTC

Hi Tom,
thanks for your effort. Ok, it works fine for the value.

But I have to work with the sum defined above (sum of the latest values for all sale_type_id for one sale_id) instead of the value. I have problem 'to substitute the value by the sum' in your query.

I can do it with PL/SQL function, but I would like to solve it with one query (if it is possible).

Tom Kyte
August 16, 2006 - 8:35 am UTC

you need to really lay this out like a specification.

Pretend for a moment I was a coder.
Now, write the specification that a coder can use to produce code from.

Variance of First time where

Milan, August 16, 2006 - 10:42 am UTC

Hi Tom,

here is specification:

I have a history information about values.

The value can not be empty, it can be negative, 0, positive. The value is of some type. The value belongs to some object. The value relates to some timestamp.

The object can have lot of values of various (or the same) types. These values can relate to various (or the same) timestamp, but the combination of object, type, value, timestamp has to be unique.

I need a SUM of latest values of every type - for the object and given timestamp.

Example:
timestamp object type value
---------- ------ ---- -----
...
03.08.2006 1 3 0
05.08.2006 1 1 2
05.08.2006 1 2 3 <-- the latest value for type = 2 and timestamp <= 06.08.2006
06.08.2006 1 1 1 <-- the latest value for type = 1 and timestamp <= 06.08.2006
06.08.2006 1 3 2 <-- the latest value for type = 3 and timestamp <= 06.08.2006


My SUM is then 1 + 3 + 2 = 6 for object = 1 and timestamp <= 06.08.2006
My SUM is then 2 + 3 + 0 = 5 for object = 1 and timestamp <= 05.08.2006
...

If this SUM is positive I have to find a timestamp in the history - a begining of continuos row of positive SUM

Example:
timestamp object SUM for timestamp
---------- ------ -----------------
...
04.08.2006 1 0 -- zero
05.08.2006 1 5 -- positive <=== the begining of continuous row
06.08.2006 1 6 -- positive

That's all. Sorry, I can not express it better.

========================
create table object_hist
(
tmstmp date,
object_id number(4),
value_type number(4),
value number(4)
)

insert into object_hist values (to_date('01082006','ddmmyyyy'), 1, 1, 1);
insert into object_hist values (to_date('01082006','ddmmyyyy'), 1, 2, 1);
insert into object_hist values (to_date('01082006','ddmmyyyy'), 1, 3, 1);
insert into object_hist values (to_date('02082006','ddmmyyyy'), 1, 1, 0);
insert into object_hist values (to_date('02082006','ddmmyyyy'), 1, 2, 0);
insert into object_hist values (to_date('02082006','ddmmyyyy'), 1, 3, 1);
insert into object_hist values (to_date('03082006','ddmmyyyy'), 1, 3, 0);
insert into object_hist values (to_date('05082006','ddmmyyyy'), 1, 1, 2);
insert into object_hist values (to_date('05082006','ddmmyyyy'), 1, 2, 3);
insert into object_hist values (to_date('06082006','ddmmyyyy'), 1, 1, 1);
insert into object_hist values (to_date('06082006','ddmmyyyy'), 1, 3, 2);



Tom Kyte
August 16, 2006 - 11:16 am UTC

lost me here:
My SUM is then 1 + 3 + 2 = 6 for object = 1 and timestamp <= 06.08.2006
My SUM is then 2 + 3 + 0 = 5 for object = 1 and timestamp <= 05.08.2006

are we missing something like "this sum should be always the sum of the three most current records"???


reading on - you keeping using "sum" but seem to be looking VALUE only.


(and the example is un-interesting to test with until you add object "2" or some other object to analyze - unless missing from the spec is "we always look at one object only")

Could you rewrite this query to have less elapsed time.

Sreenivasa Rao, August 22, 2006 - 9:14 pm UTC

Hi Tom,
Currently this query takes more than 30minutes and fails sometime with Ora-1555 error.
Could rewrite you rewrite this query to execute in short span of time.
-------------------------------------------
select substr(a.owner,1,10),
substr(a.segment_name,1,40) "Table name",
substr(round(a.bytes),1,8) "Tsizes",
substr(b.index_name,1,40) "Index name",
substr(round(b.bytes),1,8) "Isizes"
from
(select owner,
segment_name,
bytes/1024/1024 bytes
from dba_segments
where segment_type = 'TABLE'
and owner not in ('SYS','SYSTEM','PERFSTAT','OUTLN','DBSNMP')
and segment_type not like '%PARTIT%') a,
(select dba_indexes.owner,
table_name,index_name,bytes/1024/1024 bytes
from dba_indexes,dba_segments
where dba_indexes.owner = dba_segments.owner
and index_name=segment_name
and dba_indexes.owner
not in ('SYS','SYSTEM','PERFSTAT',
'OUTLN','DBSNMP')) b
where a.owner = b.owner
and a.segment_name = b.table_name
and (b.bytes - a.bytes) > 10
------------------------------------------------

Thanks in advance,
Sreenivas.

Tom Kyte
August 27, 2006 - 3:14 pm UTC

what is the purpose of such a query? what question is it trying to answer here.

the b.bytes-a.bytes part looks "interesting", what is that about?

A query on analytics

Ravi, August 23, 2006 - 3:24 am UTC

Tom,
My test case:

create table t3 (a number, b number)
/



insert into t3 values(1,2);
insert into t3 values(2,2);
insert into t3 values(3,3);
insert into t3 values (4,4);
insert into t3 values (1,2);
insert into t3 values(2,3);

My query:

SELECT a, b, COUNT ( a ) OVER ( PARTITION BY b ORDER BY a DESC ) y2
, LAST_VALUE ( a ) OVER ( PARTITION BY b ORDER BY a DESC ) b3_2
, FIRST_VALUE ( a ) OVER ( PARTITION BY b ORDER BY a DESC ) b3_2_2
, SUM ( a ) OVER ( PARTITION BY b ORDER BY a DESC ) x1
, ROW_NUMBER ( ) OVER ( PARTITION BY b ORDER BY a DESC ) b4
, ROW_NUMBER ( ) OVER ( PARTITION BY b ORDER BY a ASC ) b4_2
, FIRST_VALUE ( a ) OVER ( PARTITION BY b ORDER BY a ASC ) b5
, LAST_VALUE ( a ) OVER ( PARTITION BY b ORDER BY a ASC ) b6
, MIN ( a ) OVER ( PARTITION BY b ORDER BY a DESC ) b7
, MAX ( a ) OVER ( PARTITION BY b ORDER BY a DESC ) b9
, DENSE_RANK ( ) OVER ( PARTITION BY b ORDER BY a DESC ) b9_2
, DENSE_RANK ( ) OVER ( PARTITION BY b ORDER BY a ASC ) b9_2_1
, SUM ( a ) OVER ( PARTITION BY b ORDER BY a DESC ) x1
, SUM ( a ) OVER ( PARTITION BY b ORDER BY a ASC ) x5
, MAX ( a ) OVER ( PARTITION BY b ORDER BY a ASC ) b9_3
, MIN ( a ) OVER ( PARTITION BY b ORDER BY a ASC ) b9_3_2
, SUM ( a ) OVER ( PARTITION BY b ORDER BY a ASC ) x5
, COUNT ( a ) OVER ( PARTITION BY b ORDER BY a ASC ) y1
FROM t3
ORDER BY b DESC

Please can you confirm if I am right if I say, all cases the default Range Unbounded Preceding:

1) If I'd ordered by a column DESC, there can be only ONE unique First_Value and Max value for the ordered column.

1b) However I might expect the LAST_VALUE to keep changing as I traverse down the rows, depending on the ordered column

2)If I'd ordered by a column ASC, there can be only ONE unique Last_Value and Min value for the ordered column.

2b) However I might expect the FIRST_VALUE to keep changing as I traverse down the rows, depending on the ordered column

Tom Kyte
August 27, 2006 - 3:19 pm UTC

1) if you order by a column desc, then first_value and max will be the same.

I am not sure what you mean by "there can be only ONE unique ..."??  there is always only ONE first_value and always only ONE max - by definition.  Whether they are unique in the grand scheme of things - I cannot say.


1b) by definition, yes.  the window is growing "down" like pulling a window shade down.  the bottom is getting further from the top.

2) No, last_value is changing.  Just like it did when you ordered desc

ops$tkyte%ORA10GR2> select x,
  2         min(x) over (order by x asc) min_x,
  3             last_value(x) over (order by x asc) last_x
  4    from t;

         X      MIN_X     LAST_X
---------- ---------- ----------
         1          1          1
         2          1          2


2b) no, by definition, the top of the window doesn't move

ops$tkyte%ORA10GR2> select x,
  2         min(x) over (order by x asc) min_x,
  3         last_value(x) over (order by x asc) last_x,
  4         first_value(x) over (order by x asc) first_x
  5    from t;

         X      MIN_X     LAST_X    FIRST_X
---------- ---------- ---------- ----------
         1          1          1          1
         2          1          2          1

 

Could you rewrite this query to have less elapsed time.

Sreenivasa Rao, August 28, 2006 - 4:20 am UTC

Hi Tom,

I know you will be surprised to see this reply.
Here we have some procdure (rather it is standard) to rebuild an index if it's size exceeds the size of corresponding table.

So pls rewrite the same query,to shorten its elapsed time.

Regards,
Sreenivas.

Tom Kyte
August 28, 2006 - 10:48 am UTC

why do you do that?

I'm not surprised, I'd love to hear the reasoning - don't you end up rebuilding the same indexes over and over and over again? So this doesn't actually save "disk space" over time, when you measured performance - does it go up, down or sideways after you do this?

wouldn't it be funny if

a) this didn't affect performance noticably
b) affected it negatively

given the amount of resources you expend running the query....




Could you rewrite this query to have less elapsed time.

Sreenivasa Rao, August 29, 2006 - 3:27 pm UTC

Hi Tom,

Here we don't consider space merely for 1 or 2 gb. when it comes to performance,we rebuild indexes in off peak hours since it is DSS environment.

Only requirement is ,when the process is initiated ,that must not be affected by index fragmentation.
Since every day we do big uploads on database at specified time.

Regards,
Sreenivas.


Tom Kyte
August 29, 2006 - 4:46 pm UTC

I do not think you got my point.

Have you ever measured the results of this?!?

(meaning, if you don't do it - what then? has anyone taken the time to actually verify this is something you WANT TO BE DOING, or do you just do it "because"?)

Converting row values to a single column

Jonty, September 21, 2006 - 4:15 pm UTC

Hi,

Dept, Emp columns in a table.

I want to get results like

Dept1, 'emp1, emp2, emp3'
Dept2, 'emp11, emp22'
Dept3, 'emp9'

How can I get this ? Can Analytical function helps on this ?

Response will be highly appreciated.

Thanks

Jonty



Tom Kyte
September 22, 2006 - 2:48 am UTC

how about looking in the other place(s) you asked this identical question?

Which is better

Chris, October 27, 2006 - 11:42 am UTC

Tom,
I have a result set that has unique groupings with many attribute columns, one of which is a value. The value column is the sae for each group. I want to add a column to my select statement that is the sum of value for each group for the whole set. I have shown sample data below (where ttl is the column I'd like to add).

group value col1 ttl
----- ----- ----- ---
1 5 A 20
1 5 B 20
2 10 C 20
2 10 D 20
2 10 E 20
3 5 F 20

I was hoping to do this without a nested subquery, but can't seem to think of a way to do it. At this point my options seem to be:

A: use row_number() to sum the first row's

select group_col, value_col, col1, sum(rnval) ttl over ()
from (
select group_col, value_col, col1,
decode(row_number() over (partition by group_col
order by group_col),
1, value_col, 0) rnval
from mytab)

B:

select group_col, value_col, col1,
sum(decode(drank,1,value_col,0)) ttl over ()
from (
select group_col, value_col, col1,
dense_rank() over(partition by group_col order by group_col, rownum) drank
from mytab)

Would one of these be better than the other, and do you have yet a better approach you can think of?

Tom Kyte
October 27, 2006 - 6:24 pm UTC

just use row_number() - seems the most intuitive, the dense_rank is just plain confusing and by referencing rownum - you have to sort of pre-materialize stuff.



Analytical function to add values of two columns

Robin, December 08, 2006 - 9:31 am UTC

Lets say you have an employee who works in two different departments.

So KING may work for dept 10 and dept 20.

His records might look like this:

DEPTNO EMPNO NAME SAL MON TUE WED
10 222 KING 1000 4 6 8
20 222 KING 400 4 2 1


We want to make sure that KING doesn't work more than a total of 40 hours in any one week (we don't want to pay him OT). But we don't care if he works over 8 on a particular day. Neither supervisor knows how many hours KING works for the other dept. He usally works 5 days a week (but I only could show part of the table without screwing up formatting)

Anyway, I can use the following query to find the total for KING's daily hours.

SELECT EMPNO, NAME, SUM(MON) AS T_MON, SUM(TUE) AS T_TUE, SUM(WED) AS T_WED, SUM(THU) AS T_THU, SUM(FRI) AS T_FRI
FROM VW_WORKWEEK
GROUP BY EMPNO, NAME;

Results (with addition days):

EMPNO NAME T_SAL T_MON T_TUE T_WED T_THU T_FRI
222 KING 1400 8 8 9 10 8

But when I try to total the columns I get into trouble. What I want it to look like in the end is:

EMPNO NAME T_SAL T_WRK_WEEK
222 KING 1400 43

I've tried:

SELECT EMPNO, NAME, SUM(SAL) AS T_SAL, ((SUM(MON) + SUM(TUE)+ SUM(WED)+ SUM(THU) + SUM(FRI)) as T_WRK_WEEK
FROM VW_HOUR_COUNT
GROUP BY EMPNO, NAME;

Results:

EMPNO NAME T_SAL T_WRK_WEEK
222 KING 1400

Another try:

SELECT EMPNO, NAME, SUM(SAL) as T_SAL
FROM VW_WORKWEEK
GROUP BY EMPNO, NAME
HAVING ((SUM(MON) + SUM(TUE)+ SUM(WED)+ SUM(THU) + SUM(FRI)) > 40;

Results (no rows returned):

EMPNO NAME T_SAL T_WRK_WEEK

And finally I've tried:

SELECT EMPNO, NAME, SUM(SUM(MON) AS T_MON, SUM(TUE) AS T_TUE, SUM(WED) AS T_WED, SUM(THU) AS T_THU, SUM(FRI) AS T_FRI)
FROM VW_WORKWEEK
GROUP BY EMPNO, NAME;

I get:

ORA-00923: FROM keyword not found where expected

Is it possible to do this?

thanks!




Tom Kyte
December 09, 2006 - 12:19 pm UTC

one of the sums must have been null


nvl(sum(mon),0)+nvl......

I figured it out.

Robin, December 08, 2006 - 12:33 pm UTC

I forgot to include the NVL() function. DUH

Thanks anyway!

Appropriate indexing?

John Howard, January 23, 2007 - 11:19 am UTC

Tom,

Although loosely related to analytics this is more a question of indexing strategy. Sorry if it is a little off topic.

To illustrate my question I'll use the table ORDERS that comes with the OE sample schema (9.2).

 
SQL> desc oe.orders
 Name                      Null?    Type
 ------------------------- -------- --------------------------------------------
 ORDER_ID                  NOT NULL NUMBER(12)
 ORDER_DATE                NOT NULL TIMESTAMP(6) WITH LOCAL TIME ZONE
 ORDER_MODE                         VARCHAR2(8)
 CUSTOMER_ID               NOT NULL NUMBER(6)
 ORDER_STATUS                       NUMBER(2)
 ORDER_TOTAL                        NUMBER(8,2)
 SALES_REP_ID                       NUMBER(6)
 PROMOTION_ID                       NUMBER(6)



What I want is the most recent order, with a not null SALES_REP_ID, for each customer. That is for each CUSTOMER_ID I want the last record with a not null SALES_REP_ID when ordered by ORDER_DATE. I know that I can do this easily with the RANK() function :-

select customer_id,
        order_date,
        sales_rep_id,
        order_id
from
(select         customer_id,
                rank() over (partition by customer_id order by order_date desc) drank,
                order_date,
                sales_rep_id,
                order_id
 from   oe.orders
 where  sales_rep_id is not null
)
where drank = 1
/

CUSTOMER_ID ORDER_DATE                     SALES_REP_ID   ORDER_ID
----------- ------------------------------ ------------ ----------
        157 20-NOV-99 05.22.53.224175 AM            163       2398
        158 20-NOV-99 06.22.38.340990 AM            161       2399
        159 10-JUL-99 07.34.29.559387 AM            161       2400
        160 10-JUL-99 08.22.53.554822 AM            163       2401
        161 02-JUL-99 09.34.44.665170 AM            154       2402
....

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=70 Bytes=4550)
   1    0   VIEW (Cost=4 Card=70 Bytes=4550)
   2    1     WINDOW (SORT PUSHED RANK) (Cost=4 Card=70 Bytes=1680)
   3    2       TABLE ACCESS (FULL) OF 'ORDERS' (Cost=2 Card=70 Bytes=1680)


Now this is ok when the table is of such a size that a full scan is bearable. My question is regarding the best way (if any) to use an index to reduce the cost when the table becomes large but the number of customers remains fairly small. I was hoping that something like: -

 (CUSTOMER_ID, ORDER_DATE DESC, SALES_REP_ID) 


could be "skip-scanned" in some way so that for each CUSTOMER_ID we would only need to traverse along to the first entry with non-null SALES_REP_ID then skip to the next CUSTOMER_ID.
Anything I've tried so far full scans the index.

Am I barking up the wrong tree - or am I just barking?

If so - what is the preferred method for keeping a query like this from becoming slower as time goes by?

Many thanks in advance,
John



Even Distribution

Ed, February 14, 2007 - 6:19 pm UTC

Hi Tom,
I'd like to ask about "Even-Distribution"
(May not be the right term)

Say I have a bunch of bowling players of different skill level as indicated by his avg_score in the table below.

I need to allot them into n teams (say 8), of equivalent strength on the TEAM level so no team ends up with mostly high-scorers and vic-versa.

(and yes, let's say players can not be evenly divided into teams)

Would you shed some light on how to achieve this ?

Thanks

10gR2> create table players (id integer primary key, avg_score number, team_no integer) ;

10gR2> desc players
Name      Type   
--------- -------
ID        INTEGER                           
AVG_SCORE NUMBER                         
TEAM_NO   INTEGER

10gR2> BEGIN
  2    FOR i IN 1..113
  3    LOOP
  4        INSERT INTO players (id, avg_score)  VALUES(i,round(dbms_random.value(75,295)));
  5    END LOOP;
  6  END ;
  7  /

10gR2> commit;


Analytic Function?

Bea, April 11, 2007 - 3:36 pm UTC

Hi Tom,

I have this table with these values:

YEAR VALUE1
------------
2007 1.04
2008 1.04
2009 1.04
2010 1.04
2011 1.04

And I have to get this result

YEAR VALUE2
-----------
2007 1.04 (same value)
2008 1.0816 (1.04 (2007) * 1.04 (2008)
2009 1.12486 (1.0816 (2008)*1.04 (2009)
2010 1.16985 (1.12486 (2009) * 1.04 (2010)
2011 1.21664 (1.16985(2010) * 1.04 (2011)

That is:
For 2007, the same value in the table
For 2008, the value from the table*previous value in 2007
For 2009, the value from the table*calculated previous value in 2008

and so on

Thanks!!!





Tom Kyte
April 11, 2007 - 5:53 pm UTC

ops$tkyte%ORA10GR2> select year, v, exp( sum( ln(v) ) over (order by year) )
  2    from t;

      YEAR          V EXP(SUM(LN(V))OVER(ORDERBYYEAR))
---------- ---------- --------------------------------
      2007       1.04                             1.04
      2008       1.04                           1.0816
      2009       1.04                         1.124864
      2010       1.04                       1.16985856
      2011       1.04                        1.2166529


Eka, April 11, 2007 - 9:39 pm UTC

Hi Tom,

Here's the table:

SQL> create table test(goods_id varchar2(1),trx_date date,lastprod_qty number);
SQL> insert into test values('A',to_date('010407','ddmmyy'),20);
SQL> insert into test values('A',to_date('030407','ddmmyy'),0);
SQL> insert into test values('A',to_date('040407','ddmmyy'),0);
SQL> insert into test values('A',to_date('060407','ddmmyy'),80);
SQL> insert into test values('B',to_date('010407','ddmmyy'),50);
SQL> insert into test values('B',to_date('060407','ddmmyy'),10);
SQL> insert into test values('B',to_date('090407','ddmmyy'),0);

SQL> select * from test order by goods_id,trx_date;

GOODS_ID TRX_DATE    LASTPROD_QTY
-------- ----------- ------------
A        4/1/2007              20
A        4/3/2007               0
A        4/4/2007               0
A        4/6/2007              80
A        4/10/2007              0
B        4/1/2007              50
B        4/6/2007              10
B        4/9/2007               0

8 rows selected

The goal is to update column lastprod_qty if it is 0 with the last non zero value. I want the output to be like this:
GOODS_ID TRX_DATE    LASTPROD_QTY
-------- ----------- ------------
A        4/1/2007              20
A        4/3/2007              20
A        4/4/2007              20
A        4/6/2007              80
A        4/10/2007             80
B        4/1/2007              50
B        4/6/2007              10
B        4/9/2007              10

I can easily accomplish that by using plsql, looping rows one by one, but I wonder is there a better solution? This table will be very big, and I hope I can use single sql statement, since it will be much faster, but I don't have any idea how to do that. I tried lag analytic function, but it didn't help in this case.
Tom, do you have any suggestion?

Thanks in advance,
Eka


Tom Kyte
April 12, 2007 - 8:57 am UTC

ops$tkyte%ORA10GR2> select *
  2    from (
  3  select rowid rid, goods_id, trx_date, lastprod_qty,
  4         last_value( decode(lastprod_qty,0,to_number(null),lastprod_qty) ignore nulls) over (partition by goods_id order by trx_date) last_qty
  5    from test
  6         )
  7   where lastprod_qty = 0;

RID                G TRX_DATE  LASTPROD_QTY   LAST_QTY
------------------ - --------- ------------ ----------
AAARU+AAEAAAAHPAAB A 03-APR-07            0         20
AAARU+AAEAAAAHPAAC A 04-APR-07            0         20
AAARU+AAEAAAAHPAAE A 10-APR-07            0         80
AAARU+AAEAAAAHPAAH B 09-APR-07            0         10

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> merge into test
  2  using(
  3  select *
  4    from (
  5  select rowid rid, goods_id, trx_date, lastprod_qty,
  6         last_value( decode(lastprod_qty,0,to_number(null),lastprod_qty) ignore nulls) over (partition by goods_id order by trx_date) last_qty
  7    from test
  8         )
  9   where lastprod_qty = 0
 10      ) test2
 11  on (test.rowid = test2.rid)
 12  when matched then update set lastprod_qty = test2.last_qty;

4 rows merged.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select * from test order by goods_id, trx_date;

G TRX_DATE  LASTPROD_QTY
- --------- ------------
A 01-APR-07           20
A 03-APR-07           20
A 04-APR-07           20
A 06-APR-07           80
A 10-APR-07           80
B 01-APR-07           50
B 06-APR-07           10
B 09-APR-07           10

8 rows selected.



or for 9i

ops$tkyte%ORA9IR2> select *
  2    from (
  3  select rowid rid, goods_id, trx_date, lastprod_qty,
  4         to_number(substr(max( case when lastprod_qty <> 0 then to_char(trx_date,'yyyymmdd')||lastprod_qty end) over (partition by goods_id order by trx_date),9)) last_qty
  5    from test
  6         )
  7   where lastprod_qty = 0;

RID                G TRX_DATE  LASTPROD_QTY   LAST_QTY
------------------ - --------- ------------ ----------
AAAJrxAAJAAAAGWAAB A 03-APR-07            0         20
AAAJrxAAJAAAAGWAAC A 04-APR-07            0         20
AAAJrxAAJAAAAGWAAE A 10-APR-07            0         80
AAAJrxAAJAAAAGWAAH B 09-APR-07            0         10

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> merge into test
  2  using
  3  (
  4  select *
  5    from (
  6  select rowid rid, goods_id, trx_date, lastprod_qty,
  7         to_number(substr(max( case when lastprod_qty <> 0 then to_char(trx_date,'yyyymmdd')||lastprod_qty end) over (partition by goods_id order by trx_date),9)) last_qty
  8    from test
  9         )
 10   where lastprod_qty = 0
 11  ) test2
 12  on (test.rowid = test2.rid)
 13  when matched then update set lastprod_qty = last_qty
 14  when not matched then insert (goods_id) values (null) /* never happens! */
 15  /

4 rows merged.

ops$tkyte%ORA9IR2> select * from test order by goods_id, trx_date;

G TRX_DATE  LASTPROD_QTY
- --------- ------------
A 01-APR-07           20
A 03-APR-07           20
A 04-APR-07           20
A 06-APR-07           80
A 10-APR-07           80
B 01-APR-07           50
B 06-APR-07           10
B 09-APR-07           10

8 rows selected.


To Eka

Michel CADOT, April 12, 2007 - 2:22 am UTC


SQL> select * from test order by goods_id,trx_date;

G TRX_DATE   LASTPROD_QTY
- ---------- ------------
A 01/04/2007           20
A 03/04/2007            0
A 04/04/2007            0
A 06/04/2007           80
B 01/04/2007           50
B 06/04/2007           10
B 09/04/2007            0

7 rows selected.

SQL> select goods_id, trx_date, lastprod_qty,
  2         last_value(nullif(lastprod_qty,0) ignore nulls) 
  3           over (partition by goods_id order by trx_date) new_qty
  4  from test
  5  order by goods_id, trx_date
  6  /

G TRX_DATE   LASTPROD_QTY    NEW_QTY
- ---------- ------------ ----------
A 01/04/2007           20         20
A 03/04/2007            0         20
A 04/04/2007            0         20
A 06/04/2007           80         80
B 01/04/2007           50         50
B 06/04/2007           10         10
B 09/04/2007            0         10

7 rows selected.

SQL> update test a 
  2  set lastprod_qty = 
  3        ( select new_qty
  4          from ( select last_value(nullif(lastprod_qty,0) ignore nulls) 
  5                          over (partition by goods_id order by trx_date) new_qty,
  6                        rowid row_id
  7                 from test ) b
  8          where a.rowid = b.row_id )
  9  where lastprod_qty = 0
 10  /

3 rows updated.

SQL> select * from test order by goods_id,trx_date;

G TRX_DATE   LASTPROD_QTY
- ---------- ------------
A 01/04/2007           20
A 03/04/2007           20
A 04/04/2007           20
A 06/04/2007           80
B 01/04/2007           50
B 06/04/2007           10
B 09/04/2007           10

7 rows selected.

Regards
Michel

Thanks Michel

eka@i4te.com, April 12, 2007 - 3:58 am UTC


What is the most efficient way to get top N rows

Shree TVM, April 13, 2007 - 12:02 am UTC

Dear Tom,
We have an application that queries the Top-N rows based on filter from a very huge table. We are currently using the DENSE_RANK() function like this

select a.*
from (select c1,c2,c3, dense_rank() over (order by c1 desc,c3 desc) as rnk
from t1
where c1 < 12345
and c3 < sysdate
and c2 = 'XYZ' ) a
where rnk < :N
order by a.c1,a.c3;

Is this the most effient way to achive this?

I have been a reader for many many years and posting a question for very first time.

Tom Kyte
April 13, 2007 - 2:04 pm UTC

why do you order by c1 desc, c3 desc in the rank - but not the final output?

What is the most efficient way to get top N rows

Shree, April 14, 2007 - 9:55 am UTC

Application expects the rowset in ascending order of c1,c3

Tom Kyte
April 16, 2007 - 10:38 am UTC

ok, then it is as good as it is going to get.


I presume c1, c3 is NOT unique.

Analytical Function

Shree, April 15, 2007 - 10:37 am UTC

I have unique requirement,there is a table A (huge 15+ millions rows) in Oracle 9ir2 like

Table A(id,cd,cdate,uid);

111,1,02/04/2007,3
112,4,02/04/2007,3
112,5,02/04/2007,3
113,5,02/04/2007,3
114,4,02/04/2007,3
115,7,02/04/2007,3
115,4,02/04/2007,3
116,2,02/04/2007,3
116,3,02/04/2007,3
117,3,02/05/2007,3
117,5,02/05/2007,3

want an output like

111,1,02/04/2007,3
112,4,02/04/2007,3
113,5,02/04/2007,3
114,4,02/04/2007,3
115,7,02/04/2007,3
115,4,02/04/2007,3
116,2,02/04/2007,3
117,3,02/05/2007,3
117,5,02/05/2007,3

If for an id, cd 4 and 5 exists merge(or eliminate the row with cd 5) it togather and create a row with cd 4(ex. ID:112), same way if for an id, cd 2 and 3 exists merge(or eliminate the row with cd 3) it togather and create a row with cd 2 (ex. ID:116). All other rows will apprear in the output as is.

Is there any way I can accomplish this with a SQL statement, may be some analytical function. Here my target to do it as efficently as possible.


Thanks
Tom Kyte
April 16, 2007 - 1:02 pm UTC

no creates
no inserts
no look

Analytical Function

Krish, April 25, 2007 - 6:38 am UTC

Hi Tom,

Could you please help me with an easier approach to get the below result from emp table?

EMPNO ENAME 2_MAXSAL DEPTNO EMPNO ENAME 3_MAXSAL
9999 CLARK 2450 10 7934 MILLER 1300
7566 JONES 2975 20 7876 ADAMS 1100
7499 ALLEN 1600 30 7844 TURNER 1500

Ideally I am trying to get the 2nd and 3 top salary details in a single row.

I was trying to use the TOPN function with inline views (virtual tables)...but couldn't succeed.

Thanks in Advance.
Krish.
Tom Kyte
April 25, 2007 - 10:18 am UTC

scott%ORA10GR2> select deptno,
  2         max( decode( rn, 2, empno ) ) empno2,
  3         max( decode( rn, 2, ename ) ) ename2,
  4         max( decode( rn, 2, sal ) ) sal2,
  5         max( decode( rn, 3, empno ) ) empno3,
  6         max( decode( rn, 3, ename ) ) ename3,
  7         max( decode( rn, 3, sal ) ) sal3
  8    from (
  9  select *
 10    from (
 11  select empno, ename, deptno, sal,
 12         dense_rank() over (partition by deptno order by sal DESC nulls last)rn
 13    from emp
 14         )
 15   where rn between 2 and 3
 16         )
 17   group by deptno
 18   order by deptno
 19  /

    DEPTNO     EMPNO2 ENAME2           SAL2     EMPNO3 ENAME3           SAL3
---------- ---------- ---------- ---------- ---------- ---------- ----------
        10       7782 CLARK            2450       7934 MILLER           1300
        20       7566 JONES            2975       7876 ADAMS            1100
        30       7499 ALLEN            1600       7844 TURNER           1500



but, beware of what you ask for - ask yourself "what if there were 5 people making 2,975 in deptno=20 - what would happen then?



Got the answer TOM

Krish, April 25, 2007 - 8:58 am UTC

Hi TOM,

I was able to find the answer by myself.
but,could u please have a look into it and suggest a better way ??

select a.ename, a.sal,a.deptno, b.ename, b.sal from

(select deptno, ename, sal from (SELECT * FROM (
SELECT deptno, ename, sal, ROW_NUMBER()
OVER (
PARTITION BY deptno ORDER BY sal DESC
) Top2 FROM scott.emp
)
WHERE Top2 <= 2) where top2 = 2 ) a join

(select deptno, ename, sal from (SELECT * FROM (
SELECT deptno, ename, sal, ROW_NUMBER()
OVER (
PARTITION BY deptno ORDER BY sal DESC
) Top3 FROM scott.emp
)
WHERE Top3 <= 3) where top3 = 3) b on(a.deptno=b.deptno)
Tom Kyte
April 25, 2007 - 10:38 am UTC

see above.

Thanks TOM

Krish, May 08, 2007 - 3:12 am UTC

The Best as usual...
Yes, you are right Tom. I didn't think about it.

VR, May 24, 2007 - 3:40 pm UTC

Tom,
Given the following:

create table maxtest(a int, b date, c date);
insert into maxtest values(1, '2006-09-12','2006-12-31');
insert into maxtest values(1, '2006-08-03','9999-12-31');
insert into maxtest values(1, '2006-03-07','2006-10-15');
insert into maxtest values(1, '2006-05-06','9999-12-31');
insert into maxtest values(1, '2006-02-01','2006-12-31');


I need the following result set.

1,'2006-08-03','9999-12-31'

I basically need a way to get the max value for b for a given max value of c. The max value of c is 9999-12-31 and although 2006-09-12 is the max value of b it is not asscoiated with the max value of c. i.e. for 9999-12-31 which is the max value of c the max b value is 2006-08-03.

How can I achieve this in one shot without scanning the table multiple time? Can I use analytic functions to accomplish this??

Thanks
VR
Tom Kyte
May 26, 2007 - 11:46 am UTC

ops$tkyte%ORA10GR2> select *
  2    from (select t1.*, row_number() over (partition by a order by c DESC nulls last) rn
  3            from maxtest t1
  4         )
  5   where rn = 1;

         A B          C                  RN
---------- ---------- ---------- ----------
         1 2006-08-03 9999-12-31          1

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select a,
  2         to_date( substr( data, 15 ),'yyyymmddhh24miss') b,
  3         to_date( substr( data, 1, 14 ), 'yyyymmddhh24miss' ) c
  4    from (
  5  select a, max( to_char(c,'yyyymmddhh24miss') || to_char(b,'yyyymmddhh24miss') )  data
  6    from maxtest t1
  7   group by a
  8         );

         A B          C
---------- ---------- ----------
         1 2006-08-03 9999-12-31

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select a,
  2         max(b) KEEP (dense_rank first order by c desc nulls last) b,
  3         max(c)
  4    from maxtest t1
  5   group by a;

         A B          MAX(C)
---------- ---------- ----------
         1 2006-08-03 9999-12-31


A reader, May 24, 2007 - 6:18 pm UTC

Tom,
This is in reference to my question above. The best I can come up with is the query below.

select * from
(select a,row_number() over(partition by a order by b desc) rn, b, c from maxtest where (a,c) in
(select a, max(c) from maxtest group by a)
)
where rn = 1

As you can see it scans the table twice and I woud like to avoid it if at all possible.

Thanks
VR

A reader, May 26, 2007 - 12:50 pm UTC

Thanks a lot Tom. However, with the first solution:

ops$tkyte%ORA10GR2> select *
2 from (select t1.*, row_number() over (partition by a order by c DESC nulls last) rn
3 from maxtest t1
4 )
5 where rn = 1;

A B C RN
---------- ---------- ---------- ----------
1 2006-08-03 9999-12-31 1

Is this result always guaranteed? Since we are only ordering by C DESC and there are two records with 9999-12-31, is there a possibility that the other record will be returned? In other words can the query return the following depending on the path the optimizer takes?

A B C RN
---------- ---------- ---------- ----------
1 2006-05-06 9999-12-31 1

Tom Kyte
May 26, 2007 - 1:04 pm UTC

you are right, we need to

order by C desc nulls last, B desc nulls last


probably,

order by c desc, b desc

and where c is not null and b is not null in the inline view instead.. as there could be a case where C is not null - but B is null for all of those C's

but you can decide on how to deal with NULLS

good catch - thanks.

Groupping Rule In Oracle Applications

vishnu, May 28, 2007 - 2:39 am UTC

Tom,

Thanks for providing this forum for answering questions related to Oracle.

I had one question which is releted to Oracle Apps Invoicing.
Is there any reule for that? like Groupping Rule ? in AP or AR modules.

I have to do this

A reader, May 28, 2007 - 6:48 pm UTC

I have a TABLE xx_contract_detials
the COLUMNS are
contract_type VARCHAR2(100),
region VARCHAR2(100)
contract_NO VARCHAR2(1000),
organizations NUMBER ,
dealsize NUMBER ,
customer_name VARCHAR2(1000),
salesrep VARCHAR2(1000),
contract_effective_date DATE,
contract_desc VARCHAR2(1000),
contract_status VARCHAR2(100),
version_comments VARCHAR2(1000),
org_id NUMBER,
authoring_party_code VARCHAR2(1000),
owner_id NUMBER
contract_expriation_date DATE

i need the output LIKE this

Type Grouping of: Special Bid / Standard /Clause Standard Contract / Custom
Geo Sales region (under grouping of the type)
< $50K Totals under $50K per region per type
$50K - $250K Totals between $50K and $250K per region per type
> $250 Totals over $250K per region per type
Totals Totals of rows


Tom Kyte
May 30, 2007 - 10:18 am UTC

no create table
no insert into table
no look

and use the CODE BUTTON for fixed fonts.

Correlating timestamps across rows

David Wellborn, May 29, 2007 - 7:40 pm UTC

I have a table, like a log file, that captures disconnection and connection events for a device and timestamps them. Sometimes I get extra connections even though there is no disconnection. I'd like a query to correlate the connections and disconnections on a single row.

Here is the SQL to demonstrate:
CREATE TABLE CONN_HIST
(
  ID               NUMBER,
  CONNTYPE         VARCHAR2(3 BYTE),
  EVENT            VARCHAR2(4 BYTE),
  DEV_ID           VARCHAR2(24 BYTE),
  EVENT_TIMESTAMP  DATE 
)
LOGGING 
NOCACHE
NOPARALLEL
NOMONITORING;

COMMENT ON COLUMN CONN_HIST.CONNTYPE IS '1=RDR 2=SSR 3=CWA';

COMMENT ON COLUMN CONN_HIST.EVENT IS '1=CONN 2=DISC';

COMMENT ON COLUMN CONN_HIST.EVENT_TIMESTAMP IS 'TIME OF THE CONNECT OR DISCONNECT EVENT';

ALTER TABLE CONN_HIST ADD (
  CONSTRAINT CONN_HIST_PK
 PRIMARY KEY
 (ID));
 
 INSERT INTO CONN_HIST ( ID, CONNTYPE, EVENT, DEV_ID,
EVENT_TIMESTAMP ) VALUES ( 
2, 'SSR', 'CONN', '07041007383600000000562D',  TO_Date( '04/10/2007 07:38:36 PM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO CONN_HIST ( ID, CONNTYPE, EVENT, DEV_ID,
EVENT_TIMESTAMP ) VALUES ( 
104, 'SSR', 'DISC', '07041007383600000000562D',  TO_Date( '05/29/2007 04:18:22 PM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO CONN_HIST ( ID, CONNTYPE, EVENT, DEV_ID,
EVENT_TIMESTAMP ) VALUES ( 
105, 'SSR', 'CONN', '07041007383600000000562D',  TO_Date( '05/29/2007 04:26:26 PM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO CONN_HIST ( ID, CONNTYPE, EVENT, DEV_ID,
EVENT_TIMESTAMP ) VALUES ( 
106, 'SSR', 'CONN', '07041007383600000000562D',  TO_Date( '05/29/2007 05:24:06 PM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO CONN_HIST ( ID, CONNTYPE, EVENT, DEV_ID,
EVENT_TIMESTAMP ) VALUES ( 
107, 'SSR', 'CONN', '07041007383600000000562D',  TO_Date( '05/29/2007 05:24:27 PM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO CONN_HIST ( ID, CONNTYPE, EVENT, DEV_ID,
EVENT_TIMESTAMP ) VALUES ( 
108, 'SSR', 'DISC', '07041007383600000000562D',  TO_Date( '05/29/2007 05:27:50 PM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO CONN_HIST ( ID, CONNTYPE, EVENT, DEV_ID,
EVENT_TIMESTAMP ) VALUES ( 
109, 'SSR', 'CONN', '07041007383600000000562D',  TO_Date( '05/29/2007 05:28:25 PM', 'MM/DD/YYYY HH:MI:SS AM')); 
COMMIT;

SELECT ID, CONNTYPE, EVENT, DEV_ID, TO_CHAR(EVENT_TIMESTAMP, 'DD-MON-YYYY HH24:MI:SS') EVENT_TIME 
FROM CONN_HIST ORDER BY EVENT_TIMESTAMP DESC;



And the output:

Table created.


Comment created.


Comment created.


Comment created.


Table altered.


1 row created.


1 row created.


1 row created.


1 row created.


1 row created.


1 row created.


1 row created.


Commit complete.


        ID CON EVEN DEV_ID                   EVENT_TIME
---------- --- ---- ------------------------ --------------------
       109 SSR CONN 07041007383600000000562D 29-MAY-2007 17:28:25
       108 SSR DISC 07041007383600000000562D 29-MAY-2007 17:27:50
       107 SSR CONN 07041007383600000000562D 29-MAY-2007 17:24:27
       106 SSR CONN 07041007383600000000562D 29-MAY-2007 17:24:06
       105 SSR CONN 07041007383600000000562D 29-MAY-2007 16:26:26
       104 SSR DISC 07041007383600000000562D 29-MAY-2007 16:18:22
         2 SSR CONN 07041007383600000000562D 10-APR-2007 19:38:36

7 rows selected.


Is there any way using an analytical query I could see the data like this:

CON DEV_ID                   DISC_TIME            CONN_TIME
--- ------------------------ -------------------- --------------------
SSR 07041007383600000000562D 29-MAY-2007 16:18:22 29-MAY-2007 16:26:26
SSR 07041007383600000000562D                      29-MAY-2007 17:24:06
SSR 07041007383600000000562D                      29-MAY-2007 17:24:27
SSR 07041007383600000000562D 29-MAY-2007 17:27:50 29-MAY-2007 17:28:25


Tom Kyte
May 30, 2007 - 11:26 am UTC

can you explain your output better?

disconnect time < connect time??

do you want a row with event = conn to be merged with a row where event = 'disc' if that row is row before it?

what about id = 2?

Disconnect / Connect example

David Wellborn, May 30, 2007 - 4:08 pm UTC

We are interested in periods of disconnected time. This is why the disconnect timestamp is < connect timestamp.

So, yes, I want to merge the row as you stated above. The connect time should be >= the most recent disconnect time for the device.

I inadvertently left id 2 out of my desired result set.
Tom Kyte
May 30, 2007 - 4:16 pm UTC

ops$tkyte%ORA10GR2> select conntype, dev_id, disc_time, conn_time
  2    from (
  3  select conntype, dev_id,
  4         case when lag(event) over (order by event_timestamp) = 'DISC'
  5              then lag(event_timestamp) over (order by event_timestamp)
  6                  end disc_time,
  7                  event_timestamp conn_time,
  8                  event
  9    from conn_hist
 10         )
 11   where event = 'CONN'
 12  /

CON DEV_ID                   DISC_TIME            CONN_TIME
--- ------------------------ -------------------- --------------------
SSR 07041007383600000000562D                      10-apr-2007 19:38:36
SSR 07041007383600000000562D 29-may-2007 16:18:22 29-may-2007 16:26:26
SSR 07041007383600000000562D                      29-may-2007 17:24:06
SSR 07041007383600000000562D                      29-may-2007 17:24:27
SSR 07041007383600000000562D 29-may-2007 17:27:50 29-may-2007 17:28:25


Joe Reynolds, May 30, 2007 - 6:16 pm UTC

Using the same data set as above, is it possible to generate this result set using SQL?
DEV_ID                   ONTIME               OFFTIME
------------------------ -------------------- --------------------
07041007383600000000562D 4/10/2007 7:38:36 PM 4/10/2007 11:59:59 PM
07041007383600000000562D 5/29/2007 4:26:26 PM 5/29/2007 5:27:50 PM 
07041007383600000000562D 5/29/2007 5:28:25 PM 5/29/2007 11:59:59 PM

A description is: The DEV_ID stays connected until it either disconnects or the day ends. Ignore extra disconnects (i.e. without a connect) and extra connects (connect if already connected).

I tried using this query, but I can't figure out how to combine rows 2-4 into a single row:
select rownum, dev_id, EVENT_TIMESTAMP ontime, offtime from
 (    
select conntype, event, dev_id, EVENT_TIMESTAMP,
  case when event ='CONN' AND lead(conntype) over (partition by dev_id, trunc(EVENT_TIMESTAMP) order by EVENT_TIMESTAMP) is not null
            then lead(EVENT_TIMESTAMP) over (partition by dev_id, trunc(EVENT_TIMESTAMP) order by EVENT_TIMESTAMP)
       when event ='CONN' AND lead(conntype) over (partition by dev_id, trunc(EVENT_TIMESTAMP) order by EVENT_TIMESTAMP) is null
            then trunc(EVENT_TIMESTAMP+1)-1/(24*60*60)
       else null
  end offtime
from conn_hist
 ) 
where offtime is not null 
order by ontime

ROWNUM DEV_ID                   ONTIME               OFFTIME              
------ ------------------------ -------------------- -------------------- 
1      07041007383600000000562D 4/10/2007 7:38:36 PM 4/10/2007 11:59:59 PM
2      07041007383600000000562D 5/29/2007 4:26:26 PM 5/29/2007 5:24:06 PM 
3      07041007383600000000562D 5/29/2007 5:24:06 PM 5/29/2007 5:24:27 PM 
4      07041007383600000000562D 5/29/2007 5:24:27 PM 5/29/2007 5:27:50 PM 
5      07041007383600000000562D 5/29/2007 5:28:25 PM 5/29/2007 11:59:59 PM


Any suggestions? As usual, thanks a million for all these years of help.
Tom Kyte
May 30, 2007 - 6:31 pm UTC

you switched from 24 hour to am/pm - very confusing.


and what happens if you connect on april 10th, and disconnect on april 11th - does the end of day trump or is there a fictitious midnight connect created or what.

Whoopsy on the datemask!

Joe Reynolds, May 30, 2007 - 8:09 pm UTC

Datemask switcheroo is annoying... sorry about that.

All connections end at midnight, so there's a hard disconnect at 23:59:59. I know it seems odd, but there is a logical business reason why the data comes in that way.

My query could probably use some tweaking (still working to understand analytics), but I'm really struggling with trying to fold the 3 connect rows into a single row.

Thanks again.

custom aggregate function

Saket, June 01, 2007 - 5:18 pm UTC

My Table admin_fees is as follows

Name Type Nullable Default Comments
------------------ ------------ -------- ------- --------
ADMIN_FEES_ID NUMBER
CLIENT_ID NUMBER
BRANCH_GROUP_ID NUMBER Y
EMPLOYER_GROUP_ID NUMBER Y
DEPARTMENT VARCHAR2(25) Y
EVENT_TYPE_CODE VARCHAR2(20) Y
PURSUIT_LEVEL_CODE VARCHAR2(20) Y
FEE_PERCENT NUMBER Y
FEE_AMOUNT NUMBER Y

Sample Data :


ID Client BG EG DEPT ET P_L_C Amount Percent
1836 122 COMAUTO-I 2
1865 122 MEDMAL-UM 2
1873 122 OTHER-A 2
1885 122 PERAUTO-PL 2
1875 122 OVERPAY-R 2
1870 122 OTHER-UM 3
1907 122 UNKNOWN-A 3
1626 122 110097 5
1851 122 GENLIAB-UM 5
....
....
....

The table is having many values so


Now I want an abstract result from the above table and data.

If any of the columns except amount, percent and ID is not null it shows Y, and N else.

Then I want to perform logical OR on distinct values of Amount and percent on each column values.

So I do the following thing

select
new_or(client_id) client_id,
new_or(branch_group_id) branch_group_id,
new_or(employer_group_id) employer_group_id,
new_or(department) department,
new_or(event_type_code) event_type_code,
new_or(pursuit_level_code) pursuit_level_code,
fee_percent, fee_amount
from
(select distinct nvl2((case when client_id is not null and branch_group_id is null and employer_group_id is null and department is null and event_type_code is null and pursuit_level_code is null then client_id else null end),'Y', 'N') client_id,
nvl2(branch_group_id, 'Y', 'N') branch_group_id,
nvl2(employer_group_id, 'Y', 'N') employer_group_id,
nvl2(department, 'Y', 'N') department,
nvl2(event_type_code, 'Y', 'N') event_type_code,
nvl2(pursuit_level_code, 'Y', 'N') pursuit_level_code,
fee_percent,
fee_Amount
from admin_fees where client_id = 122
) group by fee_percent, fee_amount;


The result is exactly what I want:


client bg eg dept et pl percent amount

N N Y N Y N 2
N N N N Y N 3
N N Y N N N 4
N N Y N Y N 5
N N Y N Y N 8
N N N N Y N 9

The last query provides distinct values of amount and percent with aggregate values for each columns (Y if column is not null, N if column is null).

I have created a group function named new_or based on new type created my_or.

What it does is a normal logical OR function.

Y N N = Y
N N N N N = N
N N N N Y N N N N = Y


My question is

1. Why does not ORACLE Provide such group function like AND, OR, NAND, NOR, XOR etc.....
2. Is there any other way to get same results using SQL query (If my DBA does not allow me to create type.)

I am working on Oracle Forms 10g Application. I was really wondering if Oracle provides a facility of showing a meaning full message
when a constraint is violated like "Employee Code already exists." instead of ORA-00001 Unique constraint violated .....

3. Does not Oracle ever think of adding a message string on constraints definitions so that when ever the constraint is violated a more meaningful message could be displayed instead of ora-00001 constraint violated .... kind of stuff....
Tom Kyte
June 03, 2007 - 5:30 pm UTC

we do, bitand for example and many for raws in the utl_raw package.

but you have strings, strings are not "or'ed and anded"


ops$tkyte%ORA10GR2> select x, case when instr( x, 'Y' ) > 0 then 'Y' else 'N' end yn
  2    from t;

X     Y
----- -
NNY   Y
NNNNN N
YNYYN Y
      N

Custom Group Functions

Saket, June 04, 2007 - 9:12 am UTC

Thank you for your instant reply.

My question is a bit different here:

Supposing my data is as follows:


Col1 Col2 Col3 Amt
N Y Y 2
N Y N 2
N N Y 5
N N Y 6
N N Y 7
N Y N 7
Y N N 7


I want the following output form it:

Col1 Col2 Col3 Amt
N Y Y 2
N N Y 5
N N Y 6
Y Y Y 7

So here, what I have done is that I have grouped by Amt. and used
group function called NEW_OR.

I dont think UTL_RAW.BIT_OR could be helpful in this scenario.

Also I would remind you of my question no. 3 in previous posting.

Waiting for your reply.
Tom Kyte
June 05, 2007 - 8:50 am UTC

select max(col1), max(col2), max(col3), amt
  from t
 group by amt


eg:

ops$tkyte%ORA10GR2> select * from t;

X Y        AMT
- - ----------
N Y          1
N N          1
Y N          2
N N          2
Y N          3
N Y          3

6 rows selected.

ops$tkyte%ORA10GR2> select max(x), max(y), amt from t group by amt;

M M        AMT
- - ----------
N Y          1
Y N          2
Y Y          3

will work in your case since Y > N.

they are STRINGS, think of them as STRINGS.

enhancement requests go via support, #3.

RE: Custom Group Functions

Frank Zhou, June 04, 2007 - 1:32 pm UTC

Saket,

There are several possible ways to implement this query
Here is one of the solutions.

create table test5 as
select 'N' as Col1, 'Y' as Col2, 'Y' as Col3 , 2 as Amt from dual
union all
select 'N' as Col1, 'Y' as Col2, 'N' as Col3 , 2 as Amt from dual
union all
select 'N' as Col1, 'N' as Col2, 'Y' as Col3 , 5 as Amt from dual
union all
select 'N' as Col1, 'N' as Col2, 'Y' as Col3 , 6 as Amt from dual
union all
select 'N' as Col1, 'N' as Col2, 'Y' as Col3 , 7 as Amt from dual
union all
select 'N' as Col1, 'Y' as Col2, 'N' as Col3 , 7 as Amt from dual
union all
select 'Y' as Col1, 'N' as Col2, 'N' as Col3 , 7 as Amt from dual


SQL> select
2 case when instr( XMLAgg(XMLElement(x,Col1) ), 'Y' ) > 0 then 'Y' else 'N' end as Col1 ,
3 case when instr( XMLAgg(XMLElement(x,Col2) ), 'Y' ) > 0 then 'Y' else 'N' end as Col2 ,
4 case when instr( XMLAgg(XMLElement(x,Col3) ), 'Y' ) > 0 then 'Y' else 'N' end as Col3,
5 amt
6 from test5 group by Amt;

C C C AMT
- - - ----------
N Y Y 2
N N Y 5
N N Y 6
Y Y Y 7

SQL>

try this

Jay, June 04, 2007 - 1:51 pm UTC


with temptable as

(
select case
when col1 = 'N'
then null
else 'Y'
end col1,

case
when col2 = 'N'
then null
else'Y'
end col2,

case
when col3 = 'N'
then null
else 'Y'
end col3,

amt

from t

)

select nvl(max(col1),'N') col1,
nvl(max(col2),'N') col2,
nvl(max(col3),'N') col3,
amt

from t

group by amt

did a mistate. here is the correct one

Jay, June 04, 2007 - 1:59 pm UTC

with temptable as

(
select case
    when col1 = 'N' 
    then null
    else 'Y'
       end col1,

       case
    when col2 = 'N' 
    then null
    else'Y'
       end col2,

       case
    when col3 = 'N' 
    then null
    else 'Y'
       end col3,

       amt    
    
  from t 

)

select nvl(max(col1),'N') col1,
       nvl(max(col2),'N') col2,
       nvl(max(col3),'N') col3,    
       amt

 from temptable ----- i had done a mistake here

group by amt



thanks!

group function

Saket, June 04, 2007 - 9:57 pm UTC

Thank you very much Tom and Jay.


Group Function

Saket, June 05, 2007 - 8:39 am UTC

Looks Nice. 

(Extremely) Sorry Tom, I failed to format text properly. I will keep this in my mind next time. This was my first posting. 

Thanks to Frank Zhou. 


A reader, July 02, 2007 - 4:04 pm UTC

Tom,

I have following infomation in the table...

Cno org_id User Pass
100 500 100 abc
200 500
300 500


I want query to return following for the same org_id....

Cno org_id User Pass
100 500 100 abc
200 500 100 abc <<---
300 500 100 abc <<---

The query should return User/Pass for Cno 200 and 300 (Though in the table we have User/Pass for cno 100)

Is there anyway I can do this using analytics?

Thanks






Tom Kyte
July 03, 2007 - 10:02 am UTC

no create table
no inserts
no look

incomplete specification - what if you had

100 500 100 abc
200 500 
300 500 123 def
400 500 


no version either :(


read about last_value, ignore nulls.

  1* select empno, comm, last_value(comm ignore nulls) over (order by empno) from emp
scott%ORA10GR2> /

     EMPNO       COMM LAST_VALUE(COMMIGNORENULLS)OVER(ORDERBYEMPNO)
---------- ---------- ---------------------------------------------
      7369
      7499        300                                           300
      7521        500                                           500
      7566                                                      500
      7654       1400                                          1400
      7698                                                     1400
      7782                                                     1400
      7788                                                     1400
      7839                                                     1400
      7844          0                                             0
      7876                                                        0
      7900                                                        0
      7902                                                        0
      7934                                                        0

14 rows selected.

Analytics NTILE ?

Martin, July 09, 2007 - 5:31 pm UTC

Hi Tom.

I have this table with two columns NAME and TOTAL.

CREATE TABLE TABLE_A
(
  NAME   VARCHAR2(50 BYTE),
  TOTAL  NUMBER(9)
);


SET DEFINE OFF;
Insert into TABLE_A (NAME, TOTAL) Values ('BILL', 25);
Insert into TABLE_A (NAME, TOTAL) Values ('PETER', 23);
Insert into TABLE_A (NAME, TOTAL) Values ('JOHN', 21);
Insert into TABLE_A (NAME, TOTAL) Values ('ROSIE', 18);
Insert into TABLE_A (NAME, TOTAL) Values ('STEVE', 14);
Insert into TABLE_A (NAME, TOTAL) Values ('CARL', 13);
Insert into TABLE_A (NAME, TOTAL) Values ('GREG', 11);
Insert into TABLE_A (NAME, TOTAL) Values ('DIRK', 11);
Insert into TABLE_A (NAME, TOTAL) Values ('MARY', 9);
Insert into TABLE_A (NAME, TOTAL) Values ('KATE', 7);
Insert into TABLE_A (NAME, TOTAL) Values ('ANA', 4);
COMMIT;

NAME  TOTAL
------ -------
BILL      25
PETER     23
JOHN      21
ROSIE     18
STEVE     14
CARL      13
DIRK      11
GREG      11
MARY      9
KATE      7
ANA       4


I need to order the table by the TOTAL column DESCENDING
Divide the table into groups of 3 records.
Calculate the average for each of the groups.
If the last group doesn't have 3 records, then it should be part of the previous group.

Expected Output:

NAME  TOTAL
------ -------
BILL     23
PETER    23
JOHN     23
ROSIE    15
STEVE    15
CARL     15
DIRK     8.4
GREG     8.4
MARY     8.4
KATE     8.4
ANA      8.4  


I am trying to find an analytical function to do this, I've tried with NTILE, but I am unable to get the result I want.

My oracle version is 9i.

Thanks,

Martin

Tom Kyte
July 09, 2007 - 8:58 pm UTC

ops$tkyte%ORA10GR2> select name, avg(total) over (partition by new_grp) average
  2    from (
  3  select name, total, decode( count(*) over (partition by grp), 3, grp, grp-1) new_grp
  4    from (
  5  select name, total, ceil( row_number() over (order by total desc)/3 ) grp
  6    from table_a
  7         )
  8         )
  9  /

NAME                      AVERAGE
---------------------- ----------
BILL                           23
PETER                          23
JOHN                           23
ROSIE                          15
STEVE                          15
CARL                           15
GREG                          8.4
DIRK                          8.4
MARY                          8.4
KATE                          8.4
ANA                           8.4

11 rows selected.

analytical rocks

Akki, July 10, 2007 - 7:35 am UTC

.. again ! !

analytical rocks

A reader, July 10, 2007 - 9:44 am UTC

Amazing!!!!! Thanks a lot!

Get data in two groups.

Sean, July 24, 2007 - 3:31 pm UTC

Hi Tom,

I want to get data in two groups by c2,
group one: c2 =< 10,
group two: c2 > 10.

In each group, data should be ordered by c1. Here is the sample of the data. Database 10g. Thanks so much


c1 c2

2 8
3 6
5 10

1 14
4 11
6 13


create table t1 (c1 number, c2 number);

insert into t1(c1, c2)
values (2, 8);

insert into t1(c1, c2)
values (3, 6);

insert into t1(c1, c2)
values (5, 10);

insert into t1(c1, c2)
values (1, 14);

insert into t1(c1, c2)
values (4, 11);

insert into t1(c1, c2)
values (6, 13);
Tom Kyte
July 26, 2007 - 8:41 am UTC

ops$tkyte%ORA10GR2> select case when c2 <= 10 then 1 else 2 end grp, c1, c2
  2    from t1
  3   order by case when c2 <= 10 then 1 else 2 end, c1
  4  /

       GRP         C1         C2
---------- ---------- ----------
         1          2          8
         1          3          6
         1          5         10
         2          1         14
         2          4         11
         2          6         13

6 rows selected.


A reader, August 24, 2007 - 12:39 pm UTC

Tom

Here is my test case

drop table t;

--- Test to check what happens for Analytical Functions that need to have a row with a column value of null

create table t ( a number, b varchar2(15), c number);

begin
insert into t values (1, 'A',2);
insert into t values(1,'B',3);
insert into t values(1,'C',2);
insert into t values(1,'D',null);
insert into t values (1,null,2);
insert into t values (1,null,null);
insert into t values(1,null,4);

end;

select a,b,c, last_value( c ) over (partition by b order by b nulls first ) n , last_value( c ) over (partition by b order by b nulls last) l ,
first_value( c ) over (partition by b order by b nulls first ) j , first_value( c ) over (partition by b order by b nulls last) k,
last_value( c ) over (partition by b ) tt
from t

1) Why does the Last_value always return NULL for B=NULL when the First_Value function returns 2, if the Null value is 'indeterminate', it should be so for both the first_value and the last_value?

2) select a,b,c, last_value( c ) over (partition by b ) n , last_value( nvl(c,1) ) over (partition by b order by b nulls last) l from t

How come the NVL in column L in the above query, made the value in column N work, considering a similar column in Question 1 (column TT) did not work?
Is that a bug? or is this a feature of Analytical queries, when two columns do analytical functions they somehow 'interact'?

Tom Kyte
August 27, 2007 - 3:48 pm UTC

1) Maybe this'll help. Run the first query with b as null, and then all null b's set to xx

the results are "the same"

ops$tkyte%ORA10GR2> select a,b,c,
  2         last_value( c ) over (partition by b order by b nulls first range unbounded preceding) l  ,
  3         first_value( c ) over (partition by b order by b nulls first range unbounded preceding) f
  4    from t
  5   order by b nulls first
  6  /

         A B                        C          L          F
---------- --------------- ---------- ---------- ----------
         1                          2                     2
         1                          4                     2
         1                                                2
         1 A                        2          2          2
         1 B                        3          3          3
         1 C                        2          2          2
         1 D

7 rows selected.

ops$tkyte%ORA10GR2> update t set b = 'xx' where b is null;

3 rows updated.

ops$tkyte%ORA10GR2> select a,b,c,
  2         last_value( c ) over (partition by b order by b nulls first range unbounded preceding) l  ,
  3         first_value( c ) over (partition by b order by b nulls first range unbounded preceding) f
  4    from t
  5   order by b nulls first
  6  /

         A B                        C          L          F
---------- --------------- ---------- ---------- ----------
         1 A                        2          2          2
         1 B                        3          3          3
         1 C                        2          2          2
         1 D
         1 xx                       2                     2
         1 xx                       4                     2
         1 xx                                             2

7 rows selected.

ops$tkyte%ORA10GR2> spool off


when you break up all of the data by B (all of the null b's go together - for BOTH first and last), it is pretty 'clear'

I would say it could also be considered non-deterministic...

if you partition by B and then sort by B, the order of the rows in each partition is not deterministic and first/last value are free to return "something", but not necessarily the same thing run after run against the same data!


2) since I disagree with your analysis of "works", I'll pass on trying to interpret this one :)

It is working as I expect it should - null or not.

A reader, August 31, 2007 - 12:50 pm UTC

Tom

If I were to do partition by the same columns (or criteria) over multiple columns, would Oracle optimize the grouping to do it just once or for once per column.

I mean

select count(something) over (partition by column1, column2) data1,
sum(something) over (partiton by column1, column2) data2
from table

Would Oracle have done the Partition two times, we are having Oracle 9i with us.

I think I can prove it by making it to partition by a function output and recording in dbms_session.action each time the function is called.

The reason I am asking is a query I have written that way runs very slowly.

Now, presumably doing

Select sum(something), count(something) from
table group by column1,column2
would only create the Groups once and do the functions on the constituents of the Grouped data.

Thanks

Ravi

Tom Kyte
September 05, 2007 - 9:25 am UTC

Oracle will "share" the partitions as much as possible - even nested ones - for example:

ops$tkyte%ORA10GR2> column PLAN_TABLE_OUTPUT format a72 truncate
ops$tkyte%ORA10GR2> set autotrace traceonly explain
ops$tkyte%ORA10GR2> select ename,
  2         count(*) over (partition by deptno, job) ,
  3         sum(sal) over (partition by deptno, job) ,
  4         count(*) over (partition by deptno) ,
  5         sum(sal) over (partition by deptno)
  6    from emp
  7   order by deptno, job, ename
  8  /

Execution Plan
----------------------------------------------------------
Plan hash value: 3145491563

------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time
------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |    15 |   585 |     3   (0)| 00:00:0
|   1 |  WINDOW SORT       |      |    15 |   585 |     3   (0)| 00:00:0
|   2 |   TABLE ACCESS FULL| EMP  |    15 |   585 |     3   (0)| 00:00:0
------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement

see how there is only one window sort (partition) - and it basically did is by deptno, job, ENAME (it rolled the order by into the operation as well..)

but, if you add a partition by that cannot "fit" inside the others - it has to do it twice:


ops$tkyte%ORA10GR2> select ename,
  2         count(*) over (partition by deptno, job) ,
  3         sum(sal) over (partition by deptno, job) ,
  4         count(*) over (partition by deptno) ,
  5         sum(sal) over (partition by deptno) ,
  6         count(*) over (partition by job)
  7    from emp
  8   order by deptno, job, ename
  9  /

Execution Plan
----------------------------------------------------------
Plan hash value: 4086863039

------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time
------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |    15 |   585 |     4  (25)| 00:00:
|   1 |  WINDOW SORT        |      |    15 |   585 |     4  (25)| 00:00:
|   2 |   WINDOW SORT       |      |    15 |   585 |     4  (25)| 00:00:
|   3 |    TABLE ACCESS FULL| EMP  |    15 |   585 |     3   (0)| 00:00:
------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement

ops$tkyte%ORA10GR2> set autotrace off


so, it'll partition as few times as it can.

Use of analytic functions in UPDATE statements

Ananth, August 31, 2007 - 6:32 pm UTC

Use of analytic functions in UPDATE statements

Hi Tom,

I have the following table of data:

CREATE TABLE A_TEST (
EFF_DT DATE,
A_DESC VARCHAR2(20 BYTE),
ACTL_CHRG NUMBER,
ACTL_CHRG_YTD NUMBER,
ACTL_CHRG_YTD_UPD NUMBER
);


INSERT INTO A_TEST (EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD) VALUES('28-FEB-07','NV5000',-275.133708,-275.133708);
INSERT INTO A_TEST (EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD) VALUES('28-FEB-07','NV5000',275.133708,0.000000);
INSERT INTO A_TEST (EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD) VALUES('31-MAR-07','NV5000',-19.714995,-19.714995);
INSERT INTO A_TEST (EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD) VALUES('31-MAR-07','NV5000',147.351333,127.636338);
INSERT INTO A_TEST (EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD) VALUES('30-APR-07','NV5000',-77.253573,50.382765);
INSERT INTO A_TEST (EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD) VALUES('30-APR-07','NV5000',125.007672,175.390437);
INSERT INTO A_TEST (EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD) VALUES('31-MAY-07','NV5000',-65.570613,109.819824);
INSERT INTO A_TEST (EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD) VALUES('31-MAY-07','NV5000',126.322005,236.141829);
INSERT INTO A_TEST (EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD) VALUES('30-JUN-07','NV5000',-35.340954,200.800875);
INSERT INTO A_TEST (EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD) VALUES('30-JUN-07','NV5000',111.864342,312.665217);
INSERT INTO A_TEST (EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD) VALUES('31-JUL-07','NV5000',90.250866,402.916083);
INSERT INTO A_TEST (EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD) VALUES('31-JUL-07','NV5000',0.000000,402.916083);
INSERT INTO A_TEST (EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD) VALUES('31-AUG-07','NV5000',97.260642,500.176725);
INSERT INTO A_TEST (EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD) VALUES('31-AUG-07','NV5000',0.000000,500.176725);


SELECT EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD, ACTL_CHRG_YTD_UPD
FROM a_test;

EFF_DT A_DESC ACTL_CHRG ACTL_CHRG_YTD ACTL_CHRG_YTD_UPD
========= ====== ============ ============= ==================
02/28/07 NV5000 -275.133708 -275.133708
02/28/07 NV5000 275.133708 0.000000
03/31/07 NV5000 -19.714995 -19.714995
03/31/07 NV5000 147.351333 127.636338
04/30/07 NV5000 -77.253573 50.382765
04/30/07 NV5000 125.007672 175.390437
05/31/07 NV5000 -65.570613 109.819824
05/31/07 NV5000 126.322005 236.141829
06/30/07 NV5000 -35.340954 200.800875
06/30/07 NV5000 111.864342 312.665217
07/31/07 NV5000 90.250866 402.916083
07/31/07 NV5000 0.000000 402.916083
08/31/07 NV5000 97.260642 500.176725
08/31/07 NV5000 0.000000 500.176725

-- update ACTL_CHRG_YTD_UPD (cumulative sum from ACTL_CHRG) column using MERGE statement

MERGE INTO a_test
USING ( SELECT rowid rid,
sum(ACTL_CHRG) over (partition by A_DESC order by A_DESC,eff_dt)
new_actl_chrg_ytd_upd
FROM a_test
ORDER BY eff_dt) a
ON ( a_test.rowid = a.rid )
WHEN MATCHED
THEN UPDATE SET actl_chrg_ytd_upd = a.new_actl_chrg_ytd_upd
WHEN NOT MATCHED
THEN INSERT ( actl_chrg_ytd_upd ) VALUES ( null );


EFF_DT A_DESC ACTL_CHRG ACTL_CHRG_YTD ACTL_CHRG_YTD_UPD
======== ====== ============ ============== ===============
02/28/07 NV5000 -275.133708 -275.133708 0.000000 -275.133708
02/28/07 NV5000 275.133708 0.000000 0.000000
03/31/07 NV5000 -19.714995 -19.714995 127.636338 -19.714995
03/31/07 NV5000 147.351333 127.636338 127.636338
04/30/07 NV5000 -77.253573 50.382765 175.390437 50.382765
04/30/07 NV5000 125.007672 175.390437 175.390437
05/31/07 NV5000 -65.570613 109.819824 236.141829 109.819824
05/31/07 NV5000 126.322005 236.141829 236.141829
06/30/07 NV5000 -35.340954 200.800875 312.665217 200.800875
06/30/07 NV5000 111.864342 312.665217 312.665217
07/31/07 NV5000 90.250866 402.916083 402.916083
07/31/07 NV5000 0.000000 402.916083 402.916083
08/31/07 NV5000 97.260642 500.176725 500.176725
08/31/07 NV5000 0.000000 500.176725 500.176725

-- The updated ACTL_CHRG_YTD_UPD data is not matching with ACTL_CHRG_YTD data.

Thanks for you help.....


Tom Kyte
September 05, 2007 - 9:34 am UTC

I cannot reproduce in 9i or 10g


ops$tkyte%ORA9IR2> select a.*, sum(ACTL_CHRG) over (partition by A_DESC order by A_DESC,eff_dt) newthing
  2    from a_test a
  3   order by eff_dt
  4  /

EFF_DT    A_DESC                ACTL_CHRG ACTL_CHRG_YTD ACTL_CHRG_YTD_UPD   NEWTHING
--------- -------------------- ---------- ------------- ----------------- ----------
28-FEB-07 NV5000               -275.13371    -275.13371                            0
28-FEB-07 NV5000               275.133708             0                            0
31-MAR-07 NV5000               -19.714995    -19.714995                   127.636338
31-MAR-07 NV5000               147.351333    127.636338                   127.636338
30-APR-07 NV5000               -77.253573     50.382765                   175.390437
30-APR-07 NV5000               125.007672    175.390437                   175.390437
31-MAY-07 NV5000               -65.570613    109.819824                   236.141829
31-MAY-07 NV5000               126.322005    236.141829                   236.141829
30-JUN-07 NV5000               -35.340954    200.800875                   312.665217
30-JUN-07 NV5000               111.864342    312.665217                   312.665217
31-JUL-07 NV5000                90.250866    402.916083                   402.916083
31-JUL-07 NV5000                        0    402.916083                   402.916083
31-AUG-07 NV5000                97.260642    500.176725                   500.176725
31-AUG-07 NV5000                        0    500.176725                   500.176725

14 rows selected.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> MERGE INTO a_test
  2  USING ( SELECT rowid rid,
  3                sum(ACTL_CHRG) over (partition by A_DESC order by A_DESC,eff_dt)
  4                new_actl_chrg_ytd_upd
  5           FROM a_test
  6       ORDER BY eff_dt) a
  7  ON ( a_test.rowid = a.rid )
  8  WHEN MATCHED
  9       THEN UPDATE SET actl_chrg_ytd_upd = a.new_actl_chrg_ytd_upd
 10  WHEN NOT MATCHED
 11       THEN INSERT ( actl_chrg_ytd_upd ) VALUES ( null );

14 rows merged.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> select a.*, sum(ACTL_CHRG) over (partition by A_DESC order by A_DESC,eff_dt) newthing
  2    from a_test a
  3   order by eff_dt
  4  /

EFF_DT    A_DESC                ACTL_CHRG ACTL_CHRG_YTD ACTL_CHRG_YTD_UPD   NEWTHING
--------- -------------------- ---------- ------------- ----------------- ----------
28-FEB-07 NV5000               -275.13371    -275.13371                 0          0
28-FEB-07 NV5000               275.133708             0                 0          0
31-MAR-07 NV5000               -19.714995    -19.714995        127.636338 127.636338
31-MAR-07 NV5000               147.351333    127.636338        127.636338 127.636338
30-APR-07 NV5000               -77.253573     50.382765        175.390437 175.390437
30-APR-07 NV5000               125.007672    175.390437        175.390437 175.390437
31-MAY-07 NV5000               -65.570613    109.819824        236.141829 236.141829
31-MAY-07 NV5000               126.322005    236.141829        236.141829 236.141829
30-JUN-07 NV5000               -35.340954    200.800875        312.665217 312.665217
30-JUN-07 NV5000               111.864342    312.665217        312.665217 312.665217
31-JUL-07 NV5000                90.250866    402.916083        402.916083 402.916083
31-JUL-07 NV5000                        0    402.916083        402.916083 402.916083
31-AUG-07 NV5000                97.260642    500.176725        500.176725 500.176725
31-AUG-07 NV5000                        0    500.176725        500.176725 500.176725

14 rows selected.


Use of analytic functions in UPDATE statements

Ananth, September 05, 2007 - 12:17 pm UTC

Tom,

I cannot reproduce in 9i or 10g...

The script was created and tested in Release 9.2.0.1.0.

Problem:
I am not able compute the cumulative total from ACTL_CHRG column and update in ACTL_CHRG_YTD_UPD column.
(The cumulative data in ACTL_CHRG_YTD_UPD column should look like data in ACTL_CHRG_YTD column.)

If I am not able to do this in SQL can you suggest a PL/SQL

Thanks,



Tom Kyte
September 05, 2007 - 5:33 pm UTC

well, you can take your "using" query and bulk collect that into arrays and then forall i update - sure.


select rowid, analytic() bulk collect into rid_array, analytic_array
  from t
.....;

forall i in 1 .. rid_array.count
  update t set column = analytic_array(i) where rowid = rid_array(i);



A reader, September 06, 2007 - 5:40 am UTC

Tom,

Here is my test case

drop table t8;

create table t8 (a number, b number, c number);

truncate table t8;


begin

insert into t8 values (1, 2, 9);

insert into t8 values (1, 2, 7);

insert into t8 values (1,3,8);
insert into t8 values (1,3,9);
insert into t8 values (1,3,9);

insert into t8 values (1,3,6);

insert into t8 values (2, 7,6);

insert into t8 values (2, 7,4);

insert into t8 values (2,9,2);

insert into t8 values (2,9,3);

end;


If I were to do

SELECT a,b,c,LAST_VALUE ( c ) OVER ( PARTITION BY a ORDER BY b ) d
--- Both first and Last value will be same for SAME value of B
, FIRST_VALUE ( c ) OVER ( PARTITION BY a ORDER BY b RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING )
l2
, LAST_VALUE ( c ) OVER ( PARTITION BY a ORDER BY b RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING )
m2
FROM t8


Question:

1) Is it right in saying Oracle Orders Column C as well, other than partitioning by A and Ordering By B, this observation from the column value of both L2 and M2?


Thanks

Ravi
Tom Kyte
September 11, 2007 - 8:01 am UTC

no that is not right.

you will break up the data by A (partition by A)
you will order the data by B within each A partition.

that is all you can say - if the data happens to accidentally be sorted by C, that is just that - an accident and nothing you can be assured of.

Use of analytic functions in UPDATE statements

Ananth, September 06, 2007 - 2:47 pm UTC

MERGE INTO a_test
USING ( SELECT rowid rid,
sum(ACTL_CHRG) over (partition by A_DESC order by A_DESC,eff_dt)
new_actl_chrg_ytd_upd
FROM a_test
ORDER BY eff_dt) a
ON ( a_test.rowid = a.rid )
WHEN MATCHED
THEN UPDATE SET actl_chrg_ytd_upd = a.new_actl_chrg_ytd_upd
WHEN NOT MATCHED
THEN INSERT ( actl_chrg_ytd_upd ) VALUES ( null );

Can you pl. help me to convert the above SQL using bulk collect for update.

Thanks,

Tom Kyte
September 11, 2007 - 8:25 am UTC

I already did, right above, long time ago.

Please help me

Kailasanathan Subbaraman, September 07, 2007 - 12:48 pm UTC

Tom,

I have been a very big fan of you. I have solved lot of problems with analytic functions (truly
analytics rock and analytics roll). But I have the following situation that I need your help.

I have a table with 2 columns :
Create Table A (F1 VARCHAR2(2), F2 VARCHAR2(3));

INSERT INTO A ( F1, F2 ) VALUES ( 'A1', '1');
INSERT INTO A ( F1, F2 ) VALUES ( 'A1', '2');
INSERT INTO A ( F1, F2 ) VALUES ( 'A1', '3');
INSERT INTO A ( F1, F2 ) VALUES ( 'A1', '4');
INSERT INTO A ( F1, F2 ) VALUES ( 'A2', '2');
INSERT INTO A ( F1, F2 ) VALUES ( 'A3', '1');
INSERT INTO A ( F1, F2 ) VALUES ( 'A4', '5');
INSERT INTO A ( F1, F2 ) VALUES ( 'A2', '3');
INSERT INTO A ( F1, F2 ) VALUES ( 'A2', '4');
INSERT INTO A ( F1, F2 ) VALUES ( 'A2', '5');
INSERT INTO A ( F1, F2 ) VALUES ( 'A2', '6');
INSERT INTO A ( F1, F2 ) VALUES ( 'A2', '7');
INSERT INTO A ( F1, F2 ) VALUES ( 'A3', '6');
INSERT INTO A ( F1, F2 ) VALUES ( 'A3', '7');
INSERT INTO A ( F1, F2 ) VALUES ( 'A3', '8');
INSERT INTO A ( F1, F2 ) VALUES ( 'A4', '5');
INSERT INTO A ( F1, F2 ) VALUES ( 'A4', '7');

I need to split these rows into say 3 groups (using function such as NTILE) in such a way that all
records with the same value for the column F1 should be in the same group.

e.g

A1 1 1
A1 2 1
A1 3 1
A1 4 1

A2 2 2
A2 3 2
A2 4 2
A2 5 2
A2 6 2
A2 7 2

A3 1 3
A3 6 3
A3 7 3
A3 8 3
A4 5 3
A4 7 3

Thanks

Tom Kyte
September 12, 2007 - 10:14 am UTC

insufficient specifications here.

I see no reason logically why A4 is in the group with A3 when A2 is not with A1

what if A3 had 10 rows - what would output be then.


Need *a lot* more information here as to the precise logic.

Kailasanathan

A reader, September 11, 2007 - 9:52 am UTC

why are
A3 8 3
and
A4 5 3
in the same group "3"
?

Kailash Subbaraman, September 11, 2007 - 10:46 am UTC

That's because the logic should be as follows :

1. Divide the total number of rows in the table into ranges so that each range as far as possible has equal number of rows.
2. Records with same value in F1 should not be split and should be in the same set.

In the above eg, based on the above conditions the first set has 4 rows, second and third set has 6 rows which satisfies condition 1. If column F1 with value A4 is in another set, the that set will only have 2 records which violates condition 1.

Hope this helps
Tom Kyte
September 15, 2007 - 3:29 pm UTC

... Divide the total number of rows in the table into ranges so that each range
as far as possible has equal number of rows. ...

bin fitting problem - not easy to do in SQL, I do not recommend SQL for this particular solution.

This may be a starting point for you

Tyler, September 11, 2007 - 2:19 pm UTC

This could be a starting point for you, it will resolve out that A4 needs to be included in the A3 group based on the requirements you've given...

Hope that helps.
WITH 
   BaseInfo AS
(
   SELECT 
      F1, 
      F2
   FROM A 
   ORDER BY F1 ASC
),
   SecondSet AS
(
   SELECT 
      F1, 
      F2, 
      COUNT(*)       OVER (PARTITION BY F1)                 AS GroupedCount,
      ROW_NUMBER()   OVER (PARTITION BY F1 ORDER BY F1 ASC) AS RowsPerGroup
   FROM BaseInfo
)
   SELECT 
         F1, 
         F2, 
         RowsPerGroup,
         GroupedCount,
         LAG(GroupedCount, RowsPerGroup - 1) OVER (PARTITION BY F1 ORDER BY F1 ASC),      
         CASE 
            WHEN 
                     GroupedCount < MAX(GroupedCount) OVER () 
               AND   (LAG(GroupedCount, RowsPerGroup - 1) OVER (PARTITION BY F1 ORDER BY F1 ASC) + GroupedCount) - MAX(GroupedCount) OVER () < 2
            THEN
               'INCLUDE IN PREVIOUS SET'
            ELSE
               'DON''T COMBINE'
         END           
   FROM SecondSet;

To: Kailash Subbaraman

Michel Cadot, September 16, 2007 - 2:34 am UTC


See an example of bin fitting problem and its solution with MODEL clause at:
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:3170642805938#55879813219675

Regards
Michel

Analytic question

Lise, September 20, 2007 - 7:04 am UTC

I am sure I can do something smart with analytics here, but I am not sure what...
create table COMP
(
AMNT NUMBER,
MBMR NUMBER
);
create table AMOUNTS
(
AMNT NUMBER,
MBMR NUMBER
);
insert into COMP (AMNT, MBMR)
values (500, 1);
insert into COMP (AMNT, MBMR)
values (100, 1);
insert into COMP (AMNT, MBMR)
values (1000, 1);
insert into AMOUNTS (AMNT, MBMR)
values (100, 1);
insert into AMOUNTS (AMNT, MBMR)
values (500, 1);
insert into AMOUNTS (AMNT, MBMR)
values (250, 1);
insert into AMOUNTS (AMNT, MBMR)
values (250, 1);
insert into AMOUNTS (AMNT, MBMR)
values (250, 1);
insert into AMOUNTS (AMNT, MBMR)
values (250, 1);

I want to be able to match the payments in comp with payment sin amounts.
Imagine I had 1000 on comp and I wanted to find the rows on amounts that added up to 1000. It would be 250 * 4, but how can I automate this calculation?
How would I do this please?
Tom Kyte
September 24, 2007 - 7:19 am UTC

but if you find some 4 rows that make 1000, you might not be able to find the other X rows to make 100 or 500 - because if you used a row for 1000, it might have been the only row that could answer 100, but if you used it for 1000....

It seems to me that the matching of payments should have been done "during insertion" in real life.

I'm not sure a sql solution here is what you are looking for.

I agree

Lise, September 25, 2007 - 5:09 pm UTC

It is associated with payments received without a reference to what it is for apart from the customer id. I think you are correct, in that I would have to compute the sum of all possible combinations. I think the business needs to reconsider their approach, or live with the possibility that we can find up to 2 matched payments whilst the remainder has to be done as is.

Thanks Tom.

Analytical Help

Akki, October 02, 2007 - 6:10 am UTC

Hi Tom,

I am using function to get the required field output. But I wonder if this can be replaced with analytical functions (as after using function, the performance is very poor)

This is the sample code of our transaction tables. The data in both the tables are inserted using same sequence.
I am dealing with historic data, so there is less scope for changing the table structures.

create table t_pur_orders (id number, advert_id number, area_code varchar2(10), advert_type varchar2(10),seq_id number);

create table t_orders (id number, advert_id number, area_code varchar2(10), advert_type varchar2(10));

INSERT INTO T_PUR_ORDER ( ID, ADVERT_ID, AREA_CODE, ADVERT_TYPE, SEQ_ID ) VALUES ( 1, 1001, 'A', 'ONE', 1); 
INSERT INTO T_PUR_ORDER ( ID, ADVERT_ID, AREA_CODE, ADVERT_TYPE, SEQ_ID ) VALUES ( 1, 1001, 'A', 'ONE', 2); 
INSERT INTO T_PUR_ORDER ( ID, ADVERT_ID, AREA_CODE, ADVERT_TYPE, SEQ_ID ) VALUES ( 2, 1002, 'B', 'TWO', 1); 
INSERT INTO T_PUR_ORDER ( ID, ADVERT_ID, AREA_CODE, ADVERT_TYPE, SEQ_ID ) VALUES ( 2, 1002, 'B', 'TWO', 2); 
INSERT INTO T_PUR_ORDER ( ID, ADVERT_ID, AREA_CODE, ADVERT_TYPE, SEQ_ID ) VALUES ( 2, 1002, 'B', 'TWO', 3); 
INSERT INTO T_PUR_ORDER ( ID, ADVERT_ID, AREA_CODE, ADVERT_TYPE, SEQ_ID ) VALUES ( 3, 1003, 'C', 'THREE', 1); 
INSERT INTO T_PUR_ORDER ( ID, ADVERT_ID, AREA_CODE, ADVERT_TYPE, SEQ_ID ) VALUES ( 8, 1001, 'A', 'ONE', 1); 
INSERT INTO T_PUR_ORDER ( ID, ADVERT_ID, AREA_CODE, ADVERT_TYPE, SEQ_ID ) VALUES ( 9, 1004, 'A', 'ONE', 1); 

COMMIT;

INSERT INTO T_ORDERS ( ID, ADVERT_ID, AREA_CODE, ADVERT_TYPE ) VALUES ( 4, 1001, 'A', 'ONE'); 
INSERT INTO T_ORDERS ( ID, ADVERT_ID, AREA_CODE, ADVERT_TYPE ) VALUES ( 5, 1002, 'B', 'TWO'); 
INSERT INTO T_ORDERS ( ID, ADVERT_ID, AREA_CODE, ADVERT_TYPE ) VALUES ( 10, 1001, 'A', 'ONE'); 
INSERT INTO T_ORDERS ( ID, ADVERT_ID, AREA_CODE, ADVERT_TYPE ) VALUES ( 11, 1004, 'A', 'ONE'); 
COMMIT;

Output
------

T_ORDERS.ID   T_PUR_ORDER.ID   ADVERT_ID     AREA_CODE    ADVERT_TYPE
-----------   --------------   ---------     ----------   ------------ 
4                   1          1001              A            ONE
5                   2          1002              B            TWO
10                  8          1001              A            ONE
11                  9          1004              A            ONE


The requirement is that I need to display all records from T_ORDERS objects along with associated ID from T_PUR_ORDER.
For every record in the T_ORDERS, there will be one associated record in the T_PUR_ORDER. The matching columns in
both the tables are ADVERT_ID, AREA_CODE and ADVERT_TYPE.

I need to get latest ID from T_PUR_ORDER where T_PUR_ORDER.ID is less than or equal to T_ORDERS.ID joining rest of the
columns.

I can achive this by creating function to get T_PUR_ORDER.ID but I wonder if this can be done using ANALYTICAL FUNCTIONS

The Function I am using so far.

Create or replace function f_get_pur_id (p_id number, p_advert_id number, p_area_code varchar2, p_advert_type varchar2) 
return number is
  v_id number;
begin
   select max(id) 
   into   v_id
   from   t_pur_order
   where  id <= p_id
   and    advert_id = p_advert_id
   and    area_code = p_area_code
   and    advert_type = p_advert_type;
   
   return v_id;

end;
/


And the query I am using :
select id, 
       advert_id, 
    area_code, 
    advert_type, 
       f_get_pur_id(id, advert_ID, area_code, advert_type) pur_id 
from t_orders;

Tom Kyte
October 03, 2007 - 5:02 pm UTC

you forgot to give us the first query - I'm not sure how to join these tables, no keys defined - no correlation names on the first result set, not clear.


but first - no function - just use sql - use a scalar subquery

select ..., 
      (select max(id) from t_pur_order where id <= X.id and advert_id = X.advert_id and ... )
  from t_orders X;




or just join and aggregate (analytics are cool, but they are used when you DO NOT want to aggregate)

ops$tkyte%ORA11GR1> select a.id, a.advert_id, a.area_code, a.advert_type, max(b.id)
  2    from t_orders a, t_pur_orders b
  3   where a.advert_id = b.advert_id
  4     and a.area_code = b.area_code
  5     and a.advert_type = b.advert_type
  6     and a.id >= b.id
  7   group by a.id, a.advert_id, a.area_code, a.advert_type
  8   order by a.id
  9  /

        ID  ADVERT_ID AREA_CODE  ADVERT_TYP  MAX(B.ID)
---------- ---------- ---------- ---------- ----------
         4       1001 A          ONE                 1
         5       1002 B          TWO                 2
        10       1001 A          ONE                 8
        11       1004 A          ONE                 9



Analytical Help

Akki, October 02, 2007 - 9:26 am UTC

Hi Tom,

Please ignore my previous request. I managed to get the output using following query.

select id, advert_id, area_code, advert_type, lag from
(select typ,id, advert_id, area_code, advert_type, 
       lag(id) over (partition by advert_id, area_code, advert_type order by id) lag
from
(select 'po' typ,id, advert_id, area_code, advert_type from t_pur_order
union all 
select 'o' typ,id, advert_id, area_code, advert_type from t_orders
order by id
))
where typ = 'o'


Thanks

Should I use Analytic funciton for this example?

BM, October 05, 2007 - 1:51 pm UTC

I have following need. Do you think I can derive following column (NEXT_CHNG_PR) using analytic function?


TIME PRICE NEXT_CHNG_PR (this is derived field)
ROW1 800 2.4 2.5 (next different price value ordered by time)
ROW2 810 2.4 2.5
ROW3 820 2.5 2.3
ROW4 830 2.5 2.3
ROW5 900 2.3 (nothing is no next record)

Forgot to give version

BM, October 05, 2007 - 2:21 pm UTC

My oracle version is 9.2.0.8.

To BM

Jay, October 09, 2007 - 10:01 am UTC

Your question is so INCOMPLETE!! Give your create and inserts and explain your question more clearly.
You can't just give some values and request for a query to arrive at that result. You know what I mean?
Thanks!

BM, October 09, 2007 - 5:31 pm UTC


CREATE TABLE TEST_NEXT_ATRB
(
  TIMEID  VARCHAR2(4 BYTE),
  SEQ_NB  NUMBER,
  PRICE   NUMBER(10,2)
)
/

INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0900', 5, 2.3); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0830', 4, 2.5); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0820', 3, 2.5); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0810', 2, 2.4); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0800', 1, 2.5); 


SQL> select * from test_next_atrb
order by timeid, seq_nb

TIME     SEQ_NB      PRICE
---- ---------- ----------
0800          1        2.5
0810          2        2.4
0820          3        2.5
0830          4        2.5
0900          5        2.3


I need to derive 4th columns NEXT_CHNG_PRICE as below - 

SQL> select * from test_next_atrb
order by timeid, seq_nb

TIME     SEQ_NB      PRICE   NEXT_CHNG_PRICE
---- ---------- ----------
0800          1        2.5   2.4
0810          2        2.4   2.5
0820          3        2.5   2.3
0830          4        2.5   2.3
0900          5        2.3   NULL


BM, October 09, 2007 - 5:31 pm UTC


CREATE TABLE TEST_NEXT_ATRB
(
  TIMEID  VARCHAR2(4 BYTE),
  SEQ_NB  NUMBER,
  PRICE   NUMBER(10,2)
)
/

INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0900', 5, 2.3); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0830', 4, 2.5); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0820', 3, 2.5); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0810', 2, 2.4); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0800', 1, 2.5); 


SQL> select * from test_next_atrb
order by timeid, seq_nb

TIME     SEQ_NB      PRICE
---- ---------- ----------
0800          1        2.5
0810          2        2.4
0820          3        2.5
0830          4        2.5
0900          5        2.3


I need to derive 4th columns NEXT_CHNG_PRICE as below - 

SQL> select * from test_next_atrb
order by timeid, seq_nb

TIME     SEQ_NB      PRICE   NEXT_CHNG_PRICE
---- ---------- ----------
0800          1        2.5   2.4
0810          2        2.4   2.5
0820          3        2.5   2.3
0830          4        2.5   2.3
0900          5        2.3   NULL


Try this out...

Jay, October 10, 2007 - 11:38 am UTC



select timeid,
seq_nb,
price,
case
when lead_price is null
then lead(lead_price) over (partition by unique_key order by timeid, seq_nb)
else lead_price
end as next_diff
from
(
select unique_key,
timeid,
seq_nb,
price,
case
when lead(price) over (partition by unique_key order by timeid, seq_nb) <> price
then lead(price) over (partition by unique_key order by timeid, seq_nb)
else null
end as lead_price
from

(
select 'Time_Group' as unique_key,
timeid,
seq_nb,
price
from test_next_atrb
order by timeid, seq_nb)
)

Still an Issue

BM, October 11, 2007 - 1:27 pm UTC

Thanks for your time Jay.

Your query works for the data provided. But as soon as I have more than 2 records with same price, it fails. For example insert following and see what happens to results. It returs NULL for 0820. Issue is that data can have any number for rows before there is chagne in value.

INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES (
'0845', 4, 2.5);


TIME SEQ_NB PRICE NEXT_DIFF
---- ---------- ---------- ----------
0800 1 2.5 2.4
0810 2 2.4 2.5
0820 3 2.5 NULL
0830 4 2.5 2.3
0845 4 2.5 2.3
0900 5 2.3 NULL

hmm..

Jay, October 11, 2007 - 5:33 pm UTC

BM,

Sure we can do this. But, you need to have a common grouping indicator for the same prices. Here is an example of what I mean. Let me know if this makes sense.

--create
CREATE TABLE TEST_NEXT_ATRB
(
  TIMEID  VARCHAR2(4 BYTE),
  SEQ_NB  NUMBER,
  PRICE   NUMBER(10,2)
)



--insert
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0800', 1, 2.5); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0810', 2, 2.4); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0820', 3, 2.5); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0830', 3, 2.5); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0845', 3, 2.5); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0900', 4, 2.3); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0915', 5, 3.2); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0930', 5, 3.2); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0945', 5, 3.2); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'1000', 6, 3.8); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'1015', 7, 4.0); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'1030', 7, 4.0); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'1045', 7, 4.0); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'1100', 8, 5.3); 


--query

select timeid,       
       seq_nb,
       price,
       max(lead_price) over (partition by unique_key, seq_nb) as next_diff
       
  from (select unique_key,
               timeid,
               seq_nb,
               price,        
               case 
                when lead(price) over (partition by unique_key order by timeid, seq_nb) <> price
                then lead(price) over (partition by unique_key order by timeid, seq_nb)
                else null
               end as lead_price,
               rownum rn
          from 
                (
                select 'Time_Group' as unique_key,
                       timeid,
                       seq_nb,
                       price        
                  from test_next_atrb   
                order by timeid, seq_nb))                 



Here is the output:

 
TIMEID SEQ_NB PRICE NEXT_DIFF   
0800  1  2.50  2.4
0810  2  2.40  2.5
0820  3  2.50  2.3
0830  3  2.50  2.3
0845  3  2.50  2.3
0900  4  2.30  3.2
0915  5  3.20  3.8
0930  5  3.20  3.8
0945  5  3.20  3.8
1000  6  3.80  4
1015  7  4.00  5.3
1030  7  4.00  5.3
1045  7  4.00  5.3
1100  8  5.30  



Thanks!

Still not resolved

BM, October 15, 2007 - 10:20 am UTC

Cool. But the problem is that we dont have any control over how the data comes. But we do know that SEQ_NB is unique. So though the results looks like what we want but they do not solve the original problem.

thanks for your time.

what???

Jay, October 15, 2007 - 11:39 am UTC

It's your database and it's your data. When you have no control over how the data comes, then you do have a problem. Probably, a good time to talk to your data architect. Others might have a better solution. Good luck!

????

BM, October 18, 2007 - 10:35 am UTC

Well, this is how the real world situations are. If you cant solve the problem then you dont update the data to get around that.


Tom.. can you help?

Jay, October 23, 2007 - 11:06 am UTC

Hi Tom,

This isn't really my question. But, I am wondering if the problem above can be solved by a straight sql or if we need to use cursors etc., to achieve this output?

Thanks a lot for your time!

A reader, December 12, 2007 - 2:44 am UTC

How to execute the function which has OUT parameter
Tom Kyte
December 12, 2007 - 8:21 am UTC

variable x <datatype>
variable y <datatype>
exec :y := f(:x)


or

declare
x datatype;
y datatype;
begin
y := f(x);
end;

To 'BM'

Jay, December 14, 2007 - 11:14 am UTC

Based on Tom's 'carry forward technique', I think this query will solve your problem.

drop table TEST_NEXT_ATRB

create TABLE TEST_NEXT_ATRB
(
  TIMEID  VARCHAR2(4 BYTE),
  SEQ_NB  NUMBER,
  PRICE   NUMBER(10,2)
)
/

INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0900', 5, 2.3); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0830', 4, 2.5); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0820', 3, 2.5); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0810', 2, 2.4); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0800', 1, 2.5); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0845', 4, 2.5); 

commit;


QUERY:


select timeid,
       seq_nb,
       price,
       substr(max(test) over (order by timeid, seq_nb),12) as next_different_price
  from
(  
select timeid,
       seq_nb,
       price,
       case 
        when lead_price is not null
        then to_char(row_number() over (order by test),'fm00000000000') || lead_price
       end as test
  from       
(      
select timeid,
       seq_nb,
       price,
       test, 
       case 
        when lead(price) over (order by test) <> price
        then lead(price) over (order by test)
        else null
       end as lead_price
  from
(select timeid,
        seq_nb,
        price,
        row_number() over (order by timeid, seq_nb) as test       
   from test_next_atrb)))


Output:
TIMEID SEQ_NB PRICE NEXT_DIFFERENT_PRICE
0800      1   2.50   2.4
0810      2   2.40   2.5
0820      3   2.50   2.5
0830      4   2.50   2.5
0845      4   2.50   2.3
0900      5   2.30   2.3


Thanks,
Jay

ignore nulls

James Su, February 26, 2008 - 1:44 pm UTC

Hello Tom,
I want to use ignore nulls in lag but it's not supported, so what I did is to replace it with:

last_value(... ignore nulls) over(partition by ... order by ... range between unbounded preceding and 1 preceding)

Is this correct or is there a better way?
Thank you.
Tom Kyte
February 26, 2008 - 3:12 pm UTC

depends, last_value might be sufficient - depending on your needs, it carries down the last non-null value, if that is what you want - then that is what you want.

If you understand what you've typed in - you'll be able to answer this yourself....

Is last value a replacement for lag - no way, doesn't do the same thing

Is last value what you want - perhaps, if you define your requirements not in terms of LAG (which appears to be not what you want) but in terms of "specification"

If you want the last non-null value in the partition above the current record - looks like you have it

More on Analytics

Shivdeep Modi, March 07, 2008 - 6:54 am UTC

Hi,

Basically I am trying to find all the roles who have only read only privileges on tables and no other privileges, which have
been granted to a specific set of users.

Here are the distinct privileges granted to assigned roles(GRANTEE)

GRANTEE                        PRIVILEGE
------------------------------ ---------------
AO_SUPPORT_RO                  SELECT
BASE                           SELECT
DATA_ENTRY_CLERK               DELETE
DATA_ENTRY_CLERK               INSERT
DATA_ENTRY_CLERK               UPDATE
SUPERVISOR                     DELETE
SUPERVISOR                     EXECUTE
SUPERVISOR                     INSERT
SUPERVISOR                     SELECT
SUPERVISOR                     UPDATE
WNS_SUPERVISOR                 DELETE
WNS_SUPERVISOR                 EXECUTE
WNS_SUPERVISOR                 INSERT
WNS_SUPERVISOR                 UPDATE


I've written something like:

SQL > select grantee, count(privilege)
SQL > from (
SQL > select distinct grantee,privilege
SQL > from dba_tab_privs
SQL > where grantee in (
SQL > select distinct granted_role
SQL > from dba_role_privs
SQL > where grantee in (
SQL > select username
SQL > from dba_users
SQL > where replace(translate(username,'0123456789','+'),'+','') in ('N','U') -- Only find users like N122345 or U12345 etc
SQL > )
SQL > and granted_role not like 'IMDD%'
SQL > )
SQL > )
SQL > group by grantee
SQL > having count(privilege) = 1
SQL > /
GRANTEE                        COUNT(PRIVILEGE)
------------------------------ ----------------
AO_SUPPORT_RO                                 1
BASE                                          1

2 rows selected.

SQL >


But this is bound to fail if we have a single distinct privilege apart from SELECT, granted to the role(s). Can we use analytics or is there a better way of doing it?

Regards,
Shivdeep
Tom Kyte
March 10, 2008 - 10:19 am UTC

just fyi - your use of the DISTINCT above shows that you don'thave a full understanding of how set operations work:

SQL > where grantee in (
SQL > select distinct granted_role
SQL >  from dba_role_privs 


that distinct is superfluous - "IN" does the right thing, distinct is not necessary or desirable there.


start with this, you obviously have requirements above and beyond the stated (you have filters on usernames and what not)

ops$tkyte%ORA10GR2> select grantee,
  2         count( case when privilege = 'SELECT' then 1 end ) cnt_select,
  3             count( case when privilege <> 'SELECT' then 1 end ) cnt_not_select,
  4             count(*)
  5  from dba_tab_privs where grantee in (select role from dba_Roles)
  6  group by grantee
  7  /

GRANTEE                        CNT_SELECT CNT_NOT_SELECT   COUNT(*)
------------------------------ ---------- -------------- ----------
EXP_FULL_DATABASE                      20             19         39
AQ_ADMINISTRATOR_ROLE                  13              8         21
CTXAPP                                  0             10         10
XDBADMIN                                2              8         10
DBA                                     6             32         38
AQ_USER_ROLE                            0              4          4
SELECT_CATALOG_ROLE                  1767              3       1770
OLAP_USER                              11             30         41
HS_ADMIN_ROLE                          14              1         15
GATHER_SYSTEM_STATISTICS                1              3          4
IMP_FULL_DATABASE                       1             17         18
EXECUTE_CATALOG_ROLE                    0             77         77
DELETE_CATALOG_ROLE                     0              2          2
LOGSTDBY_ADMINISTRATOR                  0              2          2
OEM_MONITOR                             6              9         15
WM_ADMIN_ROLE                          12              0         12
MGMT_USER                              84              0         84

17 rows selected.

ops$tkyte%ORA10GR2> select grantee,
  2         count( case when privilege = 'SELECT' then 1 end ) cnt_select,
  3             count( case when privilege <> 'SELECT' then 1 end ) cnt_not_select,
  4             count(*)
  5  from dba_tab_privs where grantee in (select role from dba_Roles)
  6  group by grantee
  7  having count( case when privilege = 'SELECT' then 1 end ) = count(*)
  8  /

GRANTEE                        CNT_SELECT CNT_NOT_SELECT   COUNT(*)
------------------------------ ---------- -------------- ----------
WM_ADMIN_ROLE                          12              0         12
MGMT_USER                              84              0         84



Analytics

Prasanth, March 12, 2008 - 2:45 pm UTC

hi tom,

I am facing a problem with presenting the row information as a column value.

Create table account
( month number, acct number,subacct number,monvol number,yearvol number);

INSERT INTO account values(8,10,1,100,200);
INSERT INTO account values(8,10,2,20,30);
INSERT INTO account values(8,10,3,15,25);
INSERT INTO account values(9,10,1,50,10);
INSERT INTO account values(9,10,2,21,50);
INSERT INTO account values(9,20,1,2,15);

select * from account;

MONTH ACCT SUBACCT MONVOL YEARVOL
------ ----- ------- ------ -------
8 10 1 100 200
8 10 2 20 30
8 10 3 15 25
9 10 1 50 10
9 10 2 21 50
9 20 1 2 15

The expected ouput is.

ACCT SUBACCT 8_MONVOL 8_YEARVOL 9_MONVOL 9_YEARVOL
---- -------- -------- -------- -------- -----------
10 1 100 200 50 10
10 3 15 25 0 0
20 1 0 0 2 15

Could you help me to get this result. I tried using in-line queires but didn't get the expected output.
Can we get the result using analytic function.

The business logic is
1. All the accoint information is required, even if its not present in one particular month.
2. Requires only the account whose MONVOL or YEARVOL difference greater than 10.

Thank you
Tom Kyte
March 12, 2008 - 5:49 pm UTC

search this site for PIVOT


but anyway, are 8 and 9 the only ones or are you expecting this to magically figure out "8,9,10,11,12,...... whatever"

(if 8 and 9 are the only ones - this is easy, search for pivot and you are done. If you want magic - where it figures out how many columns the query will return - that won't be happening, you'd have to run a query to figure out how many columns you are going to have and then build a query that matches that using the same technique as 8, 9 did)

rank fn in 8.1.7.4.0

sara, April 02, 2008 - 1:44 pm UTC

Hi Tom,

The version I am using is, 8.1.7.4.0
When I run this on sql prompt, it works

SELECT empno from (
SELECT empno, ename, RANK() OVER (PARTITION BY ename ORDER BY JDT DESC ) authorrank
FROM emp
WHERE rownum < 5)
WHERE authorrank IN (1,2) AND ROWNUM < 5

But When i compile that as a procedure, gives me error:

CREATE OR REPLACE PROCEDURE test_anal(
p_emp_id IN NUMBER
, p_generic_cursor OUT report_pkg.cursor_typ )
IS
BEGIN
OPEN p_generic_cursor FOR
SELECT empno from (
SELECT empno, ename, RANK() OVER (PARTITION BY ename ORDER BY JDT DESC ) authorrank
FROM emp
WHERE rownum < 5)
WHERE authorrank IN (1,2) AND ROWNUM < 5;
END test_anal;

Please let me know if this is allowed in procedural calls.

Thanks

Sara
Tom Kyte
April 02, 2008 - 2:10 pm UTC

It is - in 9i and above.

in 8i, you can:
CREATE OR REPLACE PROCEDURE test_anal(
          p_emp_id    IN  NUMBER
        , p_generic_cursor  OUT report_pkg.cursor_typ )
IS
BEGIN
   OPEN p_generic_cursor FOR
       'SELECT empno from (
         SELECT empno, ename, 
                RANK() OVER (PARTITION BY ename  ORDER BY JDT DESC ) authorrank
         FROM   emp
         WHERE  rownum < 5)
      WHERE authorrank IN (1,2) AND ROWNUM < 5';
END test_anal;


use "dynamic" sql - that'll hide it from the plsql parser which didn't understand this construct way way way back when.

rank fn in 8.1.7.4.0

sara, April 02, 2008 - 3:20 pm UTC

Thanks Yom! That helps

Sara

rank fn in 8.1.7.4.0

sara, April 02, 2008 - 3:21 pm UTC

Thank you Tom.

query

Sean, May 13, 2008 - 8:26 pm UTC

Hi Tom,

Here is the logic:
(1) when t1.c2 is not null, use both c1 and c2 in join
(2) when t1.c2 is null, use c1 for join only



create table t1(c1 number, c2 number);


create table t2(c1 number, c2 number);


insert into t1(c1, c2) values(1, 10);
insert into t1(c1) values(2);


insert into t2(c1, c2) values(2, 20);
insert into t2(c1, c2) values(1, 10);
insert into t2(c1) values(1);



Here is the result I want:



C1 C2
---------- ----------
1 10
2 20



-- Query is like this one, but this one does not work as I want.
select t2.c1, t2.c2
from t1, t2
where t2.c1 = t1.c1(+) and t2.c2 = t1.c2(+);

Thanks so much for your help. -- Sean


Tom Kyte
May 14, 2008 - 10:34 am UTC

that is not logical logic.

it is rather ugly.

but, your join is

t1.c1 = t2.c1

and then you have a filter

a) t2.c2 = t1.c2 OR t1.c2 is null


ops$tkyte%ORA10GR2> select *
  2    from t1, t2
  3   where t1.c1 = t2.c1
  4     and (t1.c2 is null OR t1.c2 = t2.c2)
  5  /

        C1         C2         C1         C2
---------- ---------- ---------- ----------
         2                     2         20
         1         10          1         10

How analytic functions work

Dibyendu, May 14, 2008 - 5:34 am UTC

Hi Tom,
I have used a number of analytical functions. But I am not sure how these analytical functions actually work internally. Could you please give an idea?
Thanks a lot.
Tom Kyte
May 14, 2008 - 3:27 pm UTC

not sure what you mean by "work internally"??

they sort, cache, sift through data - like order by, group by, whatever would

Ranking Function

Prasanth, May 29, 2008 - 9:42 am UTC

Hi Tom,
As per the business requirement I want to achieve the median RANK values of a column. Could you please tell me, how it will be able to achieve it using the aggregate functions?

Create table account ( month number(2), acct number(5),amt number(5));

INSERT INTO account values(1,101,90);
INSERT INTO account values(1,101,50);
INSERT INTO account values(1,101,50);
INSERT INTO account values(1,101,50);
INSERT INTO account values(1,101,40);
INSERT INTO account values(1,102,90);
INSERT INTO account values(1,102,50);
INSERT INTO account values(1,102,50);
commit;

select month,acct,amt,rank() over(partition by acct order by amt desc NULLS LAST) as rnk from account

MONTH ACCT AMT RNK
---------- ---------- ---------- ----------
1 101 90 1
1 101 50 2
1 101 50 2
1 101 50 2
1 101 40 5
1 102 90 1
1 102 50 2
1 102 50 2

But i require the RANK output as

MONTH ACCT AMT RNK
---------- ---------- ---------- ----------
1 101 90 1
1 101 50 3
1 101 50 3
1 101 50 3
1 101 40 5
1 102 90 1
1 102 50 3
1 102 50 3

Thank you.
Tom Kyte
May 29, 2008 - 10:00 am UTC

well, I guess you'll have to explain your definition of a "median rank" in the manner you are presenting it.


Ranking Function

Prasanth, May 29, 2008 - 11:06 am UTC

Hi Tom,
Sorry for the confusion statement.
If there is a tie on the rank column, (this case amount), instead of assigning the next rank no (i.e. 2 in this case) wants to assigns the mean of the corresponding tied colums rank no.

For the tie columns get the rank no in sequential order and then (Sum of that ranks no / total no of tied columns).
For this e.g.: For the tied amount 50 of account 101 (2+3+4) / 3 = 3

Thank you
Tom Kyte
May 29, 2008 - 11:26 am UTC

perhaps something like:

ops$tkyte%ORA9IR2> select month, acct, amt, rnk,
  2         round( avg(rn) over (partition by acct, rnk) ) med_rnk
  3    from (
  4  select month,acct,amt,
  5         row_number() over(partition by acct order by amt desc NULLS LAST) as rn,
  6         rank() over(partition by acct order by amt desc NULLS LAST) as rnk
  7    from account
  8         )
  9  /

     MONTH       ACCT        AMT        RNK    MED_RNK
---------- ---------- ---------- ---------- ----------
         1        101         90          1          1
         1        101         50          2          3
         1        101         50          2          3
         1        101         50          2          3
         1        101         40          5          5
         1        102         90          1          1
         1        102         50          2          3
         1        102         50          2          3

8 rows selected.

Ranking Function

Prasanth, May 30, 2008 - 5:34 am UTC

Hi Tom,
Yes, this is the same i am looking for. Thank you for the quick response. We are migrating an application from Mainframe to Oracle and the main frame has this sort of ranking.

busy day busy hour

Ramakrishnan, June 09, 2008 - 9:09 am UTC

Hi Tom,

My table

CREATE TABLE TRAFFIC
( CO_ID NUMBER NOT NULL ENABLE,
START_TIME DATE NOT NULL ENABLE,
TCH_REQUEST NUMBER,
CONSTRAINT TRAFFIC_PK PRIMARY KEY (CO_ID, START_TIME)
)

I am using following SQL to find busy hour of a day. finding a counter (TCH_request ) when its max in a day.

Select co_id, Day, hour, TCH_request FROM
( Select co_id, Day, hour, TCH_request, RANK() OVER (PARTITION BY Day ORDER BY Day,
TCH_request DESC NULLS LAST) as TCH_request_Rank
FROM
( SELECT co_id, TRUNC(START_TIME,'DD') Day, TRUNC(START_TIME,'HH24') Hour,
SUM(TCH_request) TCH_request
FROM omc.Traffic
group by TRUNC(START_TIME,'HH24'), co_id, TRUNC(START_TIME,'DD') ) groupByHr
) hr_Rank
WHERE TCH_request_Rank = 1;

inner most SQL do groupBy Hour , then next inner SQL does ranking hours with in a day , outer most SQL to display busy hour of a day.


This SQL is perfectly fine.


1. but is it possible to find "busy day busy hour" i.e find a busy day in week and on that busy day find a busy hour.
is it possible to do in single SQL.

2. Is it possible to find busy day in every 3 days and on that busy day find a busy hour.

Best regards
Ramakrishnan.A

Tom Kyte
June 09, 2008 - 1:46 pm UTC

... This SQL is perfectly fine. ...

I think not.

it does not return the "busiest hour"

it returns the busiest hour on the first day for a GIVEN co_id

the inner most sql does not aggregate by hour, it aggregates by co_id and hour of day.

please rephrase this entirely - it doesn't add up.

busy day busy hour

Ramakrishnan.A, June 10, 2008 - 5:55 am UTC

Hi Tom,

Thanks for pointing out in first look itself !!

I corrected my SQL , now I added "PARTITION BY co_id, Day ORDER BY Day". I want to find busiest hour for given co_id, for every day.

Select co_id, to_char( Day, 'dd-mm-yyyy' ), to_char( hour, 'dd-mm-yyyy hh:mi' ), TCH_request FROM
( Select co_id, Day, hour, TCH_request, RANK() OVER ( PARTITION BY co_id, Day ORDER BY Day,
TCH_request DESC NULLS LAST) as TCH_request_Rank
FROM
( SELECT co_id, TRUNC(START_TIME,'DD') Day, TRUNC(START_TIME,'HH24') Hour,
SUM(TCH_request) TCH_request
FROM omc.Traffic
group by TRUNC(START_TIME,'HH24'), co_id, TRUNC(START_TIME,'DD') ) groupByHr
) hr_Rank
WHERE TCH_request_Rank = 1;

So my question again
1. is it possible to find "busy day busy hour" i.e. find a busy day in week and on that busy day find a busy hour. is it possible to do in single SQL.
2. Is it possible to find busy day in every 3 days and on that busy day find a busy hour.


Best regards
Ramakrishnan.A
Tom Kyte
June 10, 2008 - 8:08 am UTC

so, now, I need data - interesting data - provide the create table and insert into statements - make sure to have more than one co_id, a busy day that is not the "first" day and so on.

Busy day busy hour

Ramakrishnan.A, June 10, 2008 - 12:05 pm UTC

Hi Tom ,

I didn't provide the data because I thought it will fill the page.


CREATE TABLE "TRAFFIC"
( "CO_ID" NUMBER NOT NULL ENABLE,
"START_TIME" DATE NOT NULL ENABLE,
"TCH_REQUEST" NUMBER,
CONSTRAINT "TRAFFIC_PK" PRIMARY KEY ("CO_ID", "START_TIME"))

data :


insert into traffic values ( 1, to_date('9-06-2008 6:00:00','dd-mm-yyyy hh:mi:ss'), 13 );
insert into traffic values ( 2, to_date('9-06-2008 6:00:00','dd-mm-yyyy hh:mi:ss'), 11 );
insert into traffic values ( 3, to_date('9-06-2008 6:00:00','dd-mm-yyyy hh:mi:ss'), 12 );
insert into traffic values ( 1, to_date('9-06-2008 6:30:00','dd-mm-yyyy hh:mi:ss'), 10 );
insert into traffic values ( 2, to_date('9-06-2008 6:30:00','dd-mm-yyyy hh:mi:ss'), 11 );
insert into traffic values ( 3, to_date('9-06-2008 6:30:00','dd-mm-yyyy hh:mi:ss'), 12 );

insert into traffic values ( 1, to_date('9-06-2008 7:00:00','dd-mm-yyyy hh:mi:ss'), 65 );
insert into traffic values ( 2, to_date('9-06-2008 7:00:00','dd-mm-yyyy hh:mi:ss'), 30 );
insert into traffic values ( 3, to_date('9-06-2008 7:00:00','dd-mm-yyyy hh:mi:ss'), 9 );
insert into traffic values ( 1, to_date('9-06-2008 7:30:00','dd-mm-yyyy hh:mi:ss'), 13 );
insert into traffic values ( 2, to_date('9-06-2008 7:30:00','dd-mm-yyyy hh:mi:ss'), 10 );
insert into traffic values ( 3, to_date('9-06-2008 7:30:00','dd-mm-yyyy hh:mi:ss'), 9 );

insert into traffic values ( 1, to_date('9-06-2008 8:00:00','dd-mm-yyyy hh:mi:ss'), 13 );
insert into traffic values ( 2, to_date('9-06-2008 8:00:00','dd-mm-yyyy hh:mi:ss'), 70 );
insert into traffic values ( 3, to_date('9-06-2008 8:00:00','dd-mm-yyyy hh:mi:ss'), 35);
insert into traffic values ( 1, to_date('9-06-2008 8:30:00','dd-mm-yyyy hh:mi:ss'), 13 );
insert into traffic values ( 2, to_date('9-06-2008 8:30:00','dd-mm-yyyy hh:mi:ss'), 10 );
insert into traffic values ( 3, to_date('9-06-2008 8:30:00','dd-mm-yyyy hh:mi:ss'), 14 );

insert into traffic values ( 1, to_date('10-06-2008 6:00:00','dd-mm-yyyy hh:mi:ss'), 25 );
insert into traffic values ( 2, to_date('10-06-2008 6:00:00','dd-mm-yyyy hh:mi:ss'), 11 );
insert into traffic values ( 3, to_date('10-06-2008 6:00:00','dd-mm-yyyy hh:mi:ss'), 12 );
insert into traffic values ( 1, to_date('10-06-2008 6:30:00','dd-mm-yyyy hh:mi:ss'), 10 );
insert into traffic values ( 2, to_date('10-06-2008 6:30:00','dd-mm-yyyy hh:mi:ss'), 11 );
insert into traffic values ( 3, to_date('10-06-2008 6:30:00','dd-mm-yyyy hh:mi:ss'), 12 );

insert into traffic values ( 1, to_date('10-06-2008 7:00:00','dd-mm-yyyy hh:mi:ss'), 10 );
insert into traffic values ( 2, to_date('10-06-2008 7:00:00','dd-mm-yyyy hh:mi:ss'), 73 );
insert into traffic values ( 3, to_date('10-06-2008 7:00:00','dd-mm-yyyy hh:mi:ss'), 9 );
insert into traffic values ( 1, to_date('10-06-2008 7:30:00','dd-mm-yyyy hh:mi:ss'), 13 );
insert into traffic values ( 2, to_date('10-06-2008 7:30:00','dd-mm-yyyy hh:mi:ss'), 10 );
insert into traffic values ( 3, to_date('10-06-2008 7:30:00','dd-mm-yyyy hh:mi:ss'), 9 );

insert into traffic values ( 1, to_date('10-06-2008 8:00:00','dd-mm-yyyy hh:mi:ss'), 13 );
insert into traffic values ( 2, to_date('10-06-2008 8:00:00','dd-mm-yyyy hh:mi:ss'), 10 );
insert into traffic values ( 3, to_date('10-06-2008 8:00:00','dd-mm-yyyy hh:mi:ss'), 35);
insert into traffic values ( 1, to_date('10-06-2008 8:30:00','dd-mm-yyyy hh:mi:ss'), 13 );
insert into traffic values ( 2, to_date('10-06-2008 8:30:00','dd-mm-yyyy hh:mi:ss'), 10 );
insert into traffic values ( 3, to_date('10-06-2008 8:30:00','dd-mm-yyyy hh:mi:ss'), 14 );

insert into traffic values ( 1, to_date('11-06-2008 6:00:00','dd-mm-yyyy hh:mi:ss'), 25 );
insert into traffic values ( 2, to_date('11-06-2008 6:00:00','dd-mm-yyyy hh:mi:ss'), 11 );
insert into traffic values ( 3, to_date('11-06-2008 6:00:00','dd-mm-yyyy hh:mi:ss'), 12 );
insert into traffic values ( 1, to_date('11-06-2008 6:30:00','dd-mm-yyyy hh:mi:ss'), 10 );
insert into traffic values ( 2, to_date('11-06-2008 6:30:00','dd-mm-yyyy hh:mi:ss'), 11 );
insert into traffic values ( 3, to_date('11-06-2008 6:30:00','dd-mm-yyyy hh:mi:ss'), 12 );

insert into traffic values ( 1, to_date('11-06-2008 7:00:00','dd-mm-yyyy hh:mi:ss'), 13 );
insert into traffic values ( 2, to_date('11-06-2008 7:00:00','dd-mm-yyyy hh:mi:ss'), 30 );
insert into traffic values ( 3, to_date('11-06-2008 7:00:00','dd-mm-yyyy hh:mi:ss'), 9 );
insert into traffic values ( 1, to_date('11-06-2008 7:30:00','dd-mm-yyyy hh:mi:ss'), 13 );
insert into traffic values ( 2, to_date('11-06-2008 7:30:00','dd-mm-yyyy hh:mi:ss'), 10 );
insert into traffic values ( 3, to_date('11-06-2008 7:30:00','dd-mm-yyyy hh:mi:ss'), 9 );

insert into traffic values ( 1, to_date('11-06-2008 8:00:00','dd-mm-yyyy hh:mi:ss'), 13 );
insert into traffic values ( 2, to_date('11-06-2008 8:00:00','dd-mm-yyyy hh:mi:ss'), 10 );
insert into traffic values ( 3, to_date('11-06-2008 8:00:00','dd-mm-yyyy hh:mi:ss'), 85);
insert into traffic values ( 1, to_date('11-06-2008 8:30:00','dd-mm-yyyy hh:mi:ss'), 13 );
insert into traffic values ( 2, to_date('11-06-2008 8:30:00','dd-mm-yyyy hh:mi:ss'), 10 );
insert into traffic values ( 3, to_date('11-06-2008 8:30:00','dd-mm-yyyy hh:mi:ss'), 14 );

busy day for each CO_ID are , on this day I want busy hour

CO_ID DAY TCH_REQUEST
------ ------ ------------
1 09-06-2008 127
2 09-06-2008 142
3 11-06-2008 141

Tom Kyte
June 10, 2008 - 2:47 pm UTC


ops$tkyte%ORA10GR2> select co_id, dt, tch, sum_tch
  2    from (
  3  select co_id, dt, tch, sum_tch, rank() over (partition by co_id order by sum_tch DESC nulls last) big_by_day
  4    from (
  5  select co_id, dt, trunc(dt,'dd'), tch,
  6         rank() over (partition by co_id, trunc(dt,'dd') order by tch DESC nulls last) big_by_hour,
  7             sum(tch) over (partition by co_id, trunc(dt,'dd') ) sum_tch
  8    from (
  9  select co_id, trunc(start_time,'hh') dt, sum(tch_request) tch
 10    from traffic
 11   group by co_id, trunc(start_time,'hh')
 12         )
 13             )
 14   where big_by_hour = 1
 15         )
 16   where big_by_day = 1
 17  /

     CO_ID DT                          TCH    SUM_TCH
---------- -------------------- ---------- ----------
         1 09-jun-2008 07:00:00         78        127
         2 09-jun-2008 08:00:00         80        142
         3 11-jun-2008 08:00:00         99        141



busy day busy hour

Ramakrishnan.A, June 13, 2008 - 12:25 pm UTC

Thanks Tom,
I try to finding Busiest day of a week; and busy hour for that busy day. I tested and its works fine.
How can I find Busy day with in 3 days or 5 days ?


Select * from (
select co_id, to_char(dt,'dd-mm-yyyy HH:MI') BDBH, to_char(dt,'IW') week, tch,big_by_hour, tch_day_sum ,big_by_day
from (
select co_id, dt, tch, tch_day_sum, dense_rank() over (partition by co_id, trunc(dt,'IW') order by tch_day_sum DESC nulls last) big_by_day
,dense_rank() over ( partition by co_id, trunc(dt,'dd') order by tch DESC nulls last) big_by_hour
from (
select co_id, dt, trunc(dt,'dd'), tch,
--rank() over (partition by co_id, trunc(dt,'dd') order by tch DESC nulls last) big_by_hour,
sum(tch) over (partition by co_id, trunc(dt,'dd') ) tch_day_sum
from (
select co_id, trunc(start_time,'hh') dt, sum(tch_request) tch
from traffic
group by co_id, trunc(start_time,'hh')
)
)
)
)
where big_by_day = 1 and big_by_hour = 1
order by week, co_id

Tom Kyte
June 13, 2008 - 1:26 pm UTC

... How can I find Busy day with in 3 days or 5 days ?
....

define better what you mean. don't post a query that doesn't work and expect we can reverse engineer it to figure out what you really meant - post a textual description detailing precisely the logic you need to employ - like a requirement, like you were describing to your mom how to do this.

Sorting columns

Bhavesh Ghodasara, June 16, 2008 - 8:31 am UTC

Hi Tom,

I want to sort column values.

e.g.
create table t
(a number,
b number,
c number,
d number,
e number,
f number);

insert into t values (6,3,2,1,4,5);

insert into t values (40,10,60,30,20,50);

select *
from t;

A B C D E F

6 3 2 1 4 5
40 10 60 30 20 50


I want output like :

1#2#3#4#5#6
10#20#30#40#50#60

It is possible that some columns contains null value. In that case we can ignore that value.

Thanks in advance.
Tom Kyte
June 16, 2008 - 1:13 pm UTC

one has to wonder what sort of data this would be then, to be stored in a relational database - I'll bet the column names are like "C1, C2, ... C6" - meaning they are a 1:m relationship stored incorrectly IN RECORD.


ops$tkyte%ORA10GR2> select max( decode( rn, 1, column_value ) ) c1,
  2         max( decode( rn, 2, column_value ) ) c2,
  3         max( decode( rn, 3, column_value ) ) c3,
  4         max( decode( rn, 4, column_value ) ) c4,
  5         max( decode( rn, 5, column_value ) ) c5,
  6         max( decode( rn, 6, column_value ) ) c6
  7    from (
  8  select t.rowid rid, c.column_value, row_number() over (partition by t.rowid order by c.column_value) rn
  9    from T, table ( select sys.odcinumberlist( a,b,c,d,e,f ) from dual ) c
 10         )
 11   group by rid
 12  /

        C1         C2         C3         C4         C5         C6
---------- ---------- ---------- ---------- ---------- ----------
         1          2          3          4          5          6
        10         20         30         40         50         60


why is odcinumberlist not working for a simple query?

A reader, June 18, 2008 - 4:34 pm UTC

askus@oracle11g> select * from table ( select sys.odcinumberlist( a,b,c,d,e,f ) from dual ) c;
select * from table ( select sys.odcinumberlist( a,b,c,d,e,f ) from dual ) c
                                                           *
ERROR at line 1:
ORA-00904: "F": invalid identifier


askus@oracle11g> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
PL/SQL Release 11.1.0.6.0 - Production
CORE    11.1.0.6.0      Production
TNS for Linux: Version 11.1.0.6.0 - Production
NLSRTL Version 11.1.0.6.0 - Production

askus@oracle11g> table ( select sys.odcinumberlist( a,b,c,d,e,f ) from dual ) c;
SP2-0734: unknown command beginning "table ( se..." - rest of line ignored.
askus@oracle11g> select * from table ( select sys.odcinumberlist( a,b,c,d,e,f ) from dual ) c;
select * from table ( select sys.odcinumberlist( a,b,c,d,e,f ) from dual ) c
                                                           *
ERROR at line 1:
ORA-00904: "F": invalid identifier


askus@oracle11g> 

Tom Kyte
June 18, 2008 - 4:39 pm UTC

well, you are not selecting from a table that has F, you are selecting from dual which only has dummy

I had their table T

  9    from <b>T,</b> table ( select sys.odcinumberlist( a,b,c,d,e,f ) from dual ) c


a,b,c,d,e,f came from T

How to do "right outer join" using lag/lead

Salman Syed, July 09, 2008 - 3:01 pm UTC

Tom,

I have been struggling with this for some time now. Here is a simplified version of my problem.

Suppose the data is as follows:

Month Year Salary
1 2008 500
2 2008 600
1 2007 400
2 2007 450
3 2007 475
4 2008 625

The output I want is:

Month This_Year_Salary Last_Year_Salary
1 500 400
2 600 450
3 (null) 475
4 625 (null)

Is there a way to achieve this using analytics?

If I use lag, I miss the third record in the output. If I use lead, I miss the fourth records in the output.

Thanks a lot for helping me out.

Tom Kyte
July 09, 2008 - 3:24 pm UTC

this is a pivot

select month, 
       max( decode( year, 2007, salary ) ) last_year,
       max( decode( year, 2008, salary ) ) this_year
  from t
 group by month
 order by month
/


analytics are for when you want all rows preserved (you do not, you want to squish out some)


Reader, August 04, 2008 - 10:35 am UTC

create table business_data
(dt date
,val varchar2(5)
,price1 number(5,2)
,price2 number(5,2)
);


select to_date('06/26/2008 9:30:20 AM' ,'mm/dd/yyyy HH:MI:SS AM') from dual

insert into business_data
values
(to_date('06/26/2008 9:30:20 AM' ,'mm/dd/yyyy HH:MI:SS AM'),'ABC',33.98,34.03);

insert into business_data
values
(to_date('06/26/2008 9:30:20 AM', 'mm/dd/yyyy HH:MI:SS AM'),'ABC',33.98,34.02);


insert into business_data
values
(to_date('06/26/2008 09:30:21 AM' ,'mm/dd/yyyy HH:MI:SS AM'),'ABC',33.98,34.06);

insert into business_data
values
(to_date('06/26/2008 09:30:21 AM', 'mm/dd/yyyy HH:MI:SS AM'),'ABC',33.98,34.07);


insert into business_data
values
(to_date('06/26/2008 09:30:25 AM' ,'mm/dd/yyyy HH:MI:SS AM'),'ABC',33.99,34.09);

insert into business_data
values
(to_date('06/26/2008 09:30:25 AM', 'mm/dd/yyyy HH:MI:SS AM'),'ABC',33.99,34.09);

insert into business_data
values
(to_date('06/26/2008 09:30:25 AM', 'mm/dd/yyyy HH:MI:SS AM'),'ABC',34.05,34.09);

insert into business_data
values
(to_date('06/26/2008 09:30:25 AM' ,'mm/dd/yyyy HH:MI:SS AM'),'ABC',33.99,34.09);

insert into business_data
values
(to_date('06/26/2008 09:30:25 AM', 'mm/dd/yyyy HH:MI:SS AM'),'ABC',33.99,34.09);

insert into business_data
values
(to_date('06/26/2008 09:30:25 AM', 'mm/dd/yyyy HH:MI:SS AM'),'ABC',33.99,34.09);

insert into business_data
values
(to_date('06/26/2008 09:30:25 AM', 'mm/dd/yyyy HH:MI:SS AM'),'ABC',33.99,34.09);


insert into business_data
values
(to_date('06/26/2008 09:30:26 AM', 'mm/dd/yyyy HH:MI:SS AM'),'ABC',34.02,34.1);

insert into business_data
values
(to_date('06/26/2008 09:30:28 AM' ,'mm/dd/yyyy HH:MI:SS AM'),'ABC',34.02,34.1);

insert into business_data
values
(to_date('06/26/2008 09:30:29 AM', 'mm/dd/yyyy HH:MI:SS AM'),'ABC',34.02,34.08);

insert into business_data
values
(to_date('06/26/2008 09:30:29 AM' ,'mm/dd/yyyy HH:MI:SS AM'),'ABC',34.02,34.08);


insert into business_data
values
(to_date('06/26/2008 09:30:39 AM' ,'mm/dd/yyyy HH:MI:SS AM'),'ABC',34.03,34.03);

insert into business_data
values
(to_date('06/26/2008 09:30:39 AM' ,'mm/dd/yyyy HH:MI:SS AM'),'ABC',34.03,34.06);

commit;

DT VAL PRICE1 PRICE2
---------------------- ----- ---------- ----------
06/26/2008 09:30:20 am ABC 33.98 34.03
06/26/2008 09:30:20 am ABC 33.98 34.02
06/26/2008 09:30:21 am ABC 33.98 34.06
06/26/2008 09:30:21 am ABC 33.98 34.07
06/26/2008 09:30:25 am ABC 33.99 34.09
06/26/2008 09:30:25 am ABC 33.99 34.09
06/26/2008 09:30:25 am ABC 34.05 34.09
06/26/2008 09:30:25 am ABC 33.99 34.09
06/26/2008 09:30:25 am ABC 33.99 34.09
06/26/2008 09:30:25 am ABC 33.99 34.09
06/26/2008 09:30:25 am ABC 33.99 34.09
06/26/2008 09:30:26 am ABC 34.02 34.1
06/26/2008 09:30:28 am ABC 34.02 34.1
06/26/2008 09:30:29 am ABC 34.02 34.08
06/26/2008 09:30:29 am ABC 34.02 34.08
06/26/2008 09:30:39 am ABC 34.03 34.03
06/26/2008 09:30:39 am ABC 34.03 34.06

For each minute in the business_data table, I need to get the min of price1 column and max of price2 column and sort the data by val, dt column.
Here is the query I used to get this data.

select distinct val,dt, min_price_1,max_price_2
from(
select val,
dt,
first_value(price1) over (partition by val,dt order by price1) min_price_1,
first_value(price2) over (partition by val,dt order by price2 desc) max_price_2
from business_data
)
order by val,dt;

VAL DT MIN_PRICE_1 MAX_PRICE_2
----- ---------------------- ----------- -----------
ABC 06/26/2008 09:30:20 am 33.98 34.03
ABC 06/26/2008 09:30:21 am 33.98 34.07
ABC 06/26/2008 09:30:25 am 33.99 34.09
ABC 06/26/2008 09:30:26 am 34.02 34.1
ABC 06/26/2008 09:30:28 am 34.02 34.1
ABC 06/26/2008 09:30:29 am 34.02 34.08
ABC 06/26/2008 09:30:39 am 34.03 34.06

7 rows selected.



1st record: If the time difference between the first record and second record is one second,
I need to get the minimum of PRICE_1 from the 1st and 2nd record (i.e min(33.98,33.98) = 33.98)
I need to get the maximum of PRICE_2 from the 1st and 2nd record (i.e max(34.03,34.07) = 34.07)

Result:
val,dt,PRICE_1,PRICE_2
ABC,06/26/2008 09:30:20 am, 33.98,34.07


2nd record: If the time difference between the 2nd record and 1st record is one second then if the time difference between 2nd and 3rd record is one second,
I need to take min PRICE_1 among the 1st three records (i.e min(33.98,33.98,33.99) = 33.98)
I need to make max PRICE_2 among the 1st three records (i.e max(34.03,34.07,34.09) = 34.09)

val,dt,PRICE_1,PRICE_2
ABC,06/26/2008 09:30:21 am,33.98,34.09


I do not have data for 06/26/2008 09:30:22 am,06/26/2008 09:30:23 am, 06/26/2008 09:30:24 am. I should get null values for that time

val,dt,PRICE_1,PRICE_2
ABC,06/26/2008 09:30:22 am,null,null
ABC,06/26/2008 09:30:23 am,null,null
ABC,06/26/2008 09:30:24 am,null,null

For 09:30:25 am, i should compare it with 09:30:26 am only, since there is no data for 09:30:24 am.
val,dt,PRICE_1,PRICE_2
ABC,06/26/2008 09:30:21 am,33.99,34.1

Is this possible to do using analytical functions?
Tom Kyte
August 04, 2008 - 2:06 pm UTC

you wrote:


For each minute in the business_data table, I need to get the min of price1
column and max of price2 column
and sort the data by val, dt column.
Here is the query I used to get this data.


but you don't do that - you are breaking the data up by val,dt - ordering by price1 - there is nothing about minutes here at all?


so, tell you what, lose the queries that do not work - and in text, describe what you want.


your description contradicts your query contradicts your subsequent "within one second" talk. You talk of first and next records - we presume you really mean "after partitioning by value" so there are many possible first records (and update the inserts to reflect that, make the data reasonable)

what if 6,000 records all exist within 1 second of each other - does "minute" still come into play? spend a little more time on the probably description - like you were writing specifications for a programming to write code from - it'll take time to do that (and you might find the answer to your question as you do)


Reader, August 04, 2008 - 10:40 am UTC

Last record should be as below. Apologies for the typo

val,dt,PRICE_1,PRICE_2
ABC,06/26/2008 09:30:25 am,33.99,34.1

can rolling up be done easily ?

Pranav Parekh, August 05, 2008 - 10:59 am UTC

I want to rollup the sum of salaries for each employee to his manager in a hieararchy. For example, I want to add one more column in the below query to show the sum of salaries for each manager for the employees works under them. Can we do it in one query ?

SELECT EMPNO, ENAME, SAL, MGR,
SYS_CONNECT_BY_PATH (ENAME, '/') HIER, LEVEL
FROM EMP
START WITH MGR IS NULL
CONNECT BY PRIOR EMPNO = MGR;
Tom Kyte
August 05, 2008 - 1:08 pm UTC


ops$tkyte%ORA10GR2> SELECT EMPNO, ENAME, SAL, MGR,  SYS_CONNECT_BY_PATH (ename, '/') HIER, LEVEL,
  2        (select sum(e2.sal) from scott.emp e2 start with e2.empno = e1.empno connect by prior e2.empno = e2.mgr) sum_sal
  3  FROM scott.EMP e1
  4  START WITH MGR IS NULL
  5  CONNECT BY PRIOR EMPNO = MGR;

     EMPNO ENAME             SAL        MGR HIER                                LEVEL    SUM_SAL
---------- ---------- ---------- ---------- ------------------------------ ---------- ----------
      7839 KING             5000            /KING                                   1      26825
      7566 JONES            2975       7839 /KING/JONES                             2       8675
      7788 SCOTT             800       7566 /KING/JONES/SCOTT                       3       1900
      7876 ADAMS            1100       7788 /KING/JONES/SCOTT/ADAMS                 4       1100
      7902 FORD             3000       7566 /KING/JONES/FORD                        3       3800
      7369 SMITH             800       7902 /KING/JONES/FORD/SMITH                  4        800
      7698 BLAKE            2850       7839 /KING/BLAKE                             2       9400
      7499 ALLEN            1600       7698 /KING/BLAKE/ALLEN                       3       1600
      7521 WARD             1250       7698 /KING/BLAKE/WARD                        3       1250
      7654 MARTIN           1250       7698 /KING/BLAKE/MARTIN                      3       1250
      7844 TURNER           1500       7698 /KING/BLAKE/TURNER                      3       1500
      7900 JAMES             950       7698 /KING/BLAKE/JAMES                       3        950
      7782 CLARK            2450       7839 /KING/CLARK                             2       3750
      7934 MILLER           1300       7782 /KING/CLARK/MILLER                      3       1300

14 rows selected.

Thanks !!

Pranav Parekh, August 06, 2008 - 3:38 pm UTC

Thanks Tom. It works.

can i use Analytic to ..

rain, August 07, 2008 - 4:21 am UTC

hi, Tom

select * from (
select empno,ename,job,deptno,
row_number() over(partition by deptno order by empno) as seq
from emp);

EMPNO ENAME JOB DEPTNO SEQ
---------- ---------- ---------- ---------- ----------
7369 SMITH CLERK 20 1
7566 JONES MANAGER 20 2
7788 SCOTT ANALYST 20 3
7876 ADAMS CLERK 20 4
7902 FORD ANALYST 20 5
7499 ALLEN SALESMAN 30 1
7521 WARD SALESMAN 30 2
7654 MARTIN SALESMAN 30 3
7698 BLAKE MANAGER 30 4
7844 TURNER SALESMAN 30 5
7900 JAMES CLERK 30 6


if i want to get:
two of emp in deptno 20, and three of emp in deptno 30

EMPNO ENAME JOB DEPTNO SEQ
---------- ---------- ---------- ---------- ----------
7369 SMITH CLERK 20 1
7566 JONES MANAGER 20 2
7499 ALLEN SALESMAN 30 1
7521 WARD SALESMAN 30 2
7654 MARTIN SALESMAN 30 3

now i use tow sql union to achieve this , but
Can it use analytic functions and single sql achieve??

Thanks
Maer
Tom Kyte
August 07, 2008 - 1:23 pm UTC

select * from (
select empno,ename,job,deptno,
row_number() over(partition by deptno order by empno) as seq
from emp)
where (deptno=20 and seq <= 2)
or (deptno=30 and seq<=3);

thanks~~

rain, August 08, 2008 - 12:09 am UTC

YA~YA~YA~ thanks !!
select * from (
select empno,ename,job,deptno,
row_number() over(partition by deptno order by empno) as seq from emp)
where (deptno>=30 and seq <= 3) or (deptno<30 and seq<=2);

I'm too focus on the analytic functions, but i neglect the where condition instead!!
Tom, you are smartest.

Now, I still have a problem.
The deptno are 10,20,30.
There are 3 emps in deptno 10, 5 emps in deptno 20, 6 emps in deptno 30 .
I want to display 5 emps which three from deptno 30,and two from below deptno 30 (deptno 10 or 20)

in my original sql is :

select * from(
select empno,ename,job,deptno from emp where deptno=30 order by sys_guid())
where rownum<4
union
select * from(
select empno,ename,job,deptno from emp where deptno <30 order by sys_guid())
where rownum<3;

 EMPNO ENAME    JOB         DEPTNO
------ -------- --------- --------
  7698 BLAKE    MANAGER         30
  7844 TURNER   SALESMAN        30
  7900 JAMES    CLERK           30
  7902 FORD     ANALYST         20
  7934 MILLER   CLERK           10

(Our product system have some query like this.because performance i want to tuning them.)

I use analytic sql like last question, but how to group deptno 30 , below deptno 30(20,10) ?
Can it use analytic functions and single sql achieve??

thanks

above

rain, August 13, 2008 - 4:50 am UTC

Hi,Tom
would you give me some hints on my last question.

thanks
Tom Kyte
August 18, 2008 - 8:54 am UTC

this:

I use analytic sql like last question, but how to group deptno 30 , below deptno 30(20,10) ?
Can it use analytic functions and single sql achieve??


did not make sense, "how to group deptno 30", what does grouping have to do with anything??



if you really mean:

we need 3 rows from deptno = 30, they are to be RANDOM rows

we need 2 rows not from deptno = 30, they are to be RANDOM rows

and you need this to be "efficient"

you are not going to get there from here easily......

RE

rain, August 19, 2008 - 5:41 am UTC

Tom ,
Thanks your reply.

"we need 3 rows from deptno = 30, they are to be RANDOM rows "
"we need 2 rows not from deptno = 30, they are to be RANDOM rows "
Yes, we want it.
and
does your mean that we split two sql ,one do deptno 30, the other do bellow deptno 30,
then union them is the best way in this case.

Performance of Analytic functions

Tilak, August 28, 2008 - 9:56 am UTC

Hi Tom,
I am writing a query like

select * from (SELECT c.*,
Row_number() OVER (PARTITION BY customer_number ORDER BY extract_date DESC) rnk
FROM cust_table c ) where rnk=1 ;
wanted to extract one record incase of multiples orderd by extract_date

the table cust_table has something around 3 million records, the query takes around 5 min to comeup with result.It does'nt seem to be using any indexes,
anyways to reduce the time, as i want to use this in a view that will be quick.

thanks in advance,
Tom Kyte
August 29, 2008 - 10:30 pm UTC

thank goodness it uses no indexes!!!! can you imagine how slow it would be if it did!!! holy cow!!!

if you want to see that in action, do something like:


select * from cust_table c1
where extract_date = ( select max(c2.extract_date) from cust_table c2 where c2.customer_number = c1.customer_number)

and hint the heck out of it to use the index on (customer_number,extract_date) - then sit back and wait.....


You'd need to tell us what this query waits on, does. TKPROF it (with 10046 level 12 trace - or in 10g use dbms_monitor to trace wait events). Show us what it does and what it waits on.

Lag functions

Seetha, August 28, 2008 - 2:00 pm UTC

Hi Tom,

I wanted to fetch previous record if particular previous record's column value equal to One. Note that we have to navigate the previous records till we get the particular record that meets the above said condition.

For ex:-

ColA ColB ColC

R1 1 0
R2 NULL 10
R3 NULL 10
R4 NULL 10
R5 NULL 10
R6 NULL 10

In this example, For R6, it should fetch R1 as previous record because the ColB equals to 1.We should ignore NULLS previous records

ColA ColB ColC

R1 1 0
R2 NULL 10

In this example, For R2, it should fetch R1 as previous record because the ColB equals to 1.

Thanks,
Seetha

Tom Kyte
August 29, 2008 - 10:57 pm UTC

I would have enjoyed showing you lag(x ignore nulls) had only you provided

a) a version
b) a create table
c) inserts into table....


but now you know what to look for!

sorting in analytical functions

hash, December 03, 2008 - 11:18 am UTC

create table merit (
emp_id number,
dob date,
rank_code number,
sen_date date
)
/
insert into merit values(1, to_date('03/04/76', 'dd/mm/rr'), 1, to_date('05/06/96', 'dd/mm/rr'));
insert into merit values(1, to_date('03/04/76', 'dd/mm/rr'), 2, to_date('05/06/00', 'dd/mm/rr'));
insert into merit values(1, to_date('03/04/76', 'dd/mm/rr'), 3, to_date('05/06/04', 'dd/mm/rr'));
insert into merit values(1, to_date('03/04/76', 'dd/mm/rr'), 4, to_date('05/06/08', 'dd/mm/rr'));
insert into merit values(2, to_date('03/05/76', 'dd/mm/rr'), 1, to_date('05/06/96', 'dd/mm/rr'));
insert into merit values(2, to_date('03/05/76', 'dd/mm/rr'), 2, to_date('05/06/00', 'dd/mm/rr'));
insert into merit values(2, to_date('03/05/76', 'dd/mm/rr'), 3, to_date('05/06/04', 'dd/mm/rr'));
insert into merit values(2, to_date('03/05/76', 'dd/mm/rr'), 4, to_date('05/06/08', 'dd/mm/rr'));
insert into merit values(3, to_date('03/06/76', 'dd/mm/rr'), 1, to_date('05/06/96', 'dd/mm/rr'));
insert into merit values(3, to_date('03/06/76', 'dd/mm/rr'), 2, to_date('05/06/00', 'dd/mm/rr'));
insert into merit values(3, to_date('03/06/76', 'dd/mm/rr'), 3, to_date('05/06/04', 'dd/mm/rr'));
insert into merit values(3, to_date('03/06/76', 'dd/mm/rr'), 4, to_date('05/06/08', 'dd/mm/rr'));
insert into merit values(4, to_date('03/04/76', 'dd/mm/rr'), 1, to_date('05/06/96', 'dd/mm/rr'));
insert into merit values(4, to_date('03/04/76', 'dd/mm/rr'), 2, to_date('05/06/96', 'dd/mm/rr'));
insert into merit values(5, to_date('03/04/76', 'dd/mm/rr'), 1, to_date('05/06/96', 'dd/mm/rr'));
commit
/

SQL> SELECT * FROM MERIT;

    EMP_ID DOB         RANK_CODE SEN_DATE
---------- ---------- ---------- ----------
         1 03/04/1976          1 05/06/1996
         1 03/04/1976          2 05/06/2000
         1 03/04/1976          3 05/06/2004
         1 03/04/1976          4 05/06/2008
         2 03/05/1976          1 05/06/1996
         2 03/05/1976          2 05/06/2000
         2 03/05/1976          3 05/06/2004
         2 03/05/1976          4 05/06/2008
         3 03/06/1976          1 05/06/1996
         3 03/06/1976          2 05/06/2000
         3 03/06/1976          3 05/06/2004
         3 03/06/1976          4 05/06/2008
         4 03/04/1976          1 05/06/1996
         4 03/04/1976          2 05/06/1996
         5 03/04/1976          1 05/06/1996

15 rows selected.

now the sort order for a given rank_code (say 4) is:

1. oldest sen_date first.
2. if there is a tie on sen_date, then the record with oldest sen_date in previous rank_code.
3. repeat step 2 until all rank_codes for a particular emp_id are exhausted.
4. if there is still a tie, then sort by DOB.

Tom Kyte
December 09, 2008 - 10:16 am UTC

what does this have to do with analytics?


#2 makes no sense. If there is a "tie", then you have two dates that are the same - what do you mean by referencing the "oldest of previous rank code" - you have two records already here.

Seems you want to

order by emp_id, sen_date, rank_code, dob;



I couldn't explained in the first place

hash, December 11, 2008 - 6:48 am UTC

This is original data, 

SQL> select * from merit 
  2  order by emp_id, sen_date, rank_code, dob;

    EMP_ID DOB         RANK_CODE SEN_DATE         FLAG
---------- ---------- ---------- ---------- ----------
         1 03/04/1976          1 05/06/1996
         1 03/04/1976          2 05/06/2000
         1 03/04/1976          3 05/06/2004
         1 03/04/1976          4 05/06/2008
         2 03/05/1976          1 05/06/1996
         2 03/05/1976          2 05/06/2000
         2 03/05/1976          3 05/06/2004
         2 03/05/1976          4 05/06/2008
         3 03/06/1976          1 05/06/1996
         3 03/06/1976          2 05/06/2000
         3 03/06/1976          3 05/06/2004
         3 03/06/1976          4 05/06/2008
         4 03/04/1976          1 05/06/1996
         4 03/04/1976          2 05/06/1996
         5 03/04/1976          1 05/06/1996

I want to run query for a particular rank_code (i.e, 4)

  1  select * from merit
  2  where rank_code = 4
  3* order by emp_id, sen_date, rank_code, dob
SQL> /

    EMP_ID DOB         RANK_CODE SEN_DATE   
---------- ---------- ---------- ---------- 
         1 03/04/1976          4 05/06/2008
         2 03/05/1976          4 05/06/2008
         3 03/06/1976          4 05/06/2008


As you can see, all the above 3 emp_id have same sen_date (seniority date).

  1  select * from merit
  2  where rank_code = 3
  3* order by emp_id, sen_date, rank_code, dob
SQL> /

    EMP_ID DOB         RANK_CODE SEN_DATE       
---------- ---------- ---------- ---------- 
         1 03/04/1976          3 05/06/2004
         2 03/05/1976          3 05/06/2004
         3 03/06/1976          3 05/06/2004

now again they have the same sen_date in the previous rank.

  1  select * from merit
  2  where rank_code = 2
  3* order by emp_id, sen_date, rank_code, dob
SQL> /

    EMP_ID DOB         RANK_CODE SEN_DATE       
---------- ---------- ---------- ---------- 
         1 03/04/1976          2 05/06/2000
         2 03/05/1976          2 05/06/2000
         3 03/06/1976          2 05/06/2000

now again in the previous rank, they have the same sen_date.

  1  select * from merit
  2  where rank_code = 1
  3* order by emp_id, sen_date, rank_code, dob
SQL> /

    EMP_ID DOB         RANK_CODE SEN_DATE       
---------- ---------- ---------- ---------- 
         1 03/04/1976          1 05/06/1996
         2 03/05/1976          1 05/06/1996
         3 03/06/1976          1 05/06/1996 

Uptil here, all the the rank_codes are exhuasted. So the senior would be emp_id 1 because he was born first. If they have tie in any of the previous ranks, the one with the earliest sen_date would be senior.

I think I have coveyed what i wanted to say


Tom Kyte
December 11, 2008 - 8:12 am UTC

depending on the nature of sen_date - I believe this might do it for you

ops$tkyte%ORA11GR1> select emp_id,
  2         min( case when rank_code = 4 then sen_date end ) sen_date,
  3             max( sen_date ) max_sen_date,
  4             max( dob )
  5    from merit
  6   where rank_code <= 4
  7   group by emp_id
  8  having max( rank_code ) = 4
  9  order by 2, 3, 4
 10  /

    EMP_ID SEN_DATE             MAX_SEN_DATE         MAX(DOB)
---------- -------------------- -------------------- --------------------
         1 05-jun-2008 00:00:00 05-jun-2008 00:00:00 03-apr-1976 00:00:00
         2 05-jun-2008 00:00:00 05-jun-2008 00:00:00 03-may-1976 00:00:00
         3 05-jun-2008 00:00:00 05-jun-2008 00:00:00 03-jun-1976 00:00:00




Thanks a lot

hash, December 11, 2008 - 9:27 am UTC

It works for me

thanks again
Tom Kyte
December 11, 2008 - 9:40 am UTC

make sure that you actually understand what it does - don't just test empirically - understand what I did there!

re: sorting by sen-date

Stew Ashton, December 18, 2008 - 10:05 am UTC


I think the test data lacks some boundary conditions. Let's say emp_id 2 got to rank 2 earlier, and emp_id 3 got to rank 3 earlier. In that case, shouldn't emp_id 3 come first and emp_id 2 come second?
update merit set sen_date = sen_date-1 where emp_id in(2,3) and emp_id = rank_code;
/
select emp_id,
       max(case when rank_code = 4 then sen_date end) sen_date_4,
       max(case when rank_code = 3 then sen_date end) sen_date_3,
       max(case when rank_code = 2 then sen_date end) sen_date_2,
       max(case when rank_code = 1 then sen_date end) sen_date_1,
       max(dob)
  from merit
where rank_code <= 4
group by emp_id
order by 2, 3, 4, 5, 6
/
EMP_ID SEN_DATE_4 SEN_DATE_3 SEN_DATE_2 SEN_DATE_1 MAX(DOB) 
------ ---------- ---------- ---------- ---------- ---------
3      05/06/08   04/06/04   05/06/00   05/06/96   03/06/76 
2      05/06/08   05/06/04   04/06/00   05/06/96   03/05/76 
1      05/06/08   05/06/04   05/06/00   05/06/96   03/04/76 
4                            05/06/96   05/06/96   03/04/76 
5                                       05/06/96   03/04/76

Reader, December 29, 2008 - 5:49 pm UTC

Tom,
create table map
(old_sym varchar2(100)
,new_sym varchar2(100)
,Cntry_CODE varchar2(100)
,start_dt date);

insert into map
values
('ABC','ABCE','US',to_Date('11/04/2008','mm/dd/yyyy');

insert into map
values
('ABCE','ABC','US',to_Date('11/18/2008','mm/dd/yyyy');

commit;

OLD_sym new_sym Cntry_CODE START_DT
ABC ABCE US 11/04/2008 - 1st record
ABCE ABC US 11/18/2008 - 2nd record

we get 1st record on 11/04/2008 with old_sym = ABC and new_sym = ABCE
this got reversed on the 2nd record that we got on 11/18/2008

If difference between the start_Dt on the 1st and 2nd record is one month,
and if new_sym of 2nd record = old_sym of 1st record, I need to delete 1st record

If difference between the start_Dt on the 2nd and 1st record is one month,
and if old_sym of 2nd record = new_sym of 1st record, I need to delete 2nd record

Can this be done using a merge statement?
Tom Kyte
January 05, 2009 - 8:26 am UTC

insufficient data here. How do you know these two records even "go together"

what does merge have to do with this? Seems you have a table with two rows in it?

what happens if there is a 3rd record? A fourth? And so on?

details are very much lacking here...

Function to list all the combination possibilities

Junior, January 19, 2009 - 5:26 pm UTC

Tom:
How to write a function to pass a list of data set and generate all posibilities combinations as final result? E.g, pass a,b,c,d,e,f,g,h, and expecting to list any 5 of data I passesed. The result is like:
a,b,c,d,e,f
a,b,c,d,e,g
a,b,c,d,e,h
a,b,c,d,f,g
...
Thanks.
Tom Kyte
January 19, 2009 - 9:53 pm UTC

is a,a,a,a,a a valid response?

is 5 the only number? or is that a parameter as well.

will there always be 8 things in the input?

Function to list all the combination possibilities

Junior, January 19, 2009 - 10:43 pm UTC

Tom:
Sorry did not make it clear. The input parameter data will between 7 to 10 (and can be characters or numbers). The output will be fixed in 6 values only. E.g.
input parameter data: a,b,c,d,e,1,2 (up to 10 values)
expecting output result:
a,b,c,d,e,1
a,b,c,d,e,2
a,b,c,d,1,2
a,b,c,e,1,2
...
Thank you very much.
Tom Kyte
January 20, 2009 - 6:50 am UTC

you still haven't specified this

Ok, a string of inputs - 7 to 10 of them.

Output - six columns (not five)

Now, is a,a,a,a,a,a valid?

Are you expecting to see b,c,d,e,1,a as well as a,b,c,d,e,1

Function to list all the combination possibilities

Junior, January 20, 2009 - 7:28 am UTC

Tom:
Sorry. :o)
A string of inputs - 7 to 10 of them.
Output - six columns (not five)
The output value is unique, and each of them cannot be repeated. a,a,a,a,a,a is not allowed.
b,c,d,e,1,a and a,b,c,d,e,1 is same value will only display once.
Thanks.

Tom Kyte
January 20, 2009 - 7:47 am UTC

ops$tkyte%ORA11GR1> variable txt varchar2(20)
ops$tkyte%ORA11GR1> exec :txt := 'a,b,c,d,e,f,g,1'

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> with data
  2  as
  3  (
  4  select
  5    trim( substr (txt,
  6          instr (txt, ',', 1, level  ) + 1,
  7          instr (txt, ',', 1, level+1)
  8             - instr (txt, ',', 1, level) -1 ) )
  9      as token
 10     from (select ','||:txt||',' txt
 11             from dual)
 12   connect by level <=
 13      length(:txt)-length(replace(:txt,',',''))+1
 14   )
 15   select * from data;

TOK
---
a
b
c
d
e
f
g
1

8 rows selected.

ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> column token format a3
ops$tkyte%ORA11GR1> with data
  2  as
  3  (
  4  select distinct
  5    trim( substr (txt,
  6          instr (txt, ',', 1, level  ) + 1,
  7          instr (txt, ',', 1, level+1)
  8             - instr (txt, ',', 1, level) -1 ) )
  9      as token
 10     from (select ','||:txt||',' txt
 11             from dual)
 12   connect by level <=
 13      length(:txt)-length(replace(:txt,',',''))+1
 14   )
 15  select * from data d1, data d2, data d3, data d4, data d5, data d6
 16  where d1.token < d2.token
 17    and d2.token < d3.token
 18    and d3.token < d4.token
 19    and d4.token < d5.token
 20    and d5.token < d6.token;

TOK TOK TOK TOK TOK TOK
--- --- --- --- --- ---
1   a   b   c   d   e
1   a   b   d   e   g
1   a   c   d   e   g
1   b   c   d   e   g
a   b   c   d   e   g
1   a   b   c   e   g
1   a   b   c   d   g
1   a   d   e   f   g
1   b   d   e   f   g
a   b   d   e   f   g
1   c   d   e   f   g
a   c   d   e   f   g
b   c   d   e   f   g
1   a   b   e   f   g
1   a   c   e   f   g
1   b   c   e   f   g
a   b   c   e   f   g
1   a   b   d   f   g
1   a   c   d   f   g
1   b   c   d   f   g
a   b   c   d   f   g
1   a   b   c   f   g
1   a   b   d   e   f
1   a   c   d   e   f
1   b   c   d   e   f
a   b   c   d   e   f
1   a   b   c   e   f
1   a   b   c   d   f

28 rows selected.

Function to list all the combination possibilities

Junior, January 20, 2009 - 8:12 am UTC

Tom:
Thank you so much. FYI, it works perfectly on Oracle 10g in UNIX enviroment. It did not work on Oralce 9.2.0.1.0 in Windows. Error message is:

  1  with data
  2   as
  3    (
  4    select
  5      trim( substr (txt,
  6            instr (txt, ',', 1, level  ) + 1,
  7            instr (txt, ',', 1, level+1)
  8               - instr (txt, ',', 1, level) -1 ) )
  9        as token
 10       from (select ','||:txt||',' txt
 11               from dual)
 12     connect by level <=
 13        length(:txt)-length(replace(:txt,',',''))+1
 14     )
 15*    select * from data;
SQL> /
   select * from data;
                     *
ERROR at line 15:
ORA-00911: invalid character

Tom Kyte
January 20, 2009 - 8:25 am UTC

it works fine in 9ir2 - what version of sqlplus where you using.

and make sure your script is "windows compliant", eg: it has the right end of line marker for that OS.

Function to list all the combination possibilities

Junior, January 20, 2009 - 8:27 am UTC

Tom:
Correction.
It works on both Oracle 9 and 10 in any enviroment.

:)

Neven, January 20, 2009 - 9:30 am UTC

--another decision of the problem is the following :


drop type MYTABLEVARCHAR;
drop type MYOBJECTVARCHAR;
drop function parse_it_now;

CREATE OR REPLACE
TYPE MYOBJECTVARCHAR
AS OBJECT
(
A VARCHAR2(32000),
idx NUMBER
)
/


CREATE OR REPLACE
TYPE MYTABLEVARCHAR
AS TABLE OF myObjectVarchar
/


CREATE OR REPLACE FUNCTION parse_it_now(input_str CLOB, separator VARCHAR2)
RETURN mytablevarchar PIPELINED
IS
v PLS_INTEGER := 0;
BEGIN
FOR i IN
(SELECT TRIM
(CASE
WHEN LEVEL = 1
THEN SUBSTR (input_str,
1,
INSTR (input_str, separator, 1, 1) - 1
)
WHEN INSTR (input_str, separator, 1, LEVEL) <> 0
THEN SUBSTR (input_str,
INSTR (input_str,
separator,
1,
LEVEL - 1
)
+ 1,
INSTR (input_str, separator, 1, LEVEL)
- INSTR (input_str,
separator,
1,
LEVEL - 1
)
- 1
)
END) x
FROM DUAL
WHERE INSTR (input_str, separator, 1, 1) <> 0
CONNECT BY LEVEL < 9999999999
AND INSTR (input_str, separator, 1, LEVEL) <> 0
UNION ALL
SELECT TRIM (SUBSTR (input_str, INSTR (input_str, separator, -1, 1) + 1))
FROM DUAL)
LOOP
v := v + 1;
PIPE ROW (myobjectvarchar (i.x,v));
END LOOP;
RETURN;
END;
/

--and the query is :

SELECT x.a col1,y.a col2,z.a col3,
A.a col4,B.a col5
FROM TABLE(CAST(parse_it_now('a,b,c,d,e,f,g,h,k',',') AS mytablevarchar)) X,
TABLE(CAST(parse_it_now('a,b,c,d,e,f,g,h,k',',') AS mytablevarchar)) Y,
TABLE(CAST(parse_it_now('a,b,c,d,e,f,g,h,k',',') AS mytablevarchar)) Z,
TABLE(CAST(parse_it_now('a,b,c,d,e,f,g,h,k',',') AS mytablevarchar)) A,
TABLE(CAST(parse_it_now('a,b,c,d,e,f,g,h,k',',') AS mytablevarchar)) B

--If you have performance problems with the function , you could change the following clause : "CONNECT BY LEVEL < 9999999999" to a lower number

--HAVE FUN :)


Reader, February 09, 2009 - 10:33 am UTC

create table main_table
(dt_key date,
sym varchar2(20),
vol number
);

create table map_main
(old_sym varchar2(20),
new_sym varchar2(20),
effective_dt date
);

--main_table inserts
insert into main_table
values
(to_date('02/06/2009','mm/dd/yyyy'), 'CCR',1000);

insert into main_table
values
(to_date('02/05/2009','mm/dd/yyyy'), 'CJ',3000);

insert into main_table
values
(to_date('02/04/2009','mm/dd/yyyy'), 'CJ',2000);


insert into main_table
values
(to_date('02/06/2009','mm/dd/yyyy'), 'PHPL',2000);

insert into main_table
values
(to_date('02/05/2009','mm/dd/yyyy'), 'PHP',1000);

insert into main_table
values
(to_date('02/04/2009','mm/dd/yyyy'), 'PHP',4000);

insert into main_table
values
(to_date('02/03/2009','mm/dd/yyyy'), 'PHP',5000);

--map_main table inserts
insert into map_main
values
('CJ','CCR',to_date('02/06/2009','mm/dd/yyyy'));

insert into map_main
values
('PHP','PHPL',to_date('02/06/2009','mm/dd/yyyy'));

commit;
main_table data
DT_KEY SYM VOL
--------- -------------------- ----------
06-FEB-09 CCR 1000
05-FEB-09 CJ 3000
04-FEB-09 CJ 2000
06-FEB-09 PHPL 2000
05-FEB-09 PHP 1000
04-FEB-09 PHP 4000
03-FEB-09 PHP 5000

main_map table data
OLD_SYM NEW_SYM EFFECTIVE
-------------------- -------------------- ---------
CJ CCR 06-FEB-09
PHP PHPL 06-FEB-09

main_table has dt_key, sym and vol columns
main_map contains old_sym and new_sym and effective_dt

I need to get last 20 day average volume for each sym in main_table. I do not store data for holiday and weekends. The thing is, sym may be changed and the effective date will be in main_map table.
For example: CCR is changed to CJ on 2009-02-06 and CJ starts appearing in main_table starting 2009-02-06. If I calculate the 20 day average volume using
main_table, I have to consider sym=CJ for 2009-02-06 and for the data loaded previously, I need to take the SYM=CCR. The last 20D average for CCR=(1000 (loaded 02/06)+ 3000 (loaded 02/05) + 2000 (loaded 02/04))/20

The last 20D average for PHPL=(2000 (loaded 02/06)+ 1000 (loaded 02/05) + 4000 (loaded 02/04)) + 5000 (loaded on 02/03)/20

Can you advice me how to do this?
Tom Kyte
February 09, 2009 - 7:27 pm UTC

... For example: CCR is changed to CJ on 2009-02-06 ...

main_map table data
OLD_SYM NEW_SYM EFFECTIVE
-------------------- -------------------- ---------
CJ CCR 06-FEB-09



looks like CJ is changed to CCR???


could CJ change to CCR and CCR change to XYZ - do you need to account for that?

Reader, February 09, 2009 - 8:38 pm UTC

Apologies for the mistake in sending the information. CJ to CCR and PHP to PHPL.
Tom Kyte
February 10, 2009 - 7:01 am UTC

answer the other much more important question

Reader, February 10, 2009 - 9:29 am UTC

CJ to CCR and PHP to PHPL on 02/06/2009. CCR may change to something else as you said to XYZ in the future and PHPL may change to ABC in the future
I have to calculate the average volume based on CJ, CCR, XYZ

insert into main_table
values
(to_date('02/09/2009','mm/dd/yyyy'), 'CCR',11111);

insert into main_table
values
(to_date('02/010/2009','mm/dd/yyyy'), 'XYZ',22222);


insert into map_main
values
('CCR','XYZ',to_date('02/10/2009','mm/dd/yyyy'));


For example, on 02/06/2009, I need to calculate as below:
CCR=(1000 (loaded 02/06)+ 3000
(loaded 02/05) + 2000 (loaded 02/04))/20

If CCR is changed to XYZ on 02/10/2009, I need to to do the calculation as below on 02/10/2009:
CCR=( 2000 (CJ loaded 02/04) + 3000 (CJ loaded 02/05) + 1000 (CCR loaded 02/06) + 11111 (CCR loaded 02/09) + 22222 (XYZ loaded on 02/10) ) /20

- last 20 days average volume and I do not have data for holidays and weekends.


Tom Kyte
February 11, 2009 - 9:17 am UTC

can you psuedo code out the logic a bit better - I'm not sure if you are changing ccr to cj or remapping cj back to ccr

Maybe you can start with this result set and tell us what the value of the character column should be for each row:

(and I'm always curious why people store data in a way that makes it almost impossible if not impossible to answer their questions?)

Reader, February 10, 2009 - 9:30 am UTC

Apologies for the typo in my previous question
Here is what I need to calculate -

XYZ=( 2000 (CJ loaded 02/04) + 3000 (CJ loaded 02/05) + 1000 (CCR loaded 02/06) + 11111 (CCR
loaded 02/09) + 22222 (XYZ loaded on 02/10) ) /20

Reader, February 11, 2009 - 9:44 am UTC

om,
Thank you for your response. I am trying to handle ticker change in the market data. I need to calculate the 20 day average volume for a ticker each day.

CJ is changed to CCR on Feb 06;
Here CJ is old ticker and CCR is the new ticker

CJ and CCR are basically the same tickers for my caclulation.
I need to consider volumes of CJ prior to feb 06 and volume CCR on FEB 06


If CJ is changed to CCR on 02/06/2009, I need to to do the calculation as below on 02/06/2009.
It should be displayed as CCR (new ticker)
CCR=( 2000 (CJ loaded 02/04) + 3000 (CJ loaded 02/05) + 1000 (CCR loaded 02/06)) /20

Ticker date average20d
CCR 02/06/2009 300

CCR will be changed to XYZ on feb 10. In that case CJ, CCR and XYZ are same for my calculation.

If CCR is changed to XYZ on 02/10/2009, I need to to do the calculation as below on 02/10/2009:
CCR=( 2000 (CJ loaded 02/04) + 3000 (CJ loaded 02/05) + 1000 (CCR loaded 02/06) + 11111 (CCR
loaded 02/09) + 22222 (XYZ loaded on 02/10) ) /20


Ticker date average20d
XYZ 02/10/2009 1966.65

main_table data
DT_KEY SYM VOL
--------- -------------------- ----------
06-FEB-09 CCR 1000
05-FEB-09 CJ 3000
04-FEB-09 CJ 2000
06-FEB-09 PHPL 2000
05-FEB-09 PHP 1000
04-FEB-09 PHP 4000
03-FEB-09 PHP 5000
09-FEB-09 CCR 11111
10-FEB-09 XYZ 22222

main_map table data
OLD_SYM NEW_SYM EFFECTIVE
-------------------- -------------------- ---------
CJ CCR 06-FEB-09
PHP PHPL 06-FEB-09
CCR XYZ 10-FEB-09

I am not sure if I am clear. Please let me know. Thanks a lot for your response.
Tom Kyte
February 11, 2009 - 12:54 pm UTC

this should get you going


ops$tkyte%ORA11GR1> select rpad('*',2*level,'*') || old_sym osym,
  2         old_sym,
  3         new_sym,
  4             connect_by_root old_sym the_orig_sym,
  5             prior effective_dt ped,
  6             effective_dt
  7    from map_main
  8  connect by prior new_sym = old_sym and prior effective_dt < effective_dt
  9  /

OSYM                 OLD_S NEW_S THE_O PED         EFFECTIVE_D
-------------------- ----- ----- ----- ----------- -----------
**CCR                CCR   XYZ   CCR               10-feb-2009
**CJ                 CJ    CCR   CJ                06-feb-2009
****CCR              CCR   XYZ   CJ    06-feb-2009 10-feb-2009
**PHP                PHP   PHPL  PHP               06-feb-2009

<b>fill out the hierarchy - getting the root sym</b>

ops$tkyte%ORA11GR1> select the_orig_sym, new_sym, effective_dt sdate,
  2         nvl(lead(effective_dt) over (partition by the_orig_sym order by
  3             effective_dt)-1,to_date('1-jan-3000','dd-mon-yyyy')) edate
  4    from (
  5  select connect_by_root old_sym the_orig_sym,
  6         new_sym,
  7             effective_dt
  8    from map_main
  9  connect by prior new_sym = old_sym and prior effective_dt < effective_dt
 10          )
 11  order by 1, 3
 12  /

THE_O NEW_S SDATE       EDATE
----- ----- ----------- -----------
CCR   XYZ   10-feb-2009 01-jan-3000
CJ    CCR   06-feb-2009 09-feb-2009
CJ    XYZ   10-feb-2009 01-jan-3000
PHP   PHPL  06-feb-2009 01-jan-3000

<b>make it now a range... start date to end date</b>

ops$tkyte%ORA11GR1> with data
  2  as
  3  (
  4  select the_orig_sym, new_sym, effective_dt sdate,
  5         nvl(lead(effective_dt) over (partition by the_orig_sym order by
  6             effective_dt)-1,to_date('1-jan-3000','dd-mon-yyyy')) edate
  7    from (
  8  select connect_by_root old_sym the_orig_sym,
  9         new_sym,
 10             effective_dt
 11    from map_main
 12  connect by prior new_sym = old_sym and prior effective_dt < effective_dt
 13          )
 14  )
 15  select *
 16    from main_table mt left outer join data d
 17      on (mt.dt_key between d.sdate and d.edate
 18              and mt.sym = d.new_sym)
 19    order by dt_key;

DT_KEY      SYM                         VOL THE_O NEW_S SDATE       EDATE
----------- -------------------- ---------- ----- ----- ----------- -----------
03-feb-2009 PHP                        5000
04-feb-2009 CJ                         2000
04-feb-2009 PHP                        4000
05-feb-2009 CJ                         3000
05-feb-2009 PHP                        1000
06-feb-2009 CCR                        1000 CJ    CCR   06-feb-2009 09-feb-2009
06-feb-2009 PHPL                       2000 PHP   PHPL  06-feb-2009 01-jan-3000
09-feb-2009 CCR                       11111 CJ    CCR   06-feb-2009 09-feb-2009
10-feb-2009 XYZ                       22222 CCR   XYZ   10-feb-2009 01-jan-3000
10-feb-2009 XYZ                       22222 CJ    XYZ   10-feb-2009 01-jan-3000

10 rows selected.



Reader, February 11, 2009 - 5:24 pm UTC

Tom,
Thanks a lot. Trying to understand your queries. I will have questions for you on this. I have not used this kind of SQL. Again, Thanks for all your answers.

Analytics with connect by

Dan, February 11, 2009 - 5:35 pm UTC

Hi Tom,

I know I saw this issue before in the thread around 2005 (and I think you said you filed a bug on it) but I'm running into it now, and I was wondering if you had any more insight.

select * from v$version
where banner like 'Oracle%';

BANNER                                                           
---------------------------------------------------------------- 
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bi 

1 rows selected


Here is the base query - I am just trying to get the parent_dept_key name, so I was building my way to it.

with t as (
select 1 dept_key, 'Test' name, 3 parent_dept_key from dual UNION ALL
select 2 dept_key, 'Test1' name, 1 parent_dept_key from dual UNION ALL
select 3 dept_key, 'Test2' name, 3 parent_dept_key from dual )
select dept_key, 
       name, 
       parent_dept_key, 
       connect_by_root(name) parent_name,
       level lvl
from t
connect by nocycle
prior dept_key = parent_dept_key
  and level <= 2
order by dept_key, lvl;

DEPT_KEY               NAME  PARENT_DEPT_KEY        PARENT_NAME LVL                    
---------------------- ----- ---------------------- ----------- ---------------------- 
1                      Test  3                      Test        1                      
1                      Test  3                      Test2       2                      
2                      Test1 1                      Test1       1                      
2                      Test1 1                      Test        2                      
3                      Test2 3                      Test2       1                      

5 rows selected


Now in order to get only the "correct" rows, I figured I would just use analytic max() to know the max level per dept_key (and then wrap this all in an inline view and only pick the rows I wanted):

with t as (
select 1 dept_key, 'Test' name, 3 parent_dept_key from dual UNION ALL
select 2 dept_key, 'Test1' name, 1 parent_dept_key from dual UNION ALL
select 3 dept_key, 'Test2' name, 3 parent_dept_key from dual )
select dept_key, 
       name, 
       parent_dept_key, 
       connect_by_root(name) parent_name,
       level lvl,
       max(level) over (partition by dept_key) m
from t
connect by nocycle
prior dept_key = parent_dept_key
  and level <= 2
order by dept_key, lvl;

DEPT_KEY               NAME  PARENT_DEPT_KEY        PARENT_NAME LVL M   
---------------------- ----- ---------------------- ----------- --- --- 
1                      Test  3                      Test2       1   2   
1                      Test  3                      Test2       2   2   
2                      Test1 1                      Test2       1   2   
2                      Test1 1                      Test2       2   2   
3                      Test2 3                      Test2       1   1   

5 rows selected


But here's the problem - why did the Parent_name change?!

Now, I solved this by putting my original query into an inline view and then running the analytics off of that. But it seems to me that adding a column shouldn't change the result of other columns - at the very least, it's very counter-intuitive.

Any ideas on what's going on?
Tom Kyte
February 12, 2009 - 11:58 am UTC

confirmed, bug, will file after seeing if it isn't already filed

thanks!

Reader, February 12, 2009 - 12:47 pm UTC

Tom,
Using the query that you gave, can you please let me know how I should calculate average for last 20 days?

ops$tkyte%ORA11GR1> with data
2 as
3 (
4 select the_orig_sym, new_sym, effective_dt sdate,
5 nvl(lead(effective_dt) over (partition by the_orig_sym order by
6 effective_dt)-1,to_date('1-jan-3000','dd-mon-yyyy')) edate
7 from (
8 select connect_by_root old_sym the_orig_sym,
9 new_sym,
10 effective_dt
11 from map_main
12 connect by prior new_sym = old_sym and prior effective_dt < effective_dt
13 )
14 )
15 select *
16 from main_table mt left outer join data d
17 on (mt.dt_key between d.sdate and d.edate
18 and mt.sym = d.new_sym)
19 order by dt_key;



Tom Kyte
February 12, 2009 - 4:30 pm UTC

I was hoping you could get there....

first, identify the data you want:


ops$tkyte%ORA11GR1> with data
  2  as
  3  (
  4  select the_orig_sym, new_sym, effective_dt sdate,
  5         nvl(lead(effective_dt) over (partition by the_orig_sym order by
  6             effective_dt)-1,to_date('1-jan-3000','dd-mon-yyyy')) edate
  7    from (
  8  select connect_by_root old_sym the_orig_sym,
  9         new_sym,
 10             effective_dt
 11    from map_main
 12  connect by prior new_sym = old_sym and prior effective_dt < effective_dt
 13          )
 14  )
 15  select nvl(the_orig_sym,sym) the_sym,
 16         dt_key,
 17         vol
 18    from main_table mt left outer join data d
 19      on (mt.dt_key between d.sdate and d.edate
 20              and mt.sym = d.new_sym)
 21    where (sdate is null and edate is null)
 22       or (dt_key between sdate and edate)
 23    order by the_sym, dt_key;

THE_SYM              DT_KEY           VOL
-------------------- --------- ----------
CCR                  10-FEB-09      22222
CJ                   04-FEB-09       2000
CJ                   05-FEB-09       3000
CJ                   06-FEB-09       1000
CJ                   09-FEB-09      11111
CJ                   10-FEB-09      22222
PHP                  03-FEB-09       5000
PHP                  04-FEB-09       4000
PHP                  05-FEB-09       1000
PHP                  06-FEB-09       2000

10 rows selected.


and then aggregate:

ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> with data
  2  as
  3  (
  4  select the_orig_sym, new_sym, effective_dt sdate,
  5         nvl(lead(effective_dt) over (partition by the_orig_sym order by
  6             effective_dt)-1,to_date('1-jan-3000','dd-mon-yyyy')) edate
  7    from (
  8  select connect_by_root old_sym the_orig_sym,
  9         new_sym,
 10             effective_dt
 11    from map_main
 12  connect by prior new_sym = old_sym and prior effective_dt < effective_dt
 13          )
 14  )
 15  select nvl(the_orig_sym,sym) the_sym, avg(vol)
 16    from main_table mt left outer join data d
 17      on (mt.dt_key between d.sdate and d.edate
 18              and mt.sym = d.new_sym)
 19    where ((sdate is null and edate is null)
 20       or (dt_key between sdate and edate))
 21          and dt_key >= trunc(sysdate)-20
 22          and nvl(the_orig_sym,sym) in ('CJ', 'PHP')
 23    group by nvl(the_orig_sym,sym)
 24    order by the_sym;

THE_SYM                AVG(VOL)
-------------------- ----------
CJ                       7866.6
PHP                        3000


I did average, you do whatever calculation you need - you have all of the data, you should be able to get there from here?

Reader, February 13, 2009 - 7:39 am UTC

Tom,
Thank You for your response.

on 02/06, I need to display

Symbol adv_20d
CCR 300

on 02/10, I need to display
symbol adv_20d
XYZ 1966.65

I understand about getting the data for the date range. Here main_table is basically the driving table, which
has symbols each day and I have to calculate 20d average for each symbol in that table and handle symbol changes if
any in the map_table


Can I use this with data query in a pl/sql cursor to get only the symbol changes and their date ranges and use that
to join to the main_table to get the data I want?


Tom Kyte
February 16, 2009 - 10:51 am UTC

I gave you access to *every single column*, you cannot derive from the data what you need?

break the query apart - bit by bit, understand it, everything - everything is there - old symbol, new symbol, everything.

gaps

Amm Kh, February 17, 2009 - 3:29 pm UTC

Dear Tom,
can i know the gaps from a range of values ? I have a table like this :

select * from ss;

col
====
1000
1001
1002
1005
1006
1010
1011
1012
1013
1021
1022

i need the output to be like :

from To
==== =====
1000 - 1002
1005 - 1006
1010 - 1013
1021 - 1022

but without having an inner view which gets all the col values as i have about 10 million records in the table which will make it very slow.

thanks in advance.
Tom Kyte
February 17, 2009 - 4:48 pm UTC

Available

A reader, February 18, 2009 - 3:14 am UTC

Dear Tom,

Thanks fro your quick response. The link provided was very useful. but how can i knwo the "Available Ranges" and not the "Missing Ranges" ? It is a bit challenging for me.

Thanks in advance
Tom Kyte
February 18, 2009 - 3:10 pm UTC

you are a sql coder.... come on... did you give it a try (you know how you learn, you take a program and you try to solve it - rather than have someone solve it for you.....)


You see the lag/lead trick, you know you can get the prior and the next row married up with every row in the table after sorting....

So, if you know that the row "in front" (lag) of you has an X value that is not your X values minus 1 - there was a gap.... you are the START of a range....

You know if your X value is not equal to the next (lead) X value minus one - you are the END of a range.....


so, something like:
select x, lag(x) over (order by x) last_x, lead(x) over (order by x),
       case when x <> nvl(lag(x) over (order by x),x)+1  then '<<<<=== start!' end start_range,
       case when x <> nvl(lead(x) over (order by x),x)-1 then '<<<<=== end!' end end_range
  from t


this will result in rows with

a) start/end both null - uninteresting, we'll discard them
b) start/end both NOT NULL - interesting, they reprensent a range unto themselves
c) start is not null, end is null - interesting - we need to find the end
d) start is null, end is not null - and here it is....

ops$tkyte%ORA11GR1> select x, lag(x) over (order by x) last_x, lead(x) over (order by x),
  2         case when x <> nvl(lag(x) over (order by x),x)+1  then '<<<<=== start!' end start_range,
  3         case when x <> nvl(lead(x) over (order by x),x)-1 then '<<<<=== end!' end end_range
  4    from t
  5  /

         X     LAST_X LEAD(X)OVER(ORDERBYX) START_RANGE    END_RANGE
---------- ---------- --------------------- -------------- ------------
         3                                4 <<<<=== start!
         4          3                     5
         5          4                    10                <<<<=== end!
        10          5                    11 <<<<=== start!
        11         10                    13                <<<<=== end!
        13         11                    20 <<<<=== start! <<<<=== end!
        20         13                    25 <<<<=== start! <<<<=== end!
        25         20                    26 <<<<=== start!
        26         25                                      <<<<=== end!

9 rows selected.


that is an example.

so, if we take that query and keep just:

where start_range is not null or end_range is not null


ops$tkyte%ORA11GR1> select *
  2    from (
  3  select x, lag(x) over (order by x) last_x, lead(x) over (order by x),
  4         case when x <> nvl(lag(x) over (order by x),x)+1  then '<<<<=== start!' end start_range,
  5         case when x <> nvl(lead(x) over (order by x),x)-1 then '<<<<=== end!' end end_range
  6    from t
  7         )
  8   where start_range is not null or end_range is not null
  9   order by x;

         X     LAST_X LEAD(X)OVER(ORDERBYX) START_RANGE    END_RANGE
---------- ---------- --------------------- -------------- ------------
         3                                4 <<<<=== start!
         5          4                    10                <<<<=== end!
        10          5                    11 <<<<=== start!
        11         10                    13                <<<<=== end!
        13         11                    20 <<<<=== start! <<<<=== end!
        20         13                    25 <<<<=== start! <<<<=== end!
        25         20                    26 <<<<=== start!
        26         25                                      <<<<=== end!

8 rows selected.



we'll get everything we need. We just need to marry up the start/end rows, collapse them out. lead will do that nicely!



ops$tkyte%ORA11GR1> select x start_x, case when end_range is null then lead(x) over (order by x) else x end end_x, start_range
  2    from (
  3  select x, lag(x) over (order by x) last_x, lead(x) over (order by x),
  4         case when x <> nvl(lag(x) over (order by x),x)+1  then '<<<<=== start!' end start_range,
  5         case when x <> nvl(lead(x) over (order by x),x)-1 then '<<<<=== end!' end end_range
  6    from t
  7         )
  8   where start_range is not null or end_range is not null
  9   order by x;

   START_X      END_X START_RANGE
---------- ---------- --------------
         3          5 <<<<=== start!
         5          5
        10         11 <<<<=== start!
        11         11
        13         13 <<<<=== start!
        20         20 <<<<=== start!
        25         26 <<<<=== start!
        26         26

8 rows selected.



but we only care out rows where start_range is not null so....

ops$tkyte%ORA11GR1> select start_x, end_x
  2    from (
  3  select x start_x, case when end_range is null then lead(x) over (order by x) else x end end_x, start_range
  4    from (
  5  select x, lag(x) over (order by x) last_x, lead(x) over (order by x),
  6         case when x <> nvl(lag(x) over (order by x),x)+1  then '<<<<=== start!' end start_range,
  7         case when x <> nvl(lead(x) over (order by x),x)-1 then '<<<<=== end!' end end_range
  8    from t
  9         )
 10   where start_range is not null or end_range is not null
 11         )
 12   where start_range is not null
 13   order by start_x;

   START_X      END_X
---------- ----------
         3          5
        10         11
        13         13
        20         20
        25         26


and if you go back to my first query that shows all of the X's in order, you can see that I have the ranges now....

available ranges

Ammar Khwaira, February 18, 2009 - 4:35 am UTC

Dear Tom,

regarding my last review above, i could write the query to get the available ranges , it is like :


select s "from serial", nvl(lead(prev_serial) over (order by s) , (select max(s) from amm) ) "to_serial"

from (

select s,prev_serial

from ( select s, nvl(lag(s) over ( order by s) , -999) prev_serial
from (select s from amm
)

)
where s- 1 <> prev_serial
)
;

I hope it is a good one and works well.

Thanks for you support.
Ammar Khwaira
Tom Kyte
February 18, 2009 - 3:27 pm UTC

excellent :)

see above for another way (i posted before I read this...)




Reader, April 28, 2009 - 4:02 pm UTC

create table tst
(v_id number
,v_val varchar2(20)
);

insert into tst values (1,1000);
insert into tst values (2,2000);
insert into tst values (3,3000);
insert into tst values (4,4000);
insert into tst values (5,5000);
insert into tst values (6,6000);
insert into tst values (7,1000);
insert into tst values (8,2000);
insert into tst values (9,3000);
insert into tst values (10,4000);
insert into tst values (11,5000);
commit;

I want to get the sum of 10 rows startingjavascript:doSubmit('SUBMIT_REVIEW')
Submit Review from the 5 th record for each v_id (odered by v_id is ascending order). Can you advice if this can be done using analytic fucntions?
Tom Kyte
April 28, 2009 - 4:17 pm UTC

I want to get the sum of 10 rows starting from the 5 th record for each v_id (odered by v_id is ascending
order).



I didn't follow that - please explain in more detail.

Reader, April 28, 2009 - 6:09 pm UTC

For v_id=1, I need to calculate the sum of v_val starting from the v_id=5 and ending at v_id=10
For v_id=2, I need to calculate the sum of v_val starting from v_id=7 and ending at v_id=10
Basically, I need to skip the first 4 records for each v_id and take the sum of v_val from 5th record to 10 records from the beginning.

select v_id,v_val,
sum(v_val) over (order by v_id rows between 5 following and 10 following) sm
from tst

Not sure if I am clear, Please let me know.
Tom Kyte
April 29, 2009 - 8:37 am UTC

v_id=1 skip to v_id=5, skip 2,3,4 (three)    - add up 5,6,7,8,9,10 (six)
v_id=2 skip to v_id=7, skip 3,4,5,6 (four)   - add up 7,8,9,10 (four)


interesting, if you ask me that means when we get to v_id=3, we should skip 4,5,6,7,8 (I see a series here)

what pattern do you see???

come on - please....


(and you seem to already KNOW how to do this?? your sum() is picking off some rows and summing them, what is missing that makes it not possible for you to finish?)

ftd, mtd, ytd

w, May 19, 2009 - 4:09 am UTC

Hi Tom,

Currently in our DWH, we are using a materialized view to storage the amount at an aggregate level. when we need to do ftd (for the day), mtd (month till date), ytd (year till date) comparisons, we end up writing three different queries for the different periods. We are planning to store all the three figures precomputed in the MV and have achieved that using parition by. one issue we are currently facing is that the MV has rolled up data on various dimensions as below

CREATE TABLE PROD.TEST_ANALY
(
OFF NUMBER(10),
PROD NUMBER(10),
AGENT NUMBER(10),
SUBAGENT VARCHAR2(50 BYTE),
INDI NUMBER(3),
DATE1 DATE,
MONTH1 VARCHAR2(20 BYTE),
YEAR1 VARCHAR2(20 BYTE),
AMOUNT1 NUMBER
)


SET DEFINE OFF;
Insert into TEST_ANALY
(OFF, PROD, AGENT, SUBAGENT, INDI,
DATE1, MONTH1, YEAR1, AMOUNT1)
Values
(1901, 1002, 113644, '1012-122', NULL,
TO_DATE('04/28/2009 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'APR.2009', '2009', 6426);
Insert into TEST_ANALY
(OFF, PROD, AGENT, SUBAGENT, INDI,
DATE1, MONTH1, YEAR1, AMOUNT1)
Values
(1301, 1802, 196829, NULL, NULL,
TO_DATE('04/06/2009 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'APR.2009', '2009', 411);
Insert into TEST_ANALY
(OFF, PROD, AGENT, SUBAGENT, INDI,
DATE1, MONTH1, YEAR1, AMOUNT1)
Values
(1801, 1802, 2081, NULL, NULL,
TO_DATE('05/04/2009 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'MAY.2009', '2009', 617);
Insert into TEST_ANALY
(OFF, PROD, AGENT, SUBAGENT, INDI,
DATE1, MONTH1, YEAR1, AMOUNT1)
Values
(2401, 1803, 293347, NULL, NULL,
TO_DATE('04/08/2009 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'APR.2009', '2009', 0);
Insert into TEST_ANALY
(OFF, PROD, AGENT, SUBAGENT, INDI,
DATE1, MONTH1, YEAR1, AMOUNT1)
Values
(2403, 1818, 1799, NULL, NULL,
TO_DATE('04/29/2009 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'APR.2009', '2009', 7589);
Insert into TEST_ANALY
(OFF, PROD, AGENT, SUBAGENT, INDI,
DATE1, MONTH1, YEAR1, AMOUNT1)
Values
(1401, 1818, 197, '0', NULL,
TO_DATE('04/09/2009 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'APR.2009', '2009', 3006);
Insert into TEST_ANALY
(OFF, PROD, AGENT, SUBAGENT, INDI,
DATE1, MONTH1, YEAR1, AMOUNT1)
Values
(1202, 1801, 49587, '90014594', NULL,
TO_DATE('04/24/2009 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'APR.2009', '2009', 3369);
Insert into TEST_ANALY
(OFF, PROD, AGENT, SUBAGENT, INDI,
DATE1, MONTH1, YEAR1, AMOUNT1)
Values
(1901, 8401, 1849, NULL, NULL,
TO_DATE('04/28/2009 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'APR.2009', '2009', 0);
Insert into TEST_ANALY
(OFF, PROD, AGENT, SUBAGENT, INDI,
DATE1, MONTH1, YEAR1, AMOUNT1)
Values
(1901, 1802, 1849, NULL, NULL,
TO_DATE('04/15/2009 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'APR.2009', '2009', 501);
Insert into TEST_ANALY
(OFF, PROD, AGENT, SUBAGENT, INDI,
DATE1, MONTH1, YEAR1, AMOUNT1)
Values
(1901, 9901, 233575, NULL, NULL,
TO_DATE('04/21/2009 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'APR.2009', '2009', 1000);
Insert into TEST_ANALY
(OFF, PROD, AGENT, SUBAGENT, INDI,
DATE1, MONTH1, YEAR1, AMOUNT1)
Values
(1101, 8401, 304173, NULL, NULL,
TO_DATE('05/13/2009 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'MAY.2009', '2009', 7285);
COMMIT;

we are using the below query to get the mtd & ytd data.

select off1, prod, agent, subagent, indi, date1, month1, year1, amount1,
sum(amount1) over (partition by off1, agent, subagent, prod, indi, month1 order by date1) mtd,
sum(amount1) over (partition by off1, agent, subagent, prod, indi, year1 order by date1) ytd
from test_analy
group by off1, prod, agent, subagent, indi, date1, month1, year1,amount1

The issue is when on this aggregated data if we want to see only office wise, mtd, ytd for a particular date using

select off1, sum(mtd), sum(ytd) from (
select off1, prod, agent, subagent, indi, date1, month1, year1, amount1,
sum(amount1) over (partition by off1, agent, subagent, prod, indi, month1 order by date1) mtd,
sum(amount1) over (partition by off1, agent, subagent, prod, indi, year1 order by date1) ytd
from test_analy
where date1=to_date('21-apr-2009')
group by off1, prod, agent, subagent, indi, date1, month1, year1,amount1)
group by off1

these figures do not match with the actual month and year. is there some flaw in logic?

version 10.2.0.3

mtd

w, May 19, 2009 - 7:19 am UTC

figured out that not all the 6 dimensions have the value on the queried date so hence it is giving lesser amount.

problem:

storing the amount on 6 dimensions & by date. not all the dimension will have transaction on every date. one option is to fill the dates and for all unique records of the dimension, store the date, where no amount store 0. here there will issue with large dataset. since the combinations are huge. is there a better way to accomplish this?

hope this is not confusing.

generating aggregates - analytic functions

Dev Madeka, August 09, 2009 - 3:48 am UTC

Hi Tom
Not sure if you consider this a new question or a follow-up.
Hi Tom,
We have a datawarehouse which has about 15 yrs worth of data. One of the Materialised views that we maintain in this DWH has 3 kinds of aggreates - Rolling Quarter (QTR) , Year to Date (YTD) and Moving Annual total (MAT). The sales history stores both Dollars (_D) and units (_U). Currently this view is built procedurally and takes roughly 12 hrs to build - this is the major hog in our load process. We get data every month for last 60 months from an external provider (since old data can change we have to drop the last 59 months of data and load 60 months)

I have tried to use analytical functions to automate creation of this view however since the sales fact is dense (no history when sales were zero) - I have to do a cartersian join on pack, month and geography to make sure I have rows for QTR,YTD and MAT sums. However when I roll this out to production the cartesian join is going to blow up the no. of rows in the query to about 4.7 billion + (1600 geography, 180 months, 16500 packs). A lot of packs haven't sold in years and all I need to do is ensure that a pack has 12 rows following a sale in any geography for the following months so that MAT calculations are correct. I can probably add a filter which will remove any rows where all the calculations result in zero - but am afraid to try this on our prod database. Is there a smarter way to do this ?


CREATE TABLE "TEST_DT"
( "DT_CODE" NUMBER(5,0) NOT NULL ENABLE,
"DT" DATE NOT NULL ENABLE,
"DT_CHAR" VARCHAR2(20 BYTE) NOT NULL ENABLE,
CONSTRAINT "TEST_DT_PK" PRIMARY KEY ("DT_CODE")
);



CREATE TABLE "TEST_GEO"
( "GEO_CODE" NUMBER(5,0) NOT NULL ENABLE,
"GEO_NAME" VARCHAR2(20 BYTE) NOT NULL ENABLE,
CONSTRAINT "TEST_GEO_PK" PRIMARY KEY ("GEO_CODE")
);




CREATE TABLE "TEST_PACK"
( "PACK_CODE" NUMBER(5,0) NOT NULL ENABLE,
"PACK_NAME" VARCHAR2(20 BYTE) NOT NULL ENABLE,
CONSTRAINT "PACK_GEO_PK" PRIMARY KEY ("PACK_CODE")
);



CREATE TABLE"TEST_SALES"
( "DT_CODE" NUMBER(5,0) NOT NULL ENABLE,
"GEO_CODE" NUMBER(5,0) NOT NULL ENABLE,
"PACK_CODE" NUMBER(5,0) NOT NULL ENABLE,
"MTH_D" NUMBER,
"MTH_U" NUMBER,
CONSTRAINT "TEST_SALES_PK" PRIMARY KEY ("DT_CODE", "GEO_CODE", "PACK_CODE")
);



REM INSERTING into TEST_DT
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (2,to_timestamp('01/FEB/08','DD/MON/RR HH12:MI:SSXFF AM'),'01-Feb-2008');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (3,to_timestamp('01/MAR/08','DD/MON/RR HH12:MI:SSXFF AM'),'01-Mar-2008');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (4,to_timestamp('01/APR/08','DD/MON/RR HH12:MI:SSXFF AM'),'01-Apr-2008');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (5,to_timestamp('01/MAY/08','DD/MON/RR HH12:MI:SSXFF AM'),'01-May-2008');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (6,to_timestamp('01/JUN/08','DD/MON/RR HH12:MI:SSXFF AM'),'01-Jun-2008');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (7,to_timestamp('01/JUL/08','DD/MON/RR HH12:MI:SSXFF AM'),'01-Jul-2008');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (8,to_timestamp('01/AUG/08','DD/MON/RR HH12:MI:SSXFF AM'),'01-Aug-2008');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (9,to_timestamp('01/SEP/08','DD/MON/RR HH12:MI:SSXFF AM'),'01-Sep-2008');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (10,to_timestamp('01/OCT/08','DD/MON/RR HH12:MI:SSXFF AM'),'01-Oct-2008');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (11,to_timestamp('01/NOV/08','DD/MON/RR HH12:MI:SSXFF AM'),'01-Nov-2008');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (12,to_timestamp('01/DEC/08','DD/MON/RR HH12:MI:SSXFF AM'),'01-Dec-2008');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (13,to_timestamp('01/JAN/09','DD/MON/RR HH12:MI:SSXFF AM'),'01-Jan-2009');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (14,to_timestamp('01/FEB/09','DD/MON/RR HH12:MI:SSXFF AM'),'01-Feb-2009');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (15,to_timestamp('01/MAR/09','DD/MON/RR HH12:MI:SSXFF AM'),'01-Mar-2009');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (16,to_timestamp('01/APR/09','DD/MON/RR HH12:MI:SSXFF AM'),'01-Apr-2009');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (1,to_timestamp('01/JAN/08','DD/MON/RR HH12:MI:SSXFF AM'),'01-Jan-2008');


REM INSERTING into TEST_GEO
Insert into TEST_GEO (GEO_CODE,GEO_NAME) values (20000,'SYDNEY');
Insert into TEST_GEO (GEO_CODE,GEO_NAME) values (20001,'MELBOURNE');
Insert into TEST_GEO (GEO_CODE,GEO_NAME) values (20002,'BRISBANE');

REM INSERTING into TEST_PACK
Insert into TEST_PACK (PACK_CODE,PACK_NAME) values (1000,'SONITES');
Insert into TEST_PACK (PACK_CODE,PACK_NAME) values (2000,'VODITES');


REM INSERTING into TEST_SALES
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (1,20000,1000,2250,75);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (1,20001,1000,2310,77);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (1,20002,1000,1140,38);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (2,20000,1000,1650,55);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (2,20001,1000,2400,80);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (3,20000,1000,2460,82);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (3,20001,1000,210,7);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (3,20002,1000,1830,61);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (4,20000,1000,1590,53);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (4,20002,1000,2400,80);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (5,20001,1000,1620,54);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (5,20002,1000,480,16);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (6,20000,1000,180,6);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (6,20001,1000,2220,74);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (6,20002,1000,2400,80);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (7,20000,1000,2070,69);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (7,20002,1000,1230,41);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (8,20000,1000,2850,95);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (8,20001,1000,2430,81);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (8,20002,1000,540,18);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (9,20000,1000,2760,92);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (9,20001,1000,900,30);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (9,20002,1000,270,9);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (10,20000,1000,1080,36);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (10,20002,1000,420,14);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (11,20000,1000,600,20);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (11,20001,1000,2490,83);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (11,20002,1000,2490,83);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (12,20000,1000,330,11);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (12,20002,1000,1980,66);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (13,20000,1000,180,6);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (13,20001,1000,990,33);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (14,20000,1000,1290,43);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (14,20001,1000,360,12);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (14,20002,1000,1950,65);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (15,20000,1000,1140,38);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (15,20002,1000,930,31);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (16,20000,1000,750,25);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (16,20002,1000,1230,41);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (1,20000,2000,250,5);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (1,20001,2000,2010,67);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (2,20000,2000,360,12);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (2,20001,2000,330,11);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (3,20000,2000,2070,69);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (3,20001,2000,2190,73);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (3,20002,2000,870,29);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (4,20000,2000,930,31);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (4,20001,2000,750,25);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (5,20000,2000,120,4);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (5,20001,2000,1860,62);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (5,20002,2000,1260,42);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (6,20000,2000,2940,98);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (6,20001,2000,2790,93);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (6,20002,2000,150,5);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (7,20001,2000,1050,35);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (7,20002,2000,510,17);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (8,20000,2000,1080,36);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (8,20001,2000,2820,94);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (8,20002,2000,960,32);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (9,20000,2000,60,2);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (9,20001,2000,1320,44);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (9,20002,2000,1230,41);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (10,20000,2000,1020,34);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (10,20001,2000,1050,35);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (10,20002,2000,2880,96);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (11,20001,2000,2040,68);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (11,20002,2000,750,25);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (12,20001,2000,2250,75);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (12,20002,2000,2100,70);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (13,20001,2000,390,13);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (13,20002,2000,2700,90);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (14,20000,2000,2910,97);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (14,20001,2000,1770,59);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (14,20002,2000,1410,47);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (15,20001,2000,990,33);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (15,20002,2000,60,2);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (16,20000,2000,1050,35);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (16,20001,2000,2250,75);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (16,20002,2000,1230,41);


--query to generate the MV
select all_combi.dt, all_combi.geo_code, all_combi.pack_code,
nvl(mth_d,0) mth_d,
nvl(mth_u,0) mth_u,
sum(nvl(mth_d,0)) over
(PARTITION BY all_combi.geo_code, all_combi.pack_code order by all_combi.dt rows 2 preceding) qtr_d,
sum(nvl(mth_u,0)) over
(PARTITION BY all_combi.geo_code, all_combi.pack_code order by all_combi.dt rows 2 preceding) qtr_u,
sum(nvl(mth_d,0)) over
(PARTITION BY all_combi.geo_code, all_combi.pack_code order by all_combi.dt rows 11 preceding) mat_d,
sum(nvl(mth_u,0)) over
(PARTITION BY all_combi.geo_code, all_combi.pack_code order by all_combi.dt rows 11 preceding) mat_u,
sum(nvl(mth_d,0)) over
(PARTITION BY all_combi.geo_code, all_combi.pack_code order by all_combi.dt rows
to_char(dt,'MM')-1 preceding) ytd_d,
sum(nvl(mth_u,0)) over
(PARTITION BY all_combi.geo_code, all_combi.pack_code order by all_combi.dt rows
to_char(dt,'MM')-1 preceding) ytd_u
from
(select dt_code, dt, geo_code, pack_code
from test_dt, test_geo, test_pack ) all_combi,
test_sales
where
all_combi.dt_code = test_sales.dt_code(+)
and all_combi.geo_code = test_sales.geo_code(+)
and all_combi.pack_code = test_sales.pack_code(+)
order by all_combi.pack_code, all_combi.dt, all_combi.geo_code




Tom Kyte
August 11, 2009 - 1:33 pm UTC

how can you use analytics (which do not aggregate rows, they preserve the rows) to generate your aggregates?

and since the analytics would not be incrementally refreshable - would this be a good idea at all?

seems you would just build quarterly rollups which could be used to generate the other rollups?

using Analytics function to calculate sale for last three months.

Prats, September 16, 2009 - 6:11 am UTC

Tom,

I need to write a query to provide sum of sale amount for last three month. I tried going through your website and tried few of the options availble on this page, but couldn't come up with a suiatble solution. Please HELP...

create table sales (saledate date, amount number);

insert into sales values('01/01/2009', 10);
insert into sales values('02/01/2009', 1);
insert into sales values('03/01/2009', NULL);
insert into sales values('04/01/2009', 5);
insert into sales values('05/01/2009', 2);
insert into sales values('06/01/2009', 4);
insert into sales values('07/01/2009', 7);
insert into sales values('08/01/2009', 8);
insert into sales values('09/01/2009', 2);
insert into sales values('10/01/2009', 4);
insert into sales values('11/01/2009', 3);
insert into sales values('12/01/2009', 0);

Desired solution :
12/01/2009 7
11/01/2009 9
10/01/2009 14
09/01/2009 17
08/01/2009 19
......
......
......

Tom Kyte
September 16, 2009 - 4:22 pm UTC

ops$tkyte%ORA11GR1> select saledate, amount, add_months(saledate,-2),
  2         sum(amount) over (order by saledate asc range saledate-add_months(saledate,-2) preceding) sum_amt
  3    from sales
  4   order by saledate DESC
  5  /

SALEDATE       AMOUNT ADD_MONTHS    SUM_AMT
---------- ---------- ---------- ----------
12/01/2009          0 10/01/2009          7
11/01/2009          3 09/01/2009          9
10/01/2009          4 08/01/2009         14
09/01/2009          2 07/01/2009         17
08/01/2009          8 06/01/2009         19
07/01/2009          7 05/01/2009         13
06/01/2009          4 04/01/2009         11
05/01/2009          2 03/01/2009          7
04/01/2009          5 02/01/2009          6
03/01/2009            01/01/2009         11
02/01/2009          1 12/01/2008         11
01/01/2009         10 11/01/2008         10

12 rows selected.



Prats, September 24, 2009 - 8:41 am UTC

Thanks for the query Tom.
One small problem I have been struck with last few days is handling of missing months.
Say if I don't have an entry for April09. Query output wouldn't show the record corresponding to April09.
Can we do something about it.Please guide...


SALEDATE AMOUNT ADD_MONTHS SUM_AMT
---------- ---------- ---------- ----------
07/01/2009 7 05/01/2009 13
06/01/2009 4 04/01/2009 11
05/01/2009 2 03/01/2009 7
>>>04/01/2009 5 02/01/2009 6<<<
03/01/2009 01/01/2009 11
02/01/2009 1 12/01/2008 11
01/01/2009 10 11/01/2008 10



Tom Kyte
September 29, 2009 - 7:55 am UTC

ops$tkyte%ORA11GR1> select * from sales;

SALEDATE       AMOUNT
---------- ----------
01/01/2009         10
02/01/2009          1
03/01/2009
05/01/2009          2
06/01/2009          4
07/01/2009          7
08/01/2009          8
09/01/2009          2
10/01/2009          4
11/01/2009          3
12/01/2009          0

11 rows selected.

ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> with min_date as (select min(saledate) min_sd from sales),
  2       max_date as (select max(saledate) max_sd from sales),
  3           dates as (select add_months(min_sd,level-1) dt
  4                       from min_date, max_date
  5                            connect by level <= months_between(max_sd,min_sd)+1)
  6  select saledate, amount, add_months(saledate,-2),
  7         sum(amount) over (order by saledate asc range saledate-add_months(saledate,-2) preceding) sum_amt
  8    from (select dates.dt as saledate,
  9                 nvl(sales.amount,0) amount
 10            from dates, sales
 11                   where dates.dt = sales.saledate(+))
 12   order by saledate DESC
 13  /

SALEDATE       AMOUNT ADD_MONTHS    SUM_AMT
---------- ---------- ---------- ----------
12/01/2009          0 10/01/2009          7
11/01/2009          3 09/01/2009          9
10/01/2009          4 08/01/2009         14
09/01/2009          2 07/01/2009         17
08/01/2009          8 06/01/2009         19
07/01/2009          7 05/01/2009         13
06/01/2009          4 04/01/2009          6
05/01/2009          2 03/01/2009          2
04/01/2009          0 02/01/2009          1
03/01/2009          0 01/01/2009         11
02/01/2009          1 12/01/2008         11
01/01/2009         10 11/01/2008         10

12 rows selected.


Lag and Lead (analytic function)

Daphnia, September 29, 2009 - 6:03 am UTC

Tom,

I need your help.

Kindly see the below data.
-------------------------------------
REC_ID YEAR_MONTH YR VAL
-------------------------------------
10 200101 2001 null --> ignored
10 200102 2001 null --> ignored
10 200103 2001 2356
10 200104 2001 7898 ----> upper value(uv)
10 200105 2001 null->1+(uv+lv/count)= 7239
10 200106 2001 null->2+(uv+lv/count)= 7240
10 200107 2001 6578 -----> lower value(lv)
10 200108 2001 6878 ----> upper value(uv)
10 200109 2001 null
10 200110 2001 null
10 200111 2001 null
10 200112 2001 8078 -----> lower value(lv)
---------------------------------------
i need to calculate the null values by using the upper value(the value just above the null) and
lower value(the value just after the null) and total null count. the logic goes
like,n+(uv+lv/count) and fill the result in the null sets.. the count is the inbetween count of the
uv and lv.Moreover nulls before the upper value must be ignored for the red_id 10.how can we
achieve this? kindly help me. the query which i tried is,

SELECT YEAR_MONTH,REC_ID,val,
last_VALUE(val IGNORE NULLS) over (PARTITION BY REC_ID ORDER BY YEAR_MONTH) last_val,
CASE WHEN (val IS NOT NULL AND previous_value IS NOT NULL AND
next_value IS NULL) THEN 'BEFORE_NULL'
WHEN (val IS NOT NULL AND previous_value IS NULL AND
next_value IS NOT NULL) THEN 'AFTER_NULL'
END AS record_position ,
previous_value,
next_value
FROM (SELECT ROWNUM row_num,
REC_ID,
YEAR_MONTH,
YR,
VAL,
lag(val) over(PARTITION BY REC_ID ORDER BY YEAR_MONTH) previous_value,
lead(val) over(PARTITION BY REC_ID ORDER BY YEAR_MONTH) next_value
FROM TABLE_X
WHERE YR = 2001
AND REC_ID = 10
ORDER BY YEAR_MONTH);

but, how to calculate and update it over there in the null. kindly help me.
---------------------------------------------------------
scripts
---------------------------------------------------------
CREATE TABLE TABLE_X
(REC_ID NUMBER,
YEAR_MONTH NUMBER,
YR NUMBER,
VAL NUMBER);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200101, 2001, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200102, 2001, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200103, 2001, 7898);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200104, 2001, 2356);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200105, 2001, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200106, 2001, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200107, 2001, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200108, 2001, 6578);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200109, 2001, 6878);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200110, 2001, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200111, 2001, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200112, 2001, 8078);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200202, 2002, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200203, 2002, 14225);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200204, 2002, 20554);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200205, 2002, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200206, 2002, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200207, 2002, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200208, 2002, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200209, 2002, 52199);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200210, 2002, 58528);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200211, 2002, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200212, 2002, 71186);

--------------------------------------------------------

Awaiting your reply


Tom Kyte
October 02, 2009 - 9:07 am UTC

I hate it when the same question pops up all over the place. Look in the other place you asked the same exact thing.


http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:16389714357623

Analytic functions (sum over partition) with rollup?

johny, November 25, 2009 - 7:59 am UTC

Hi Tom,

Consider:

Oracle9i Enterprise Edition Release 9.2.0.6.0

CREATE TABLE reportstg
(
maid VARCHAR2(4),
maprice NUMBER(5,2),
mapart VARCHAR2(5),
irid VARCHAR2(4),
irdateyymm VARCHAR2(4),
irstatus VARCHAR2(4),
iramount NUMBER(5,2)
)

insert into reportstg
values ('ma1', 100.00, 'part1', 'ir1', to_char(sysdate, 'YYMM'), 'DOWN', 100.00);

insert into reportstg
values ('ma1', 100.00, 'part1', 'ir2', to_char(sysdate, 'YYMM'), 'UP', 100.00);

insert into reportstg
values ('ma1', 100.00, 'part1', 'ir3', to_char(add_months(sysdate, -2), 'YYMM'), 'DOWN', 200.00);

insert into reportstg
values ('ma1', 150.00, 'part2', 'ir4', to_char(sysdate,'YYMM'), 'DOWN', 50.00);

insert into reportstg
values ('ma1', 150.00, 'part2', 'ir5', to_char(sysdate,'YYMM'), 'UP', 60.00);

insert into reportstg
values ('ma1', 10.00, 'part3', 'ir6', to_char(add_months(sysdate, -1), 'YYMM'), 'UP', 20.00);

insert into reportstg
values ('ma1', 10.00, 'part4', 'ir7', to_char(sysdate, 'YYMM'), 'DOWN', 20.00);

insert into reportstg
values ('ma1', 10.00, 'part4', 'ir8', to_char(add_months(sysdate, -1), 'YYMM'), 'DOWN', 20.00);

insert into reportstg
values ('ma1', 15.00, 'part5', 'ir9', to_char(sysdate, 'YYMM'), 'DOWN', 10.00);

insert into reportstg
values ('ma2', 100.00, 'part6', 'ir7', to_char(add_months(sysdate, -2), 'YYMM'), 'DOWN', 500.00);

insert into reportstg
values ('ma2', 100.00, 'part6', 'ir8', to_char(sysdate, 'YYMM'), 'UP', 20.00);

insert into reportstg
values ('ma2', 150.00, 'part7', 'ir9', to_char(add_months(sysdate, -1), 'YYMM'), 'DOWN', 110.00);

insert into reportstg
values ('ma2', 10.00, 'part8', 'ir10', to_char(sysdate, 'YYMM'), 'DOWN', 120.00);

insert into reportstg
values ('ma3', 10.00, 'part9', 'ir11', to_char(sysdate, 'YYMM'), 'DOWN', 220.00);

select * from reportstg;

MAID MAPRICE MAPAR IRID IRDA IRST IRAMOUNT
---- ---------- ----- ---- ---- ---- ----------
ma1 100 part1 ir1 0911 DOWN 100
ma1 100 part1 ir2 0911 UP 100
ma1 100 part1 ir3 0909 DOWN 200
ma1 150 part2 ir4 0911 DOWN 50
ma1 150 part2 ir5 0911 UP 60
ma1 10 part3 ir6 0910 UP 20
ma1 10 part4 ir7 0911 DOWN 20
ma1 10 part4 ir8 0910 DOWN 20
ma1 15 part5 ir9 0911 DOWN 10
ma2 100 part6 ir7 0909 DOWN 500
ma2 100 part6 ir8 0911 UP 20

MAID MAPRICE MAPAR IRID IRDA IRST IRAMOUNT
---- ---------- ----- ---- ---- ---- ----------
ma2 150 part7 ir9 0910 DOWN 110
ma3 10 part9 ir11 0911 DOWN 220
ma2 10 part8 ir10 0911 DOWN 120

14 rows selected.

select distinct rpt.maid,
rpt.maprice,
rpt.mapart,
rem.downamt,
rem.upamt,
sum(rpt.iramount) over (partition by rpt.maid, rpt.mapart) yrtotal
from reportstg rpt,
(select maid,
maprice,
mapart,
max(decode(irstatus, 'DOWN', monthtotal, null)) downamt,
max(decode(irstatus, 'UP', monthtotal, null)) upamt
from (
select distinct maid,
maprice,
mapart,
irstatus,
sum(iramount) over (partition by maid, mapart, irdateyymm, irstatus) monthtotal
from reportstg
where irdateyymm = '0911'
)
group by maid, maprice, mapart) rem
where rpt.maid = rem.maid(+)
and rpt.mapart = rem.mapart(+)
and rpt.irstatus != 'UP'

MAID MAPRICE MAPAR DOWNAMT UPAMT YRTOTAL
---- ---------- ----- ---------- ---------- ----------
ma1 100 part1 100 100 300
ma1 150 part2 50 60 50
ma1 10 part4 20 40
ma1 15 part5 10 10
ma2 100 part6 20 500
ma2 150 part7 110
ma2 10 part8 120 120
ma3 10 part9 220 220

8 rows selected.

Basically, I wanted a query to return the total amount of status 'DOWN' and 'UP'
on the most recent irdateyymm ('0911'), pivot via status, then get the total for the entire year -
for the entire year total, exclude the 'UP' status.

Some quick questions:
a. Is my query good? Not sure if there's any easy way on this.
b. Is it possible to have the output something like this:

MAID MAPRICE MAPAR DOWNAMT UPAMT YRTOTAL
---- ---------- ----- ---------- ---------- ----------
ma1 100 part1 100 100 300
ma1 150 part2 50 60 50
ma1 10 part4 20 40
ma1 15 part5 10 10
ma1 total 180 400
ma2 100 part6 20 500
ma2 150 part7 110
ma2 10 part8 120 120
ma2 total 120 730
ma3 10 part9 220 220
ma3 total 220 220

I wanted to rollup the totals based on maid. Thanks in advance.
Tom Kyte
November 27, 2009 - 3:43 pm UTC

ops$tkyte%ORA9IR2> select maid, decode( grouping(maprice), 1, 'Total:' ), maprice, mapart, downamt, upamt, sum(yrtotal)
  2    from (
  3  select distinct rpt.maid,
  4         rpt.maprice,
  5         rpt.mapart,
  6         rem.downamt,
  7         rem.upamt,
  8         sum(rpt.iramount) over (partition by rpt.maid, rpt.mapart) yrtotal
  9  from reportstg rpt,
 10  (select maid,
 11         maprice,
 12         mapart,
 13         max(decode(irstatus, 'DOWN', monthtotal, null)) downamt,
 14         max(decode(irstatus, 'UP', monthtotal, null)) upamt
 15    from (
 16         select distinct maid,
 17                maprice,
 18                mapart,
 19                irstatus,
 20                sum(iramount) over (partition by maid, mapart, irdateyymm,
 21  irstatus) monthtotal
 22         from reportstg
 23         where irdateyymm = '0911'
 24         )
 25  group by maid, maprice, mapart) rem
 26  where rpt.maid = rem.maid(+)
 27  and rpt.mapart = rem.mapart(+)
 28  and rpt.irstatus != 'UP'
 29        )
 30   group by grouping sets( (maid,maprice,mapart,downamt,upamt), (maid))
 31  /

MAID DECODE    MAPRICE MAPAR    DOWNAMT      UPAMT SUM(YRTOTAL)
---- ------ ---------- ----- ---------- ---------- ------------
ma1                 10 part4         20                      40
ma1                 15 part5         10                      10
ma1                100 part1        100        100          300
ma1                150 part2         50         60           50
ma1  Total:                                                 400
ma2                 10 part8        120                     120
ma2                100 part6                    20          500
ma2                150 part7                                110
ma2  Total:                                                 730
ma3                 10 part9        220                     220
ma3  Total:                                                 220

11 rows selected.

Analytic Functions

johny, November 28, 2009 - 8:36 am UTC

Awesome. Thanks a lot! :)

Anaylytic Functions

anon, December 18, 2009 - 11:54 am UTC

Hi Tom,

If I have table like this:

create table x (a number, b number);

insert into x values (1,1);
insert into x values (1,1);
insert into x values (1,2);
insert into x values (1,3);

Is it possible to select out a running count of distinct occurrences of b:

e.g. a b result
---------------
1 1 1
1 1 1
1 2 2
1 3 3

Tom Kyte
December 18, 2009 - 1:06 pm UTC

Ranking date overlaps

A reader, December 22, 2009 - 7:55 pm UTC

Hi Tom,

How can I get ranking from datetime overlaps

CODE START_DAT END_DATE RN
---------- --------- --------- ----------
111 01-JAN-09 08:00 01-JAN-09 10:00 1
111 01-JAN-09 08:45 01-JAN-09 10:05 2
111 01-JAN-09 09:20 01-JAN-09 10:10 3
111 01-JAN-09 10:30 01-JAN-09 11:10 1
111 01-JAN-09 10:25 01-JAN-09 11:12 2
111 01-JAN-09 13:12 01-JAN-09 15:24 1
111 01-JAN-09 18:56 01-JAN-09 22:21 1

222 01-JAN-09 01:56 01-JAN-09 02:28 1
222 01-JAN-09 02:10 01-JAN-09 02:30 2
222 01-JAN-09 02:12 01-JAN-09 02:32 3
222 01-JAN-09 03:22 01-JAN-09 03:23 1
222 01-JAN-09 06:22 01-JAN-09 08:23 1





Tom Kyte
December 31, 2009 - 11:12 am UTC

you'd have to start by defining what:

... ranking from datetime overlaps
...


means, it means nothing to me.

Generating max+1

AG, January 19, 2010 - 9:53 pm UTC

Tom,

Needed some help to write a query that would assign/generate max+1 when there is null.

create table test(loc_name varchar2(10), id varchar2(10), lk number);

insert into test('A','U1',0);
insert into test('A','U2',0);
insert into test('A','U3',0);
insert into test('A','U4',2);
insert into test('A','U5',3);
insert into test('A','U6',5);
insert into test('A','U7',null);
insert into test('A','U8',null);

select * from test;

loc id lk
---- -- --
A U1 0
A U2 0
A U3 0
A U4 2
A U5 3
A U6 5
A U7
A U8




Result:


loc id lk
---- -- --
A U1 0
A U2 0
A U3 0
A U4 2
A U5 3
A U6 5
A U7 6 <== (5+1)
A U8 7 <== (6+1)


Thanks in Advance for any help.

AG
Tom Kyte
January 20, 2010 - 11:28 am UTC

I see no order by, the nulls will come out where ever they feel like it, they won't come out last in many cases.


and what if the data is:

insert into test('A','U1',0);
insert into test('A','U2',0);
insert into test('A','U3',0);
insert into test('A','U4',2);
insert into test('A','U5',3);
insert into test('A','U6',5);
insert into test('A','U7',null);
insert into test('A','U8',null);
insert into test('A','U9',6);

what then? more details, give all details, use and order by, think about strange situations and get back to us.

AG, January 20, 2010 - 8:05 am UTC

Tom,

Sorry about not posting the example clearly. Actually the requirement is to write a query that would assign/generate within a qroup/partition max+1 when there is null.

create table test(loc_name varchar2(10), id varchar2(10), lk number);

insert into test values('A','U1',0);
insert into test values('A','U2',0);
insert into test values('A','U3',0);
insert into test values('A','U4',2);
insert into test values('A','U5',3);
insert into test values('A','U6',5);
insert into test values('A','U7',null);
insert into test values('A','U8',null);
insert into test values('B','S1',0);
insert into test values('B','S2',0);
insert into test values('B','S3',null);
insert into test values('B','S4',null);


select * from test;

LOC_NAME ID LK
---------- ---------- ----------
A U1 0
A U2 0
A U3 0
A U4 2
A U5 3
A U6 5
A U7
A U8
B S1 0
B S2 0
B S3
B S4

12 rows selected.




Result:


loc id lk
---- -- --
A U1 0
A U2 0
A U3 0
A U4 2
A U5 3
A U6 5
A U7 6 <== (5+1)
A U8 7 <== (6+1)
B S1 0
B S2 0
B S3 1 <== (0+1)
B S4 2 <== (1+1)


Thanks in Advance for any help.

AG

Tom Kyte
January 20, 2010 - 11:38 am UTC

See above, add my U9 to the example and explain how that works.

Analytical function

Sudhakar, January 29, 2010 - 1:48 am UTC

Hi Tom,

create table daily_data(id number, status_id number, date_time date)

insert into daily_data values(1,0,to_date('12/4/2009 9:40:06','mm/dd/yyyy hh:mi:ss'));
insert into daily_data values(1,0,to_date('12/4/2009 9:47:06','mm/dd/yyyy hh:mi:ss'));
insert into daily_data values(1,1,to_date('12/4/2009 9:57:06','mm/dd/yyyy hh:mi:ss'));
insert into daily_data values(1,0,to_date('12/4/2009 10:57:06','mm/dd/yyyy hh:mi:ss'));
insert into daily_data values(1,0,to_date('12/4/2009 11:10:06','mm/dd/yyyy hh:mi:ss'));
insert into daily_data values(1,1,to_date('12/4/2009 11:17:06','mm/dd/yyyy hh:mi:ss'));
insert into daily_data values(1,1,to_date('12/4/2009 11:47:06','mm/dd/yyyy hh:mi:ss'));

As per above data I need to get the out put as time gap between staus_id "1" and next immediate status id "0"

For example
-----------
1) 12/4/2009 9:57:06 - 12/4/2009 10:57:06 =60 minutes
2) 12/4/2009 11:17:06 - 12/4/2009 11:47:06 = 30 minutes.

Could you please help me provide a sequel to give output as above specified.

Best regards & Thank you,
Sudhakar
Tom Kyte
January 29, 2010 - 4:18 pm UTC

your output doesn't compute.


the time 11:47:06 corresponds to a status_id = 1 record, it is not the next 0 record.

Analytic Functions

Stephan, February 03, 2010 - 11:20 am UTC

Hi Tom,
i have below table data

select * from test.
fr_dt      to_dt       amount
01/01/2010 31/12/2010   1000
01/01/2011 31/12/2011   1000
01/01/2012 31/12/2012   1400
01/01/2013 31/12/2013   1400
01/01/2014 31/12/2014   1400
01/01/2015 31/12/2015   1400
01/01/2016 31/12/2016   1000
01/01/2017 31/12/2017   1000
01/01/2018 31/12/2018   1000

how to write a query to output as below.
fr_dt      to_dt       amount
01/01/2010 31/12/2011   1000
01/01/2012 31/12/2015   1400
01/01/2016 31/12/2018   1000

Please help.
Tom Kyte
February 03, 2010 - 12:28 pm UTC

... how to write a query to output as below. ...

use sql. I would.


You'll be able to do it because

a) you have the create table ( I don't)
b) you have the inserts ready to go (i don't)

It looks pretty easy. If I had a create table and inserts, I could even demonstrate it for you :)

remember my nls_date_format isn't yours, make it easy to run your test script.

Analytic Functions

Stephan, February 03, 2010 - 5:19 pm UTC

Thanks for your swift reply.

here is insert statements

create table test (fr_dt date,to_dt date,amount number);
insert into test values(to_date('01/01/2010','dd/mm/yyyy'),to_date('31/12/2010','dd/mm/yyyy'),1000);
insert into test values(to_date('01/01/2011','dd/mm/yyyy'),to_date('31/12/2011','dd/mm/yyyy'),1000);
insert into test values(to_date('01/01/2012','dd/mm/yyyy'),to_date('31/12/2012','dd/mm/yyyy'),1400);
insert into test values(to_date('01/01/2013','dd/mm/yyyy'),to_date('31/12/2013','dd/mm/yyyy'),1400);
insert into test values(to_date('01/01/2014','dd/mm/yyyy'),to_date('31/12/2014','dd/mm/yyyy'),1400);
insert into test values(to_date('01/01/2015','dd/mm/yyyy'),to_date('31/12/2015','dd/mm/yyyy'),1400);
insert into test values(to_date('01/01/2016','dd/mm/yyyy'),to_date('31/12/2016','dd/mm/yyyy'),1000);
insert into test values(to_date('01/01/2017','dd/mm/yyyy'),to_date('31/12/2017','dd/mm/yyyy'),1000);
insert into test values(to_date('01/01/2018','dd/mm/yyyy'),to_date('31/12/2018','dd/mm/yyyy'),1000);

Tom Kyte
February 04, 2010 - 12:20 pm UTC

ops$tkyte%ORA10GR2> select min(fr_dt), max(to_dt), max(amount)
  2    from (
  3  select fr_dt, to_dt, amount, last_value(grp ignore nulls) over (order by fr_dt) carry_down
  4    from (
  5  select fr_dt, to_dt, amount,
  6         case when nvl(lag(amount) over (order by fr_dt),-amount) <> amount
  7              then row_number() over (order by fr_dt)
  8          end grp
  9    from test
 10         )
 11         )
 12   group by carry_down
 13   order by 1;

MIN(FR_DT MAX(TO_DT MAX(AMOUNT)
--------- --------- -----------
01-JAN-10 31-DEC-11        1000
01-JAN-12 31-DEC-15        1400
01-JAN-16 31-DEC-18        1000

Ugly but works... I think?

A reader, February 04, 2010 - 8:47 am UTC

select fr_dt, to_dt, amount
from
(
select case when start_pos is not null then fr_dt end as fr_dt,
lead(case when end_pos is not null then to_dt end ) over (order by fr_dt) as to_dt,
amount
from
(
select fr_dt, to_dt, amount, start_pos, case when ( start_pos is null and lead(start_pos,1,1) over (order by to_dt) is not null ) then 'End' else null end as end_pos
from
(
select fr_dt, to_dt, amount,
case when amount = lag(amount) over (order by to_dt) then null else 'Start' end as start_pos
from test
)
)
where start_pos is not null or end_pos is not null
)
where fr_dt is not null and to_dt is not null

Analytic Functions

Stephan, February 04, 2010 - 2:40 pm UTC

Thanks Tom, you're the best :)

Only 2 SELECTs

CezarN, February 05, 2010 - 12:30 pm UTC

select distinct
case when b=e or b>0 then fr_dt else lag(fr_dt) over (order by fr_dt) end as fr_dt,
case when e>0 then to_dt else lead(to_dt) over (order by fr_dt) end as to_dt,
amount
from
(
select case when lag(t.amount,1,-t.amount) over (order by fr_dt) =t.amount then 0 else row_number() over(order by fr_dt) end b,
case when lead(t.amount,1,-t.amount) over (order by fr_dt)=t.amount then 0 else row_number() over(order by fr_dt) end e,
t.*
from test t
)
where b+e<>0

eliminate records through analytics

Sanji, June 01, 2010 - 12:52 pm UTC

Tom,

Scenario

create table test1 (party_id varchar2(5), sam_flag varchar2(1));

insert into test1 values('123','N');
insert into test1 values('456','N');
insert into test1 values('789','N');
insert into test1 values('ABC','Y');
insert into test1 values('ABC','Y');
insert into test1 values('ABC','N');
insert into test1 values('DEF','N');
insert into test1 values('DEF','Y');
insert into test1 values('DEF','N');
insert into test1 values('GHI','Y');
insert into test1 values('GHI','Y');
insert into test1 values('GHI','N');

I'm attempting to rewrite the following query through analytics

SELECT * FROM TEST1
WHERE (PARTY_ID, SAM_FLAG) IN
(SELECT PARTY_ID,SAM_FLAG FROM TEST1 WHERE SAM_FLAG='Y')
OR PARTY_ID NOT IN
(SELECT DISTINCT PARTY_ID FROM TEST1 WHERE SAM_FLAG = 'Y' )
ORDER BY PARTY_ID

The result has to look like

PARTY SAM_FLAG
----- -
123 N
456 Y
789 N
ABC Y
ABC Y
DEF Y
GHI Y
GHI Y

I tried the following query but need a logic to elimiate certain records

SELECT PARTY_ID, SAM_FLAG,
ROW_NUMBER() OVER (PARTITION BY PARTY_ID ORDER BY SAM_FLAG) AS RANK1
FROM TEST1
/

PARTY S SALES_RANK
----- - ----------
123 N 1

456 Y 1

789 N 1

ABC N 1
Y 2
Y 2

DEF N 1
N 1
Y 3

GHI N 1
Y 2
Y 2

Following records should not be a part of the result set
'ABC','N'
'DEF','N'
'DEF','N'
'GHI','N'

Would appreciate any insight.

Thanks
Sanji
Tom Kyte
June 08, 2010 - 8:51 am UTC

it would have been really nice to phrase the question as a question - rather than just giving a sql statement and thus forcing us to live in the confines of that sql statement.

we'll make the safe (hopefully) assumption that party_id is NOT NULL in real life.

You want (apparently) every row that is a 'Y' if there exists at least one 'Y' row for a party_id

OR

You want all rows if there are NO 'Y' records.


So, just see if there is a 'Y' record by party_id (easy) and then filter on its existence or lack thereof.




ops$tkyte%ORA11GR2> select party_id, sam_flag, sam_flag2
  2    from (
  3  select party_id, sam_flag,
  4         max(case when sam_flag = 'Y' then sam_flag end) over (partition by party_id) sam_flag2
  5    from test1
  6         )
  7   where (sam_flag = 'Y' and sam_flag2 = 'Y')
  8      or (sam_flag2 is null)
  9   order by party_id
 10  /

PARTY S S
----- - -
123   N
456   N
789   N
ABC   Y Y
ABC   Y Y
DEF   Y Y
GHI   Y Y
GHI   Y Y

8 rows selected.




followup to previous post

Sanji, June 02, 2010 - 3:15 pm UTC

Tom,
I rewrote the above query (previous post from me) but am not sure if this is the efficient way.

OPEN:SANJI:FDCTL@ORADEV1>select party_id,sam_flag, rank2
2 from(
3 select party_id,
4 lead(sam_flag) over (partition by party_id order by sam_flag) as rank2,
5 sam_flag
6 from test1)
7 where rank2 is null
8 or (rank2 = 'Y' and sam_flag=rank2)
9 order by 1
10 /

PARTY_ID SAM_F RANK2
---------- ----- -----
123 N

456 Y

789 N

ABC Y Y
Y

DEF Y

GHI Y Y
Y

8 rows selected.

Thanks
Sanji
Tom Kyte
June 08, 2010 - 10:46 am UTC

see above.

analytical query

Ravi B, July 20, 2010 - 6:43 pm UTC

Hi Tom,

I have a requirement where i have to query a table with multiple limiting conditions and show it as a single row. Would this be possible with analytical function. This can be accomplished by set operations but, I dont want to hit the same table more than once to get to the results. Would analytics help?

example:

drop table tab;

create table tab (id number, publisher varchar2(10),name varchar2(10),version varchar2(5),ignore varchar2(5));

insert into tab (id,publisher,name,version,ignore) values (1,'MS','MS1','1.0','no');
insert into tab (id,publisher,name,version,ignore) values (2,'MS','MS1','1.0','no');
insert into tab (id,publisher,name,version,ignore) values (3,'MS','MS2','1.2','no');
insert into tab (id,publisher,name,version,ignore) values (4,'MS','MS3','1.3','yes');
insert into tab (id,publisher,name,version,ignore) values (-1,'MS','MS4','1.4','yes');
--
insert into tab (id,publisher,name,version,ignore) values (5,'AB','AB1','1.0','no');
insert into tab (id,publisher,name,version,ignore) values (6,'AB','AB2','1.0','no');
insert into tab (id,publisher,name,version,ignore) values (7,'AB','AB2','1.2','no');
insert into tab (id,publisher,name,version,ignore) values (8,'AB','AB3','1.3','yes');
insert into tab (id,publisher,name,version,ignore) values (-2,'AB','AB1','1.4','yes');

commit

Queries:
--------
SELECT Count(*) Total FROM (SELECT DISTINCT publisher, NAME, version FROM tab)

SELECT Count(*) yes FROM (SELECT DISTINCT publisher, NAME, version FROM tab where ignore = 'yes')

SELECT Count(*) no FROM (SELECT DISTINCT publisher, NAME, version FROM tab where ignore is null or ignore <> 'yes')

SELECT Count(*) unknown FROM (SELECT DISTINCT publisher, NAME, version FROM tab where id > -1)

Output wanted as:

Total Yes No Unknown
----------------------
9 4 5 7

Thanks,
Ravi
Tom Kyte
July 23, 2010 - 7:59 am UTC

ops$tkyte%ORA11GR1> select count(distinct publisher||'/'||name||'/'||version) total,
  2         count(distinct case when ignore = 'yes' then publisher||'/'||name||'/'||version end) yes,
  3         count(distinct case when ignore is null or ignore <> 'yes' then publisher||'/'||name||'/'||version end) no,
  4         count(distinct case when id > -1 then publisher||'/'||name||'/'||version end) unknown
  5    from tab;

     TOTAL        YES         NO    UNKNOWN
---------- ---------- ---------- ----------
         9          4          5          7

A reader, September 08, 2010 - 5:59 am UTC

Hi Tom,

I want - all the department names and count of employees in the department whose salary is greater than the average salary in the same department ?

could you please suggest me an approach for this ?
1.with out analytic functions
2.with analytic functions

emp table ( emp name, sal,dept_id(fk))
dep table (dept_id(pk),dept_name)




Many thanks

Tom Kyte
September 09, 2010 - 8:41 pm UTC

I smell.......


HOMEWORK.


this is pretty trivial - doesn't require a DEP table at all. would require a real life create table and inserts if you wanted to see how I would do it.

hint:

select emp.*, avg(sal) over (partition by by dept_id) avg_sal) from emp;


that would assign to each row the avg salary by its department, now, use an inline view and apply a where clause.


hint 2:

select dept_id, avg(sal) avg_sal from emp group by dept_id

what if you ran that - that would be the avg_sal by dept_id, if you used an inline view and joined to THAT query from the emp table by dept_id, what where clause could you add to get the rows you needed?


A reader, September 08, 2010 - 8:56 am UTC

WITH OUT ANALYTIC functions
==============================

All the department names (with and with out employees also)

select dept_name ,(select count(*) from emp where emp.dept_id=dept.dept_id and
sal > ( select avg(sal) from emp where emp.dept_id=dept.dept_id) ) count
from dept where exists (select null from emp where emp.dept_id=dept.dept_id)

Only department names having employees
---------------------------------------------
select dept_name,x.count
from (
select count(*) count,dept_id from emp a where sal > (select avg(sal) from emp b where a.dept_id=b.dept_id)
group by dept_id) X, dept Y
where x.dept_id=y.dept_id



WITH ANALYTIC functions
=======================

Tom Kyte
September 09, 2010 - 8:41 pm UTC

wrong, try again :) read above

A reader, September 08, 2010 - 8:57 am UTC

Sorry - above post is 'incorrect'

WITH OUT ANALYTIC functions
==============================

We want all the department names (with and with out employees also)

select dept_name ,(select count(*) from emp where emp.dept_id=dept.dept_id and
sal > ( select avg(sal) from emp where emp.dept_id=dept.dept_id) ) count
from dept


We want only department names having employees

select dept_name,x.count
from (
select count(*) count,dept_id from emp a where sal > (select avg(sal) from emp b where a.dept_id=b.dept_id)
group by dept_id) X, dept Y
where x.dept_id=y.dept_id

(OR)

select dept_name ,(select count(*) from emp where emp.dept_id=dept.dept_id and
sal > ( select avg(sal) from emp where emp.dept_id=dept.dept_id) ) count
from dept where exists (select null from emp where emp.dept_id=dept.dept_id)

WITH ANALYTIC functions
=======================
Tom Kyte
September 09, 2010 - 8:42 pm UTC

anything that queries the dep table is wrong, it is not needed for this homework assignment at all.

Homework

Michel Cadot, September 10, 2010 - 2:40 am UTC


And this is not an original homework.
Here are 3 closed ones posted in 2007, 2008 and 2009 that can give some ideas :)
http://www.orafaq.com/forum/mv/msg/79948/230696/102589/#msg_230696
http://www.orafaq.com/forum/mv/msg/122031/334607/102589/#msg_334607
http://www.orafaq.com/forum/mv/msg/148250/415194/102589/#msg_415194

Regards
Michel

Analytical Function Without Analytical Function

A reader, September 10, 2010 - 1:20 pm UTC

Just as an exercise, how do I implement row_number() functionality without using an analytical function? I'd like to rewrite the following SQL without using row_number().

CREATE TABLE x (
   a DATE,
   b INTEGER,
   c INTEGER
);

INSERT INTO x VALUES (TRUNC(TO_DATE('2010-09-10', 'yyyy-mm-dd')), 1, 3);
INSERT INTO x VALUES (TRUNC(TO_DATE('2010-09-10', 'yyyy-mm-dd')), 1, 1);
INSERT INTO x VALUES (TRUNC(TO_DATE('2010-09-10', 'yyyy-mm-dd')), 1, 2);
INSERT INTO x VALUES (TRUNC(TO_DATE('2010-09-10', 'yyyy-mm-dd')), 2, 1);
INSERT INTO x VALUES (TRUNC(TO_DATE('2010-09-09', 'yyyy-mm-dd')), 2, 1);
INSERT INTO x VALUES (TRUNC(TO_DATE('2010-09-09', 'yyyy-mm-dd')), 2, 2);

SELECT a, b, c,
       row_number() OVER (PARTITION BY a ORDER BY b, c) rn
FROM   x;

A                            B          C         RN
------------------- ---------- ---------- ----------
2010-09-09 00:00:00          2          1          1
2010-09-09 00:00:00          2          2          2
2010-09-10 00:00:00          1          1          1
2010-09-10 00:00:00          1          2          2
2010-09-10 00:00:00          1          3          3
2010-09-10 00:00:00          2          1          4

Tom Kyte
September 13, 2010 - 8:02 am UTC

it would be painfully slow and ugly and wrong.

ops$tkyte%ORA11GR2> SELECT a, b, c,
  2         row_number() OVER (PARTITION BY a ORDER BY b, c) rn,
  3         (select count(*) from x t2 where t2.a = x.a and (t2.b < x.b or ( t2.b = x.b and t2.c <= x.c)) ) rn2
  4  FROM   x
  5  order by a,b,c;

A                  B          C         RN        RN2
--------- ---------- ---------- ---------- ----------
09-SEP-10          2          1          1          1
09-SEP-10          2          2          2          2
10-SEP-10          1          1          1          1
10-SEP-10          1          2          2          2
10-SEP-10          1          3          3          3
10-SEP-10          2          1          4          4

6 rows selected.


beware of duplicate b,c records - they would change the answer of course, you would have to incorporate rowid into the predicate to 'fix' that.

Without analytic

Michel Cadot, September 11, 2010 - 2:35 am UTC


SQL> select a, b, c, rownum - nb rn,
  2         row_number() over (partition by a order by b, c) rownumber
  3  from ( select x.a, x.b, x.c, y.nb
  4         from ( select a, b, c from x order by a, b, c ) x,
  5              ( select x.a, nvl(count(y.a),0) nb
  6                from ( select distinct a from x ) x, x y
  7                where y.a (+) < x.a
  8                group by x.a ) y
  9         where y.a = x.a
 10         order by x.a, x.b, x.c )
 11  /
A                            B          C         RN  ROWNUMBER
------------------- ---------- ---------- ---------- ----------
09/09/2010 00:00:00          2          1          1          1
09/09/2010 00:00:00          2          2          2          2
10/09/2010 00:00:00          1          1          1          1
10/09/2010 00:00:00          1          2          2          2
10/09/2010 00:00:00          1          3          3          3
10/09/2010 00:00:00          2          1          4          4

Regards
Michel

Without analytic

Michel Cadot, September 13, 2010 - 9:20 am UTC


Ah ah! I'm still using old SQL before scalar subquery was allowed in select list.

Regards
Michel

null

Sokrates, September 13, 2010 - 10:10 am UTC

Tom's solution fails for

INSERT INTO x VALUES (TRUNC(TO_DATE('2010-09-09', 'yyyy-mm-dd')), 3, null);

Michael's solution fails for
INSERT INTO x VALUES (null, 1,2);

-> use row_number()
Tom Kyte
September 13, 2010 - 2:37 pm UTC

indeed :)

it is a silly thing to do (not use the analytics) for sure regardless.

Without analytic

Michel Cadot, September 13, 2010 - 3:12 pm UTC


Both solutions work if you just add NVL at the correct places.

That was just an exercise (but should be done in some RDBMS that don't know analytics).

Regards
Michel

Ignore Nulls

Joe, September 15, 2010 - 1:17 pm UTC

Hi Tom,

I'm a little confused as to why it seems the "ignore nulls" does not seem to be working in the below example:

create table t
(
  phone   varchar2(10),
  isprimary varchar2(1)
);

insert into t values (null, 'Y');
insert into t values ('5551212','N');


select distinct first_value(phone ignore nulls) over(order by isprimary desc, phone) phone
  from t;

Result:

PHONE
<null>
5551212


The above query is trying to get a single phone # (taking the "Primary" one first assuming it is not null). So I thought the "ignore nulls" would ignore the NULL values. Am I incorrect in my understanding of the "ignore nulls"?

(This is only simple query to represent the problem. I know in this example the query could be rewritten to exclude the nulls but I can't do that with my real query).

Thanks,
Joe

Oracle 10.2

Tom Kyte
September 15, 2010 - 1:38 pm UTC

ops$tkyte%ORA11GR2> select t.*, first_value(phone ignore nulls) over(order by isprimary desc, phone) phone2 from t order by isprimary DESC, phone;

PHONE      I PHONE2
---------- - ----------
           Y
5551212    N 5551212



It has no choice - the "first value" after ignoring nulls, is UNKNOWN still for that first row, the first value in the window defined by the current row and all preceding rows (the default window) after sorting by isprimary DESC, phone - is in fact NULL, it is not known, it doesn't exist.


Now, if you change the default window from being the current row and unbounded preceding - to be the entire set, you get a different answer:

ops$tkyte%ORA11GR2> select t.*, first_value(phone ignore nulls) over(order by isprimary desc, phone rows between unbounded preceding and unbounded following) phone2 from t order by isprimary DESC, phone;

PHONE      I PHONE2
---------- - ----------
           Y 5551212
5551212    N 5551212

Arg! Throw something out the window

Duke Ganote, September 15, 2010 - 7:28 pm UTC

This is like the LAST_VALUE issue, no?
http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:1822122724803#10044959630864

I suppose I'll just have to get more comfortable with non-default windows.

And remembering that ROW_NUMBER defaults to the whole "partition". There's no optional window clause for it.

Or just do the usual trick of sticking rows with NULL values at the end of the ordering:

select t.*
     , FIRST_VALUE(phone ignore nulls) over
         (order by isprimary desc
                 , phone
          rows between unbounded preceding and unbounded following
         ) AS phone1
     , LAST_VALUE(phone ignore nulls) over
         (order by isprimary desc
                 , phone
          rows between unbounded preceding and unbounded following
         ) AS phone2
     , FIRST_VALUE(phone) over
         (order by case when phone is not null then isprimary else CHR(0) end DESC
                 , case when phone is not null then phone else '9999999999' end
         ) AS phone3
     , ROW_NUMBER() over
         (order by case when phone is not null then isprimary else CHR(0) end DESC
                 , case when phone is not null then phone else '9999999999' end
         ) AS r#
     , case when phone is not null then isprimary else CHR(0) end as o1
     , case when phone is not null then phone else '9999999999' end as o2
  from t
/
PHONE      I PHONE1     PHONE2     PHONE3             R# O1 O2
---------- - ---------- ---------- ---------- ---------- -- ----------
           Y 5551212    5551212    5551212             2    9999999999
5551212    N 5551212    5551212    5551212             1 N  5551212


decisions, decisions...

month/year top and low

Omsai, September 25, 2010 - 12:30 am UTC

Hi
Tom
refer to above example on emp ranking ,please let us know if includes hiredate also , and wants to know the monthly top and low earned employee based on sum(sal) ,
we are abale to get it by below query but when remove "where caluse) its only return the MAX one instead of all rows
SELECT max(TO_CHAR(hireDATE,'MM/RR'))SMONTH
, min(empno) KEEP (DENSE_RANK last ORDER BY sum(sal)desc) "least"
, MIN(sum(sal)) KEEP (DENSE_RANK FIRST ORDER BY TO_CHAR(hiredate,'MM/RR')) "least"
, MAX(empno) KEEP (DENSE_RANK first ORDER BY sum(sal)desc) "Best"
, MAX(sum(sal)) KEEP (DENSE_RANK FIRST ORDER BY TO_CHAR(hiredate,'MM/RR')) "Bbest"

FROM scott.emp
WHERE to_char(hiredate,'MM/RR')='02/81' ----

GROUP BY TO_CHAR(hiredate,'MM/RR'),empno
rgds
Omsai

State Transition Constraints

Parthiban Nagarajan, October 15, 2010 - 8:30 am UTC

Hi Tom

Please refer the link - Article by Joe Celko
http://www.simple-talk.com/sql/t-sql-programming/state-transition-constraints/

The child table requires two columns (current-state and previous-state) ... It seems to be redundant when we do have the Analytics, doesn't it?

But, how we can achieve the same state-transition-constraints, yes, of course with analytics but not with the column *previous-state*?

Is it feasible? or is my question look silly?
I just want to have your thoughts Tom ...
Tom Kyte
October 15, 2010 - 9:08 am UTC

A constraint like that would be hard (impractical) to enforce *across* rows - but straightforward "in a row"

analytics could be used to synthesize the information upon retrieval if it was not stored - but it could not really be used in a constraint

Analytics - running total

Josphat, October 24, 2010 - 9:46 am UTC

Hi;

Your site is fascinating. The power of analytics is beyond imagination. Data warehousing is bound to be more appealing with analytics. I found immense stuff on running total and applied as below.

drop table test_analytics;

create table test_analytics
(empid number,ldate date,getval number(15,2),intrate number(5,2),retval number(15,2),factor number(5));

insert into test_analytics
values(1,'01-jan-2010',500,5,50,1);


insert into test_analytics
values(1,'01-apr-2010',0,5,50,1);

insert into test_analytics
values(1,'01-jun-2010',0,5,70,1);


insert into test_analytics
values(1,'01-jul-2010',0,5,70,1);



insert into test_analytics
values(2,'01-apr-2010',1000,8,0,1);

insert into test_analytics
values(2,'01-jun-2010',0,8,150,1);


insert into test_analytics
values(2,'01-jul-2010',0,8,100,1);


insert into test_analytics
values(3,'01-apr-2010',400,8,0,1);

insert into test_analytics
values(3,'01-jun-2010',0,8,150,1);


insert into test_analytics
values(3,'01-jul-2010',0,8,100,1);



insert into test_analytics
values(3,'01-aug-2010',0,8,150,1);

commit;
/
set lines 1000

select * from test_analytics;

--------------------------------------display running balance per empid offsetting settlement(retval)

The object is to display running totals with the starting getval as opening balance per empid, while
applying simple interest multiplied by the factor then deduct retval to get net balance. The net balance for empid is
used to be the opening balance for the next row.


I can display running total as follows

SELECT empid, getval,retval, ldate,intrate,
LAG(ldate,1,NULL)
OVER (PARTITION BY empid
ORDER BY ldate, empid) runldate,
sum((getval- retval))
OVER (PARTITION BY empid
ORDER BY empid,ldate) bals,
LEAD(ldate,1,NULL)
OVER (PARTITION BY empid
ORDER BY ldate) compdate
FROM test_analytics ORDER BY empid,ldate;



EMPID GETVAL RETVAL LDATE INTRATE RUNLDATE BALS COMPDATE
--------- --------- --------- --------- --------- --------- --------- ---------
1 500 50 01-JAN-10 5 450 01-APR-10
1 0 50 01-APR-10 5 01-JAN-10 400 01-JUN-10
1 0 70 01-JUN-10 5 01-APR-10 330 01-JUL-10
1 0 70 01-JUL-10 5 01-JUN-10 260
2 1000 0 01-APR-10 8 1000 01-JUN-10
2 0 150 01-JUN-10 8 01-APR-10 850 01-JUL-10
2 0 100 01-JUL-10 8 01-JUN-10 750
3 400 0 01-APR-10 8 400 01-JUN-10
3 0 150 01-JUN-10 8 01-APR-10 250 01-JUL-10
3 0 100 01-JUL-10 8 01-JUN-10 150 01-AUG-10
3 0 150 01-AUG-10 8 01-JUL-10 0

-------------------------------------------challenge introduce interest element
-------------------------------------------have it as part of the running total and as a seperate column
-------------------------------------------this only calculates intrest on the first row

after attempt to introduce interest, this is only calculated at the first row of each empid as follows;

SELECT empid, getval,retval, ldate,intrate,
LAG(ldate,1,NULL)
OVER (PARTITION BY empid
ORDER BY ldate, empid) runldate,
sum((getval*(intrate+100)/100*nvl(factor,1)- retval))
OVER (PARTITION BY empid
ORDER BY empid,ldate) bals,
LEAD(ldate,1,NULL)
OVER (PARTITION BY empid
ORDER BY ldate) compdate
FROM test_analytics ORDER BY empid,ldate;

EMPID GETVAL RETVAL LDATE INTRATE RUNLDATE BALS COMPDATE
----- --------- --------- --------- --------- --------- --------- ---------
1 500 50 01-JAN-10 5 475 01-APR-10
1 0 50 01-APR-10 5 01-JAN-10 425 01-JUN-10
1 0 70 01-JUN-10 5 01-APR-10 355 01-JUL-10
1 0 70 01-JUL-10 5 01-JUN-10 285
2 1000 0 01-APR-10 8 1080 01-JUN-10
2 0 150 01-JUN-10 8 01-APR-10 1016.4 01-JUL-10
2 0 100 01-JUL-10 8 01-JUN-10 830
3 400 0 01-APR-10 8 432 01-JUN-10
3 0 150 01-JUN-10 8 01-APR-10 282 01-JUL-10
3 0 100 01-JUL-10 8 01-JUN-10 182 01-AUG-10
3 0 150 01-AUG-10 8 01-JUL-10 32





-----The expected output which I am not getting should be as follows (Bals and interest columns of output);


EMPID GETVAL RETVAL LDATE INTRATE RUNLDATE BALS COMPDATE interest
----- --------- --------- --------- --------- --------- --------- ---------
1 500 50 01-JAN-10 5 475 01-APR-10 25
1 0 50 01-APR-10 5 01-JAN-10 448.75 01-JUN-10 23.75
1 0 70 01-JUN-10 5 01-APR-10 401.15 01-JUL-10 22.40
1 0 70 01-JUL-10 5 01-JUN-10 351.25 20.10
2 1000 0 01-APR-10 8 1080 01-JUN-10 80
2 0 150 01-JUN-10 8 01-APR-10 1016.4 01-JUL-10 86.4
2 0 100 01-JUL-10 8 01-JUN-10 997.7 81.30
3 400 0 01-APR-10 8 432 01-JUN-10 32
3 0 150 01-JUN-10 8 01-APR-10 316.6 01-JUL-10 34.60
3 0 100 01-JUL-10 8 01-JUN-10 241.9 01-AUG-10 25.30
3 0 150 01-AUG-10 8 01-JUL-10 111.25 19.35

oracle environment is 9ir2

Thanks
Tom Kyte
October 25, 2010 - 5:46 pm UTC

instead of giving me a query that doesn't work (and hence I cannot reverse engineer - since it isn't logical - it doesn't work - and therefore does not demonstrate what you need)....

how about you give me psuedo code - DETAILED, commented psuedo code that describes how you need the data processed.



Running total include simple interest

Josphat, October 27, 2010 - 5:07 am UTC

Sorry; Thanks for your time.
I meant to apply the following;
SELECT empid, getval,retval, ldate,intrate,
LAG(ldate,1,NULL)
OVER (PARTITION BY empid
ORDER BY ldate, empid) runldate,
sum((getval*(intrate+100)/100*nvl(factor,1)- retval))
OVER (PARTITION BY empid
ORDER BY empid,ldate) bals,
LEAD(ldate,1,NULL)
OVER (PARTITION BY empid
ORDER BY ldate) compdate
FROM test_analytics ORDER BY empid,ldate; ---out

I expected interest to be calculated on all rows but it is happening only at the first row of partitioning key i.e employee id (empid).
---getval is the starting balance
---retval is a payment
---intrate is the percentage of interest in each row
---factor is a multiplier
---
In summary ---from a transactions table
/*
Get starting point / row for every empid which is the earliest ldate (paydate),
pick getval apply simple interest calculation subtract retval i.e getval*(intrate+100)/100*nvl(factor,1)- retval
to get the opening balance for the next row
till end for each empid relecting balance(running net total) and interest for every row.

how to cast pseudo column for the interest component?.
End
---nb, Table could contain millions of records
*/

Just started on analytics after going through your great site. Already used concept for serialization, difference in dates and it is fast!.
Thanks
The test data is as

drop table test_analytics;

create table test_analytics
(empid number,ldate date,getval number(15,2),intrate number(5,2),retval number(15,2),factor
number(5));

insert into test_analytics
values(1,'01-jan-2010',500,5,50,1);


insert into test_analytics
values(1,'01-apr-2010',0,5,50,1);

insert into test_analytics
values(1,'01-jun-2010',0,5,70,1);


insert into test_analytics
values(1,'01-jul-2010',0,5,70,1);



insert into test_analytics
values(2,'01-apr-2010',1000,8,0,1);

insert into test_analytics
values(2,'01-jun-2010',0,8,150,1);


insert into test_analytics
values(2,'01-jul-2010',0,8,100,1);


insert into test_analytics
values(3,'01-apr-2010',400,8,0,1);

insert into test_analytics
values(3,'01-jun-2010',0,8,150,1);


insert into test_analytics
values(3,'01-jul-2010',0,8,100,1);



insert into test_analytics
values(3,'01-aug-2010',0,8,150,1);

commit;
/





Tom Kyte
October 27, 2010 - 6:15 am UTC

Sorry; Thanks for your time.
I meant to apply the following;
SELECT empid, getval,retval, ldate,intrate,
LAG(ldate,1,NULL)
OVER (PARTITION BY empid
ORDER BY ldate, empid) runldate,
sum((getval*(intrate+100)/100*nvl(factor,1)- retval))
OVER (PARTITION BY empid
ORDER BY empid,ldate) bals,
LEAD(ldate,1,NULL)
OVER (PARTITION BY empid
ORDER BY ldate) compdate
FROM test_analytics ORDER BY empid,ldate; ---ou


No, you didn't mean to apply the above, if you did - it would mean it WORKS, but it doesn't. All it does is compound confusion - as it is a bit of logic (it is a bit of logic, it "works", it is logic) that doesn't do what you want. You see - if you give that to someone that knows analytics and say "this is what I want to do" - you'll get nothing from them - since they'll just say "ok, well then run it - it is valid SQL"


One wonders why you have two tables in one here. The rows with getval <> 0 obviously belong to one table and the rows with getval=0 belong to another. Actually, just HALF of the row with getval <> 0 belongs in one table. You have a table of "we charged you this (getval <> 0) and should have another table of "you paid us this" (all of the retvals)


You won't be able to compute the "running total with an interest bit added" using analytics. You don't have a column to lag back on, you would need recursion for this (the last FUNCTION value would have to be available - but it isn't - just the column values are)

sorry - this isn't a problem for analytics.

Running total include simple interest

Josphat, October 27, 2010 - 8:35 am UTC

Thank you once more for your time.

I will continue to explore otherwise. I agree , bit of tough stuff not for analytics. have to go back and forth!

Any glint of plsql block / function welcome.

Thanks
Tom Kyte
October 27, 2010 - 12:34 pm UTC

... Any glint of plsql block / function welcome.
...

you cannot take your psuedo code and code it - it is rather straight forward from a purely procedural perspective?



ops$tkyte%ORA11GR2> create or replace type myScalarType as object
  2  (empid number,
  3   ldate date,
  4  getval number(15,2),
  5  intrate number(5,2),
  6  retval number(15,2),
  7  factor number(5),
  8  running_total number )
  9  /

Type created.

ops$tkyte%ORA11GR2> create or replace type myTableType as table of myScalarType
  2  /

Type created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> create or replace function foo( p_data in sys_refcursor )
  2  return myTableType
  3  PIPELINED
  4  as
  5      type array is table of test_analytics%rowtype;
  6  
  7      l_last_empid    number;
  8      l_running_total number := 0;
  9      l_rec           test_analytics%rowtype;
 10      l_data          array;
 11      l_limit         number := 100;
 12  begin
 13      loop
 14          fetch p_data bulk collect into l_data limit l_limit;
 15          for i in 1 .. l_data.count
 16          loop
 17              if ( l_last_empid <> l_data(i).empid or l_last_empid is null)
 18              then
 19                  l_last_empid := l_data(i).empid;
 20                  l_running_total := l_data(i).getval *
 21                     (l_data(i).intrate+100)/100*nvl(l_data(i).factor,1)-l_data(i).retval;
 22              else
 23                  l_running_total := l_running_total *
 24                     (l_data(i).intrate+100)/100*nvl(l_data(i).factor,1)-l_data(i).retval;
 25              end if;
 26              pipe row( myScalarType( l_data(i).retval, l_data(i).ldate, l_data(i).getval,
 27                                      l_data(i).intrate, l_data(i).retval, l_data(i).factor,
 28                                      l_running_total ));
 29          end loop;
 30          exit when p_data%notfound;
 31      end loop;
 32      close p_data;
 33      return;
 34  end;
 35  /

Function created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select * from TABLE( foo( cursor(select * from test_analytics order by empid, ldate ) ) );

     EMPID LDATE         GETVAL    INTRATE     RETVAL     FACTOR RUNNING_TOTAL
---------- --------- ---------- ---------- ---------- ---------- -------------
        50 01-JAN-10        500          5         50          1           475
        50 01-APR-10          0          5         50          1        448.75
        70 01-JUN-10          0          5         70          1      401.1875
        70 01-JUL-10          0          5         70          1    351.246875
         0 01-APR-10       1000          8          0          1          1080
       150 01-JUN-10          0          8        150          1        1016.4
       100 01-JUL-10          0          8        100          1       997.712
         0 01-APR-10        400          8          0          1           432
       150 01-JUN-10          0          8        150          1        316.56
       100 01-JUL-10          0          8        100          1      241.8848
       150 01-AUG-10          0          8        150          1    111.235584

11 rows selected.


YOU figure out how YOU want to do the rounding and carry forward...

make sure to use ORDER BY - the result set needs to be ordered for this to work of course.

also make sure to handle NO_DATA_NEEDED

Sokrates, October 28, 2010 - 2:32 am UTC

NO_DATA_NEEDED must also be handled when you code like that:

11.2 > drop type myntype;

Type dropped.

11.2 > create or replace type mynType as object(n number);
  2  /

Type created.

11.2 > create or replace type mynTypeTab is table of mynType;
 2  /


Type created.

11.2 > create or replace view test_analytics as select cast(null as number) n from dual;

View created.

create or replace function foo( p_data in sys_refcursor )
return mynTypeTab PIPELINED
as
type array is table of test_analytics%rowtype;
l_data array;
begin
   loop
      fetch p_data bulk collect into l_data limit 100;
      for i in 1..l_data.count loop
         pipe row(mynType(l_data(i).n));
      end loop;
      exit when p_data%notfound;
   end loop;
   dbms_output.put_line('close p_data ...');
   close p_data;
   dbms_output.put_line('closed.');
return;
end foo;
 19  /

Function created.

11.2 > select * from TABLE( foo( cursor(select level from dual connect by level<=3)));

  N
----------
  1
  2
  3

close p_data ...
closed.
11.2 > REM ok, closed
11.2 > select * from TABLE( foo( cursor(select level from dual connect by level<=3))) where rownum=1;

  N
----------
  1

11.2 > REM not closed, my fault


I SHOULD coded


create or replace function foo( p_data in sys_refcursor )
return mynTypeTab PIPELINED
as
type array is table of test_analytics%rowtype;
l_data array;
begin
   loop
      fetch p_data bulk collect into l_data limit 100;
      begin
         for i in 1..l_data.count loop
            pipe row(mynType(l_data(i).n));
         end loop;
      exception when no_data_needed then exit;
      end;
      exit when p_data%notfound;
   end loop;
   dbms_output.put_line('close p_data ...');
   close p_data;
   dbms_output.put_line('closed.');
return;
end foo;
 22  /

Function created.

11.2 > select * from TABLE( foo( cursor(select level from dual connect by level<=3)));

  N
----------
  1
  2
  3

close p_data ...
closed.
11.2 > select * from TABLE( foo( cursor(select level from dual connect by level<=3))) where rownum=1;

  N
----------
  1

close p_data ...
closed.


Tom Kyte
October 28, 2010 - 12:04 pm UTC

correct.


Running total include simple interest

Josphat, October 28, 2010 - 7:00 am UTC

The going SEEMS illuminative. 

I have tried though on my oracle 9i r2;

create or replace type myTableType as table of myScalarType

Warning: Type created with compilation errors.

SQL> show errors

Errors for TYPE MYTABLETYPE:
LINE/COL ERROR
-------- -----------------------------------------------------------------
0/0      PL/SQL: Compilation unit analysis terminated
1/30     PLS-00905: object TEST.MYSCALARTYPE is invalid

Seems I will have to sort database object issues before I can do much.  But I will Post development
Thanks



Tom Kyte
October 28, 2010 - 12:12 pm UTC

you didn't show us the create of myscalartype :(


Analytics for Time Calculation

Snehasish Das, December 22, 2010 - 9:09 am UTC

Hi Tom,

I have a peculiar requirement for calculation of time a order spends in a Queue, I know we can do it via a function but I am also sure that We can do it by a Oracle Analytics funtion can you help me with this.

Below is the requirement.

ORDER_NUM QUEUE_IN_TIMT QUEUE_OUT_TIMT QUEUE_NAME
------------------------------------------------------------------------------
ORDER1 21/Dec/2010 08:00:00 21/Dec/2010 10:00:00 Q1
ORDER1 21/Dec/2010 09:00:00 21/Dec/2010 11:00:00 Q2
ORDER1 21/Dec/2010 10:00:00 21/Dec/2010 12:00:00 Q3
ORDER1 21/Dec/2010 10:00:00 21/Dec/2010 14:00:00 Q4
ORDER1 21/Dec/2010 17:00:00 21/Dec/2010 20:00:00 Q5

The total time calculated should be the time spend in the Queues, and we are not adding up the overlapping time,
i.e the time Q1 and Q2 spend in the queue should be difference between 08:00:00 to 11:00:00 i.e 3 hours
and not 4 hours. Secondly we would not add up the time when the order was not in queue ie. time from 14:00:00 to 17:00:00.
In this example the output for order1 should be 9 hours. I tried one way but it was too clumsy,

below is the query

select stg.*,(MAX_Q_OUT_TIME - MIN_Q_IN_TIME - TIME_SUBS ) * 24 time_in_queue from
(
select stg.*,sum (case when q_in_time > lead_time then q_in_time - lead_time else 0 end ) over ( partition by serial_num) TIME_SUBS
from
(
select serial_num,max(q_out_time) over (partition by serial_num) MAX_Q_OUT_TIME,
min(q_in_time) over (partition by serial_num) MIN_Q_IN_TIME,
lead(Q_OUT_TIME) over (partition by serial_num order by Q_OUT_TIME desc) LEAD_TIME,Q_IN_TIME,Q_OUT_TIME,
row_number () over (partition by serial_num order by Q_OUT_TIME desc) RN
from test_queue
) stg
) stg where rn = 1

Looking forward to your help.
Snehasish Das,
Technical Assuarance.

Tom Kyte
December 22, 2010 - 2:43 pm UTC

I know of no efficient/effective way to do this with analytics. The problem is you cannot look back just "a row" or even "N rows" - you don't know how many overlaps you will potentionally have and they do not even have to be next to each other.


if however you

a) have a table create
b) inserts into the table
c) give more information about the dates - are they hourly as depicted - or can they go down to the second.

we might be able to suggest a different approach

I think this works...

Brendan, December 23, 2010 - 6:01 am UTC

Order by the in time
For the current row, get the latest out time among prior records
The interval between (derived start time = the later of this latest prior out time and the current in time) and (the earlier of the current out time and the derived start time) gives the contribution of the current record non-overlapping with earlier records

DROP TABLE order_queues;
CREATE TABLE order_queues (
order_num VARCHAR2(10) NOT NULL,
queue_name VARCHAR2(10) NOT NULL,
queue_in_timt DATE NOT NULL,
queue_out_timt DATE,
CONSTRAINT order_queues_pk PRIMARY KEY (order_num, queue_name, queue_in_timt)
);

INSERT INTO order_queues VALUES ('ORDER1', 'Q1', To_Date ('21/Dec/2010 08:00:00', 'dd/Mon/yyyy hh24:mi:ss'), To_Date ('21/Dec/2010 10:00:00', 'dd/Mon/yyyy hh24:mi:ss'));
INSERT INTO order_queues VALUES ('ORDER1', 'Q2', To_Date ('21/Dec/2010 09:00:00', 'dd/Mon/yyyy hh24:mi:ss'), To_Date ('21/Dec/2010 11:00:00', 'dd/Mon/yyyy hh24:mi:ss'));
INSERT INTO order_queues VALUES ('ORDER1', 'Q3', To_Date ('21/Dec/2010 10:00:00', 'dd/Mon/yyyy hh24:mi:ss'), To_Date ('21/Dec/2010 12:00:00', 'dd/Mon/yyyy hh24:mi:ss'));
INSERT INTO order_queues VALUES ('ORDER1', 'Q4', To_Date ('21/Dec/2010 10:00:00', 'dd/Mon/yyyy hh24:mi:ss'), To_Date ('21/Dec/2010 14:00:00', 'dd/Mon/yyyy hh24:mi:ss'));
INSERT INTO order_queues VALUES ('ORDER1', 'Q5', To_Date ('21/Dec/2010 17:00:00', 'dd/Mon/yyyy hh24:mi:ss'), To_Date ('21/Dec/2010 20:00:00', 'dd/Mon/yyyy hh24:mi:ss'));

SELECT
order_num,
To_Char (queue_in_timt, 'hh24:mi:ss'),
To_Char (queue_out_timt, 'hh24:mi:ss'),
(Least (queue_out_timt, Greatest (queue_out_timt, Nvl (Lag (max_out) OVER (ORDER BY order_num, queue_in_timt), queue_out_timt)))
- Greatest (queue_in_timt, Nvl (Lag (max_out) OVER (ORDER BY order_num, queue_in_timt), queue_in_timt)) )* 24 non_overlap_time
FROM (
SELECT
order_num,
queue_in_timt,
queue_out_timt,
Max (queue_out_timt)
OVER (ORDER BY order_num, queue_in_timt, ROWID RANGE UNBOUNDED PRECEDING) max_out
FROM order_queues
)
ORDER BY 2;

SELECT order_num, Sum (non_overlap_time)
FROM (
SELECT
order_num,
To_Char (queue_in_timt, 'hh24:mi:ss'),
To_Char (queue_out_timt, 'hh24:mi:ss'),
(Least (queue_out_timt, Greatest (queue_out_timt, Nvl (Lag (max_out) OVER (ORDER BY order_num, queue_in_timt), queue_out_timt)))
- Greatest (queue_in_timt, Nvl (Lag (max_out) OVER (ORDER BY order_num, queue_in_timt), queue_in_timt)) )* 24 non_overlap_time
FROM (
SELECT
order_num,
queue_in_timt,
queue_out_timt,
Max (queue_out_timt)
OVER (ORDER BY order_num, queue_in_timt, ROWID RANGE UNBOUNDED PRECEDING) max_out
FROM order_queues
)
)
GROUP BY order_num;

ORDER_NUM TO_CHAR( TO_CHAR( NON_OVERLAP_TIME
---------- -------- -------- ----------------
ORDER1 08:00:00 10:00:00 2
ORDER1 09:00:00 11:00:00 1
ORDER1 10:00:00 12:00:00 1
ORDER1 10:00:00 14:00:00 2
ORDER1 17:00:00 20:00:00 3


ORDER_NUM SUM(NON_OVERLAP_TIME)
---------- ---------------------
ORDER1 9

Not quite right, maybe this time...

Brendan, December 23, 2010 - 9:39 am UTC

Ok, not quite right, sorry - I added another test record and changed the logic slightly...

Order by the in time
For the current row, get the latest out time among prior rows
Non-overlapping section of current row:
In time:
the later of the current in time and the latest prior out time
Out time:
the later of the derived in time and the current out time

INSERT INTO order_queues VALUES ('ORDER1', 'Q4', To_Date ('21/Dec/2010 17:00:00', 'dd/Mon/yyyy hh24:mi:ss'), To_Date ('21/Dec/2010 19:00:00', 'dd/Mon/yyyy hh24:mi:ss'));

SELECT order_num,
To_Char (queue_in_timt, 'hh24:mi:ss') in_time,
To_Char (queue_out_timt, 'hh24:mi:ss') out_time,
queue_name,
non_overlap_time,
Sum (non_overlap_time) OVER (PARTITION BY 1) total
FROM (
SELECT
order_num,
queue_in_timt,
queue_out_timt,
queue_name,
(Greatest (queue_out_timt, Greatest (queue_in_timt, Nvl (Lag (max_out) OVER (ORDER BY order_num, queue_in_timt), queue_in_timt)))
- Greatest (queue_in_timt, Nvl (Lag (max_out) OVER (ORDER BY order_num, queue_in_timt), queue_in_timt)) )* 24 non_overlap_time
FROM (
SELECT
order_num,
queue_in_timt,
queue_out_timt,
queue_name,
Max (queue_out_timt)
OVER (ORDER BY order_num, queue_in_timt, ROWID RANGE UNBOUNDED PRECEDING) max_out
FROM order_queues
)
)
ORDER BY 1, 2;

ORDER_NUM IN_TIME OUT_TIME QUEUE_NAME NON_OVERLAP_TIME TOTAL
---------- -------- -------- ---------- ---------------- ----------
ORDER1 08:00:00 10:00:00 Q1 2 9
ORDER1 09:00:00 11:00:00 Q2 1 9
ORDER1 10:00:00 12:00:00 Q3 1 9
ORDER1 10:00:00 14:00:00 Q4 2 9
ORDER1 17:00:00 20:00:00 Q5 3 9
ORDER1 17:00:00 19:00:00 Q4 0 9

6 rows selected.
Tom Kyte
December 23, 2010 - 11:22 am UTC

given the arbitrary overlaps - this isn't going to work.

what if you have overlaps like this:


q1
    q2
       q3
          q4
    q2
             q5
       q3
             q5
                  q6
q1
                  q6
          q4


try some examples like that - where the overlaps are really whacky

I still think it works - here are two more examples

Brendan, December 23, 2010 - 12:15 pm UTC

Ok, I tried a couple more examples with as whacky overlapping as I could think of - and unless I'm missing something it worked on both - please suggest an example where you think it doesn't...

TRUNCATE TABLE order_queues;

DECLARE

PROCEDURE Set_Queue (p_q VARCHAR2, p_hour PLS_INTEGER, p_duration PLS_INTEGER) IS
BEGIN

INSERT INTO order_queues VALUES ('ORDER1', 'Q' || To_Char(p_q),
To_Date ('21/Dec/2010 ' || To_Char (p_hour) || ':00:00', 'dd/Mon/yyyy hh24:mi:ss'),
To_Date ('21/Dec/2010 ' || To_Char (p_hour + p_duration) || ':00:00', 'dd/Mon/yyyy hh24:mi:ss'));

END Set_Queue;

BEGIN
/*

Column represents hour from 1-6, row represents queue from 1-5, x in row i, column j means order in queue i at hour j
Example has 5 hours in queue

Example 1 - expecting 5 hours result (3 is free)
---------
H123456
Q
-
1 x
2 xx xxx
3 xx xx
4 x
5 x x

Set_Queue (1, 1, 1);
Set_Queue (2, 1, 2);
Set_Queue (2, 4, 3);
Set_Queue (3, 1, 2);
Set_Queue (3, 5, 2);
Set_Queue (4, 5, 1);
Set_Queue (5, 2, 1);
Set_Queue (5, 6, 1);

ORDER_NUM IN_TIME OUT_TIME QUEUE_NAME NON_OVERLAP_TIME TOTAL
---------- -------- -------- ---------- ---------------- ----------
ORDER1 01:00:00 02:00:00 Q1 1 5
ORDER1 01:00:00 03:00:00 Q2 1 5
ORDER1 01:00:00 03:00:00 Q3 0 5
ORDER1 02:00:00 03:00:00 Q5 0 5
ORDER1 04:00:00 07:00:00 Q2 3 5
ORDER1 05:00:00 07:00:00 Q3 0 5
ORDER1 05:00:00 06:00:00 Q4 0 5
ORDER1 06:00:00 07:00:00 Q5 0 5

8 rows selected.

Example 2 - expecting 3 hours result (2 and 4 are free)
---------
H123456
Q
-
1 x x
2 x
3 x
4 x x x
5 x x

Set_Queue (1, 1, 1);
Set_Queue (1, 6, 1);
Set_Queue (2, 3, 1);
Set_Queue (3, 3, 1);
Set_Queue (4, 1, 1);
Set_Queue (4, 3, 1);
Set_Queue (4, 6, 1);
Set_Queue (5, 1, 1);
Set_Queue (5, 6, 1);

ORDER_NUM IN_TIME OUT_TIME QUEUE_NAME NON_OVERLAP_TIME TOTAL
---------- -------- -------- ---------- ---------------- ----------
ORDER1 01:00:00 02:00:00 Q1 1 3
ORDER1 01:00:00 02:00:00 Q4 0 3
ORDER1 01:00:00 02:00:00 Q5 0 3
ORDER1 03:00:00 04:00:00 Q2 1 3
ORDER1 03:00:00 04:00:00 Q3 0 3
ORDER1 03:00:00 04:00:00 Q4 0 3
ORDER1 06:00:00 07:00:00 Q1 1 3
ORDER1 06:00:00 07:00:00 Q4 0 3
ORDER1 06:00:00 07:00:00 Q5 0 3

9 rows selected.

*/
Set_Queue (1, 1, 1);
Set_Queue (1, 6, 1);
Set_Queue (2, 3, 1);
Set_Queue (3, 3, 1);
Set_Queue (4, 1, 1);
Set_Queue (4, 3, 1);
Set_Queue (4, 6, 1);
Set_Queue (5, 1, 1);
Set_Queue (5, 6, 1);

END;
/
SELECT order_num,
To_Char (queue_in_timt, 'hh24:mi:ss') in_time,
To_Char (queue_out_timt, 'hh24:mi:ss') out_time,
queue_name,
non_overlap_time,
Sum (non_overlap_time) OVER (PARTITION BY order_num) total
FROM (
SELECT
order_num,
queue_in_timt,
queue_out_timt,
queue_name,
(Greatest (queue_out_timt, Greatest (queue_in_timt, Nvl (Lag (max_out) OVER (ORDER BY order_num, queue_in_timt), queue_in_timt)))
- Greatest (queue_in_timt, Nvl (Lag (max_out) OVER (ORDER BY order_num, queue_in_timt), queue_in_timt)) )* 24 non_overlap_time
FROM (
SELECT
order_num,
queue_in_timt,
queue_out_timt,
queue_name,
Max (queue_out_timt)
OVER (ORDER BY order_num, queue_in_timt, ROWID RANGE UNBOUNDED PRECEDING) max_out
FROM order_queues
)
)
ORDER BY 1, 2;

Tom Kyte
December 24, 2010 - 12:01 pm UTC

you might be onto something - but a partition by order num is going to have to be part of the solution:


ops$tkyte%ORA11GR2> CREATE TABLE order_queues (
  2      order_num        VARCHAR2(10)    NOT NULL,
  3      queue_name        VARCHAR2(10)    NOT NULL,
  4      queue_in_timt        DATE        NOT NULL,
  5      queue_out_timt        DATE,
  6      CONSTRAINT order_queues_pk PRIMARY KEY (order_num, queue_name, queue_in_timt)
  7  );

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> insert into order_queues values ( 2, 'a',
  2  to_date( '01-jan-2010 01:00:00', 'dd-mon-yyyy hh24:mi:ss' ),
  3  to_date( '01-jan-2010 01:00:10', 'dd-mon-yyyy hh24:mi:ss') );

1 row created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> insert into order_queues values ( 1, 'b',
  2  to_date( '01-jan-2010 01:00:02', 'dd-mon-yyyy hh24:mi:ss' ),
  3  to_date( '01-jan-2010 01:00:53', 'dd-mon-yyyy hh24:mi:ss') );

1 row created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> SELECT order_num,
  2      To_Char (queue_in_timt, 'hh24:mi:ss')    in_time,
  3      To_Char (queue_out_timt, 'hh24:mi:ss')    out_time,
  4      queue_name,
  5      non_overlap_time,
  6      Sum (non_overlap_time) OVER (PARTITION BY order_num) total,
  7      Sum (non_overlap_time) OVER (PARTITION BY order_num)*60*60 sec_total
  8    FROM (
  9  SELECT
 10      order_num,
 11      queue_in_timt,
 12      queue_out_timt,
 13      queue_name,
 14      (Greatest (queue_out_timt, Greatest (queue_in_timt, Nvl (Lag (max_out) OVER (ORDER BY
 15  order_num, queue_in_timt), queue_in_timt)))
 16      - Greatest (queue_in_timt, Nvl (Lag (max_out) OVER (ORDER BY order_num, queue_in_timt),
 17  queue_in_timt)) )* 24 non_overlap_time
 18    FROM (
 19  SELECT
 20      order_num,
 21      queue_in_timt,
 22      queue_out_timt,
 23      queue_name,
 24      Max (queue_out_timt)
 25              OVER (ORDER BY order_num, queue_in_timt, ROWID RANGE UNBOUNDED PRECEDING) max_out
 26    FROM order_queues
 27  )
 28  )
 29  ORDER BY 1, 2;

ORDER_NUM  IN_TIME  OUT_TIME QUEUE_NAME NON_OVERLAP_TIME      TOTAL  SEC_TOTAL
---------- -------- -------- ---------- ---------------- ---------- ----------
1          01:00:02 01:00:53 b                .014166667 .014166667         51
2          01:00:00 01:00:10 a                         0          0          0




order number one - 51 seconds
order number two - 10 seconds

should be the answer.

Yes, added partition by...

Brendan, December 26, 2010 - 5:40 am UTC

TRUNCATE TABLE order_queues;

DECLARE

g_order_num VARCHAR2(10);
PROCEDURE Set_Queue (p_q VARCHAR2, p_hour PLS_INTEGER, p_duration PLS_INTEGER) IS
BEGIN

INSERT INTO order_queues VALUES (g_order_num, 'Q' || To_Char(p_q),
To_Date ('21/Dec/2010 ' || To_Char (p_hour) || ':00:00', 'dd/Mon/yyyy hh24:mi:ss'),
To_Date ('21/Dec/2010 ' || To_Char (p_hour + p_duration) || ':00:00', 'dd/Mon/yyyy hh24:mi:ss'));

END Set_Queue;

BEGIN
/*

Column represents hour from 1-9, row represents queue from 1-5, x in row i, column j means order in queue i at hour j
Example has 5 hours in queue

Example 1 - expecting 5 hours ORDER1, 7 hours ORDER2
---------
H123456789
Q
-
ORDER1

1 x
2 xx xxx
3 xx xx
4 x
5 x x

ORDER2 - as ORDER1 but with Q4 8 and 9 and Q5 8 added, so expecting 7 hours

1 x
2 xx xxx
3 xx xx
4 x xx
5 x x x

*/
g_order_num := 'ORDER1';
Set_Queue (1, 1, 1);
Set_Queue (2, 1, 2);
Set_Queue (2, 4, 3);
Set_Queue (3, 1, 2);
Set_Queue (3, 5, 2);
Set_Queue (4, 5, 1);
Set_Queue (5, 2, 1);
Set_Queue (5, 6, 1);
--
-- First, insert same records for ORDER2...
--
g_order_num := 'ORDER2';
Set_Queue (1, 1, 1);
Set_Queue (2, 1, 2);
Set_Queue (2, 4, 3);
Set_Queue (3, 1, 2);
Set_Queue (3, 5, 2);
Set_Queue (4, 5, 1);
Set_Queue (5, 2, 1);
Set_Queue (5, 6, 1);
--
-- Then add two more...
--
Set_Queue (4, 8, 2);
Set_Queue (5, 8, 1);

END;
/
SET PAGES 1000
BREAK ON order_num
SELECT order_num,
To_Char (queue_in_timt, 'hh24:mi:ss') in_time,
To_Char (queue_out_timt, 'hh24:mi:ss') out_time,
queue_name,
non_overlap_time,
Sum (non_overlap_time) OVER (PARTITION BY order_num) total
FROM (
SELECT
order_num,
queue_in_timt,
queue_out_timt,
queue_name,
(Greatest (queue_out_timt, Greatest (queue_in_timt, Nvl (Lag (max_out) OVER (PARTITION BY order_num ORDER BY order_num, queue_in_timt, queue_name), queue_in_timt)))
- Greatest (queue_in_timt, Nvl (Lag (max_out) OVER (PARTITION BY order_num ORDER BY order_num, queue_in_timt, queue_name), queue_in_timt)) )* 24 non_overlap_time
FROM (
SELECT
order_num,
queue_in_timt,
queue_out_timt,
queue_name,
Max (queue_out_timt)
OVER (PARTITION BY order_num ORDER BY order_num, queue_in_timt, queue_name RANGE UNBOUNDED PRECEDING) max_out
FROM order_queues
)
)
ORDER BY 1, 2, 4;

Output:

Table truncated.


PL/SQL procedure successfully completed.


ORDER_NUM IN_TIME OUT_TIME QUEUE_NAME NON_OVERLAP_TIME TOTAL
---------- -------- -------- ---------- ---------------- ----------
ORDER1 01:00:00 02:00:00 Q1 1 5
01:00:00 03:00:00 Q2 1 5
01:00:00 03:00:00 Q3 0 5
02:00:00 03:00:00 Q5 0 5
04:00:00 07:00:00 Q2 3 5
05:00:00 07:00:00 Q3 0 5
05:00:00 06:00:00 Q4 0 5
06:00:00 07:00:00 Q5 0 5
ORDER2 01:00:00 02:00:00 Q1 1 7
01:00:00 03:00:00 Q2 1 7
01:00:00 03:00:00 Q3 0 7
02:00:00 03:00:00 Q5 0 7
04:00:00 07:00:00 Q2 3 7
05:00:00 07:00:00 Q3 0 7
05:00:00 06:00:00 Q4 0 7
06:00:00 07:00:00 Q5 0 7
08:00:00 10:00:00 Q4 2 7
08:00:00 09:00:00 Q5 0 7

18 rows selected.

A reader, February 08, 2011 - 7:15 pm UTC

Hi Tom ,

Can you please help me how can i acheive this one SQL.


My adhoc report mainly between the following columns

driver#,entry_date,milage, veh_distance, qty, price.

The requirement is for the same driver# get the rows

1) I have to choose the rows that has these values are greater than 0 : qty and price and milage

2) previous row details , if (current mileage - previous row mileage) > 1000 and (current.qty- previous.qty) > 10%

3) use the #2 as current record and get it is previous record that exactly like #2, means previous.mileage- priorrow of previous.mileage > 1000





Sample data


driver# date milage veh_distance , qty, price

1234 01/01/2011 0 0 10 10
1234 01/02/2011 2000 100 20 10
1234 01/03/2011 3000 200 10 20
1234 01/04/2011 3000 300 0 0
1234 01/05/2011 5000 400 40 40

5555 01/01/2011 1000 100 10 10
1234 01/02/2011 1000 100 10 10
1234 01/03/2011 3000 200 60 60
1234 01/04/2011 3000 300 80 100
1234 01/05/2011 9000 400 80 80


Output should be

driver# entry_date previous_entry_date, current_mileage, pre_mileage, previous_pre_mileage, curr_dist, prev_dist, previous_pre_dist
1234 01/02/2011 01/03/2011 2000 3000 5000 100 200 400
5555 01/01/2011 01/03/2011 1000 3000 9000 100 200 400

would this be posible to get in single sql statement or do we have to use a holding table to get the results.


Thank you
Tom Kyte
February 10, 2011 - 4:03 pm UTC

no create
no inserts
no look

how can I write a query if I have nothing to work with???

Script for You !

Rajeshwaran, Jeyabal, February 16, 2011 - 10:04 am UTC

Tom:

Here is the script for you!

drop table t purge;

create table t(
  driver    number,
  dt        date,
  milage    number,
  veh_dist  number,
  qty       number,
  price     number
);

insert into t values (1234,to_date('01/01/2011','mm/dd/yyyy'),0,0,10,10);
insert into t values (1234,to_date('01/02/2011','mm/dd/yyyy'),200,100,20,10);
insert into t values (1234,to_date('01/03/2011','mm/dd/yyyy'),3000,200,10,20);
insert into t values (1234,to_date('01/04/2011','mm/dd/yyyy'),3000,300,0,0);
insert into t values (1234,to_date('01/05/2011','mm/dd/yyyy'),5000,400,40,40);

insert into t values (5555,to_date('01/01/2011','mm/dd/yyyy'),1000,100,10,10);
insert into t values (1234,to_date('01/02/2011','mm/dd/yyyy'),1000,100,10,10);
insert into t values (1234,to_date('01/03/2011','mm/dd/yyyy'),3000,200,60,60);
insert into t values (1234,to_date('01/04/2011','mm/dd/yyyy'),3000,300,80,100);
insert into t values (1234,to_date('01/05/2011','mm/dd/yyyy'),9000,400,80,80);

commit;

Tom Kyte
February 16, 2011 - 11:38 am UTC

your "requirements" are ambiguous/wrong


1) I have to choose the rows that has these values are greater than 0 : qty and price and milage

2) previous row details , if (current mileage - previous row mileage) > 1000 and (current.qty-
previous.qty) > 10%

3) use the #2 as current record and get it is previous record that exactly like #2, means
previous.mileage- priorrow of previous.mileage > 1000


#1 - when you choose those rows and then in #2 ask for the previous row - does the previous row have the same constraint - that those three attributes must be > 0 - or should the previous row come from the entire set?

#2 current.qty - previous.qty > 10%?????? how can the left hand side be compared to a percent. Do you mean that the current qty differs from previous qty by more than 10% or was the % a typo.



I'll assume that for #1, we apply the predicate and that dictates what a current row and previous row are selected from.

I'll assume that for #2, you mean the % was a typo.


Sigh, I'll also have to assume you really mean 5555 for the last four records as well.

You can make corrections TO THE SQL - as you'll see the technique clearly.


but if I do that - there can be NO STEP THREE - since by the time we do #1 and #2 - we are left with:


ops$tkyte%ORA11GR2> select *
  2    from (
  3  select driver, dt, milage, veh_dist, qty, price,
  4         lag(milage) over (partition by driver order by dt) last_milage,
  5         lag(qty) over (partition by driver order by dt) last_qty
  6    from t
  7   where qty > 0 and price > 0 and milage > 0
  8         )
  9   where milage-last_milage > 1000
 10     and qty-last_qty > 10
 11  /

    DRIVER DT            MILAGE   VEH_DIST        QTY      PRICE LAST_MILAGE   LAST_QTY
---------- --------- ---------- ---------- ---------- ---------- ----------- ----------
      1234 05-JAN-11       5000        400         40         40        3000         10
      5555 03-JAN-11       3000        200         60         60        1000         10




and there is no longer any "previous" records for anything.

But now you see how, using inline views and selecting from a select - that you do not need/want/desire a temporary scratch table.

Is there a better way for me?

Jason, March 24, 2011 - 4:50 am UTC

Tom,

With the following table I would like to select only the latest id, version and revision of each product. What complicates matters is that you can have a higher revision of a lower version e.g. 0, 15 is before 1, 4.

CREATE TABLE SER_PRODUCT
(
PRO_ID NUMBER(20) NOT NULL,
PRO_VERSION NUMBER(3) NOT NULL,
PRO_REVISION NUMBER(3) NOT NULL,
)

Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(1, 1, 1);
Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(1, 1, 2);
Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(1, 1, 3);
Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(2, 1, 1);
Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(101, 0, 1);
Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(101, 0, 2);
Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(101, 0, 3);
Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(101, 0, 4);
Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(101, 0, 5);
Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(101, 0, 6);
Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(101, 0, 7);
Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(101, 0, 8);
Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(101, 0, 9);
Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(101, 0, 10);
Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(101, 0, 11);
Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(101, 0, 12);
Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(101, 0, 13);
Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(101, 0, 14);
Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(101, 0, 15);
Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(122, 0, 1);
Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(122, 0, 2);
Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(122, 0, 3);
Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(122, 0, 4);
Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(122, 1, 0);
Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(142, 0, 1);
Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(142, 0, 2);
Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(142, 0, 3);
Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(142, 0, 4);
Insert into SER_PRODUCT_EXPORT
(PRO_ID, PRO_VERSION, PRO_REVISION)
Values
(142, 1, 0);


The output I would like is

MAX(PRO) MAX(VER) MAX(REV)
---------- ---------- ----------
1 1 3
2 1 1
101 0 15
122 1 0
142 1 0

I have a piece of SQL that achieves what I need but to me it does appear rather cumbersome - is there a better way?

Here is the sql.

SELECT MAX (pro), MAX (ver), MAX (rev)
FROM (SELECT pro,
ver,
LAST_VALUE (pro_revision)
OVER (PARTITION BY ver ORDER BY pro_id)
rev
FROM (SELECT pro_id,
pro_version,
pro_revision,
pro,
LAST_VALUE (pro_version)
OVER (PARTITION BY pro ORDER BY pro_id)
ver
FROM (SELECT pro_id,
pro_version,
pro_revision,
LAST_VALUE (pro_id)
OVER (PARTITION BY pro_id ORDER BY pro_id)
pro
FROM SER_PRODUCT)
WHERE pro_id = pro)
WHERE pro_id = pro AND ver = pro_version)
GROUP BY (pro, ver, rev)
order by 1,2,3

Thanks

Jason


Tom Kyte
March 24, 2011 - 8:21 am UTC

tell me how to order the data (in the form of a specification, not a sql statement - don't make me reverse engineer something when you can just TELL ME) so that 15 comes before 1.4


this makes no sense:

LAST_VALUE (pro_id)
OVER (PARTITION BY pro_id ORDER BY pro_id)
pro

the last value of pro_id is the same as the first value or in fact the nth value - if you are partitioning by pro_id and ordering by it??????


I won't reverse engineer that query because I'm pretty darn sure it is wrong - since none of the things you order by are UNIQUE - the order bys are non-deterministic - meaning you could get a different answer from the same data if it was laid out on disk a little differently.

This is how I'd do it...

Brendan, March 24, 2011 - 11:48 am UTC

SELECT -- Get vsn-max per prd, rev-max per prd/vsn for each (prd/vsn-max/rev-max) [Filter out the non-max revision records]
ilv_1.pro_id, ilv_1.vsn_max, ilv_1.rev_max
FROM
( -- Get vsn-max per prd, rev-max per prd/vsn for each (prd/vsn-max/rev) [Filter out the non-max version records]
SELECT ilv.pro_id, ilv.vsn_max, ilv.rev_max, ilv.pro_revision
FROM
( -- Get vsn-max per prd, rev-max per prd/vsn for each (prd/vsn/rev)
SELECT prd.pro_id, prd.pro_version, prd.pro_revision,
Max(prd.pro_version) OVER (PARTITION BY prd.pro_id) vsn_max,
Max(prd.pro_revision) OVER (PARTITION BY prd.pro_id, prd.pro_version) rev_max
FROM ser_product prd
) ilv
WHERE ilv.pro_version = ilv.vsn_max
) ilv_1
WHERE ilv_1.pro_revision = ilv_1.rev_max
ORDER BY 1
Tom Kyte
March 24, 2011 - 12:08 pm UTC

how do you know you'd do it that we, we don't really know what "it" is yet :)

The original posters query doesn't tell us how to get it - only shows us a non-deterministic query that accidentally seems to work on their existing data...

if all they want is to break the data up by product and then sort by rev, and within rev by version and keep the last one - that would simply be:


ops$tkyte%ORA11GR2> select *
  2    from (select t.*, row_number() over (partition by pro_id order by pro_version DESC, pro_revision DESC) rn from ser_product_export t)
  3    where rn = 1;

    PRO_ID PRO_VERSION PRO_REVISION         RN
---------- ----------- ------------ ----------
         1           1            3          1
         2           1            1          1
       101           0           15          1
       122           1            0          1
       142           1            0          1




that gets the answer they asked for - still not 100% sure if it is the answer they mean to get ;)

Yes, that's a simpler solution...

Brendan, March 24, 2011 - 1:52 pm UTC

..and here, just for the hell of it (not saying it's better), is a non-analytic solution:
 SELECT pro_id, Max (pro_version || '.' || pro_revision) KEEP (DENSE_RANK LAST ORDER BY pro_version, pro_revision) rev
  FROM ser_product GROUP BY pro_id

I had never used KEEP before (hope I got it right!), so I learned something today, which is nice...
Tom Kyte
March 24, 2011 - 3:19 pm UTC

yup, that is another way..

assuming we have inferred correctly what the original poster meant.

brain droppings

Art V, March 24, 2011 - 8:24 pm UTC

SQL> select pro_id
      , floor(max(pro_version*1000+pro_revision) / 1000) pro_version
      , mod(max(pro_version*1000+pro_revision) , 1000 ) pro_revision
  FROM ser_product_export GROUP BY pro_id
;
  2    3    4    5  
    PRO_ID PRO_VERSION PRO_REVISION
---------- ----------- ------------
         1           1            3
         2           1            1
       101           0           15
       142           1            0
       122           1            0

SQL> SELECT pro_id, max (lpad(pro_version,3,'0') || '.' || lpad(pro_revision,3,'0')) rev
  FROM ser_product_export GROUP BY pro_id
;
  2    3  
    PRO_ID REV
---------- ------------------------------
         1 001.003
         2 001.001
       101 000.015
       142 001.000
       122 001.000

SQL> 

Thanks

Jason, March 28, 2011 - 3:18 am UTC

Yes - that is exactly what I meant and was asking for (albeit not clearly).

In future, I'll remember to detail a requirement rather than a guess...

First_Value

Ananth, April 07, 2011 - 3:39 pm UTC

Hi Tom,

I see some strange results in the values of columns (MGRNAME, MGRNAME2). ideally i want to be same.

The table and data are from HR Schema's EMPLOYEE table.

SELECT
EMPLOYEE_ID,
FIRST_NAME || ' ' || LAST_NAME NAME,
MANAGER_ID,
CONNECT_BY_ROOT EMPLOYEE_ID RTEMP,
FIRST_VALUE(FIRST_NAME || ' ' || LAST_NAME) OVER (PARTITION BY CONNECT_BY_ROOT EMPLOYEE_ID ORDER BY LEVEL) MGRNAME,
FIRST_VALUE(FIRST_NAME || ' ' || LAST_NAME) OVER (PARTITION BY CONNECT_BY_ROOT EMPLOYEE_ID ORDER BY LEVEL) MGRNAME2,
DEPARTMENT_ID
FROM EMPLOYEES
START WITH DEPARTMENT_ID = 90
CONNECT BY PRIOR EMPLOYEE_ID = MANAGER_ID

actually i wrote

FIRST_VALUE(EMPLOYEE_ID) OVER (PARTITION BY CONNECT_BY_ROOT EMPLOYEE_ID ORDER BY LEVEL) MGRID,
FIRST_VALUE(FIRST_NAME || ' ' || LAST_NAME) OVER (PARTITION BY CONNECT_BY_ROOT EMPLOYEE_ID ORDER BY LEVEL) MGNAME,

but i saw MGRID remained same for all the rows.

Can you please explain me how this is happening.

Tom Kyte
April 12, 2011 - 1:32 pm UTC

I see different RTEMP values and different names. I cannot see your screen, so I don't know what you are seeing nor what you expected to see.

I got rtemp values of 100, 101, and 102.

Ananth, April 12, 2011 - 4:04 pm UTC

Hi Tom,

Below is the result set that i got(fetched first 16 rows from PLSQL Developer)

EMPLOYEE_ID NAME MANAGER_ID RTEMP MGRNAME MGRNAME2 DEPARTMENT_ID
-----------------------------------------------------------------------------------------------
1 102 Lex De Haan 100 100 Lex De Haan Lex De Haan 90
2 101 Neena Kochhar 100 100 Lex De Haan Neena Kochhar 90
3 100 Steven King 100 Lex De Haan Steven King 90
4 121 Adam Fripp 100 100 Lex De Haan Steven King 50
5 147 Alberto Errazuriz 100 100 Lex De Haan Steven King 80
6 103 Alexander Hunold 102 100 Lex De Haan Lex De Haan 60
7 114 Den Raphaely 100 100 Lex De Haan Steven King 30
8 149 Eleni Zlotkey 100 100 Lex De Haan Steven King 80
9 148 Gerald Cambrault 100 100 Lex De Haan Steven King 80
10 204 Hermann Baer 101 100 Lex De Haan Neena Kochhar 70
11 200 Jennifer Whalen 101 100 Lex De Haan Neena Kochhar 10
12 145 John Russell 100 100 Lex De Haan Steven King 80
13 146 Karen Partners 100 100 Lex De Haan Steven King 80
14 124 Kevin Mourgos 100 100 Lex De Haan Steven King 50
15 102 Lex De Haan 100 100 Lex De Haan Steven King 90
16 120 Matthew Weiss 100 100 Lex De Haan Steven King 50

Am sorry for the indentation, let me know if by any means you want the results to be deployed.

Regards
Ananth

Tom Kyte
April 13, 2011 - 9:55 am UTC

so, why do you think that is wrong, get the entire result set and you'll see different values.

You would really have to explain WHAT you think you should be seeing.

Use the CODE button to put CODE examples in a set of tags that will use a fixed font.

Ananth, April 13, 2011 - 11:11 am UTC

Hi Tom,

for the below query,

SELECT 
  EMPLOYEE_ID, 
  FIRST_NAME || ' ' || LAST_NAME NAME, 
  MANAGER_ID, 
  FIRST_VALUE(FIRST_NAME || ' ' || LAST_NAME) 
     OVER (PARTITION BY CONNECT_BY_ROOT EMPLOYEE_ID ORDER BY LEVEL) MGRNAME,
  FIRST_VALUE(FIRST_NAME || ' ' || LAST_NAME) 
    OVER (PARTITION BY CONNECT_BY_ROOT EMPLOYEE_ID ORDER BY LEVEL) MGRNAME2,  
  DEPARTMENT_ID
FROM EMPLOYEES
START WITH DEPARTMENT_ID = 90
CONNECT BY PRIOR EMPLOYEE_ID = MANAGER_ID


If you observe the columns MGRNAME and MGRNAME2, they give different results though the selects for the both the columns are same.

Am unable to analyse, why the result shows different values.

EMPLOYEE_ID NAME              MANAGER_ID MGRNAME      MGRNAME2       DEPARTMENT_ID
----------- ---------------------------- ------------ ----------------------------
        102 Lex De Haan              100 Lex De Haan  Lex De Haan               90
        101 Neena Kochhar            100 Lex De Haan  Neena Kochhar             90
        100 Steven King                  Lex De Haan  Steven King               90
        121 Adam Fripp               100 Lex De Haan  Steven King               50
        147 Alberto Errazuriz        100 Lex De Haan  Steven King               80
        103 Alexander Hunold         102 Lex De Haan  Lex De Haan               60
        114 Den Raphaely             100 Lex De Haan  Steven King               30
        149 Eleni Zlotkey            100 Lex De Haan  Steven King               80
        148 Gerald Cambrault         100 Lex De Haan  Steven King               80
        204 Hermann Baer             101 Lex De Haan  Neena Kochhar             70
        200 Jennifer Whalen          101 Lex De Haan  Neena Kochhar             10
        145 John Russell             100 Lex De Haan  Steven King               80
        146 Karen Partners           100 Lex De Haan  Steven King               80
        124 Kevin Mourgos            100 Lex De Haan  Steven King               50
        102 Lex De Haan              100 Lex De Haan  Steven King               90
        120 Matthew Weiss            100 Lex De Haan  Steven King               50
        201 Michael Hartstein        100 Lex De Haan  Steven King               20
        108 Nancy Greenberg          101 Lex De Haan  Neena Kochhar            100
        101 Neena Kochhar            100 Lex De Haan  Steven King               90
        122 Payam Kaufling           100 Lex De Haan  Steven King

Tom Kyte
April 13, 2011 - 11:21 am UTC

what version, I'm in 11.2.0.2 and cannot reproduce.

(but in any case, why aren't you just using connect_by_root itself, first_value and analytics are the wrong way to get your data...)

Ananth, April 13, 2011 - 12:50 pm UTC

Hi Tom,

Am using Oracle 10g Express Edition for my developement. In work we are using Oracle 10.2.0.4.

Regards
Ananth
Tom Kyte
April 13, 2011 - 1:53 pm UTC

might be a product issue in the old release. Does it reproduce in your patched up database - 10.2.0.4?

Ananth, April 13, 2011 - 2:08 pm UTC

Hi Tom,

Can you compare and tell me which is more preferred way to achieve..

1. Self Join
2. Connect By & Analytic on the output of Connect By.

which one do you suggest and why..?
am not sure how oracle calculates Connect By internally.

Regards
Ananth
Tom Kyte
April 13, 2011 - 2:51 pm UTC

I would use SQL - that would be the best way to achieve a self join and to use connect by and analytics.

Other than that - I don't have anything else to say - the question is far far too ambiguous.

I have no idea how to compare things that do entirely different things and that are not interchangeable.


Ananth, April 13, 2011 - 2:50 pm UTC

In Other words,

Lets take HR Schema

to get EMPLOYEE_ID, MANAGER_ID, MANAGER EMAIL we can use

Self Join:
---------
SELECT E.EMPLOYEE_ID, M.EMPLOYEE_ID, M.EMAIL
FROM EMPLOYEES E, EMPLOYEES M
WHERE
E.MANAGER_ID = M.EMPLOYEE_ID

Using Connect By
-----------------
SELECT * FROM
(
SELECT 
EMPLOYEE_ID EID, 
LEAD(MANAGER_ID,1, NULL) OVER (PARTITION BY EMPLOYEE_ID ORDER BY LEVEL) MID,
LEAD(EMAIL,1, NULL)      OVER (PARTITION BY EMPLOYEE_ID ORDER BY LEVEL) MEMAIL
FROM  EMPLOYEES E
CONNECT BY PRIOR MANAGER_ID = EMPLOYEE_ID
AND        LEVEL <= 2
)
WHERE MID IS NOT NULL

Which one is better approach incase of data around 200 million...?

Tom Kyte
April 13, 2011 - 3:03 pm UTC

if you wanted to get all employees with their manager information - the first query is the only approach you should consider. The second approach using connect by would be considered a bit "bizzare".

that, and the second query doesn't return the same data as the first - that would be a problem too.


ops$tkyte%ORA11GR2> with data1
  2  as
  3  (SELECT * FROM
  4  (
  5  SELECT
  6  EMPLOYEE_ID EID,
  7  LEAD(MANAGER_ID,1, NULL) OVER (PARTITION BY EMPLOYEE_ID ORDER BY LEVEL) MID,
  8  LEAD(EMAIL,1, NULL)      OVER (PARTITION BY EMPLOYEE_ID ORDER BY LEVEL)
  9  MEMAIL
 10  FROM  hr.EMPLOYEES E
 11  CONNECT BY PRIOR MANAGER_ID = EMPLOYEE_ID
 12  AND        LEVEL <= 2
 13  )
 14  WHERE MID IS NOT NULL
 15  ),
 16  data2 as
 17  (
 18  SELECT E.EMPLOYEE_ID eid, M.EMPLOYEE_ID mid, M.EMAIL memail
 19  FROM hr.EMPLOYEES E, hr.EMPLOYEES M
 20  WHERE
 21  E.MANAGER_ID = M.EMPLOYEE_ID
 22  )
 23  (select 'in data1', data1.* from data1 MINUS select 'in data1', data2.* from data2)
 24  union all
 25  (select 'in data1', data1.* from data1 MINUS select 'in data1', data2.* from data2)
 26  /

'INDATA1        EID        MID MEMAIL
-------- ---------- ---------- -------------------------
in data1        101        100 NKOCHHAR
in data1        102        100 LDEHAAN
in data1        103        102 AHUNOLD
in data1        108        101 NGREENBE
in data1        114        100 DRAPHEAL
in data1        120        100 MWEISS
in data1        121        100 AFRIPP
in data1        122        100 PKAUFLIN
in data1        123        100 SVOLLMAN
in data1        124        100 KMOURGOS
in data1        145        100 JRUSSEL
in data1        146        100 KPARTNER
in data1        147        100 AERRAZUR
in data1        148        100 GCAMBRAU
in data1        149        100 EZLOTKEY
in data1        201        100 MHARTSTE
in data1        205        101 SHIGGINS
in data1        101        100 NKOCHHAR
in data1        102        100 LDEHAAN
in data1        103        102 AHUNOLD
in data1        108        101 NGREENBE
in data1        114        100 DRAPHEAL
in data1        120        100 MWEISS
in data1        121        100 AFRIPP
in data1        122        100 PKAUFLIN
in data1        123        100 SVOLLMAN
in data1        124        100 KMOURGOS
in data1        145        100 JRUSSEL
in data1        146        100 KPARTNER
in data1        147        100 AERRAZUR
in data1        148        100 GCAMBRAU
in data1        149        100 EZLOTKEY
in data1        201        100 MHARTSTE
in data1        205        101 SHIGGINS

34 rows selected.

Conditional Ranking

Maverick, April 14, 2011 - 9:18 am UTC

Tom, I am trying to rank / row_number records based on some condition[Salary is between 35 and 40]. If the conditon matches then only it should rank otherwise rank should be null. Here is what I am trying to do.

 CREATE TABLE Z_EMP
(
  EMPNO   INTEGER,
  ENAME   VARCHAR2(20 BYTE),
  SALARY  NUMBER(5,2),
  DEPTNO  INTEGER
);

SET DEFINE OFF;
Insert into Z_EMP
   (EMPNO, ENAME, SALARY, DEPTNO)
 Values
   (1234, 'Michael', 35, 10);
Insert into Z_EMP
   (EMPNO, ENAME, SALARY, DEPTNO)
 Values
   (1235, 'Michael Madan', 39, 10);
Insert into Z_EMP
   (EMPNO, ENAME, SALARY, DEPTNO)
 Values
   (2345, 'Jackson', 55, 20);
Insert into Z_EMP
   (EMPNO, ENAME, SALARY, DEPTNO)
 Values
   (2346, 'Jeremy Jackson', 25, 20);
Insert into Z_EMP
   (EMPNO, ENAME, SALARY, DEPTNO)
 Values
   (1, 'madeup1', 123.56, 10);
Insert into Z_EMP
   (EMPNO, ENAME, SALARY, DEPTNO)
 Values
   (2, 'ser', 123, 10);
Insert into Z_EMP
   (EMPNO, ENAME, SALARY, DEPTNO)
 Values
   (4, 'asd', 35, 10);
Insert into Z_EMP
   (EMPNO, ENAME, SALARY, DEPTNO)
 Values
   (6, 'Testname', 23.45, 20);
Insert into Z_EMP
   (EMPNO, ENAME, SALARY, DEPTNO)
 Values
   (3, 'myname', 123, 10);
Insert into Z_EMP
   (EMPNO, ENAME, SALARY, DEPTNO)
 Values
   (3333, 'testXcel', 34, 20);
Insert into Z_EMP
   (EMPNO, ENAME, SALARY, DEPTNO)
 Values
   (4545, 'Testing MV', 23, 10);
Insert into Z_EMP
   (EMPNO, ENAME, SALARY, DEPTNO)
 Values
   (4546, 'Testing MV', 143, 10);
Insert into Z_EMP
   (EMPNO, ENAME, SALARY, DEPTNO)
 Values
   (6666, 'test', 24, NULL);
Insert into Z_EMP
   (EMPNO, ENAME, SALARY, DEPTNO)
 Values
   (8, 'madeup2', 24.99, 30);
COMMIT;


Query :

select empno,
       salary,
       deptno,
       case when salary between 35 and 40 then salary else 0 end salary_L100,
       row_number() over(partition by deptno order by salary,rownum nulls last) r_1,
       case when salary between 35 and 40 then row_number() over(partition by deptno order by salary,rownum nulls last) else null end r_2
  from z_emp
 where deptno=10
 order by r_1,r_2,empno ;
 
and the result:

EMPNO SALARY DEPTNO SALARY_L100 R_1 R_2
4545 23 10 0  1 
1234 35 10 35  2 2
4 35 10 35  3 3
1235 39 10 39  4 4
2 123 10 0  5 
3 123 10 0  6 
1 123.56 10 0  7 
4546 143 10 0  8 


If you see here, R_2 should start with one, instead it increments and assigns raking only when matched. I want it to increment only when the condition is met. Hope I am able to explain it clearly.
I could use this by putting an outer query and rank it but don't want to use.do you think it's possible?


[Sorry!..Result table was looking good to me before posting to asktom. in review I see it's distorted..can't help]
Thanks,
Tom Kyte
April 14, 2011 - 10:22 am UTC

ops$tkyte%ORA11GR2> select empno, salary, deptno, case when salary between 35 and 40 then salary end salary_l100,
  2         row_number() over (partition by deptno order by salary, rowid) r_1,
  3             case when salary between 35 and 40
  4                  then row_number() over
  5                                 (partition by deptno
  6                                      order by case when salary between 35 and 40 then salary end, rowid)
  7                   end r_2
  8    from z_emp
  9   where deptno = 10
 10   order by r_1, r_2, empno
 11  /

     EMPNO     SALARY     DEPTNO SALARY_L100        R_1        R_2
---------- ---------- ---------- ----------- ---------- ----------
      4545         23         10                      1
      1234         35         10          35          2          1
         4         35         10          35          3          2
      1235         39         10          39          4          3
         2        123         10                      5
         3        123         10                      6
         1     123.56         10                      7
      4546        143         10                      8

8 rows selected.

Uh .. typo??

Greg, April 14, 2011 - 9:19 am UTC

Hey Tom,

Either you're working some new magic on me that I don't understand, or you have a typo in your example, there :)

" 23 (select 'in data1', data1.* from data1 MINUS select 'in data1', data2.* from data2)
24 union all
25 (select 'in data1', data1.* from data1 MINUS select 'in data1', data2.* from data2)"

Both queries in the union all are identical .. did you mean to reverse the 2nd one?

25 (select 'in data2', data2.* from data2 MINUS select 'in data2', data1.* from data1)

??

Or am I just missing something completely obvious ? :\

Tom Kyte
April 14, 2011 - 10:24 am UTC

I meant to reverse the table names - but still - it shows there is stuff in one query not in the other.

block copy killed me on that one :) I forgot to reverse the names... But point still made.

Conditional Ranking

Maverick, April 14, 2011 - 3:56 pm UTC

Thanks for that solution Tom. It seems to be working. but if I remove rownum, it's not working as expected..what is rownum doing there?
Tom Kyte
April 14, 2011 - 5:53 pm UTC

I didn't use rownum, what do you mean?

Conditional Ranking

A reader, April 15, 2011 - 7:07 am UTC

Sorry! I meant rowid [in order by ] not rownum.
Tom Kyte
April 15, 2011 - 10:05 am UTC

the rowid is necessary since salary is not unique - if you want a deterministic row_number assignment. You need to order by something UNIQUE.

The rowid is there to make the order by deterministic by ensuring a unique sort key.

Could not find it documented, but...

Andrew, May 03, 2011 - 3:52 pm UTC

I just figured out that a query can look back a varying number of rows in the window. I looked, but could not find it documented so here is my example

From this data:
item_id item_seq item_number item_desc item_pct
------- -------- ------------ ------------------ --------
61730 1 18030201 Test 100
61739 1 18001901 Black 80
61739 2 18001901_R Red 20
65173 1 32150648 Yellow 100
67609 1 18050301 yellow 50
67609 2 TEST red 50
70645 1 34119591ABLF Black 50
70645 2 34119591AMLS Medium Light Stone 50
73841 1 214 811 005 ... 100
76538 1 212 313 007 off road 10
76538 2 212 313 010 PLA 10
76538 3 212 313 006 PDC 10
76538 4 212 320 002 trunk 10
76538 5 212 318 002 off road 10
76538 6 212 314 002 damper 10
76538 7 212 313 005 ECO 10
76538 8 262 948 001 ECO 10
76538 9 262 972 001 trunk 10
76538 10 262 973 001 trunk 10

This query:
SELECT item_id,
LAG( item_number, item_seq -1, 1 )
-- offset is from the result set
OVER( PARTITION BY item_id
ORDER BY item_id,
item_seq ) AS parent_item_number,
item_number,
item_desc,
item_pct
FROM t;

Returns:
item_id parent_item_number item_number item_desc item_pct
---------- ------------------ ------------ ------------------- --------
61730 18030201 18030201 Test 100
61739 18001901 18001901 Black 80
61739 18001901 18001901_R Red 20
65173 32150648 32150648 Yellow 100
67609 18050301 18050301 yellow 50
67609 18050301 TEST red 50
70645 34119591ABLF 34119591ABLF Black 50
70645 34119591ABLF 34119591AMLS Medium Light Stone 50
73841 214 811 005 214 811 005 ... 100
76538 212 313 007 212 313 007 off road 10
76538 212 313 007 212 313 010 PLA 10
76538 212 313 007 212 313 006 PDC 10
76538 212 313 007 212 320 002 trunk 10
76538 212 313 007 212 318 002 off road 10
76538 212 313 007 212 314 002 damper 10
76538 212 313 007 212 313 005 ECO 10
76538 212 313 007 262 948 001 ECO 10
76538 212 313 007 262 972 001 trunk 10
76538 212 313 007 262 973 001 trunk 10

Now that is really useful. Had to find the parent item id, which is the item id for the first row in the window, so I look back the item_seq value -1.

Just wanted to share the coolness of this. Sorry for the interruption.

sri, July 05, 2011 - 9:28 am UTC

hi tom,

Your answers were most useful. I just completed one complex update using your guidance.

Now I have a problem with analytics. the problem is, with the sql:-
SELECT a.pats_appl_titl
,a.pats_Appl_type_id
,a.monthofyear
,a.year_display
,a.pats_tot_mtd
, SUM(a.pats_tot_mtd) over (partition BY a.year_display
ORDER BY a.year_display
,a.monthofyear) AS YTD
, a.pats_tot_mtda
, SUM(a.pats_tot_mtda) over (partition BY a.year_display
ORDER BY a.year_display
,a.monthofyear) AS YTDa
FROM sysadm.ps_pats_total_mtvw a
GROUP BY a.pats_appl_titl,a.pats_Appl_type_id,a.year_display,a.monthofyear,a.pats_tot_mtd ,a.pats_tot_mtda
ORDER BY a.pats_appl_titl,a.pats_Appl_type_id ,a.year_display,a.monthofyear,a.pats_tot_mtd ,a.pats_tot_mtda

when I try to run this above sql in TOAD, the results are perfect. But when I try to create a view based on the above sql. And then refer the view, the results are not correct. Can you let me know how to make the above sql into a view and use the view again and again.
Tom Kyte
July 05, 2011 - 3:45 pm UTC

Oh - the results are CORRECT, just not what you expected.

The analytics are performed AFTER a where clause - so, if you code:


select analytic from t where condition;


the analytic is computed AFTER the where clause. However, if you code:

select * from (select analytic from t) where condition

the analytic is computed AND THEN the where clause is applied. that is what is happening here with the view. For example:


ops$tkyte%ORA11GR2> create table emp as select * from scott.emp;

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select ename, empno, deptno, sum(sal) over (partition by deptno order by sal) sumsal
  2    from emp;

ENAME           EMPNO     DEPTNO     SUMSAL
---------- ---------- ---------- ----------
MILLER           7934         10       1300
CLARK            7782                  3750
KING             7839                  8750

SMITH            7369         20        800
ADAMS            7876                  1900
JONES            7566                  4875
SCOTT            7788                 10875
FORD             7902                 10875

JAMES            7900         30        950
MARTIN           7654                  3450
WARD             7521                  3450
TURNER           7844                  4950
ALLEN            7499                  6550
BLAKE            7698                  9400


14 rows selected.

<b>so that is the sum of salary - a running total - with no where clause..</b>

ops$tkyte%ORA11GR2> select ename, empno, deptno, sum(sal) over (partition by deptno order by sal) sumsal
  2    from emp
  3   where ename like '%A%';

ENAME           EMPNO     DEPTNO     SUMSAL
---------- ---------- ---------- ----------
CLARK            7782         10       2450

ADAMS            7876         20       1100

JAMES            7900         30        950
WARD             7521                  3450
MARTIN           7654                  3450
ALLEN            7499                  5050
BLAKE            7698                  7900


7 rows selected.

<b>if we filter on ename AND THEN apply the analytic - we get different running totals, whereas</b>

ops$tkyte%ORA11GR2> select *
  2    from (
  3  select ename, empno, deptno, sum(sal) over (partition by deptno order by sal) sumsal
  4    from emp
  5         )
  6   where ename like '%A%';

ENAME           EMPNO     DEPTNO     SUMSAL
---------- ---------- ---------- ----------
CLARK            7782         10       3750

ADAMS            7876         20       1900

JAMES            7900         30        950
MARTIN           7654                  3450
WARD             7521                  3450
ALLEN            7499                  6550
BLAKE            7698                  9400


7 rows selected.

<b>if we apply the analytic AND THEN filter, we get a running total that is the same as (but perhaps confusing in this context to look at) when we did not filter at all</b>



This is what is happening with your view, it is by design (it is supposed to do that) - and there isn't any way to have it "not happen" in that fashion.

There is a big difference between:


select analytic from t where condition

and

select * from (select analytic from t) where condition


The latter is your view, the former is what you are doing in 'toad'

Hi...

Sam, July 11, 2011 - 10:21 am UTC

Hi Tom,

I stuck some situation where i have to show the all the emp
max emp number,min emp number and count of total emp and dept name of that emp.

How can we make that query,need your help on this. I have tried but not getting expected output.

Expected output ::

EMPNO DNAME Max(empno Min(empno Total_count
7369 RESEARCH 7934 7369 13
7499 SALES 7934 7369 13
7566 RESEARCH 7934 7369 13
7654 SALES 7934 7369 13
7698 SALES 7934 7369 13
7782 ACCOUNTING 7934 7369 13
7788 RESEARCH 7934 7369 13
7839 ACCOUNTING 7934 7369 13
7844 SALES 7934 7369 13
7876 RESEARCH 7934 7369 13
7900 SALES 7934 7369 13
7902 RESEARCH 7934 7369 13
7934 ACCOUNTING 7934 7369 13

Tom Kyte
July 13, 2011 - 1:27 pm UTC

scott%ORA11GR2> select e.empno, d.dname,
  2         max(e.empno) over () maxemp,
  3             min(e.empno) over () minemp,
  4             count(*) over () cnt
  5    from emp e, dept d
  6   where e.deptno = d.deptno
  7  /

     EMPNO DNAME              MAXEMP     MINEMP        CNT
---------- -------------- ---------- ---------- ----------
      7369 RESEARCH             7934       7369         14
      7499 SALES                7934       7369         14
      7521 SALES                7934       7369         14
      7566 RESEARCH             7934       7369         14
      7654 SALES                7934       7369         14
      7698 SALES                7934       7369         14
      7782 ACCOUNTING           7934       7369         14
      7788 RESEARCH             7934       7369         14
      7839 ACCOUNTING           7934       7369         14
      7844 SALES                7934       7369         14
      7876 RESEARCH             7934       7369         14
      7900 SALES                7934       7369         14
      7902 RESEARCH             7934       7369         14
      7934 ACCOUNTING           7934       7369         14

14 rows selected.

learn analytic functions

venkata, July 12, 2011 - 3:53 pm UTC

Hi Tom,

i want to learn how to use analytic functions, in general, something a pointer kind of which points me to insider view of using analytic functions. it reminds me of my childhood days when i was scared of mathamatics. so i feel its the same with analytic functions. i can say very well i don't know what is row_number, rank, dense_rank, over etc functions do and how and where to use'em. please help me to learn about analytic functions.

Order of Result set with multiple row_number() analytical function

Masood Qaisar, October 20, 2011 - 12:23 am UTC

if I have query of multiple row_number() with different partition by & order by clause. How does oracle sort the result set e.g.

SELECT row_number() over (partition by ISSUE_DATE order by ctl_num) row_num1,
row_number() over (partition by ctl_num order by ISSUE_DATE) row_num2,
row_number() over (partition by inv_num order by inv_num) row_num3,
ctl_num,inv_num,ISSUE_DATE
from MY_INV

Tom Kyte
October 20, 2011 - 2:25 am UTC

it is not defined, you should have an order by in there if you want it ordered by something.

Note that it won't change the plan really - we *know* if that if we've already sorted it - we don't have to sort it. But - you cannot rely on us having it sorted just based on the row_number order by!

ops$tkyte%ORA11GR2> create table t as select * from all_objects where 1=0;

Table created.

ops$tkyte%ORA11GR2> exec dbms_stats.set_table_stats( user, 'T', numrows => 1000000, numblks => 100000 );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select owner, object_type, object_name ,
  2         row_number() over (order by owner) rn1,
  3         row_number() over (order by object_type) rn2,
  4         row_number() over (order by object_name) rn3
  5    from t;

Execution Plan
----------------------------------------------------------
Plan hash value: 718753377

-------------------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |  1000K|    42M|       | 61241   (1)| 00:12:15 |
|   1 |  WINDOW SORT         |      |  1000K|    42M|    53M| 61241   (1)| 00:12:15 |
|   2 |   WINDOW SORT        |      |  1000K|    42M|    53M| 61241   (1)| 00:12:15 |
|   3 |    WINDOW SORT       |      |  1000K|    42M|    53M| 61241   (1)| 00:12:15 |
|   4 |     TABLE ACCESS FULL| T    |  1000K|    42M|       | 27125   (1)| 00:05:26 |
-------------------------------------------------------------------------------------

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select owner, object_type, object_name ,
  2         row_number() over (order by owner) rn1,
  3         row_number() over (order by object_type) rn2,
  4         row_number() over (order by object_name) rn3
  5    from t
  6   order by owner;

Execution Plan
----------------------------------------------------------
Plan hash value: 718753377

-------------------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |  1000K|    42M|       | 49869   (1)| 00:09:59 |
|   1 |  WINDOW SORT         |      |  1000K|    42M|    53M| 49869   (1)| 00:09:59 |
|   2 |   WINDOW SORT        |      |  1000K|    42M|    53M| 49869   (1)| 00:09:59 |
|   3 |    WINDOW SORT       |      |  1000K|    42M|    53M| 49869   (1)| 00:09:59 |
|   4 |     TABLE ACCESS FULL| T    |  1000K|    42M|       | 27125   (1)| 00:05:26 |
-------------------------------------------------------------------------------------

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select owner, object_type, object_name ,
  2         row_number() over (order by owner) rn1,
  3         row_number() over (order by object_type) rn2,
  4         row_number() over (order by object_name) rn3
  5    from t
  6   order by object_type;

Execution Plan
----------------------------------------------------------
Plan hash value: 718753377

-------------------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |  1000K|    42M|       | 49869   (1)| 00:09:59 |
|   1 |  WINDOW SORT         |      |  1000K|    42M|    53M| 49869   (1)| 00:09:59 |
|   2 |   WINDOW SORT        |      |  1000K|    42M|    53M| 49869   (1)| 00:09:59 |
|   3 |    WINDOW SORT       |      |  1000K|    42M|    53M| 49869   (1)| 00:09:59 |
|   4 |     TABLE ACCESS FULL| T    |  1000K|    42M|       | 27125   (1)| 00:05:26 |
-------------------------------------------------------------------------------------

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> set autotrace off



You cannot rely on row_number - even when used all by itself, a single time, with no other analytics - returning the data in any sort of order.

If you need something ordered by something - you need to say "order by" at the end of the query.

A reader, January 16, 2012 - 3:01 am UTC

Hi Tom,

Is it possible to use analytic function for below query as i need to take max value of 2 field:The reason of using analytic is to tune the query.

SELECT MAX(c.effective_rate), MAX(c.amt_base)
INTO var_rate, var_amt_base
FROM ln_acct_dtls b, ln_tmp_rln_interest c
WHERE b.cod_acct_no = c.cod_acct_no
AND (c.dat_from = b.dat_last_ioa OR
c.dat_from = var_dat_last_process OR
c.dat_from = ADD_MONTHS(b.dat_last_charged, -1))
AND c.flg_int_type = 1
AND c.flg_from IN ('AI', 'OI', 'IN')
AND b.flg_mnt_status = 'A'
AND b.cod_acct_no = var_cod_acct_no
AND to_timestamp(c.date_time) =
(SELECT to_timestamp(MAX(date_time))
FROM ln_tmp_rln_interest
WHERE cod_acct_no = c.cod_acct_no
AND (dat_from = b.dat_last_ioa OR
dat_from = var_dat_last_process OR
dat_from = ADD_MONTHS(b.dat_last_charged, -1)))
GROUP BY b.cod_acct_no;
Tom Kyte
January 17, 2012 - 1:53 pm UTC

It is impossible for me to tell what is a column and what is a bind variable here. VAR_COD_ACCT_NO seems to be a plsql variable maybe?

No tables, we don't understand the schema at all.

We don't know how big anything is.

totally fuzzy join - this stuff:

 AND (c.dat_from = b.dat_last_ioa OR
               c.dat_from = var_dat_last_process OR
               c.dat_from = ADD_MONTHS(b.dat_last_charged, -1))


sends chills down my spine (and again, is var_dat_last_process a plsql variable, maybe?)


I *think*, if my assumption about var_cod_acct_no is correct, that


select max(erate), max(abase)
  INTO var_rate, var_amt_base
  from (
SELECT case when c.flg_int_type=1 and c.flg_from in ('AI','OI','IN') and b.flg_mnt_status = 'A' then c.effective_rate end erate,
       case when c.flg_int_type=1 and c.flg_from in ('AI','OI','IN') and b.flg_mnt_status = 'A' then c.amt_base       end abase,
       c.date_time,
       max(c.date_time) over () max_date_time
  FROM ln_acct_dtls b, ln_tmp_rln_interest c
 WHERE b.cod_acct_no = c.cod_acct_no
   AND (c.dat_from = b.dat_last_ioa OR
        c.dat_from = var_dat_last_process OR
        c.dat_from = ADD_MONTHS(b.dat_last_charged, -1))
   AND b.cod_acct_no = var_cod_acct_no
       )
 where date_time = max_date_time


might be equivalent.


join B to C by cod_acct_no and your fuzzy join, only get the one cod_acct_no from b (and hence c).

then, using case - null out anything that doesn't match the rest of your where clause.

also - keep the max(c.date_time) over () - over the entire set, we'll use it at the next level.

at the next level, we keep only the rows where the date is the same as the max date time - and max it then.


I don't know why you to_timestamp'ed things - that cannot be making anything faster.

A reader, January 18, 2012 - 2:30 pm UTC

<OP Query>

SELECT MAX(c.effective_rate), MAX(c.amt_base)
INTO var_rate, var_amt_base
FROM ln_acct_dtls b, ln_tmp_rln_interest c
WHERE b.cod_acct_no = c.cod_acct_no
AND (c.dat_from = b.dat_last_ioa OR
c.dat_from = var_dat_last_process OR
c.dat_from = ADD_MONTHS(b.dat_last_charged, -1))
AND c.flg_int_type = 1
AND c.flg_from IN ('AI', 'OI', 'IN')
AND b.flg_mnt_status = 'A'
AND b.cod_acct_no = var_cod_acct_no
AND to_timestamp(c.date_time) =
(SELECT to_timestamp(MAX(date_time))
FROM ln_tmp_rln_interest
WHERE cod_acct_no = c.cod_acct_no
AND (dat_from = b.dat_last_ioa OR
dat_from = var_dat_last_process OR
dat_from = ADD_MONTHS(b.dat_last_charged, -1)))
GROUP BY b.cod_acct_no;

<OP Query>


<Your Query>

select max(erate), max(abase)
INTO var_rate, var_amt_base
from (
SELECT case when c.flg_int_type=1 and c.flg_from in ('AI','OI','IN') and b.flg_mnt_status = 'A'
then c.effective_rate end erate,
case when c.flg_int_type=1 and c.flg_from in ('AI','OI','IN') and b.flg_mnt_status = 'A'
then c.amt_base end abase,
c.date_time,
max(c.date_time) over () max_date_time
FROM ln_acct_dtls b, ln_tmp_rln_interest c
WHERE b.cod_acct_no = c.cod_acct_no
AND (c.dat_from = b.dat_last_ioa OR
c.dat_from = var_dat_last_process OR
c.dat_from = ADD_MONTHS(b.dat_last_charged, -1))
AND b.cod_acct_no = var_cod_acct_no
)
where date_time = max_date_time

<Your Query>



in your query don't you need to add predicate "AND erate is not null and abase is not null" in the outer query


what if there is a record(s) which satisfy below condition

WHERE b.cod_acct_no = c.cod_acct_no
AND (c.dat_from = b.dat_last_ioa OR
c.dat_from = var_dat_last_process OR
c.dat_from = ADD_MONTHS(b.dat_last_charged, -1))
AND b.cod_acct_no = var_cod_acct_no


but not satisy

AND c.flg_int_type = 1
AND c.flg_from IN ('AI', 'OI', 'IN')
AND b.flg_mnt_status = 'A'

we do not have to output those records


So I think below is what you are looking for, just teawk your query a little to make it identical to OP query


select max(erate), max(abase)
INTO var_rate, var_amt_base
from (
SELECT case when c.flg_int_type=1 and c.flg_from in ('AI','OI','IN') and b.flg_mnt_status = 'A'
then c.effective_rate end erate,
case when c.flg_int_type=1 and c.flg_from in ('AI','OI','IN') and b.flg_mnt_status = 'A'
then c.amt_base end abase,
c.date_time,
max(c.date_time) over () max_date_time
FROM ln_acct_dtls b, ln_tmp_rln_interest c
WHERE b.cod_acct_no = c.cod_acct_no
AND (c.dat_from = b.dat_last_ioa OR
c.dat_from = var_dat_last_process OR
c.dat_from = ADD_MONTHS(b.dat_last_charged, -1))
AND b.cod_acct_no = var_cod_acct_no
)
where date_time = max_date_time and erate is not null and abase is not null


Please correct me if I am wrong

Thanks...
Tom Kyte
January 18, 2012 - 3:20 pm UTC

... in your query don't you need to add predicate "AND erate is not null and abase
is not null" in the outer query ...

no, aggregates already ignore nulls.


... we do not have to output those records
..

the case statement took care of that, and we did need to output those records in order to get:

 max(c.date_time) over () max_date_time


we needed the max analytic function to be applied to all records that matched

 WHERE b.cod_acct_no = c.cod_acct_no
   AND (c.dat_from = b.dat_last_ioa OR
        c.dat_from = var_dat_last_process OR
        c.dat_from = ADD_MONTHS(b.dat_last_charged, -1))
   AND b.cod_acct_no = var_cod_acct_no


including the ones that didn't satisfy:

           AND c.flg_int_type = 1
           AND c.flg_from IN ('AI', 'OI', 'IN')
           AND b.flg_mnt_status = 'A'



... Please correct me if I am wrong
...

I know that adding:
 and erate is not null and abase is not null

in general makes your query wrong as that was never a condition in the original query - if the original query allows for more than one record to be retrieved BEFORE THE AGGREGATE is applied - then the max(erate) could come from a record that had a null abase and the max(abase) could come from a record that has a null erate. You could add:

and (erate is not null OR abase is not null)

but the max(erate)/max(abase) deals with that already.

A reader, January 20, 2012 - 11:00 am UTC

I think I did not convey properly what I mean to say
in OP query for a given value for "var_cod_acct_no" if there is no records which satisy other predicate
the query return no row, but in you query it return a row


with t as
(
select cast (null as number) as c1, cast (null as number) as c2, 10 as c3 from dual union all
select cast (null as number), 55, 11 from dual
)
select --*
max(c1), max(c2)
from t
where c3=55


will return a row, eventhough there is no row which satisy condition "c3=55"
now if you add "group by" it will return no rows

Thanks
Tom Kyte
January 20, 2012 - 11:27 am UTC

they would have to add a group by, not a predicate.

an aggregate without a group by always returns a row - always - even if the table queried is empty.

your predicate would not change that, but it would break the answer :)


so yes, to mimic the original, they would need to add a group by to avoid getting a record when there is no matching var_cod_acct_no.

selecting different data from same table

Ravi B, January 24, 2012 - 12:17 pm UTC

Hi Tom,

I have a table as follows.

create table test(id number, description varchar2(10), hidden number, inst_date date);

insert into test values(1,'DESC1',0,'12-JAN-2012');
insert into test values(1,'DESC1',1,'14-JAN-2012');
insert into test values(2,'DESC2',0,'15-FEB-2012');
insert into test values(3,'DESC3',1,'13-APR-2012');
insert into test values(4,'DESC4',0,NULL);
insert into test values(4,'DESC4',1,'16-JUN-2012');
insert into test values(4,'DESC4',1,'27-AUG-2012');
insert into test values(5,'DESC5',0,NULL);
insert into test values(5,'DESC5',1,'19-DEC-2012');
insert into test values(5,'DESC5',1,'23-DEC-2012');

select * from test;

My requirement is to select all the columns from this table where hidden=0 and insert into another table.
The rules for selecting inst_date is as follows:

1) if inst_date IS NOT NULL select the inst_date from the current row.
2) if inst_date NULL, select max of inst_date from the same table for the same id where hidden = 1

select ID,DESCRIPTION,HIDDEN,
       CASE WHEN INST_DATE is null 
       THEN (select max(inst_date)
               from test t2
              where t1.id=t2.id
                and hidden=1)
        ELSE inst_date END inst_date
  from test t1
where hidden=0;



This is a simplified version of what we are trying to do here.
1) This table would contain about 25 - 35 million rows.
2) There are few more smaller (few hundred to few thousand rows)tables that are joined to this query

Could you please let me know if there is a better way to do this? Would Analytics make thing efficient/easier?

Thanks!
Tom Kyte
January 24, 2012 - 1:49 pm UTC

ops$tkyte%ORA11GR2> select id, description, nvl(inst_date,max_inst_date) new_inst_date
  2    from (
  3  select id, description, hidden, inst_date, max( case when hidden=1 then inst_date end) over (partition by id) max_inst_date
  4    from test
  5         )
  6   where hidden = 0;

        ID DESCRIPTIO NEW_INST_
---------- ---------- ---------
         1 DESC1      12-JAN-12
         2 DESC2      15-FEB-12
         4 DESC4      27-AUG-12
         5 DESC5      23-DEC-12

ops$tkyte%ORA11GR2> 

A reader, January 25, 2012 - 10:33 am UTC

Hi Tom

I have below query.I thought of using merge but unable to do it as i need data group by account no.so i cant select dat_post in table ext_ch_nobook table.
Is it possible to do it without using table rpt_ci_r112 2 times by merge or analytic function .Even i cant use update(select...) as ther is no primary key on table,

Orginal Query :

UPDATE rpt_ci_r112 c
SET c.interest_accr = NVL((SELECT NVL(SUM(B.amt_txn), 0.00)
FROM ext_ch_nobook b, rpt_ci_r112 a
WHERE B.cod_acct_no = A.cod_acct_no
AND B.dat_post >= a.dat_from
AND B.dat_post <= A.dat_to
and c.cod_acct_no = a.cod_acct_no
and c.code_sub_type in ('L1', 'A1')
and c.code_module in ('CK', 'CI')
AND B.cod_txn_mnemonic = 5001
GROUP BY B.cod_acct_no),
0.00);

Tom Kyte
January 26, 2012 - 10:22 am UTC

well, you could simplify it easily by realizing that when cod_sub_type NOT IN ('l1', 'a1') or code module NOT IN ( 'ck','ci' ) - you would set it to zero

set c.interest_accr = case when c.cod_sub_type NOT IN ('L1', 'A1') 
                                or
                                c.code_module NOT IN ('CK', 'CI')
                           then 0
                           else(select nvl( sum(b.amt_txn), 0 )
                                  from b, a
                                 where b.cod_acct_no = c.cod_acct_no
                                   and a.cod_acct_no = c.cod_acct_no
                                   and b.cod_txn_mnemonic = 5001
                                   and b.dat_post between a.dat_from and a.dat_to)
                       end



but without understanding the schema (the constraints, the relationships, everything) and without understanding the actual problem you are trying to solve - that is about it.

What is the syntax of the query and how is the result computed?

J.H.Zheng, January 28, 2012 - 5:06 am UTC

Hi Tom,

     I have the belowing query, but i have trouble to understand what is the syntax meaning and how the count_m is computed by count(*).

     Could you please help me understand 
A) The meaning of this query, 
B) How the count_m is computed,

Thanks,


SQL> select deptno,sal,empno, count(*) over ( partition by deptno
  2  order by sal range between sal/2 following and unbounded following) count_m from emp where deptno=30;

    DEPTNO        SAL      EMPNO    COUNT_M
---------- ---------- ---------- ----------
        30        950       7900          3
        30       1250       7521          1
        30       1250       7654          1
        30       1500       7844          1
        30       1600       7499          1
        30       2850       7698          0



Tom Kyte
January 31, 2012 - 5:37 pm UTC

Ok, in short - what that is doing is:


a) PARTITION BY DEPTNO breaking the data up by deptno. So, when we look at a result for a given row, we know the count_m is derived ONLY from other rows with that same deptno.

it wasn't really necessary in this case, since there was a predicate on deptno that restricted the rows to having the same deptno already. It would only be useful in the event there was more than one deptno.


b) order by sal. After breaking the data up by deptno, sort it by salary before computing the analytic function - so now we have all of the rows in a given deptno sorted by sal from small to big.

c) range between SAL/2 following and UNBOUNDED FOLLOWING says to include the rows that have a salary (since that is what we ordered by) that is between their salary PLUS sal/2 and "unbounded following" - meaning everything after that.

It is easier to see with a concocted data set and the use of first_value/last_value, consider:


ops$tkyte%ORA11GR2> create table emp
  2  as select 30 deptno, rownum sal, empno from scott.emp;

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2>  select deptno,sal,empno, sal+sal/2,
  2          count(*) over ( partition by deptno order by sal range between sal/2 following and unbounded following) count_m ,
  3          first_value(sal) over ( partition by deptno order by sal range between sal/2 following and unbounded following) first_sal ,
  4          last_value(sal) over ( partition by deptno order by sal range between sal/2 following and unbounded following) last_sal
  5     from emp where deptno=30
  6  /

    DEPTNO        SAL      EMPNO  SAL+SAL/2    COUNT_M  FIRST_SAL   LAST_SAL
---------- ---------- ---------- ---------- ---------- ---------- ----------
        30          1       7369        1.5         13          2         14
        30          2       7499          3         12          3         14
        30          3       7521        4.5         10          5         14
        30          4       7566          6          9          6         14
        30          5       7654        7.5          7          8         14
        30          6       7698          9          6          9         14
        30          7       7782       10.5          4         11         14
        30          8       7788         12          3         12         14
        30          9       7839       13.5          1         14         14
        30         10       7844         15          0
        30         11       7876       16.5          0
        30         12       7900         18          0
        30         13       7902       19.5          0
        30         14       7934         21          0

14 rows selected.




Now, you can see for the guy with a salary of 5 - that the first row that contributed to their result was the guy with a salary of 8 - because 8 is the first salary that is greater than or equal to 5+5/2 = 7.5. And the last salary used was 14 - because that is the end of the result set.

For the guy with a salary of 10, since 10+10/2 = 15 - the range of rows would be between "15 and end of set" - but 15 is outside the bounds of the set - so the count is zero....



It is a fairly strange window to use - I see the blog you got it from, sort of weird that in an article purporting to explain analytics, it was a strange pick to start with (and then not explain in detail ;) )

Thank you very much! Now it is very clear....

J.H.zheng, January 31, 2012 - 7:28 pm UTC


looking a better approach

Praveen Ray, March 04, 2012 - 3:28 am UTC

Hi Tom,

create table t (bkp_date date, prev_rows number, curr_rows number);

insert into t values (to_date('11/25/2011','mm/dd/yyyy'), 1653, 1691);
insert into t values (to_date('12/01/2011','mm/dd/yyyy'), 1763, 1819);
insert into t values (to_date('12/02/2011','mm/dd/yyyy'), 1819, 1887);
insert into t values (to_date('12/05/2011','mm/dd/yyyy'), 1887, 1909);
insert into t values (to_date('12/08/2011','mm/dd/yyyy'), 1937, 1866);


On Nov 25, say table X had 1691 rows. On Dec 01, it had 1819 rows and here previously recorded value is 1763, since it does not match with Nov 25th's curr_rows i.e. 1691, we report this as Nov 30th (Dec 01 - 1) value. This way, I want you to please write a query to report the following output:

11/24/2011 -> 1653
11/25/2011 -> 1691
11/30/2011 -> 1763
12/01/2011 -> 1819
12/02/2011 -> 1887
12/05/2011 -> 1909
12/07/2011 -> 1937
12/08/2011 -> 1866

Praveen Ray
Tom Kyte
March 04, 2012 - 9:34 am UTC

ops$tkyte%ORA11GR2> with r as (select level-1 l from dual connect by level <= 2)
  2  select t.bkp_date-r.l, decode(r.l,0,curr_rows,prev_rows), lag_rows, r.l, t.bkp_date, t.prev_rows, t.curr_rows
  3    from r, (select bkp_date, prev_rows, curr_rows, lag(curr_rows) over (order by bkp_date) lag_rows from t) t
  4   where r.l = 0 or (r.l=1 and decode(t.prev_rows,t.lag_rows,0,1) = 1)
  5   order by t.bkp_date-r.l
  6  /

T.BKP_DAT DECODE(R.L,0,CURR_ROWS,PREV_ROWS)   LAG_ROWS          L BKP_DATE   PREV_ROWS  CURR_ROWS
--------- --------------------------------- ---------- ---------- --------- ---------- ----------
24-NOV-11                              1653                     1 25-NOV-11       1653       1691
25-NOV-11                              1691                     0 25-NOV-11       1653       1691
30-NOV-11                              1763       1691          1 01-DEC-11       1763       1819
01-DEC-11                              1819       1691          0 01-DEC-11       1763       1819
02-DEC-11                              1887       1819          0 02-DEC-11       1819       1887
05-DEC-11                              1909       1887          0 05-DEC-11       1887       1909
07-DEC-11                              1937       1909          1 08-DEC-11       1937       1866
08-DEC-11                              1866       1909          0 08-DEC-11       1937       1866

8 rows selected.



make each row become two rows (cartesian join to R)

get the true previous value (using lag)

keep the one row regardless (r.l = 0) and keep the other row if the prev_row differs from the true previous row..


acknowledgement

Praveen Ray, March 06, 2012 - 4:01 am UTC

Hi Tom, I tried not to polute but I couldn't stop myself by not acknowledging your generous effort. I am sorry :)

You are awsome.

more use of LAG

Praveen Ray, March 09, 2012 - 10:00 pm UTC

Hi Tom,

I have got another question with a bit of twist. I am facing the difficulty at how to iterate for the gap/weekends,holidays.

create table h (holiday date);
insert into h values(to_date('11/28/2011','mm/dd/yyyy'));

create table t (bkp_date date, prev_rows number, curr_rows number);

insert into t values (to_date('11/25/2011','mm/dd/yyyy'), 1653, 1691); -- 1. Friday
insert into t values (to_date('11/29/2011','mm/dd/yyyy'), 1691, 1720); -- 2. Tuesday
-- 3. Wednesday - Missed
insert into t values (to_date('12/01/2011','mm/dd/yyyy'), 1763, 1819); -- 4. Thursday
insert into t values (to_date('12/02/2011','mm/dd/yyyy'), 1819, 1887); -- 5. Friday
insert into t values (to_date('12/05/2011','mm/dd/yyyy'), 1887, 1909); -- 6. Monday
-- 7. Tue - Missed
-- 8. Wed - Missed
insert into t values (to_date('12/08/2011','mm/dd/yyyy'), 1937, 1866); -- 9. Thursday

create view ray as select bkp_date, curr_rows from t;


I have access to the view and holiday table H. The table X had 1691 rows on Nov-25. Since, Nov-26 is Saturday and Nov-27 is Sunday and Nov-28 is Holiday, the rows on Nov-29 i.e. 1720 is considered next day data. My question is to write a query to output the increase of rows, only if the data are backed-up on consecutive working days as shown:

11/25/2011
11/29/2011 29 (i.e. 1720 - 1691)
12/01/2011
12/02/2011 68 (i.e. 1887 - 1819)
12/05/2011 22 (i.e. 1909 - 1887)
12/08/2011

Praveen Ray


Tom Kyte
March 12, 2012 - 7:30 am UTC

My question is to
write a query to output the increase of rows, only if the data are backed-up on
consecutive working days as shown:


how do you know if something was backed up?

LAG clarification

Praveen Ray, March 12, 2012 - 8:44 am UTC

if there is no Saturday,Sunday, or Holiday falling between the earlier date and current date means it is fine and then only display the difference.
Tom Kyte
March 12, 2012 - 9:22 am UTC

ops$tkyte%ORA11GR2> select bkp_date, curr_rows, last_curr_rows, last_bkp_date, diff,
  2         case when diff >0 and bkp_date = last_bkp_date+1
  3              then diff
  4              when diff >0 and bkp_date = last_bkp_date+3 and to_char(bkp_date,'dy') = 'mon'
  5              then diff
  6              when diff >0 and bkp_date <> last_bkp_date+1 and
  7                   not exists ( select last_bkp_date+level
  8                                  from dual
  9                                 where to_char(last_bkp_date+level,'dy') not in ('sat','sun')
 10                               connect by level <= bkp_date-last_bkp_date-1
 11                                 minus
 12                                select holiday
 13                                  from h )
 14              then diff
 15          end esac
 16    from (
 17  select bkp_date, curr_rows, lag(curr_rows) over (order by bkp_date) last_curr_rows,
 18         lag(bkp_date) over (order by bkp_date) last_bkp_date,
 19         curr_rows-lag(curr_rows) over (order by bkp_date) diff
 20    from ray
 21         )
 22   order by bkp_date
 23  /

BKP_DATE   CURR_ROWS LAST_CURR_ROWS LAST_BKP_       DIFF       ESAC
--------- ---------- -------------- --------- ---------- ----------
25-NOV-11       1691
29-NOV-11       1720           1691 25-NOV-11         29         29
01-DEC-11       1819           1720 29-NOV-11         99
02-DEC-11       1887           1819 01-DEC-11         68         68
05-DEC-11       1909           1887 02-DEC-11         22         22
08-DEC-11       1866           1909 05-DEC-11        -43

6 rows selected.


is one approach.

thank you so much

Praveen Ray, March 12, 2012 - 11:52 am UTC

excellent!!!

Analytics

Krishna Boda, April 13, 2012 - 3:09 am UTC

Hi Tom,

Need help to frame a query using analytics to get the desired results as follows:

Following is the table data:

Case TransDate Type TRANSAMT
C1 1/1/2012 A 100
C1 2/1/2012 A 100
C1 15/1/2012 I 200
C1 16/1/2012 A 500
C1 18/1/2012 I 500
C1 28/1/2012 A 200

I want to compare transamt of 'I' with the sum(transamt) of precedding 'A'.

Ex:

The sum(transamt) of 1st two 'A' (i.e 100+100) compare with following 'I' (i.e 200) and so on for following records. If transamt equal ignore those records else display.

Thanks & Rgds,
Krishna
Tom Kyte
April 13, 2012 - 12:51 pm UTC

no creates
no inserts
no look

and remember - my date format isn't your date format, you better use an explicit format when you post your example.

Analytics Functions

Krishna, April 16, 2012 - 9:28 pm UTC

Sorry..below are the required scripts:


Create script:

CREATE TABLE "CASE_TABLE"
(
"CASE_NO" VARCHAR2(2),
"TRANS_DTE" DATE,
"TYPE" VARCHAR2(1),
"TRANS_AMT" NUMBER
)


Insert statements:


INSERT INTO CASE_TABLE (CASE_NO, TRANS_DTE, TYPE, TRANS_AMT) VALUES ('C1', TO_DATE('01/01/2012', 'DD/MM/YYYY'), 'A', '100')
INSERT INTO CASE_TABLE (CASE_NO, TRANS_DTE, TYPE, TRANS_AMT) VALUES ('C1', TO_DATE('02/01/2012', 'DD/MM/YYYY'), 'A', '100')
INSERT INTO CASE_TABLE (CASE_NO, TRANS_DTE, TYPE, TRANS_AMT) VALUES ('C1', TO_DATE('15/01/2012', 'DD/MM/YYYY'), 'I', '200')
INSERT INTO CASE_TABLE (CASE_NO, TRANS_DTE, TYPE, TRANS_AMT) VALUES ('C1', TO_DATE('16/01/2012', 'DD/MM/YYYY'), 'A', '500')
INSERT INTO CASE_TABLE (CASE_NO, TRANS_DTE, TYPE, TRANS_AMT) VALUES ('C1', TO_DATE('18/01/2012', 'DD/MM/YYYY'), 'I', '500')
INSERT INTO CASE_TABLE (CASE_NO, TRANS_DTE, TYPE, TRANS_AMT) VALUES ('C1', TO_DATE('28/01/2012', 'DD/MM/YYYY'), 'A', '200')



select * from case_table;

CASE_NO TRANS_DTE TYPE TRANS_AMT
C1 01/01/12 A 100
C1 02/01/12 A 100
C1 15/01/12 I 200
C1 16/01/12 A 500
C1 18/01/12 I 500
C1 28/01/12 A 200



Expected Results:

CASE_NO TRANS_DTE TYPE TRANS_AMT
C1 28/01/12 A 200


Since sum of trans_amt for type "A" matched with following 'I's, the first 5 rows must be ignored and should display last record only.


NOTE : My intention is not run, select * from case_table where trans_dte = 28/01/12 to get the result


Thanks in advance.

Krishna

Tom Kyte
April 17, 2012 - 1:41 am UTC


<b> we need to create groups - each group is all of the rows up to and including the row with 'I' in it - after breaking the data up by CASE_NO (I assumed that, you never said,I assume the analysis takes place within a case_no not across) and sorting by trans_dte - so, we'll tag the beginning of each group:</b>

ops$tkyte%ORA11GR2> select c.*, case when nvl(lag(type) over (partition by case_no order by trans_dte),'I') = 'I'
  2                   then row_number() over (partition by case_no order by trans_dte)
  3               end grp
  4    from case_table c
  5  /

CA TRANS_DTE T  TRANS_AMT        GRP
-- --------- - ---------- ----------
C1 01-JAN-12 A        100          1
C1 02-JAN-12 A        100
C1 15-JAN-12 I        200
C1 16-JAN-12 A        500          4
C1 18-JAN-12 I        500
C1 28-JAN-12 A        200          6

6 rows selected.

<b>then we'll carry that group down to fill in the blanks:</b>

ops$tkyte%ORA11GR2> select x.*, max(grp) over (partition by case_no order by trans_dte) max_grp
  2    from (
  3  select c.*, case when nvl(lag(type) over (partition by case_no order by trans_dte),'I') = 'I'
  4                   then row_number() over (partition by case_no order by trans_dte)
  5               end grp
  6    from case_table c
  7         ) x
  8  /

CA TRANS_DTE T  TRANS_AMT        GRP    MAX_GRP
-- --------- - ---------- ---------- ----------
C1 01-JAN-12 A        100          1          1
C1 02-JAN-12 A        100                     1
C1 15-JAN-12 I        200                     1
C1 16-JAN-12 A        500          4          4
C1 18-JAN-12 I        500                     4
C1 28-JAN-12 A        200          6          6

6 rows selected.

<b> then we'll get the sum of the A's and the sum of the I's (only one of those) in each group:</b>

ops$tkyte%ORA11GR2> select case_no, trans_dte, type, trans_amt,
  2         nvl(sum( case when type = 'A' then trans_amt end ) over (partition by case_no, max_grp ),0) a_sum,
  3         nvl(sum( case when type = 'I' then trans_amt end ) over (partition by case_no, max_grp ),0) i_sum
  4    from (
  5  select x.*, max(grp) over (partition by case_no order by trans_dte) max_grp
  6    from (
  7  select c.*, case when nvl(lag(type) over (partition by case_no order by trans_dte),'I') = 'I'
  8                   then row_number() over (partition by case_no order by trans_dte)
  9               end grp
 10    from case_table c
 11         ) x
 12         )
 13  /

CA TRANS_DTE T  TRANS_AMT      A_SUM      I_SUM
-- --------- - ---------- ---------- ----------
C1 01-JAN-12 A        100        200        200
C1 02-JAN-12 A        100        200        200
C1 15-JAN-12 I        200        200        200
C1 16-JAN-12 A        500        500        500
C1 18-JAN-12 I        500        500        500
C1 28-JAN-12 A        200        200          0

6 rows selected.

<b> and only keep the rows where the sums do not match</b>

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select *
  2    from (
  3  select case_no, trans_dte, type, trans_amt,
  4         nvl(sum( case when type = 'A' then trans_amt end ) over (partition by case_no, max_grp ),0) a_sum,
  5         nvl(sum( case when type = 'I' then trans_amt end ) over (partition by case_no, max_grp ),0) i_sum
  6    from (
  7  select x.*, max(grp) over (partition by case_no order by trans_dte) max_grp
  8    from (
  9  select c.*, case when nvl(lag(type) over (partition by case_no order by trans_dte),'I') = 'I'
 10                   then row_number() over (partition by case_no order by trans_dte)
 11               end grp
 12    from case_table c
 13         ) x
 14         )
 15         )
 16   where a_sum <> i_sum
 17  /

CA TRANS_DTE T  TRANS_AMT      A_SUM      I_SUM
-- --------- - ---------- ---------- ----------
C1 28-JAN-12 A        200        200          0

1 row selected.

Analytics Functions

Krishna, April 17, 2012 - 4:07 am UTC

Thanks ....Tom for quick response

Group in 1 window

Matthew McPeak, April 17, 2012 - 9:54 am UTC

We had this discussion before... I think you can improve just a little by doing the initial group assignments in only one window?

<code>
WITH lvl1 AS
        (SELECT   c.*,
                  COUNT (
                     DECODE (c.TYPE, 'I', 1, NULL))
                  OVER (PARTITION BY c.case_no
                        ORDER BY c.trans_dte
                        ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)
                  + 1
                     grp
         FROM     case_table c),
     lvl2 AS
        (SELECT lvl1.*,
                SUM (DECODE (lvl1.TYPE, 'A', lvl1.trans_amt, NULL))
                   OVER (PARTITION BY lvl1.case_no, lvl1.grp)
                   a_sum,
                SUM (DECODE (lvl1.TYPE, 'I', lvl1.trans_amt, NULL))
                   OVER (PARTITION BY lvl1.case_no, lvl1.grp)
                   i_sum
         FROM   lvl1)
SELECT *
FROM   lvl2
WHERE  NVL (a_sum, 0) != NVL (i_sum, 0)


</code>
Tom Kyte
April 18, 2012 - 2:52 am UTC

yes, you are probably correct - I forgot about that technique ;)

it is neat but a lot less intuitive.

help on this

venkat, October 23, 2012 - 4:49 am UTC

Hi Tom,

select deptno,avg(sal) from emp group by deptno;


will produce department wise avg salary, 3 records only, i want to use analytic function to achieve the same result as above so i used
select deptno, avg(sal) over( partition by deptno ) from emp;
which is producing 14 records, i don't know what is wrong here, hope u can help
Tom Kyte
October 23, 2012 - 12:24 pm UTC

the problem is

aggregates aggregate

analytics do not.


why do you want to use an analytic to do something aggregates do????

this would be like asking order by to aggregate - it doesn't make sense.

Analytic Function over partition

Santosh, October 24, 2012 - 5:10 pm UTC

Hi Tom,
Below is the scenario that I need your help.

CREATE TABLE TEST (LOCATION NUMBER, DEPT NUMBER, YR NUMBER,
PRD NUMBER, WK NUMBER, TOTL_RETL NUMBER) ;

Insert into TEST (LOCATION, DEPT, YR, PRD, WK,TOTL_RETL) Values (18, 510, 2011, 1, 1,31476.4);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK,TOTL_RETL) Values (18, 520, 2011, 1, 1,63711.13);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK,TOTL_RETL) Values (18, 530, 2011, 1, 1,15672.74);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK,TOTL_RETL) Values (18, 540, 2011, 1, 1,38086.47);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK,TOTL_RETL) Values (18, 610, 2011, 1, 1,28140.75);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK,TOTL_RETL) Values (18, 620, 2011, 1, 1,30893.24);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK,TOTL_RETL) Values (18, 630, 2011, 1, 1,58493.09);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK,TOTL_RETL) Values (18, 640, 2011, 1, 1,27488.79);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK,TOTL_RETL) Values (18, 650, 2011, 1, 1,16408.94);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 660, 2011, 1, 1, 21689.94);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 700, 2011, 1, 1, 228342.52);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 990, 2011, 1, 1, 2150.84);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 125, 2011, 1, 2, 0);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 400, 2011, 1, 2, 2.95);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 510, 2011, 1, 2, 19844.29);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 520, 2011, 1, 2, 41771.71);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 530, 2011, 1, 2, 10841.08);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 540, 2011, 1, 2, 23134.07);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 610, 2011, 1, 2, 15120.12);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 620, 2011, 1, 2, 16449.79);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 630, 2011, 1, 2, 33649.653);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 640, 2011, 1, 2, 16101.17);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 650, 2011, 1, 2, 10331.21);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 660, 2011, 1, 2, 13489.1);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 700, 2011, 1, 2, 151811.877);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 990, 2011, 1, 2, 1153.71);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 100, 2011, 1, 3, 28.9);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 125, 2011, 1, 3, 0);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 510, 2011, 1, 3, 18381.26);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 520, 2011, 1, 3, 36350.48);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 530, 2011, 1, 3, 7087.94);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 540, 2011, 1, 3, 21359.93);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 610, 2011, 1, 3, 14403.45);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 620, 2011, 1, 3, 13509.96);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 630, 2011, 1, 3, 28255.173);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 640, 2011, 1, 3, 13829.64);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 650, 2011, 1, 3, 8852.46);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 660, 2011, 1, 3, 12370.38);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 700, 2011, 1, 3, 149759.997);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 990, 2011, 1, 3, 943.18);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 400, 2011, 1, 4, 0);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 510, 2011, 1, 4, 16949.37);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 520, 2011, 1, 4, 37383.85);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 530, 2011, 1, 4, 7497.58);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 540, 2011, 1, 4, 20788.84);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 610, 2011, 1, 4, 10992.85);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 620, 2011, 1, 4, 14992.43);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 630, 2011, 1, 4, 27450.39);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 640, 2011, 1, 4, 12489.45);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 650, 2011, 1, 4, 14542.03);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 660, 2011, 1, 4, 9567.7);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 700, 2011, 1, 4, 128321.87);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 990, 2011, 1, 4, 1126.89);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 100, 2011, 2, 1, 5.95);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 125, 2011, 2, 1, 39.9);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 510, 2011, 2, 1, 19809.88);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 520, 2011, 2, 1, 37047.37);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 530, 2011, 2, 1, 7588.09);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 540, 2011, 2, 1, 21242.25);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 610, 2011, 2, 1, 11596.95);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 620, 2011, 2, 1, 14378.69);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 630, 2011, 2, 1, 27952.592);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 640, 2011, 2, 1, 14877.69);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 650, 2011, 2, 1, 8862.82);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 660, 2011, 2, 1, 11269.37);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 700, 2011, 2, 1, 136642.098);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 990, 2011, 2, 1, 880.66);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 100, 2011, 2, 2, 24.71);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 125, 2011, 2, 2, 19.95);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 400, 2011, 2, 2, 0);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 510, 2011, 2, 2, 22428.46);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 520, 2011, 2, 2, 46155.66);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 530, 2011, 2, 2, 9443.22);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 540, 2011, 2, 2, 25715.86);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 610, 2011, 2, 2, 14119.74);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 620, 2011, 2, 2, 16562.08);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 630, 2011, 2, 2, 39380.098);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 640, 2011, 2, 2, 15643.81);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 650, 2011, 2, 2, 9459.89);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 660, 2011, 2, 2, 13126);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 700, 2011, 2, 2, 160920.012);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 990, 2011, 2, 2, 970.17);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 100, 2011, 2, 3, 39);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 510, 2011, 2, 3, 28723.9);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 520, 2011, 2, 3, 54913.34);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 530, 2011, 2, 3, 9875.94);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 540, 2011, 2, 3, 30609);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 610, 2011, 2, 3, 17938.32);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 620, 2011, 2, 3, 22780.97);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 630, 2011, 2, 3, 45516.787);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 640, 2011, 2, 3, 20507.64);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 650, 2011, 2, 3, 10772.56);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 660, 2011, 2, 3, 16980.97);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 700, 2011, 2, 3, 214895.303);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 990, 2011, 2, 3, 1030.81);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 100, 2011, 2, 4, 35.47);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 510, 2011, 2, 4, 35207.19);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 520, 2011, 2, 4, 74947.75);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 530, 2011, 2, 4, 16769.03);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 540, 2011, 2, 4, 40667.7);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 610, 2011, 2, 4, 24533.57);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 620, 2011, 2, 4, 30303.68);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 630, 2011, 2, 4, 60116.023);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 640, 2011, 2, 4, 28808.64);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 650, 2011, 2, 4, 15382.84);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 660, 2011, 2, 4, 25966.28);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 700, 2011, 2, 4, 321753.127);
Insert into TEST (LOCATION, DEPT, YR, PRD, WK, TOTL_RETL) Values (18, 990, 2011, 2, 4, 1473.17);
COMMIT;

I am trying to pull the information of sum of retail and period to date retail and year to date retail with the following query

SELECT LOCATION, YR, PRD, WK, SUM(TOTL_RETL),SUM(PTD_T_RETL),SUM(YTD_T_RETL)
FROM
(
Select
x.LOCATION,
x.DEPT,
x.YR,
x.PRD,
x.WK,
x.TOTL_RETL,
sum(x.totl_retl) over (partition by x.location,x.dept,x.yr,x.prd order by x.prd,x.wk asc) PTD_T_RETL,
sum(x.totl_retl) over (partition by x.location,x.dept,x.yr order by x.prd,x.wk asc) YTD_T_RETL
FROM
(SELECT
A.LOCATION,
a.DEPT,
a.YR,
a.PRD,
a.WK,
SUM(TOTL_RETL) TOTL_RETL
FROM
TEST A
where location = 18
AND PRD IN(1, 2)
AND YR =2011
GROUP BY A.LOCATION, a.DEPT, a.YR, a.PRD, a.WK) X)
WHERE PRD = 2 AND WK = 4
GROUP BY LOCATION, YR, PRD, WK

The result is 18 2011 2 4 675964.47 1836653.13 3380141.76.

The PTD and YTD values are incorrect as they are only including the dept's that are common in every week.
Tom Kyte
October 25, 2012 - 8:55 am UTC

explain in the form of a specification what you are trying to do - use psuedo code, text, whatever to explain it.

posting a non-functioning bit of code doesn't help - it only hurts.


Santosh, October 25, 2012 - 9:37 am UTC

I am trying to pull the information of sum of retail and period to date retail and year to date
retail with the query.

Below is the result when the run the following query

SELECT * FROM
(
Select
x.LOCATION,
x.DEPT,
x.YR,
x.PRD,
x.WK,
x.TOTL_RETL,
sum(x.totl_retl) over (partition by x.location,x.dept,x.yr,x.prd order by x.prd,x.wk asc)
PTD_T_RETL,
sum(x.totl_retl) over (partition by x.location,x.dept,x.yr order by x.prd,x.wk asc) YTD_T_RETL
FROM
(SELECT
A.LOCATION,
a.DEPT,
a.YR,
a.PRD,
a.WK,
SUM(TOTL_RETL) TOTL_RETL
FROM
WAREHOUSE.JDA_RTC_INVSTK A
where location = 18
AND PRD IN(1, 2)
AND YR =2011
GROUP BY A.LOCATION, a.DEPT, a.YR, a.PRD, a.WK) X)
ORDER BY PRD, WK;

LOCATION,DEPT,YR,PRD,WK,TOTL_RETL,PTD_T_RETL,YTD_T_RETL
18,620,2011,1,1,30893.24,30893.24,30893.24
18,990,2011,1,1,2150.84,2150.84,2150.84
18,700,2011,1,1,228342.52,228342.52,228342.52
18,660,2011,1,1,21689.94,21689.94,21689.94
18,650,2011,1,1,16408.94,16408.94,16408.94
18,640,2011,1,1,27488.79,27488.79,27488.79
18,630,2011,1,1,58493.09,58493.09,58493.09
18,610,2011,1,1,28140.75,28140.75,28140.75
18,540,2011,1,1,38086.47,38086.47,38086.47
18,530,2011,1,1,15672.74,15672.74,15672.74
18,520,2011,1,1,63711.13,63711.13,63711.13
18,510,2011,1,1,31476.4,31476.4,31476.4
18,990,2011,1,2,1153.71,3304.55,3304.55
18,620,2011,1,2,16449.79,47343.03,47343.03
18,610,2011,1,2,15120.12,43260.87,43260.87
18,660,2011,1,2,13489.1,35179.04,35179.04
18,540,2011,1,2,23134.07,61220.54,61220.54
18,640,2011,1,2,16101.17,43589.96,43589.96
18,125,2011,1,2,0,0,0
18,700,2011,1,2,151811.877,380154.397,380154.397
18,520,2011,1,2,41771.71,105482.84,105482.84
18,630,2011,1,2,33649.653,92142.743,92142.743
18,510,2011,1,2,19844.29,51320.69,51320.69
18,650,2011,1,2,10331.21,26740.15,26740.15
18,400,2011,1,2,2.95,2.95,2.95
18,530,2011,1,2,10841.08,26513.82,26513.82
18,100,2011,1,3,28.9,28.9,28.9
18,125,2011,1,3,0,0,0
18,510,2011,1,3,18381.26,69701.95,69701.95
18,520,2011,1,3,36350.48,141833.32,141833.32
18,530,2011,1,3,7087.94,33601.76,33601.76
18,540,2011,1,3,21359.93,82580.47,82580.47
18,610,2011,1,3,14403.45,57664.32,57664.32
18,620,2011,1,3,13509.96,60852.99,60852.99
18,630,2011,1,3,28255.173,120397.916,120397.916
18,640,2011,1,3,13829.64,57419.6,57419.6
18,650,2011,1,3,8852.46,35592.61,35592.61
18,660,2011,1,3,12370.38,47549.42,47549.42
18,700,2011,1,3,149759.997,529914.394,529914.394
18,990,2011,1,3,943.18,4247.73,4247.73
18,400,2011,1,4,0,2.95,2.95
18,510,2011,1,4,16949.37,86651.32,86651.32
18,520,2011,1,4,37383.85,179217.17,179217.17
18,530,2011,1,4,7497.58,41099.34,41099.34
18,540,2011,1,4,20788.84,103369.31,103369.31
18,610,2011,1,4,10992.85,68657.17,68657.17
18,990,2011,1,4,1126.89,5374.62,5374.62
18,630,2011,1,4,27450.39,147848.306,147848.306
18,640,2011,1,4,12489.45,69909.05,69909.05
18,650,2011,1,4,14542.03,50134.64,50134.64
18,660,2011,1,4,9567.7,57117.12,57117.12
18,700,2011,1,4,128321.87,658236.264,658236.264
18,620,2011,1,4,14992.43,75845.42,75845.42
18,540,2011,2,1,21242.25,21242.25,124611.56
18,610,2011,2,1,11596.95,11596.95,80254.12
18,125,2011,2,1,39.9,39.9,39.9
18,530,2011,2,1,7588.09,7588.09,48687.43
18,620,2011,2,1,14378.69,14378.69,90224.11
18,100,2011,2,1,5.95,5.95,34.85
18,630,2011,2,1,27952.592,27952.592,175800.898
18,510,2011,2,1,19809.88,19809.88,106461.2
18,700,2011,2,1,136642.098,136642.098,794878.362
18,650,2011,2,1,8862.82,8862.82,58997.46
18,640,2011,2,1,14877.69,14877.69,84786.74
18,520,2011,2,1,37047.37,37047.37,216264.54
18,660,2011,2,1,11269.37,11269.37,68386.49
18,990,2011,2,1,880.66,880.66,6255.28
18,530,2011,2,2,9443.22,17031.31,58130.65
18,520,2011,2,2,46155.66,83203.03,262420.2
18,510,2011,2,2,22428.46,42238.34,128889.66
18,400,2011,2,2,0,0,2.95
18,125,2011,2,2,19.95,59.85,59.85
18,100,2011,2,2,24.71,30.66,59.56
18,990,2011,2,2,970.17,1850.83,7225.45
18,700,2011,2,2,160920.012,297562.11,955798.374
18,660,2011,2,2,13126,24395.37,81512.49
18,650,2011,2,2,9459.89,18322.71,68457.35
18,640,2011,2,2,15643.81,30521.5,100430.55
18,630,2011,2,2,39380.098,67332.69,215180.996
18,620,2011,2,2,16562.08,30940.77,106786.19
18,610,2011,2,2,14119.74,25716.69,94373.86
18,540,2011,2,2,25715.86,46958.11,150327.42
18,620,2011,2,3,22780.97,53721.74,129567.16
18,650,2011,2,3,10772.56,29095.27,79229.91
18,660,2011,2,3,16980.97,41376.34,98493.46
18,540,2011,2,3,30609,77567.11,180936.42
18,530,2011,2,3,9875.94,26907.25,68006.59
18,610,2011,2,3,17938.32,43655.01,112312.18
18,100,2011,2,3,39,69.66,98.56
18,510,2011,2,3,28723.9,70962.24,157613.56
18,640,2011,2,3,20507.64,51029.14,120938.19
18,700,2011,2,3,214895.303,512457.413,1170693.677
18,630,2011,2,3,45516.787,112849.477,260697.783
18,990,2011,2,3,1030.81,2881.64,8256.26
18,520,2011,2,3,54913.34,138116.37,317333.54
18,650,2011,2,4,15382.84,44478.11,94612.75
18,530,2011,2,4,16769.03,43676.28,84775.62
18,700,2011,2,4,321753.127,834210.54,1492446.804
18,540,2011,2,4,40667.7,118234.81,221604.12
18,640,2011,2,4,28808.64,79837.78,149746.83
18,610,2011,2,4,24533.57,68188.58,136845.75
18,620,2011,2,4,30303.68,84025.42,159870.84
18,510,2011,2,4,35207.19,106169.43,192820.75
18,630,2011,2,4,60116.023,172965.5,320813.806
18,660,2011,2,4,25966.28,67342.62,124459.74
18,990,2011,2,4,1473.17,4354.81,9729.43
18,100,2011,2,4,35.47,105.13,134.03
18,520,2011,2,4,74947.75,213064.12,392281.29

I want to pull the ptd_t_retail for the period=2 and week = 4.

When I add the totl_retl for period=2 ,the result is 1,836,712.98

However, when I add the ptd_t_retl for period = 2 and week = 4, the result is 1,836,653.13

I need the ptd calculated through analytic function give the same result as 1,836,712.98
Tom Kyte
October 25, 2012 - 9:51 am UTC

stop giving code that does not work - what is the point of giving someone code that does not work???

especially since the output is unreadable.


give full specification, lose the query, pretend no one every saw the query. write down on paper what you would have to write down to explain to someone clearly - very very very clearly - what the question you are trying to answer is - and the processing that needs to take place to get there.

Santosh, October 25, 2012 - 10:18 am UTC

1. The data is populated as location, departments , year, period(month), week(week number in the period) and retail sales.

2. I need to pull the retail value, period to date retail value and year to date retail value for any given period and week. (Ex: I need the retail, ptd_retail ,ytd_retail values for period = 2, week = 4, this should give me the sum of all the retail sales in every department for that period-week and the sum of retail sales in every department for that entire period (as week 4 is the last week of the period)) and sum of retail sales in every department for the periods 1 and 2 (which would give me year to date sales)

3. Not all departments have sales in every week.

4. When I wrote the query to pull the result for period to date sales and year to date sales, the departments that had no sales in few weeks are getting omitted from aggregation.

Tom Kyte
October 25, 2012 - 11:29 am UTC

ops$tkyte%ORA11GR2> select * from (
  2  select location, dept, yr, prd, wk, totl_retl,
  3         sum(totl_retl) over (partition by location, dept, yr, prd order by wk) prd_to_date,
  4         sum(totl_retl) over (partition by location, dept, yr      order by prd, wk) yr_to_date
  5    from test
  6  )
  7  where prd = 2 and wk = 4;

  LOCATION       DEPT         YR        PRD         WK  TOTL_RETL PRD_TO_DATE YR_TO_DATE
---------- ---------- ---------- ---------- ---------- ---------- ----------- ----------
        18        100       2011          2          4      35.47      105.13     134.03
        18        510       2011          2          4   35207.19   106169.43  192820.75
        18        520       2011          2          4   74947.75   213064.12  392281.29
        18        530       2011          2          4   16769.03    43676.28   84775.62
        18        540       2011          2          4    40667.7   118234.81  221604.12
        18        610       2011          2          4   24533.57    68188.58  136845.75
        18        620       2011          2          4   30303.68    84025.42  159870.84
        18        630       2011          2          4  60116.023    172965.5 320813.806
        18        640       2011          2          4   28808.64    79837.78  149746.83
        18        650       2011          2          4   15382.84    44478.11   94612.75
        18        660       2011          2          4   25966.28    67342.62  124459.74
        18        700       2011          2          4 321753.127   834210.54  1492446.8
        18        990       2011          2          4    1473.17     4354.81    9729.43

13 rows selected.



that gives what you asked for in #2


Santosh, October 25, 2012 - 12:35 pm UTC

Yes, I just did the same thing and the result is (i.e., sum of retail sales for period 2 is (1,836,653.13)

However, the correct result which I need is 1,836,712.98

For period =2,in the weeks 1 and 2, there were sales in dept 125.
There were no sales for dept 125 in weeks 3 and 4, so there was no entry in the table.

The aggregated sum needs to consider those sales as well.
Tom Kyte
October 28, 2012 - 10:25 pm UTC

if there were no sales - what numbers are to be considered????

what sales need to be considered???

Santosh, November 02, 2012 - 10:29 pm UTC

Sorry was stuck in Sandy, could not reply soon.

If there were no sales, we need to consider that as 0.

I think the issue we have is because the dept with no sales is not loaded in the table.

The analytic function when partitioned on dept is only considering the sum of sales of the dept's present in a particular week for the whole period.

For time being I am using two different queries:
1.To pull weekly sales using analytic function.
2.To pull the PTD and YTD I am not including the dept in the analytic function.

This is working for me.

If there is a better way, could you please suggest.

PARTITION BY CLAUSE with more than 1 column

Mukund Nampally, April 04, 2013 - 1:18 pm UTC

Hi Tom,

The Analytical Functions are a great advantage in Oracle.
My Question is -
Does the PARTITION BY Clause support more than 1 Column.
Here is what I'm testing -

create table zz_tst1(a number,b number,c varchar2(1))
/
Insert Into Zz_Tst1 Values (1,2,'I')
/
Insert Into Zz_Tst1 Values (1,4,'I')
/
Insert Into Zz_Tst1 Values (10,2,'I')
/
Insert Into Zz_Tst1 Values (11,2,'A')
/
Insert into zz_tst1 values (12,2,'A')
And this is the Query I'm running -
select count(*) over(partition by case when c='I' then (a,b) end)
from Zz_Tst1

And this gives an error - "missing Right Parentheses"

Tom Kyte
April 22, 2013 - 2:06 pm UTC

it definitely supports more than one column

the problem is your syntax "ase when c='I' then (a,b) end" doesn't make any sense. I'm not sure what you are trying to do there? what is (a,b) - that would make sense in an in-list but that is about it.


maybe you mean

partition by case when c='I' then a end, case when c='I' then b end

that would partition by A and B when c='I' or NULL and NULL otherwise.

reply to mukund's query

Ranjan, April 11, 2013 - 6:37 am UTC

Hi Mukund

############
select count(*) over(partition by case when c='I' then a||b end)
from Zz_Tst1;
###########

In normal group by we write columns separated by comma(,) but when we use advance group by or partition by(which has CASE),
that doesnt support, so we need to use concantenation(like I did above).
May be Tom has some other tricks for that.

regards,
Ranjan
Tom Kyte
April 22, 2013 - 2:48 pm UTC

I'm not really sure what he was trying to do, in general a||b is *not* going to work.


consider
a= 1, b= 11
and
a= 11, b = 1


Mukund, April 15, 2013 - 6:04 am UTC

Thanks that was so Useful....

A reader, April 22, 2013 - 6:41 pm UTC

but how abt a||','|| b ?

Thank you,
Tom Kyte
April 22, 2013 - 8:32 pm UTC

what if a or b contains a ','

no, concatenation is the wrong way to go. If they want to partition by a, b IF something = something else, they can do it the way I described above:

partition by case when c='I' then a end, case when c='I' then b end


if not, they'll have to tell us what they meant to be doing.

:)

A reader, April 23, 2013 - 7:16 am UTC

Hi Tom,

Nice to see you back here in asktom.

Yes,I missed that.

thanks,
Ranjan.

Running Balance

Anand, May 05, 2013 - 1:26 pm UTC

Hi Tom,

I have a table as below :

drop table test
/
create table test (date_trn date,trn_typ char(1),amt number)
/
insert into test values ('01-apr-2013','C',1000)
/
insert into test values ('02-apr-2013','D',500)
/
insert into test values ('03-apr-2013','C',200)
/
insert into test values ('04-apr-2013','C',100)
/

now i want to get the running balance of each day transaction happened.If trn_typ = 'C' then add and if trn_typ = 'D' then subtract.
so my output should be as below :

date_tn running_balance
01-apr-2013 1000
02-apr-2013 500 --> formula (1000-500)
03-apr-2013 700 --> formula (1000 -500 + 200)
04-apr-2013 800 --> formula (1000 -500 + 200 + 100)
Tom Kyte
May 06, 2013 - 7:23 pm UTC

ops$tkyte%ORA11GR2> select date_trn, sum( case when trn_typ = 'C' then amt
  2                             when trn_typ = 'D' then -amt
  3                          end ) over (order by date_trn) rt
  4    from test;

DATE_TRN          RT
--------- ----------
01-APR-13       1000
02-APR-13        500
03-APR-13        700
04-APR-13        800

ops$tkyte%ORA11GR2> 

Anand, May 06, 2013 - 2:59 pm UTC

Hi Tom,

I got the solution for above question.

SQL> select date_trn,
  2         sum(decode(trn_typ, 'C', amt, -amt)) over(order by date_trn range unbounded preceding)
  3    from test
  4  /

DATE_TRN    SUM(DECODE(TRN_TYP,'C',AMT,-AM
----------- ------------------------------
01/04/2013                            1000
02/04/2013                             500
03/04/2013                             700
04/04/2013                             800

SQL> 

Thanks for your support and a great site.

FIRST_VALUE

mfz, September 23, 2013 - 3:03 pm UTC

I have a question  on "First Value" function . My understanding is that , if I use IGNORE NULLS , 
the function should return first no null value in the set. 

So , shouldn't it return the value of "FV" in row 1 as 5550004444 . 
Why does it return null in this case ?

SQL> select * from v$version ;

BANNER
-------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for 64-bit Windows: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

SQL> Create table t
  2    (
  3    id int  ,
  4    type_cd varchar2(1) ,
  5    tele_no int  ) ;

Table created.

SQL>     insert into t values ( 1 , 'M' , '5550004444') ;

1 row created.

SQL>   insert into t values ( 1 , 'P' , null ) ;

1 row created.

SQL>   insert into t values ( 2 , 'P' , '7775550000' ) ;

1 row created.

SQL> commit;

Commit complete.

SQL>   select id , type_cd , tele_no ,
  2    first_value(tele_no IGNORE NULLS) over ( partition by id order by ( Case When type_cd ='P' then 1 else 2 end ) )  fv
  3    from t;

        ID T    TELE_NO         FV
---------- - ---------- ----------
         1 P
         1 M 5550004444 5550004444
         2 P 7775550000 7775550000

SQL>

for mfz

Ranjan, September 27, 2013 - 12:55 pm UTC

Hi Mfz,

The result you are getting is correct.

just have a look to below query's output(I have ran under scott schema).

select deptno,comm,first_value(comm ignore nulls) over(partition by deptno order by comm desc ) x from emp;

Deptno comm x
10
10
10
20
20
20
20
20
30
30
30 1400 1400
30 500 1400
30 300 1400
30 0 1400

Now from above result we saw for two rows of 30 department we got the first_value as null coz
untill it gets the first non null value that is 1400 it doesnot print for those first tworows
(it just ignored those tow rows).

now what will do,we will just add nulls last clause in above query and see the result.

select deptno,comm,first_value(comm ignore nulls) over(partition by deptno order by comm desc nulls last) x from emp;
10
10
10
20
20
20
20
20
30 1400 1400
30 500 1400
30 300 1400
30 0 1400
30 1400
30 1400

Now see how it prints 1400 as first value for those two rows with comm as null (coz it allready calculated the first_value
as 1400 and what ever comes after for that group it just show that value).

Hope this explanation helps you.

So Now coming to your query. If you write your query like below (am adding only desc in order by).
First_Value(Tele_No Ignore Nulls) Over ( Partition By Id Order By ( Case When Type_Cd ='P'
Then 1 Else 2 End ) desc ) Fv
from t;
ID Type_cd tele_no FV
1 M 5550004444 5550004444
1 P 5550004444
2 P 7775550000 7775550000


Hope this helps you understanding the concept.

Regards,
Ranjan.

harini, October 10, 2013 - 2:56 am UTC

If a*0.5+b*1.5+c*6=100 and a+b+c=100 then find a,b,c values using plsql.
Tom Kyte
October 10, 2013 - 12:58 pm UTC

go for it, it sounds like an intriguing HOMEWORK problem.

For update nowait

Anand, November 01, 2013 - 10:43 am UTC

Hi Tom,

i have table like below :

drop table t
/
create table t (record_id number ,date_p date)
/
insert into t values (1,'01-jan-2013')
/
insert into t values (2,'02-jan-2013')
/
insert into t values (3,'03-jan-2013')
/
insert into t values (4,'04-jan-2013')
/
insert into t values (5,'05-jan-2013')
/
insert into t values (6,'06-jan-2013')
/

If i run below query it work fine :

select record_id from t where date_p = (select min(date_p) from t ) for update nowait

but below query give me error :

select record_id
from (select record_id, date_p, min(date_p) over() min_date from t)
where date_p = min_date
for update nowait

i want to use analytical for query tunning but unable to do this.Need your suggestion.
Tom Kyte
November 01, 2013 - 9:36 pm UTC

you cannot. that is not an updateable view once you add the analytic function

and in fact, the analytic would probably "detune" this query - not tune it - in this case.

think about it - getting the min(date_p) using an index min/max would be around 3 io's. Then getting that record would take about 4 IO's (3 of which would definitely be in memory since we just used that index)

whereas the analytic.... well - it would likely full scan wouldn't it...


ops$tkyte%ORA11GR2> create table t
  2  as
  3  select * from all_objects;

Table created.

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> update t set created = sysdate-10000 where rownum = 1;

1 row updated.

ops$tkyte%ORA11GR2> create index t_idx on t(created);

Index created.

ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'T' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> set autotrace on
ops$tkyte%ORA11GR2> select object_name, created from t where created = (select min(created) from t);

OBJECT_NAME                    CREATED
------------------------------ ---------
ICOL$                          16-JUN-86


Execution Plan
----------------------------------------------------------
Plan hash value: 4019783748

--------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |    60 |  1980 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID | T     |    60 |  1980 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN           | T_IDX |    60 |       |     1   (0)| 00:00:01 |
|   3 |    SORT AGGREGATE            |       |     1 |     8 |            |          |
|   4 |     INDEX FULL SCAN (MIN/MAX)| T_IDX |     1 |     8 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("CREATED"= (SELECT MIN("CREATED") FROM "T" "T"))


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          6  consistent gets
          0  physical reads
          0  redo size
        609  bytes sent via SQL*Net to client
        524  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

ops$tkyte%ORA11GR2> select object_name, created from (select t.*, min(created) over () min_created from t)
  2  where created = min_created;

OBJECT_NAME                    CREATED
------------------------------ ---------
ICOL$                          16-JUN-86


Execution Plan
----------------------------------------------------------
Plan hash value: 3283722134

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      | 73021 |  2495K|   293   (1)| 00:00:04 |
|*  1 |  VIEW               |      | 73021 |  2495K|   293   (1)| 00:00:04 |
|   2 |   WINDOW BUFFER     |      | 73021 |  2353K|   293   (1)| 00:00:04 |
|   3 |    TABLE ACCESS FULL| T    | 73021 |  2353K|   293   (1)| 00:00:04 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("CREATED"="MIN_CREATED")


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       1045  consistent gets
          0  physical reads
          0  redo size
        609  bytes sent via SQL*Net to client
        524  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1  rows processed

ops$tkyte%ORA11GR2> set autotrace off



analytics sometimes

make queries go faster
make queries go no faster nor slower
make queries go a lot slower...


there are no silver bullets.

Find Popular day of week, Popular Hour, Popular User

A Reader, November 29, 2013 - 4:13 am UTC

Hi Tom,

I am not able to find the solution to this problem. Need help.

Create table test (day_of_week VARCHAR(10), HR VARCHAR(2), USERNAME VARCHAR2(10));

insert into test values ('Friday','00','U1') ;
insert into test values ('Friday','01','U1') ;
insert into test values ('Friday','02','U2') ;
insert into test values ('Saturday','00','U1') ;
insert into test values ('Saturday','00','U2') ;

In a single scan of test table need to find Popular DAY OF WEEK, POPULAR HR, POPULAR USER based on the counts as to how many times they have appeared in the table.

The answer to this is Friday , 00 and U1 as they have MAX counts of 3 times as against other values.

Thanks

Overlapping data

Reader, January 22, 2014 - 7:15 pm UTC

Tom,

I am trying to find some overlapping data in this list. I do not have a way to group this.Can you please suggest to how to get the results as shown below?

PLAN_MIN PLAN_MAX BASE_CD
220554 229079 AGT
220554 229079 AGT
2210A5 2210A5 AGT
2210A8 2210A8 AGT
2220A5 2220A5 AGT
2220A8 2220A8 AGT
2230A5 2230A5 AGT
2230A8 2230A8 AGT

220554 229079 BRK
220554 229079 BRK
2210A5 2210A5 BRK
2210A8 2210A8 BRK
2220A5 2220A5 BRK
2220A8 2220A8 BRK
2230A5 2230A5 BRK
2230A8 2230A8 BRK


Results
PLAN_MIN PLAN_MAX BASE_CD PLAN_MIN_1 PLAN_MAX_1
2210A5 2210A5 AGT 220554 229079
2210A5 2210A5 BRK 220554 229079
2210A8 2210A8 AGT 220554 229079
2210A8 2210A8 BRK 220554 229079
2220A5 2220A5 BRK 220554 229079
2220A5 2220A5 AGT 220554 229079
2220A8 2220A8 BRK 220554 229079
2220A8 2220A8 AGT 220554 229079
2230A5 2230A5 AGT 220554 229079
2230A5 2230A5 BRK 220554 229079
2230A8 2230A8 BRK 220554 229079
2230A8 2230A8 AGT 220554 229079

Scripts:
create table plan_details
(PLAN_MIN varchar2(20)
,PLAN_MAX varchar2(20)
, base_cd varchar2(5)
);

--AGT
insert into plan_details values ('220554' ,'229079' , 'AGT');
insert into plan_details values ('2210A5' ,'2210A5' , 'AGT');
insert into plan_details values ('2210A8' ,'2210A8' , 'AGT');
insert into plan_details values ('2220A5' ,'2220A5' , 'AGT');
insert into plan_details values ('2220A8' ,'2220A8' , 'AGT');
insert into plan_details values ('2230A5' ,'2230A5' , 'AGT');
insert into plan_details values ('2230A8' ,'2230A8' , 'AGT');
--BRK
insert into plan_details values ('220554' ,'229079' , 'BRK');
insert into plan_details values ('220554' ,'229079' , 'BRK');
insert into plan_details values ('2210A5' ,'2210A5' , 'BRK');
insert into plan_details values ('2210A8' ,'2210A8' , 'BRK');
insert into plan_details values ('2220A5' ,'2220A5' , 'BRK');
insert into plan_details values ('2220A8' ,'2220A8' , 'BRK');
insert into plan_details values ('2230A5' ,'2230A5' , 'BRK');
insert into plan_details values ('2230A8' ,'2230A8' , 'BRK');

commit;


reader

reader, July 11, 2014 - 4:56 pm UTC

Tom,

I need some help on analytical functions on Lag and Lead functions.

I want get either lag or lead that matches with the where clause.
Example get the salary of employee and the difference between the current and selecting row should have min of 1000. is there a way to get this using analytical function?

Thanks

Thanks for the insight

SlothB77, February 22, 2018 - 3:02 pm UTC

So, in other words there is no way to avoid a subquery when you want to add a WHERE clause to an analytic function and the WHERE clause uses analytic function.

I.e.,

select zip_code, 
date, 
temperature, 
avg(temperature) over (partition by zip_code) average_temperature
from zip_temps
<i>where temperature > average_temperature</i>


won't work, but

select * from
(
select zip_code, 
date, 
temperature, 
avg(temperature) over (partition by zip_code) average_temperature
from zip_temps
)
where temperature > average_temperature


will work.

Here i just want to return the rows where the recorded temperature for a certain zip code on a certain date is higher than the average temperature for all dates for the zip code. And assuming the data set includes temperatures recorded for multiple zip codes over multiple dates.

I want to improve the aesthetics of the query by avoiding the use of a subquery, but I guess that isn't possible.
Chris Saxon
February 22, 2018 - 4:44 pm UTC

Correct.

You could go with subquery factoring:

with temps as (
  select zip_code, date, temperature, 
         avg(temperature) over (partition by zip_code) average_temperature
  from   zip_temps
)
  select * from temps
  where  temperature > average_temperature


I find the aesthetics of it better than inline views :)

Analytic over (partition by

Asim, September 11, 2022 - 9:35 am UTC


Can we safely say that

....query text... analyticfunctionthatallowsorderby ()
OVER(PARTITION BY col1 ORDER BY col1, col2....rest of the query...

is always equal to

....query text... analyticfunctionthatallowsorderby ()
OVER(PARTITION BY col1 ORDER BY col2....rest of the query...

for all intents and purposes?





Connor McDonald
September 12, 2022 - 3:13 am UTC

Nope.

Just because we partition by "X" does not guarantee that we will order by "X"

Its similar to this discussion

https://asktom.oracle.com/pls/apex/asktom.search?tag=does-a-group-by-gaurantee-that-output-data-is-sorted

ie, just because it does today does not mean it will tomorrow (or ever)

But why

Asim, September 12, 2022 - 4:54 am UTC


That link is about GROUP BY and the query"s final order by.

Here its analytical windw's ORDER BY which just order within a partition.

Here when the analytic window is partitioned by col1, it means all rows of this partition will be of same value in col1, then why we need to ORDER BY col1, col2 why not only col2
Connor McDonald
September 14, 2022 - 10:08 am UTC

That link is about GROUP BY and the query"s final order by.

Sigh.... As I said: "Its similar to this discussion"

That link is MORE than just that. It is a statement that anything you do not explicitly order by has NO guarantees to be ordered.

Which means the same here

OVER(PARTITION BY col1 ORDER BY col1, col2
OVER(PARTITION BY col1 ORDER BY col2

could end up being quite different. The raw data will be the same, but the sorting in the final result could be same or different.

Right

Asim, September 14, 2022 - 2:55 pm UTC

Oh yes, you are right, actually I was only focusing on ordering inside analytical window, and therefore meant that in terms of raw data the result will be same.

Ok now, if they are same in terms of raw data then what about in terms of performance,? or Oracle is smart enough to transform the query such that unecessary cols (ie partition key) removed from the order by ?

Please reply

Asim, September 30, 2022 - 1:17 pm UTC

Can you please reply above as I am very curious about it.

You can suppose that the final order by is on a unique key.

Connor McDonald
October 03, 2022 - 3:34 am UTC

Anything you do not explicitly order by has NO guarantees to be ordered.

Anything you do explicitly order by is guaranteed to be ordered.

how to make this query to work with analytic function

Brian, December 12, 2023 - 5:30 am UTC

I am trying to write this query to get total_count of sessions by inst_id, username along with ACTIVE status count and inactive status count. But this query does not produce correct output and seems missing something. Can you please show how to get this worked with analytic function. Thanks.


select inst_id, username,
count(*) over (partition by inst_id, username) total_cnt,
count( CASE WHEN status='ACTIVE' THEN 1 END) over (partition by inst_id, username) active_cnt,
count( CASE WHEN status='INACTIVE' THEN 1 END) over (partition by inst_id, username) inactive_cnt
from gv$session a
where inst_id=2
--group by inst_id, username
order by 3 DESC
/

Connor McDonald
January 15, 2024 - 5:14 am UTC

Why an analytic? It seems to be you just want

select inst_id, username,
count(*) total_cnt,
count( CASE WHEN status='ACTIVE' THEN 1 END) active_cnt,
count( CASE WHEN status='INACTIVE' THEN 1 END) inactive_cnt
from gv$session a
where inst_id=2
group by inst_id, username
order by 3 DESC
/

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.