Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Rich.

Asked: March 24, 2002 - 11:28 pm UTC

Answered by: Tom Kyte - Last updated: February 22, 2018 - 4:44 pm UTC

Category: Database - Version: 8.1.7

Viewed 50K+ times! This question is

You Asked

Tom,

Thanks for providing this forum for answering questions. I'm trying to get the hang of analytic functions and I'm having a problem with the following code. I have to comment out "where dr <= 3)" to get it to work. "dr" shows up as the column heading but I add my conditional statement I get invalid column name. Help!

Ps. I have your book and I hope you sign on the dotted line for the next one.

select *
FROM (select candidate_id,
deptid,
actual_start_date,
annual_sal_amt,
dense_rank()
over (partition by deptid
order by annual_sal_amt desc) dr
from obapp1.offer
where actual_start_date >= '01-JAN-2002'
and annual_sal_amt > 0)
-- where dr <= 3)
order by deptid,
annual_sal_amt desc


and we said...

scott@ORA817DEV.US.ORACLE.COM> select *
2 FROM (select empno,
3 deptno,
4 hiredate,
5 sal,
6 dense_rank()
7 over (partition by deptno
8 order by sal desc) dr
9 from emp
10 where hiredate >= to_date( '01-JAN-1980', 'dd-mon-yyyy' )
11 and sal > 0)
12 where dr <= 3
13 order by deptno, sal desc
14 /

EMPNO DEPTNO HIREDATE SAL DR
---------- ---------- --------- ---------- ----------
7839 10 17-NOV-81 5000 1
7782 10 09-JUN-81 2450 2
7934 10 23-JAN-82 1300 3
7788 20 09-DEC-82 3000 1
7902 20 03-DEC-81 3000 1
7566 20 02-APR-81 2975 2
7876 20 12-JAN-83 1100 3
7698 30 01-MAY-81 2850 1
7499 30 20-FEB-81 1600 2
7844 30 08-SEP-81 1500 3

10 rows selected.

You have to apply the constraint against the analytic function (what you call DR) outside of the inline view. The anlytic functions have to be evaluated AND THEN you can "where" on them. Simply put the where dr <= 3 where I have it.

and you rated our response

  (422 ratings)

Is this answer out of date? If it is, please let us know via a Review

Reviews

Analytic Functions

March 25, 2002 - 2:07 pm UTC

Reviewer: Rich Sekunda from Montvale, NJ

Thanks for pointing that out. These new (for me and my application) functions are full of possibilities but will take a little getting used to.

Again, thanks for your help!

groupping analytic functions

September 11, 2002 - 12:19 pm UTC

Reviewer: A reader

hi tom

I have this query

select deptno,job, sum(sal), sum(sal) over ()
from emp
group by deptno, job, sal

it works but I dont understand why shouldnt it be

select deptno,job, sum(sal), sum(sal) over ()
from emp
group by deptno, job, sum(sal) over ()


???

Tom Kyte

Followup  

September 11, 2002 - 12:32 pm UTC

because analytics are done last. You probably mean to do:

1 select a.*, sum(sal) over ()
2 from (
3 select deptno,job, sum(sal) sal
4 from emp
5 group by deptno, job
6* ) a
scott@ORA920.US.ORACLE.COM> /

DEPTNO JOB SAL SUM(SAL)OVER()
------ --------- ---------- --------------
10 CLERK 1300 29025
10 MANAGER 2450 29025
10 PRESIDENT 5000 29025
20 ANALYST 6000 29025
20 CLERK 1900 29025
20 MANAGER 2975 29025
30 CLERK 950 29025
30 MANAGER 2850 29025
30 SALESMAN 5600 29025

9 rows selected.


I think

what the difference between

September 11, 2002 - 3:23 pm UTC

Reviewer: A reader

select a.*, sum(sal) over ()
from (
select deptno, sum(sal) sal
from emp
where deptno in (10, 20)
group by deptno, sal) a

and

select deptno, sum(sal), sum(sal) over ()
from emp
where deptno in (10, 20)
group by deptno, sal

?

Tom Kyte

Followup  

September 11, 2002 - 3:43 pm UTC

besides that they give different results in general?


ops$tkyte@ORA920.US.ORACLE.COM> select * from emp
  2  /

DEPTNO        SAL
------ ----------
    10        100
    10        100
    10        100
    10        100
    10        100
    20        200
    20        200
    20        200
    20        200

9 rows selected.

ops$tkyte@ORA920.US.ORACLE.COM>
ops$tkyte@ORA920.US.ORACLE.COM> select a.*, sum(sal) over ()
  2  from (
  3  select deptno, sum(sal) sal
  4  from emp
  5  where deptno in (10, 20)
  6  group by deptno, sal) a
  7  /

DEPTNO        SAL SUM(SAL)OVER()
------ ---------- --------------
    10        500           1300
    20        800           1300

ops$tkyte@ORA920.US.ORACLE.COM>
ops$tkyte@ORA920.US.ORACLE.COM> select deptno, sum(sal), sum(sal) over ()
  2  from emp
  3  where deptno in (10, 20)
  4  group by deptno, sal
  5  /

DEPTNO   SUM(SAL) SUM(SAL)OVER()
------ ---------- --------------
    10        500            300
    20        800            300

ops$tkyte@ORA920.US.ORACLE.COM>


The query:

select deptno, sum(sal), sum(sal) over ()
from emp
where deptno in (10, 20)
group by deptno, sal

gets the result set:

select deptno, sum(sal), SAL
  from emp
 where deptno in ( 10, 20 )
 group by DEPTNO, SAL

and then applies the analytic function (they are dead last) to the SAL column -- which since you grouped by it roles all of the same values into one row.  Doing that -- you get the sum of the DISTINCT sal's


I got the sum of the summed sals.


 

conditional anlytic functions

October 30, 2002 - 11:26 am UTC

Reviewer: raju

can the analytic function be evaluated based on a condtion for that function only

we have the surveys, i need to display the survey status(live or not)


SQL> 
SQL>    SELECT
  2     sur.ppl_survey_question_start_date  surve_start_date,
  3     sur.ppl_survey_question_desc survey_desc,
  4     decode(SUBSTR(dr,9),ROWID,'Live',NULL) survey_status
  5     FROM
  6     (
  7     SELECT
  8     ppl_survey_question_start_date,
  9     ppl_survey_question_desc ,
 10     MAX(to_char(ppl_survey_question_start_date,'YYYYMMDD') || ROWID) over(PARTITION BY ppl_division_id) dr
 11     FROM
 12     ppl_survey_question_ref
 13     ) sur
 14  /

SURVE_START_DATE SURVEY_DESC                                                                      SURVEY_STATUS
---------------- -------------------------------------------------------------------------------- -------------
10/25/2002       What do you think?                                                               
10/26/2002       What do you think?                                                               
10/27/2002       What do you think?                                                                               
11/1/2002        How many questions are in this survey?                                           Live


i need to display the survey status as live only when the survey start date <= sysdate. surveys whose start dates are greater than sysdate
should n't be live 

in the above query survey with start date 10/27/2002 should be live

i wrote the following query:
SQL>    SELECT
  2     sur.ppl_survey_question_start_date  surve_start_date,
  3     sur.ppl_survey_question_desc survey_desc,
  4     decode(SUBSTR(dr,9),ROWID,'Live',NULL) survey_status
  5     FROM
  6     (
  7     SELECT
  8     ppl_survey_question_start_date,
  9     ppl_survey_question_desc ,
 10     MAX(to_char(decode(sign(ppl_survey_question_start_date-trunc(SYSDATE)),1,to_date('01010001','ddmmyyyy'),ppl_survey_question_start_date),'YYYYMMDD') || ROWID) over(PARTITION BY ppl_division_id) dr
 11     FROM
 12     ppl_survey_question_ref
 13     ) sur
 14  /

SURVE_START_DATE SURVEY_DESC                                                                      SURVEY_STATUS
---------------- -------------------------------------------------------------------------------- -------------
10/25/2002       What do you think?                                                               
10/26/2002       What do you think?                                                               
10/27/2002       What do you think?                                                               Live
11/1/2002        How many questions are in this survey? 


i want to know, is there a better way to do this?

and also is dense_rank be better than the max function 

Tom Kyte

Followup  

October 31, 2002 - 8:13 pm UTC

I'm not sure why you are using analytics here at all???


select ppl_survey_question_start_date, ppl_survey_question_desc,
case when ppl_survey_question_start_date <= sysdate then 'LIVE'
else 'NOT LIVE'
end status
from ppl_survey_question_ref
/



conditional analytic functions

October 31, 2002 - 9:22 pm UTC

Reviewer: raju

tom,

i don't want all the surveys to be live, but i want the surveys to be live whose start_date =
max start date and less than today and for a division

we can only have one live survey at a time for a division

for example

if today is : 10/31/2002

survey id start date status Division
---------- ---------- ------- --------
1 10/27/2002 Live abc
2 11/2/20002 abc
3 10/26/2002 Live cde
4 12/1/2002 abc
5 10/24/2002 cde
6 11/21/2002 cde
7 10/26/2002 abc

thanks



Tom Kyte

Followup  

October 31, 2002 - 9:41 pm UTC

then your selecting of

max( start_date || rowid ) over ( partition by division) max_start_rowid



was good and you just need to:


select ...
decode( start_date || rowid, max_start_rowid, 'LIVE', null ) status
from ( .....


is appropriate

unable to allocate memory error with analytic function

November 18, 2002 - 3:31 pm UTC

Reviewer: raju

the following ref cursor gives with anlytic function an error:

ORA-04031: unable to allocate 4200 bytes of shared memory ("shared pool","SELECT ppl_content_typ_c ...","library cache","kkslpkp - literal info.")
ORA-06512: at "PCSFB.PPL_HOME_PKG", line 78 ORA-06512: at line 1


OPEN o_content_details FOR
SELECT
ppl_content_typ_c content_type,
ppl_content_hdng content_heading,
ppl_content_t content_desc,
ppl_content_url content_url,
ppl_video_url video_url,
ppl_audio_url audio_url
FROM
(
SELECT
a.ppl_content_typ_c,
a.ppl_content_hdng,
a.ppl_content_t,
a.ppl_content_url,
a.ppl_video_url,
a.ppl_audio_url,
dense_rank() over (PARTITION BY a.ppl_content_typ_c ORDER BY decode(ppl_divsn_id_c ,'ALL',1,0) ASC,a.ppl_content_eff_d DESC NULLS LAST) dr
FROM ppl_tcontent a WHERE a.ppl_content_eff_d <= v_curr_date AND a.ppl_divsn_id_c IN ('ALL',v_divsn_id)) WHERE dr = 1;



i don't understand why is it giving that error as this is static
can you guide where i can use bind variables to avoid this error?

purpose of the query is : we have contents for each division, we need to pick the content that has max effective date and which is less than sysdate



Tom Kyte

Followup  

November 18, 2002 - 8:52 pm UTC

you've trashed the shared pool -- it is full. flush it or restart (so its "clean") and this query won't do that. It's in the parse phase -- trying to allocate library cache and you've plain run out.


Most likely -- you aren't using bind variables elsewhere in your system or your shared pool really is too small (thats very rare, the bind variable thing -- for whatever reason, is not unfortunately)

What are the side-effects of Flushing the shared Pool ?

November 19, 2002 - 2:47 am UTC

Reviewer: pasko from Hamburg,Germany

Hi Tom,
Thanks for a Great Website + Great Answers!

Beside the fact that Flushing the shared Pool would remove all cached pl/sql and cursors, what are other side-effects of flushing the shared Pool..


Thanks and best regards.




Tom Kyte

Followup  

November 19, 2002 - 7:25 am UTC

that is the side effect.

if you find you NEED to flush the shared pool on a recurring basis - you are killing performance, limiting scalability and need to find the app that isn't using bind variables and fix it.

unable to allocate memory error with analytic function

November 19, 2002 - 10:06 am UTC

Reviewer: raju

Tom,

thanks for the response.Even we flush shared pool, after sometime we keep getting this error:

earlier(for around 1 month) it was working well, all of sudden we are getting this error,

this is the rest of the stored proc code,

can you please tell me am i not using bind variables any where?

BEGIN
--
SELECT
ppl_survey_question_id,
ppl_survey_question_desc
INTO
v_survey_id_c,
v_survey_t
FROM
(
SELECT
ppl_survey_question_id,
ppl_survey_question_desc,
ppl_division_id,
ppl_survey_question_start_date
FROM
ppl_survey_question_ref
WHERE ppl_division_id IN ('ALL',v_divsn_id)
AND ppl_survey_question_start_date <= TRUNC(SYSDATE)
ORDER BY decode(ppl_division_id,'ALL',1,0) ASC, ppl_survey_question_start_date DESC NULLS LAST
)
WHERE
/*NOT EXISTS (SELECT 1 FROM ppl_tem_survey_choice emsr
WHERE emsr.ppl_survey_id_c = srv.ppl_survey_id_c
AND emsr.sid_em_id_c = i_emp_id)
AND*/
ROWNUM = 1;
BEGIN
SELECT
'Y'
INTO
v_survey_answered_ind
FROM
ppl_employee_survey_answer res
WHERE res.sid_employee_id = i_emp_id
AND res.ppl_survey_question_id = v_survey_id_c
AND rownum = 1;
EXCEPTION
WHEN no_data_found THEN
v_survey_answered_ind := 'N';
END;
--
OPEN o_survey FOR
SELECT
v_survey_id_c survey_id,
v_survey_t survey_desc,
v_survey_answered_ind question_answered
FROM dual;
--
OPEN o_survey_choice FOR
SELECT
chc.ppl_survey_answer_id choice_seq,
chc.ppl_survey_answer_desc choice_desc
FROM
ppl_survey_answer_ref chc
WHERE chc.ppl_survey_question_id = v_survey_id_c;
--
EXCEPTION
WHEN no_data_found THEN
OPEN o_survey FOR
SELECT 1 FROM dual WHERE 1 = 0;
OPEN o_survey_choice FOR
SELECT 1 FROM dual WHERE 1 = 0;
END;


thanks


Tom Kyte

Followup  

November 19, 2002 - 5:22 pm UTC

see
</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:1163635055580 <code>

to see if you are using binds.

STATIC sql in plsql is always 100% bind friendly.

I don't get the same results as in your book, p 546

February 11, 2003 - 10:32 am UTC

Reviewer: Jon from Hartford, CT USA

At first I thought I was typing it in wrong, but I downloaded the .sql file and did a cut and paste.  Why am I not getting my running totals as shown in the book?

SQL> break on deptno skip 1
SQL> select ename, deptno, sal,
  2         sum(sal) over
  3                (order by deptno, ename) running_total,
  4         sum(sal) over
  5                (partition by deptno
  6                 order by ename) department_total,
  7         row_number() over
  8                (partition by deptno
  9                 order by ename  ) seq
 10  from emp
 11  order by deptno, ename
 12  /

ENAME                   DEPTNO       SAL RUNNING_TOTAL DEPARTMENT_TOTAL       SEQ
-------------------- --------- --------- ------------- ---------------- ---------
Smith                        1        70        106000           106000         1
Smith                                 90        106000           106000         2
Smith                                110        106000           106000         3
Smith                                130        106000           106000         4
Smith                                150        106000           106000         5
Smith                                170        106000           106000         6
Smith                                190        106000           106000         7
Smith                                210        106000           106000         8
Smith                                230        106000           106000         9
Smith                                250        106000           106000        10  

Tom Kyte

Followup  

February 11, 2003 - 4:50 pm UTC

and so what does select * from EMP look like to you?


looks like you have a whole lotta SMITHs in there...

Answer to your question

February 12, 2003 - 8:11 am UTC

Reviewer: Jon from Hartford, CT USA

Yes, I just generated some test data using something like

begin
for i in 1..10 loop
for j in 1..100 loop
  insert into emp values ('Smith',i, j*70);
end loop;
end loop;
end;
/

I never committed the data.  My purpose was to get timing statistics in our environment similar to what you reflected in your book.  But when I got the questionable results, I wondered why.

Using the above,
SQL> select * from emp where rownum < 6;

ENAME                   DEPTNO       SAL
-------------------- --------- ---------
Smith                        1        70
Smith                                140
Smith                                210
Smith                                280
Smith                                350

and the first few rows from the query are:
SQL> select ename, deptno, sal,
  2         sum(sal) over
  3                (order by deptno, ename) running_total,
  4         sum(sal) over
  5                (partition by deptno
  6                 order by ename) department_total,
  7         row_number() over
  8                (partition by deptno
  9                 order by ename  ) seq
 10  from emp
 11  order by deptno, ename
 12  /

ENAME                   DEPTNO       SAL RUNNING_TOTAL DEPARTMENT_TOTAL       SEQ
-------------------- --------- --------- ------------- ---------------- ---------
Smith                        1        70        353500           353500         1
Smith                                140        353500           353500         2
Smith                                210        353500           353500         3
Smith                                280        353500           353500         4
Smith                                350        353500           353500         5
 

Tom Kyte

Followup  

February 12, 2003 - 8:38 am UTC

they all have the same ENAMES -- when you order by them -- they are all the SAME. Mine worked simply because the ENAMES are unique in the scott/tiger EMP table.

add

, rowid


to the query:

select ename, deptno, sal,
2 sum(sal) over
3 (order by deptno, ename, ROWID) running_total,
4 sum(sal) over
5 (partition by deptno
6 order by ename, ROWID ) department_total,
7 row_number() over
8 (partition by deptno
9 order by ename, ROWID ) seq
10 from emp
11 order by deptno, ename
12 /


or more simply, just run demobld.sql to get the real tables loaded up if you want to follow the examples blow by blow

Thanks

February 12, 2003 - 8:57 am UTC

Reviewer: Jon from Hartford, CT USA

I knew there had to be a reason. That's a really important thing to know, and it hadn't occurred to me. In the real world, we might end up with duplicate last names (although I don't live in "Smith-ville") and then you really won't get a running total unless you use the technique you just explained.

As always, you had the answer - you're the best, Tom!

Analytic Function in PL/SQL ref_cursor

May 08, 2003 - 5:24 pm UTC

Reviewer: Dale Ware from St. Louis, Missouri

Tom,

I run the following anonymous PL/SQL block in my 8i instances.

set term off serverout off feed off timing off head off pause off arraysize 100
col owner form a16
col index_owner form a16
col blevel form 99,999
col leaf_blocks head 'Leaf|Blocks'
col blocks head 'Table|Blocks'
ttitle skip2 "-- Top 10 Indexes which are potential Candidates for Rebuilding" skip 2
variable x refcursor;
set term on serverout on head on pagesize 44 linesize 132 autoprint on
declare
pi_version varchar2(20);
pi_compatibility varchar2(20);
begin
dbms_utility.db_version(pi_version,pi_compatibility);
IF pi_version > '8.1.0.0.0' THEN
OPEN :x FOR 'select owner, index_name, leaf_blocks , blevel, tablespace_name, blocks
from (select idx.owner, index_name, leaf_blocks,
blevel, idx.tablespace_name, blocks,
rank () over (order by leaf_blocks desc nulls last) rank
from dba_indexes idx,
dba_tables tbl
where idx.table_owner = tbl.owner
and idx.table_name = tbl.table_name
and idx.tablespace_name is not null
and idx.leaf_blocks > (tbl.blocks * .75))
where rank <= 10';
END IF;
end;
/
set term off serverout off head off


However, when I run this in 8.0 instances it throws the following error:


OPEN :x FOR 'select owner, index_name, leaf_blocks , blevel, tablespace_name, blocks
*
ERROR at line 7:
ORA-06550: line 7, column 19:
PLS-00103: Encountered the symbol "select owner, index_name, leaf_blocks , blevel, tablespace_name," when expecting one of the
following:
select

Is it possible to run this in 8.0 and below? If so how?

Thanks,
Dale

does this require analytic function or can be done without them

May 21, 2003 - 2:43 pm UTC

Reviewer: Asim Naveed from Pakistan

I am trying to write SQL query for a question, but i
failed to make it. I thought may be analytic functions
can do some thing.
here are the create, and insert statements.

Create table T1
(
QUARTER VARCHAR2(100),
DIVISION VARCHAR2(100),
SALES NUMBER(10)
);

INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr1-1998', 'Div1', 3);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr1-1998', 'Div1', 3);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr1-1998', 'Div2', 13);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr1-1998', 'Div2', 13);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr1-1998', 'Div3', 43);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr1-1998', 'Div3', 43);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr1-1998', 'Div4', 23);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr2-1998', 'Div1', 43);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr2-1998', 'Div2', 33);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr2-1998', 'Div3', 33);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr2-1998', 'Div4', 4);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr3-1998', 'Div1', 4);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr3-1998', 'Div2', 23);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr3-1998', 'Div3', 2);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr3-1998', 'Div4', 1);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr4-1998', 'Div1', 1);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr4-1998', 'Div2', 23);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr4-1998', 'Div3', 2);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr4-1998', 'Div4', 21);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr1-1999', 'Div1', 123);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr1-1999', 'Div2', 3);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr1-1999', 'Div3', 4);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr1-1999', 'Div4', 6);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr2-1999', 'Div1', 7);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr2-1999', 'Div2', 8);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr2-1999', 'Div3', 6);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr2-1999', 'Div4', 5);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr3-1999', 'Div1', 5);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr3-1999', 'Div2', 4);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr3-1999', 'Div3', 6);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr3-1999', 'Div4', 7);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr4-1999', 'Div1', 5);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr4-1999', 'Div2', 55);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr4-1999', 'Div3', 36);
INSERT INTO T1 (QUARTER, DIVISION, SALES) VALUES ('Qtr4-1999', 'Div4', 43);


as you can see there are four divisions in each quarter.
the question is that show all those quarters in which
the sales of div1 is greater than sales of div2. The out
put should be like this

QUARTER DIVISION SUM(SALES)
---------------------------------------------
Qtr1-1999 Div1 123
Qtr1-1999 Div2 3
Qtr2-1998 Div1 43
Qtr2-1998 Div2 33
Qtr3-1999 Div1 5
Qtr3-1999 Div2 4


to acheive the above result i wrote the following query


Select quarter, division, sum(sales) from T1 where division in ('Div1', 'Div2') GROUP BY quarter, division having quarter
IN
(
SELECT quarter
FROM T1
GROUP BY QUARTER
HAVING SUM(DECODE(division, 'Div1', sales, 0))
> SUM(decode(division, 'Div2', sales, 0))
)

this query is working fine, but for some reason
I want a query without DECODE/CASE
which returns same result. Could analytical functions help.?

Thanks in advance



Tom Kyte

Followup  

May 21, 2003 - 3:14 pm UTC

well, the right way to do this is in one pass.  You get more pleasing output as well.  we could unpivot the data using DUAL if necessary.  But DECODE is the best answer.  Just to make you happy, there is a non-decode (but slower) version.

ops$tkyte@ORA920> select quarter,
  2         sum(decode(division,'Div1',sales,null)) Div1,
  3         sum(decode(division,'Div2',sales,null)) div2
  4    from t1
  5   group by quarter
  6  having sum(decode(division,'Div1',sales,null)) > sum(decode(division,'Div2',sales,null))
  7  /

QUARTER          DIV1       DIV2
---------- ---------- ----------
Qtr1-1999         123          3
Qtr2-1998          43         33
Qtr3-1999           5          4

ops$tkyte@ORA920>
ops$tkyte@ORA920> select quarter,
  2         sum(instr(division,'Div1')*sales) Div1,
  3         sum(instr(division,'Div2')*sales) Div2
  4    from t1
  5  group by quarter
  6  having sum(instr(division,'Div1')*sales) > sum(instr(division,'Div2')*sales)
  7  /

QUARTER          DIV1       DIV2
---------- ---------- ----------
Qtr1-1999         123          3
Qtr2-1998          43         33
Qtr3-1999           5          4

ops$tkyte@ORA920>


 

reason of not using decode

May 22, 2003 - 3:46 am UTC

Reviewer: Asim Naveed from Pakistan

Thanks for your reply.
I am using Oracle Version 9i Release 2

Although your answer is producing the information
without using decode, but my problem is still not solved,
because the reason I avoid decode/case and now also instr
is that I can not break the expression SUM(sales)
into SUM(DECODE(division, 'Div1', sales, null) because
i am writing a dynamic SQL, and the expression "sum(sales)"
is coming from a table at run time, and this expression
could be anything for e.g
SUM(sales), SUM(price*volume), COUNT(DISTINCT salesid),
COUNT(Distinct customerid). etc. etc.

see

SELECT
quarter,
division,
"here goes the full expression which is coming from table"

Also my output requirement is to get divisions on rows
instead of columns, (how can we do this with dual)

Today I have written the following query which is doing
exactly what I want , but, if possible i need a better
query.

SELECT quarter, division, SUM(sales)
from t1 c
where division IN ('Div1', 'Div2')
and quarter IN (
select quarter from t1 a where division = 'Div1'
GROUP BY quarter
HAVING SUM(sales) >
(select NVL(SUM(sales), 0)
from t1 b
where b.quarter = a.quarter AND division = 'Div2')
)
GROUP BY quarter, division


See in the above query, I just replace "SUM(sales)"
with what ever expression I get from
the table (e.g COUNT(salesid), SUM(price*volume)) and its done.

main point is that such a solution is required in which
the aggregate expression can be plugged
as it comes from the table with out breaking that
expression.


Tom Kyte

Followup  

May 23, 2003 - 7:55 am UTC

you can do this with decode, I see no issues with that at all. why do you believe otherwise?

anyway, guess you are back to your solution.

remember this:

generic is great
generic is slow
specific solutions will massively outperform generic solutions
it is up to you to made the tradeoff, decide which is more important.

One more thing to mention about instr

May 22, 2003 - 3:54 am UTC

Reviewer: Asim Naveed from Pakistan

your solution with instr can produce wrong results.
if the input division are like this.
first division = 'Div1'
second division = 'Div1Div2'


Tom Kyte

Followup  

May 23, 2003 - 7:55 am UTC

i never really expected anyone to use it, it would be "wrong"

Question

June 10, 2003 - 2:21 pm UTC

Reviewer: PRS from PRS

Hi Tom,
I have a table named comments. It has following columns.
opp_id number(10);
comm_id number(10);
commoent varchar2(4000);

Data is shown as below.

opp_id comm_id comment
1 101 test1
1 202 test2
1 300 test3
1 550 test4
2 220 test1
3 301 test1
3 501 test2

Here opp_id and comm_id both are genereated by the oracle sequence.

I need to transfer this data into the following table.

RSF_COMMENT
opp_id number(10);
comm_id number(10);
commoent varchar2(4000);

But here comm_id is a sequential number. Data is shown below.

opp_id comm_id comment
1 1 test1
1 2 test2
1 3 test3
1 4 test4
2 1 test1
3 1 test1
3 2 test2

I cannot use straight insert into SELECT as I need to
derive sequential comment id. So that can be achieved by PL/SQL procedure.

Is there anyway we can derive the sequential id using the
SELECT statement by having anylatical function?

Thanks for your help,
PRS

Tom Kyte

Followup  

June 10, 2003 - 2:26 pm UTC

ops$tkyte@ORA920> insert into t2
  2  select opp_id, row_number() over (partition by opp_id order by comm_id), com
  3    from t1
  4  /

7 rows created.

ops$tkyte@ORA920> select * from t2;

    OPP_ID    COMM_ID COM
---------- ---------- -----
         1          1 test1
         1          2 test2
         1          3 test3
         1          4 test4
         2          1 test1
         3          1 test1
         3          2 test2

7 rows selected. 

Analytics can solve this one i suppose -- for Asim Naveed

June 11, 2003 - 3:43 pm UTC

Reviewer: A reader from MN, USA

1 select quarter, division, sum(sales) over (partition by quarter, division) sum_sales
2 from t1
3 where quarter in ('Qtr1-1999', 'Qtr2-1998', 'Qtr3-1999')
4* and division in ('Div1', 'Div2')
RKPA01> /

QUARTER DIVISION SUM_SALES
---------- ---------- ----------
Qtr1-1999 Div1 123
Qtr1-1999 Div2 3
Qtr2-1998 Div1 43
Qtr2-1998 Div2 33
Qtr3-1999 Div1 5
Qtr3-1999 Div2 4

6 rows selected.


for MN from USA

June 13, 2003 - 11:59 am UTC

Reviewer: Asim Naveed from Pakistan

MN,

Please read the question carefully , i asked for those
records in which sum(sales) of div1 is greater than
sum(sales) in div2. the out put have to show both records
(i.e of div1 and div2 )

i.e qtr1-1998 div1 200000
qtr1-1998 div2 1000

see above div1 sales is greater than div2 in quarter 1 of
1998.

Is this a candidate for Analytic function

July 16, 2003 - 9:54 am UTC

Reviewer: Arun from Mumbai,

Tom,
I have a table with this kind of data

REQID C_NODE P_NODE
3838 1 0
3838 2 9
3838 2 10
3838 3 11
3838 7 3
3838 7 8
3838 8 11
3838 9 3
3838 9 8
3838 10 7
3838 11 1
Now i need a result set such that, for a criterion of 10
i will get
7
3
8
11
1

ie
3838 10 7
3838 7 3
3838 7 8
3838 3 11
3838 8 11
3838 11 1

How could i go about this?. I thought i could use analytic functions, but found we can have running totals etc only through that. Is there a possibility of doing this without procedures

Thanks for all your help,
Regards,
Arun


Tom Kyte

Followup  

July 16, 2003 - 10:43 am UTC

proof that what is obvious to one person is just a meaningless jumble to another.

"Now i need a result set such that, for a criterion of 10
i will get "

a criterion of 10 what? against what?

I cannot fathom how you got from input data to that result set

Sorry

July 16, 2003 - 11:04 am UTC

Reviewer: A reader

Tom,
Sorry for wasting your time. It never occured... I was wondering with this
problem for the past 4 hours :-(...

For the table

REQID C_NODE P_NODE
3838 1 0
3838 2 9
3838 2 10
3838 3 11
3838 7 3
3838 7 8
3838 8 11
3838 9 3
3838 9 8
3838 10 7
3838 11 1

i need to print
7
3
8
11
1
when the where clause is this

select ...... <tablename > where cnode = 10

which comes by a look up like this
C_NODE P_NODE
10 ---->>>>> 7
---->>> 7 ---->>>>> 3
7 ---->>>>> 8
3 ---->>>>> 11
Regards,
Arun

Tom Kyte

Followup  

July 16, 2003 - 11:28 am UTC

Much clearer -- that is a hierarchical query -- connect by is the answer, analytics not so useful.  consider:

ops$tkyte@ORA920LAP> column graph format a10
ops$tkyte@ORA920LAP> select rpad( '*', 2*level, '*' ) graph, level, t.*
  2    from t
  3   start with c_node = 10
  4   connect by prior p_node = c_node;

GRAPH           LEVEL      REQID     C_NODE     P_NODE
---------- ---------- ---------- ---------- ----------
**                  1       3838         10          7
****                2       3838          7          3
******              3       3838          3         11
********            4       3838         11          1
**********          5       3838          1          0
****                2       3838          7          8
******              3       3838          8         11
********            4       3838         11          1
**********          5       3838          1          0

9 rows selected.

<b>that is your entire tree -- looking at this:</b>


ops$tkyte@ORA920LAP>
ops$tkyte@ORA920LAP> select rpad( '*', 2*level, '*' ) graph, level, t.*
  2    from t
  3   start with c_node = 10
  4   connect by prior p_node = c_node and level <= 4;

GRAPH           LEVEL      REQID     C_NODE     P_NODE
---------- ---------- ---------- ---------- ----------
**                  1       3838         10          7
****                2       3838          7          3
******              3       3838          3         11
********            4       3838         11          1
****                2       3838          7          8
******              3       3838          8         11
********            4       3838         11          1

7 rows selected.

<b>it looks like you want everything at level 4 and below in the tree... so</b>


ops$tkyte@ORA920LAP>
ops$tkyte@ORA920LAP> select distinct p_node
  2    from t
  3   start with c_node = 10
  4   connect by prior p_node = c_node and level <= 4;

    P_NODE
----------
         1
         3
         7
         8
        11

<b>there you go</b> 

Thanks

July 18, 2003 - 1:11 am UTC

Reviewer: Arun

Thank you very much Tom.
Regards,
Arun

can I use analytic here?

July 25, 2003 - 4:24 pm UTC

Reviewer: steve from nyc,USA

Hi Tom,

I have 2 tables:
table1: dept#, itemId, ...
table2: item_id, item_type,...

table2 contains all the items. And I just want to insert
into table1 the items which exist in table2 not in the dept of table1 with type 'A' and 'B'.
e.g.
dept# itemid
----------------
1 100
1 101
2 102
2 103
3 100
4 103
5 101

T2:
item_id type
------- ----
100 A
101 A
102 B
103 C
104 C
105 A

and result I expect:
dept# itemid
----------------
1 100
1 101
1 102
1 105
2 102
2 103
2 100
2 101
2 105
.... ...
5 100
5 101
5 102
5 105

I can get the result by the following sql,
Can I use analytic function to reach the same goal?
beacuse I am thinking if there is new dept added in, I have to add new code.

insert into t1 (dept_no,item_id)
select 1,item_id
FROM t2
WHERE item_type in ('B', 'A')
and item_id not in ( select item_id from t1
where t1.dept_no = 1
)
union all
insert into t1 (dept_no,item_id)
select 2,item_id
FROM t2
WHERE item_type in ('B', 'A')
and item_id not in ( select item_id from t1
where t1.dept_no = 2
)
union all
...
unioa all
insert into t1 (dept_no,item_id)
select 5,item_id
FROM t2
WHERE item_type in ('B', 'A')
and item_id not in ( select item_id from t1
where t1.dept_no = 5
)



Thanks!

Steve


Tom Kyte

Followup  

July 25, 2003 - 5:10 pm UTC

insert into t1 ( dept_no, item_id )
select *
from (select distinct deptno from t1 ),
(select item_id from t2 )
where (dept_no, item_id ) NOT IN ( select dept_no, item_id from t1 );

cartesian product all of the deptnos with all of the item ids and insert the ones that do not exist.

A minor correction

July 25, 2003 - 8:10 pm UTC

Reviewer: A reader

A minor correction: Item_type needs to be added.

insert into t1 ( dept_no, item_id )
select *
from (select distinct deptno from t1 ),
(select item_id from t2 where item_type in ('A', 'B'))
where (dept_no, item_id ) NOT IN ( select dept_no, item_id from t1 );


rank() and plsql

August 25, 2003 - 10:07 am UTC

Reviewer: rd from sweden

Hi,

Please tell me why the following code fails.

-----------------
declare
type cur_type is ref cursor;
c1 cur_type;
vSql varchar2(500):=
'select acc.rowid accrow
,dense_rank() over (partition by acc.social_security_number
order by limit_increase_amount desc
,account_utilisation desc
,account_number desc
) dr
from cm_accounts acc
,cm_customer cus
where acc.social_security_number=cus.social_security_number
and acc.social_security_number=''3507249658'')';

type rid_tab is table of rowid index by binary_integer;
type rank_tab is table of number(2) index by binary_integer;
rids rid_tab;
drs rank_tab;
begin
open c1 for vSql;
loop
fetch c1 bulk collect into rids,drs limit 1000;
exit when c1%notfound;
forall i in 1 .. rids.count
update cm_accounts
set acc_rank=drs(i)
where rowid=rids(i);
end loop;
commit;
end ;
-----------------

Thanks

Tom Kyte

Followup  

August 25, 2003 - 1:41 pm UTC

well, when I generated some tables to go with this (be nice if tables came with test case), i got this first:

declare
*
ERROR at line 1:
ORA-06502: PL/SQL: numeric or value error: character string buffer too small
ORA-06512: at line 4


so for me at least, 500 wasn't big enough there... upping vSQL to 5000, i got:

declare
*
ERROR at line 1:
ORA-00933: SQL command not properly ended
ORA-06512: at line 21

meaning your sql was bad...


running just your SQL, i see it was:

ops$tkyte@ORA920>             select acc.rowid accrow
  2                     ,dense_rank() over (partition by acc.social_security_number
  3                                                    order by limit_increase_amount desc
  4                                                       ,account_utilisation desc
  5                                                       ,account_number desc
  6                                                         ) dr
  7              from   cm_accounts acc
  8                     ,cm_customer cus
  9              where  acc.social_security_number=cus.social_security_number
 10              and acc.social_security_number='3507249658')
 11  /
            and acc.social_security_number='3507249658')
                                                       *
ERROR at line 10:
ORA-00933: SQL command not properly ended


that parens that didn't need be there. taking that out, I ran it and got:

ops$tkyte@ORA920> declare
  2  type cur_type is ref cursor;
  3  c1  cur_type;
  4  vSql varchar2(5000):=
  5              'select acc.rowid accrow
  6                     ,dense_rank() over (partition by acc.social_security_number
  7                                                    order by limit_increase_amount desc
  8                                                       ,account_utilisation desc
  9                                                       ,account_number desc
 10                                                         ) dr
 11              from   cm_accounts acc
 12                     ,cm_customer cus
 13              where  acc.social_security_number=cus.social_security_number
 14              and acc.social_security_number=''3507249658''';
 15
 16  type rid_tab is table of rowid index by binary_integer;
 17  type rank_tab is table of number(2) index by binary_integer;
 18  rids rid_tab;
 19  drs        rank_tab;
 20  begin
 21     open c1 for vSql;
 22     loop
 23            fetch c1 bulk collect into rids,drs limit 1000;
 24         exit when c1%notfound;
 25         forall i in 1 .. rids.count
 26                   update cm_accounts
 27                set     acc_rank=drs(i)
 28                where     rowid=rids(i);
 29     end loop;
 30     commit;
 31  end ;
 32  /
 
PL/SQL procedure successfully completed.

Now, if you are in 8i (you never said), you would find that this will raise:

declare
*
ERROR at line 1:
ORA-01001: invalid cursor
ORA-06512: at line 23
 

and that is because BULK FETCH on a dynamically opened refcursor was simply "not supported" then.  The solution would be:

ops$tkyte@ORA817DEV> create or replace view v
  2  as
  3  select acc.rowid accrow
  4         ,dense_rank() over (partition by acc.social_security_number
  5                             order by limit_increase_amount desc
  6                                     ,account_utilisation desc
  7                                     ,account_number desc) dr
  8     from  cm_accounts acc
  9          ,cm_customer cus
 10    where  acc.social_security_number=cus.social_security_number
 11           and acc.social_security_number=userenv('client_info');
 
View created.
 
ops$tkyte@ORA817DEV>
ops$tkyte@ORA817DEV>
ops$tkyte@ORA817DEV> declare
  2          type rid_tab is table of rowid index by binary_integer;
  3          type rank_tab is table of number(2) index by binary_integer;
  4          rids rid_tab;
  5          drs  rank_tab;
  6
  7          cursor c1 is select * from v;
  8  begin
  9          dbms_application_info.set_client_info( '3507249658' );
 10          open c1;
 11          loop
 12                  fetch c1 bulk collect into rids,drs limit 1000;
 13          forall i in 1 .. rids.count
 14            update cm_accounts set acc_rank=drs(i) where rowid=rids(i);
 15           exit when c1%notfound;
 16     end loop;
 17     commit;
 18  end ;
 19  /
 
PL/SQL procedure successfully completed.


<b>NOTE the important MOVING Of the exit when c1%notfound!!!!! it would be a bug to have it before the update -- else your last fetch of say 999 rows, which raises the c1%notfound, would never get processed!!!!!</b>







 

August 25, 2003 - 7:15 pm UTC

Reviewer: A reader

Hi,
Here I have something about analytic functions, I hope it help somebody.
</code> http://www.geocities.com/juancarlosreyesp/OracleAskTom.pdf <code>

:)

August 26, 2003 - 5:19 am UTC

Reviewer: Helena Marková from Bratislava, Slovakia


Analytics?

September 18, 2003 - 10:16 am UTC

Reviewer: Steve from UK

I have a table with the following data:
location cpyear period prod_code, cases, cups
99951 2002 12 EU17 1 72
99951 2003 5 EU17 3 216
99951 2002 13 EU17 1 72

and I need output as:

99951 2003 5 EU17 5 360

As you can see cases and cups are summed and I'm looking for the latest year and period.

Thanks

Steve

Tom Kyte

Followup  

September 18, 2003 - 11:13 am UTC

ops$tkyte@ORA920> select location, max( cpyear * 100 + period ), prod_code, sum(cases), sum(cups)
  2    from t
  3   group by location, prod_code;
 
  LOCATION MAX(CPYEAR*100+PERIOD) PROD_CODE  SUM(CASES)  SUM(CUPS)
---------- ---------------------- ---------- ---------- ----------
     99951                 200305 EU17                5        360
 
<b>or...</b>


ops$tkyte@ORA920>
ops$tkyte@ORA920> select location,
  2         trunc(max( cpyear * 100 + period )/100) cpyear,
  3         mod(max( cpyear * 100 + period ),100) period,
  4         prod_code, sum(cases), sum(cups)
  5    from t
  6   group by location, prod_code;
 
  LOCATION     CPYEAR     PERIOD PROD_CODE  SUM(CASES)  SUM(CUPS)
---------- ---------- ---------- ---------- ---------- ----------
     99951       2003          5 EU17                5        360
 

September 18, 2003 - 11:40 am UTC

Reviewer: Steve from UK

Excellent thanks! I was convinced that I needed to use analytics and so couldn't see the wood for the trees!

Steve

Going further

October 08, 2003 - 8:25 pm UTC

Reviewer: bobspa

Tom,

is is possible to convert this code fragment using analytics => for performance improvement(?) [ i'm not yet too familiar with this 8i feature -- been sleeping for years :( ]. i'm using 8.1.6 EE

select bank_id, txn_date,acct_type,dep_cd,dep_term, interest_rate,currency_code,sum(amt_in_dollars),sum(amt_in_org_curr),count(distinct dep_name), tax_disc
from deposit_txns
group by bank_id, txn_date,acct_type,dep_cd,dep_term, interest_rate,currency_code,tax_disc

the table deposit_txns has millions of rows and is taking a lot of time. best time is 8 minutes elapsed. (sort area is already 8KB)

thanks and best regards

Tom Kyte

Followup  

October 09, 2003 - 4:04 pm UTC



there is no need for analytics there at all, they do not apply at all in this case. this query only does normal aggregation.


sort area is already '8kb'?? is that a typo?



Calculating count Using analytic functions

October 27, 2003 - 5:25 pm UTC

Reviewer: Sridhar Reddy from USA

Hi Tom,
I have a small issue for which i could not find a solution on this site,so iam asking you.

I have table with the following values
date varchar2 number number
1/12/2002 first 1 2
1/11/2002 first 3 4
1/10/2002 first 5 6
1/13/2002 second 7 8
1/14/2002 second 9 10

I would like to select something like this.
date varchar2 number number count
1/12/2002 first 1 2 3
1/11/2002 first 3 4 3
1/10/2002 first 5 6 3
1/13/2002 second 7 8 2
1/14/2002 second 9 10 2

do a count of the values on the varchar2 column and assign it.is this possible in one single statement?please help

Thanks,
Sridhar






Tom Kyte

Followup  

October 27, 2003 - 6:49 pm UTC

ops$tkyte@ORA920> select d, v, n1, n2, count(*) over (partition by v)
  2    from t;
 
D          V                  N1         N2 COUNT(*)OVER(PARTITIONBYV)
---------- ---------- ---------- ---------- --------------------------
01/12/2002 first               1          2                          3
01/11/2002 first               3          4                          3
01/10/2002 first               5          6                          3
01/13/2002 second              7          8                          2
01/14/2002 second              9         10                          2
 

Thanks a lot Tom

October 28, 2003 - 9:07 am UTC

Reviewer: Sridhar Reddy from USA

Tom ,
It helped me. thanks a lot for your quick response.
Thanks,
Sridhar.

Taking it a step ahead

October 28, 2003 - 9:23 am UTC

Reviewer: Sridhar Reddy from USA

Tom,
Your response helped me a lot.but i was curious how we can do the same thing taking into account another column.
date v2 n1 n2
1/12/2002 first 10 2
1/11/2002 first 10 4
1/10/2002 first 10 6
1/13/2002 first 20 8
1/14/2002 first 20 10
1/12/2002 two 10 2
1/11/2002 two 10 4
1/10/2002 two 10 6
1/13/2002 two 20 8
1/14/2002 two 20 10

I would like to select something like this.
date v2 n1 n2 count
1/12/2002 first 10 2 3
1/11/2002 first 10 4 3
1/10/2002 first 10 6 3
1/13/2002 first 20 8 2
1/14/2002 first 20 10 2
1/12/2002 two 10 2 3
1/11/2002 two 10 4 3
1/10/2002 two 10 6 3
1/13/2002 two 20 8 2
1/14/2002 two 20 10 2

What should we do if we need to select the count based on N1
even though "first" is repeated 5 times i want to get 3 and 2 as the count based on the column N1.Is this possible in one single statement?Please help.
Thanks,
Sridhar



Tom Kyte

Followup  

October 28, 2003 - 10:21 am UTC

count(*) over (partition by v2, n1)



Another Example

October 28, 2003 - 11:22 am UTC

Reviewer: BK from NJ, USA

Thought of contributing my 2 cents. Don't know without analytic function how this could be achieved.

select description,sales_account,count(*) over(partition
by sales_account) Count_SA,
cost_of_sales_account,count(*) over(partition by
cost_of_sales_account) Count_COSA
from mtl_system_items

Hope this helps.

can I use analytical function in having clause

February 06, 2004 - 8:53 pm UTC

Reviewer: A reader

I need order_id which were packed different people.
Let's say:
ID packed_by
123 james
123 james
122 james
122 james
122 jeff
140 jeff
140 jeff
128 jeff
128 james
I need orders with id 122 and 128.
Can I use select something like this:
select id, packed_by
from t1,t1
where t1.ID=t2.ID
having count(*) over (partition by id) <>
count(*) over (partition by packed_by)


Tom Kyte

Followup  

February 07, 2004 - 2:24 pm UTC

select id, packed_by
from t
where id in (select id from t group by id having count(distinct packed_by)>1)
/

or

select id, packed_by, cnt
from ( select id, packed_by, count(distinct packed_by) over (partition by id)cnt
from t )
where cnt > 1;



Tom...

February 20, 2004 - 3:26 pm UTC

Reviewer: Denise from Virginia,USA

please bear with me...I know I've been a real pest recently.
I'm working on several different projects..the primary
one being getting the required tables loaded into the 9i
db so I can start doing analysis reports using Analytic
Functions.

I mentioned in a previous thread about this matrix report.

I found a pl/sql block that another programmer was developing to run this report 3 years ago.It never got completed however I'm looking at it from the standpoint as a flowchart.

I've read somewhere that Analytic Functions can now be
used with PL/SQL in 9i as long as you compile the cursor
SELECT as a view and select from the VIEW instead of
the actual TABLE.

I'm reviewing Chapter 12 of your book.

Perhaps given the power of AF with SQL maybe just scrap
PL/SQL altogether.

what is your take on this?

thanks...and I promise to stay away from your website(
for the rest of the day!)





Tom Kyte

Followup  

February 20, 2004 - 3:31 pm UTC

In 8i, you had to "hide" analytics with dynamic sql or a view.

In 9i, just "code away", you don't have to hide them anymore.

analytics work great in plsql in 9i -- no issues.


Is this doable in a single statement

April 15, 2004 - 11:02 am UTC

Reviewer: Kishan from Rockville, MD

The following is part of a big table, where data is queried from different tables and populated. We are converting the table into a view. Is it possible to populate these columns in a one/two statements using analytics or some other better way?

create view p_flat as
select ...
...
(select count(*) over(partition by p_link_id) from p_flat where ast_cls ='1' and prop_sts = '1') as tot_l_inv,
(select count(*) over(partition by p_link_id from p_flat where ast_cls ='1' and prop_sts = '1') as tot_e_inv,
(select count(*) over(partition by p_link_id) from p_flat where ast_cls ='1' and prop_sts = '1') as tot_b_inv,
(select tot_l_inv + tot_e_inv + tot_b_inv from dual) as tot_r_inv
...);


create table p_temp as
select distinct(p_link_id) PLINK_ID, count(*) p_count
from p_flat
where ast_cls = '1' and prop_sts = '1'
group by p_link_id;

update p_flat t0 set (tot_l_inv)=
(select p_count from p_temp t1 where t0.p_link_id = t1.PLINK_ID);

create table p_temp as
select distinct(p_link_id) PLINK_ID, count(*) p_count
from p_flat
where ast_cls = '3' and prop_sts = '1'
group by p_link_id;

update p_flat t0 set (tot_e_inv)=
(select p_count from p_temp t1 where t0.p_link_id = t1.PLINK_ID);

create table p_temp as
select distinct(p_link_id) PLINK_ID, count(*) p_count
from p_flat
where ast_cls = '2' and prop_sts = '1'
group by p_link_id;

update p_flat t0 set (tot_b_inv)=
(select p_count from p_temp t1 where t0.p_link_id = t1.PLINK_ID);

update p_flat t0 set tot_r_inv=tot_l_inv + tot_e_inv + tot_b_inv;

Tom Kyte

Followup  

April 15, 2004 - 11:22 am UTC

sorry -- i don't understand that first view (didn't read further).



Analytical aggregate based on data sets

April 15, 2004 - 11:56 am UTC

Reviewer: Lee from NY, USA

Tom,

I have a situation that I need to tackle in a query, I tried using analytical function but I can’t figure out how to exactly do this:

I have a table with 10 columns 2 of which are date_effective and date_tarmination, 1 other column is person_id.

I need to find the min date_effective and max date_termination for each person_id for records that every thing but the dates is identical.

This is my problem, if I have the following record set:

PERSON_ID DATE_EFFECTIVE DATE_TERMINATION COL_ZZZ
1 01-JAN-01 01-FEB-01 A
1 02-FEB-01 01-MAR-01 A
1 02-APR-01 01-MAY-01 B
1 02-MAY-01 01-JUN-01 A
1 02-JUN-01 01-JUL-01 A
1 02-JUL-01 01-AUG-01 A

If I use regular min and max analytical function group by the columns I’ll get the following for the set of records with all identical values:

MIN_DATE_EFFECTIVE MAX_DATE_EFFECTIVE
01-JAN-01 01-AUG-01
01-JAN-01 01-AUG-01
02-APR-01 01-MAY-01
01-JAN-01 01-AUG-01
01-JAN-01 01-AUG-01
01-JAN-01 01-AUG-01

However, I want each group to have it’s own min and max dates based on the date order, like the following:

MIN_DATE_EFFECTIVE MAX_DATE_EFFECTIVE
01-JAN-01 01-MAR-01
01-JAN-01 01-MAR-01
02-APR-01 01-MAY-01
01-MAY-01 01-AUG-01
01-MAY-01 01-AUG-01
01-MAY-01 01-AUG-01

I want to say some thing like:
Give me the min and max dates where all the other values are identical from a group that is ordered by date and un broken by non-identical values. Every time that a group is “”broken””, a new group starts.

PERSON_ID DATE_EFFECTIVE DATE_TERMINATION COL_ZZZ
1 01-JAN-01 01-FEB-01 A group 1
1 02-FEB-01 01-MAR-01 A group 1
1 02-APR-01 01-MAY-01 B group 2
1 02-MAY-01 01-JUN-01 A group 3
1 02-JUN-01 01-JUL-01 A group 3
1 02-JUL-01 01-AUG-01 A group 3




I know I can probably do it with PL/SQL but this is a part of an INSERT FIRST query and I really like to do it with SQL, I tried different variations of analytical functions but couldn’t get it.

Your help is appreciated.

Thanks.








Tom Kyte

Followup  

April 15, 2004 - 12:51 pm UTC

ops$tkyte@ORA9IR2> select person_id, edate, tdate, col_zzz, last_col_zzz,
  2         max(
  3         case when rn = 1
  4                   OR (last_col_zzz <> col_zzz)
  5                   OR (last_col_zzz is null and col_zzz is not null)
  6                   OR (last_col_zzz is not null and col_zzz is null )
  7              then rn
  8          end) over (partition by person_id order by edate) grp
  9    from (
 10  select person_id, edate, tdate, col_zzz,
 11         row_number() over (partition by person_id order by edate) rn,
 12         lag(col_zzz) over (partition by person_id order by edate) last_col_zzz
 13    from t
 14         )
 15  /
 
 PERSON_ID EDATE     TDATE     C L        GRP
---------- --------- --------- - - ----------
         1 01-JAN-01 01-FEB-01 A            1
         1 02-FEB-01 01-MAR-01 A A          1
         1 02-APR-01 01-MAY-01 B A          3
         1 02-MAY-01 01-JUN-01 A B          4
         1 02-JUN-01 01-JUL-01 A A          4
         1 02-JUL-01 01-AUG-01 A A          4
 
6 rows selected.
 

<b>If zzz is NOT NULL, you do not need the other checks (is null/is not null), to support "more columns", it would look like this:</b>


ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> alter table t add col_aaa char(1);
 
Table altered.
 
ops$tkyte@ORA9IR2> update t set col_aaa = rownum;
 
6 rows updated.
 
ops$tkyte@ORA9IR2> update t set col_aaa = 1 where col_aaa = 2;
 
1 row updated.
 
ops$tkyte@ORA9IR2> update t set col_aaa = 5 where col_aaa = 4;
 
1 row updated.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select person_id, edate, tdate,
  2         col_zzz,
  3         col_aaa,
  4         last_col_aaa,
  5         last_col_zzz,
  6         max(
  7         case when rn = 1
  8                   OR (last_col_zzz <> col_zzz)
  9                   OR (last_col_zzz is null and col_zzz is not null)
 10                   OR (last_col_zzz is not null and col_zzz is null )
 11                   OR (last_col_aaa <> col_aaa)
 12                   OR (last_col_aaa is null and col_aaa is not null)
 13                   OR (last_col_aaa is not null and col_aaa is null )
 14              then rn
 15          end) over (partition by person_id order by edate) grp
 16    from (
 17  select person_id, edate, tdate, col_zzz, col_aaa,
 18         row_number() over (partition by person_id order by edate) rn,
 19         lag(col_zzz) over (partition by person_id order by edate) last_col_zzz,
 20         lag(col_aaa) over (partition by person_id order by edate) last_col_aaa
 21    from t
 22         )
 23  /
 
 PERSON_ID EDATE     TDATE     C C L L        GRP
---------- --------- --------- - - - - ----------
         1 01-JAN-01 01-FEB-01 A 1              1
         1 02-FEB-01 01-MAR-01 A 1 1 A          1
         1 02-APR-01 01-MAY-01 B 3 1 A          3
         1 02-MAY-01 01-JUN-01 A 5 3 B          4
         1 02-JUN-01 01-JUL-01 A 5 5 A          4
         1 02-JUL-01 01-AUG-01 A 6 5 A          6
 
6 rows selected.
 
 

Query

April 15, 2004 - 11:58 am UTC

Reviewer: PRS from USA

I have table named PS_RSF_OPP_NOTE as shown below.
ps_rsf_opp_note
(opportunity_id VARCHAR2(15) NOT NULL,
note_seq_nbr NUMBER(*,0) NOT NULL,
rc_summary VARCHAR2(80) NOT NULL,
rc_note_type VARCHAR2(5) NOT NULL,
rc_visibility VARCHAR2(4) NOT NULL,
row_added_dttm DATE,
row_added_oprid VARCHAR2(30) NOT NULL,
row_lastmant_dttm DATE,
row_lastmant_oprid VARCHAR2(30) NOT NULL,
syncid NUMBER(*,0),
syncdttm DATE,
descrlong LONG)

I want to find out the last descrlong column for a given
opportunity_id and max(note_seq_nbr). I wrote the following query.
select descrlong from ps_rsf_opp_note h where h.opportunity_id = '498624973' and
note_seq_nbr = (select count(*) from ps_rsf_opp_note g where g.opportunity_id = '498624973')
This works but I am making a lookup two times to PS_RSF_OPP_NOTE. 1st is to find out max sequence number and
2nd to get the actual record to get the last description.

This is a very huge table with 35 million rows. Is there any analytic function available so I can do this in one lookup?
I have descrlong column datatype LONG.

Any idea is appreciated.

Thanks for your help.
PRS

Tom Kyte

Followup  

April 15, 2004 - 12:53 pm UTC

select ...
from ( select ....
from t
where ....
order by note_seq_nbr DESC )
where rownum = 1;

is but one approach.


select ...
from ( select ...., MAX(note_seq_nbr) over () max_note_seq_nbr
from t
where .... )
where note_seq_nbr = max_note_seq_nbr
/

is another.

April 15, 2004 - 12:16 pm UTC

Reviewer: Kishan

Let me try and explain...

The update statements are part of a table.
CREATE table p_flat as
(ast_id number(10) primary key,
plink_id varchar2(10),
ast_cls varchar2(10),
prop_sts varchar2(10),
tot_l_inv number(10),
tot_e_inv number(10),
tot_b_inv number(10),
tot_r_inv number(10)
...
...
...)
Records are inserted/updated to this table from different tables using several sql scripts. This process takes 30-45 minutes daily. We are re-engineering this table by converting it to a view/materialized view and hoping to cut down the process time. We are creating the view by reviewing each script and building the appropriate columns by selecting them from corresponding tables given in the script or computing it on the fly.

The statements that I had posted earlier is part of one such script. I wanted to know if we could compute those columns on the fly and hence have that information computed by the database rather than saved in a table.

So instead of the table we want to have this view as follows:

create view p_flat as
SELECT x.*,
t2.p_link_id AS P_LINK_ID,
(select count(*) over(partition by p_link_id) from p_flat where ast_cls ='1' and prop_sts = '1') as tot_l_inv,
(select count(*) over(partition by p_link_id from p_flat where ast_cls ='1' and prop_sts = '1') as tot_e_inv,
(select count(*) over(partition by p_link_id) from p_flat where ast_cls ='1' and prop_sts = '1') as tot_b_inv,
(select tot_l_inv + tot_e_inv + tot_b_inv from dual) as tot_r_inv
...
...
FROM (select ast_id,c1,c2,c3....
FROM from T1) x
FROM T2,t3,t4....
WHERE x.ast_id = t2.ast_id and <other conditions> ...

Hope I am clear.

Thanks for all the help in advance.

Regards,
Kishan.

Tom Kyte

Followup  

April 15, 2004 - 12:59 pm UTC

i don't understand by "partition by" would be in a scalar subquery -- that is what is confusing me

partition by would imply "more than one row" and a scalar subquery cannot return more than one row.


and all of those columns look "the same" to me.




Inline view gives me error

April 15, 2004 - 1:40 pm UTC

Reviewer: PRS from USA

Following view gives me ORA904 invalid identifier. As soon as I added inline view in from clause join, it does not see outer tables.
SELECT B.opportunity_id
,B.BO_ID_CUST
,A.WSI_ACCOUNT_ID
,E.WSI_ACCOUNT_NO
,B.FIRST_NAME
,B.LAST_NAME
,B.OPPORTUNITY_STATUS
,f.xlatlongname status_desc
,A.TD_OPP_FLW_UP_ACT
,g.xlatlongname acat_desc
,B.PERSON_ID
,D.NAME
,A.RA_CAMPAIGN_ID
,C.RA_CMPGN_NAME
,B.ROW_ADDED_DTTM
,B.ROW_LASTMANT_DTTM
,B.NEXT_ACTIVITY_DT
,B.ACT_CLOSE_DT
,B.ACT_REVENUE
,B.EST_REVENUE
,B.PHONE
,'' dummy

,h.descrlong
,', Created By ' || h.row_added_oprid || ', on ' || h.row_added_dttm note
FROM PS_WSI_RSF_OPP_FLD A
, PS_RSF_OPPORTUNITY B
, PS_RA_CAMPAIGN C
, PS_RD_PERSON_NAME D
, PS_WSI_ACCOUNT E
,PSXLATITEM F
,PSXLATITEM G
,(select descrlong,row_added_oprid,row_added_dttm
from ( select opportunity_id,descrlong,row_added_oprid,row_added_dttm
from ps_rsf_opp_note k
where k.opportunity_id = b.opportunity_id
order by note_seq_nbr desc)
where rownum = 1) h
WHERE A.OPPORTUNITY_ID = B.OPPORTUNITY_ID
AND A.RA_CAMPAIGN_ID = C.RA_CAMPAIGN_ID(+)
AND B.PERSON_ID = D.PERSON_ID(+)
AND A.WSI_ACCOUNT_ID = E.WSI_ACCOUNT_ID(+)
AND f.fieldvalue = b.opportunity_status
AND f.fieldname = 'OPPORTUNITY_STATUS'
AND e.wsi_account_no = '33312345'
AND g.fieldvalue = A.TD_OPP_FLW_UP_ACT
AND g.fieldname = 'TD_OPP_FLW_UP_ACT'


Tom Kyte

Followup  

April 15, 2004 - 2:19 pm UTC

same answer as the last page you put this on.


please -- just once does it.

Sorry for confusing you....

April 15, 2004 - 2:39 pm UTC

Reviewer: Kishan

Man...I am so muddled up that I end up confusing even "THE TOM"....:)

What we want to do is to compute the columns tot_l_inv, tot_e_inv, tot_b_inv as the count of unique p_link_id values where ast_cls and prop_sts conditions are met.
My SQL skills are not great, and I guess the statements below does the job.

What is the best way to do this?

create view p_flat as
SELECT x.*,
T2.p_link_id AS P_LINK_ID,
(select count(*) from p_flat p1 where ast_cls ='1' and prop_sts = '1' and p1.p_link_id = T2.p_link_id group by p1.p_link_id) as tot_l_inv,
(select count(*) from p_flat p1 where ast_cls ='2' and prop_sts = '1' and p1.p_link_id = T2.p_link_id group by p1.p_link_id) as tot_e_inv,
(select count(*) from p_flat p1 where ast_cls ='3' and prop_sts = '1' and p1.p_link_id = T2.p_link_id group by p1.p_link_id) as tot_b_inv
...
...
FROM (select ast_id FROM from T1) x
FROM T2,T3,T4....
WHERE x.ast_id = T2.ast_id;


Tom Kyte

Followup  

April 15, 2004 - 2:43 pm UTC

are you trying to transpose rows into columns?


i would not use a scalar subquery for that -- i would use decode/case and group by.

but i'm not following the example at all -- can you start from the beginning, WITH THE SOURCE table. this view with a self referencing thing is just confusing. you have a view p_flat that references p_flat -- not getting it.

This is what we require...

April 16, 2004 - 9:45 am UTC

Reviewer: Kishan

Dear Tom:
The tables and the required view information with sample data is given below.Is it possible to accomplish this in a single sql statement? If not, what would be the best approach as the number of records in both the asset and the prop table are around 100,000 and 25,00,000 respectively and growing.

CREATE TABLE ASSET (
AST_ID NUMBER(10) PRIMARY KEY,
AST_CLS VARCHAR2(10),
PROP_STS VARCHAR2(10));

CREATE TABLE PROP(
AST_ID NUMBER(10) PRIMARY KEY REFERENCES ASSET(AST_ID),
P_LINK_ID NUMBER(10));

insert into asset (ast_id, ast_cls, prop_sts) values(100, '1', '1');
insert into asset (ast_id, ast_cls, prop_sts) values(101, '1', '1');
insert into asset (ast_id, ast_cls, prop_sts) values(102, '1', '1');
insert into asset (ast_id, ast_cls, prop_sts) values(103, '2', '1');
insert into asset (ast_id, ast_cls, prop_sts) values(104, '2', '1');
insert into asset (ast_id, ast_cls, prop_sts) values(105, '3', '1');
insert into asset (ast_id, ast_cls, prop_sts) values(106, '1', '1');
insert into asset (ast_id, ast_cls, prop_sts) values(107, '1', '2');
insert into asset (ast_id, ast_cls, prop_sts) values(108, '1', '2');
insert into asset (ast_id, ast_cls, prop_sts) values(109, '4', '1');
insert into asset (ast_id, ast_cls, prop_sts) values(110, '7', '1');
insert into asset (ast_id, ast_cls, prop_sts) values(111, '7', '1');
insert into asset (ast_id, ast_cls, prop_sts) values(112, '7', '1');

insert into prop(ast_id, p_link_id) values(100,'100');
insert into prop(ast_id, p_link_id) values(101,'100');
insert into prop(ast_id, p_link_id) values(102,'100');
insert into prop(ast_id, p_link_id) values(103,'200');
insert into prop(ast_id, p_link_id) values(104,'200');
insert into prop(ast_id, p_link_id) values(105,'300');
insert into prop(ast_id, p_link_id) values(106,'400');
insert into prop(ast_id, p_link_id) values(107,'400');
insert into prop(ast_id, p_link_id) values(108,'500');
insert into prop(ast_id, p_link_id) values(109,'500');

asset:
AST_ID AST_CLS PROP_STS
---------- ---------- ----------
100 1 1
101 1 1
102 1 1
103 2 1
104 2 1
105 3 1
106 1 1
107 1 2
108 1 2
109 4 1
110 7 1
111 7 1
112 7 1

prop:
AST_ID P_LINK_ID
---------- ----------
100 100
101 100
102 100
103 200
104 200
105 300
106 400
107 400
108 500
109 500

This is the expected result:
AST_ID AST_CLS PROP_STS P_LINK_ID TOT_L_INV TOT_E_INV TOT_B_INV TOT_R_INV
---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
100 1 1 100 3 0 0 3
101 1 1 100 3 0 0 3
102 1 1 100 3 0 0 3
103 2 1 200 0 0 2 2
104 2 1 200 0 0 2 2
105 3 1 300 0 1 0 1
106 1 1 400 1 0 0 1
107 1 2 400 1 0 0 1
108 1 2 500 0 0 0 0
109 4 1 500 0 0 0 0
110 7 1 0 0 0 0
111 7 1 0 0 0 0
112 7 1 0 0 0 0

AST_ID NUMBER(10) - from asset
AST_CLS VARCHAR2(10) - from asset
PROP_STS VARCHAR2(10) - from asset
P_LINK_ID NUMBER(10) - from prop
TOT_L_INV NUMBER - computed value
TOT_E_INV NUMBER - computed value
TOT_B_INV NUMBER - computed value
TOT_R_INV NUMBER - computed value

Conditions:
1. asset.ast_id = prop.ast_id
2. tot_l_inv = for this p_link_id count all unique p_link_id where ast_cls = '1' and prop_sts = '1'
3. tot_e_inv = for this p_link_id count all unique p_link_id where ast_cls = '2' and prop_sts = '1'
4. tot_b_inv = for this p_link_id count all unique p_link_id where ast_cls = '3' and prop_sts = '1'
5. tot_r_inv = tot_l_inv + tot_e_inv + tot_b_inv

Thankful as always...
Kishan.

Tom Kyte

Followup  

April 16, 2004 - 10:15 am UTC

much better to get the question and not an attempted answer :)


ops$tkyte@ORA9IR2> select x.*, tot_l_inv+tot_e_inv+tot_b_inv tot_r_inv
  2    from (
  3  select a.ast_id, a.ast_cls, a.prop_sts, b.p_link_id,
  4    count( case when ast_cls = '1' and prop_sts = '1' then b.p_link_id end )
  5                  over ( partition by p_link_id ) tot_l_inv,
  6    count( case when ast_cls = '2' and prop_sts = '1' then b.p_link_id end )
  7                  over ( partition by p_link_id ) tot_e_inv,
  8    count( case when ast_cls = '3' and prop_sts = '1' then b.p_link_id end )
  9                  over ( partition by p_link_id ) tot_b_inv
 10    from asset a, prop b
 11   where a.ast_id = b.ast_id(+)
 12         ) x
 13  /
 
AST_ID A P P_LINK_ID  TOT_L_INV  TOT_E_INV  TOT_B_INV  TOT_R_INV
------ - - --------- ---------- ---------- ---------- ----------
   100 1 1       100          3          0          0          3
   101 1 1       100          3          0          0          3
   102 1 1       100          3          0          0          3
   103 2 1       200          0          2          0          2
   104 2 1       200          0          2          0          2
   105 3 1       300          0          0          1          1
   106 1 1       400          1          0          0          1
   107 1 2       400          1          0          0          1
   108 1 2       500          0          0          0          0
   109 4 1       500          0          0          0          0
   110 7 1                    0          0          0          0
   111 7 1                    0          0          0          0
   112 7 1                    0          0          0          0
 
13 rows selected.
 

Please help

April 21, 2004 - 3:29 am UTC

Reviewer: Ram from Bangalore,India

Hello Tom,
Please see the code below
SQL> create table sales(monthly_trans_date date,total_sales number);

Table created.

SQL> insert into sales values('31-Jan-04',5000);
SQL> insert into sales values('29-Feb-04',15000);
SQL> insert into sales values('31-Mar-04',1500);
SQL> insert into sales values('30-Apr-04',6000);
SQL> insert into sales values('31-May-04',9000);
SQL> insert into sales values('30-Jun-04',7800);
SQL> insert into sales values('31-Jul-04',6700);
SQL> insert into sales values('30-Aug-04',5000);
SQL> insert into sales values('30-Sep-04',4000);
SQL> insert into sales values('31-Oct-04',2000);
SQL> insert into sales values('30-Nov-04',9000)
SQL> insert into sales values('31-Dec-04',6900)

SQL> select * from sales;

MONTHLY_T TOTAL_SALES                                                   

        
--------- -----------                                                   

        
31-JAN-04        5000                                                   

        
29-FEB-04       15000                                                   

        
31-MAR-04        1500                                                   

        
30-APR-04        6000                                                   

        
31-MAY-04        9000                                                   

       
                                                      
30-JUN-04        7800                                                   

        
31-JUL-04        6700                                                   

        
30-AUG-04        5000                                                   

        
30-SEP-04        5000                                                   

        
31-OCT-04        2000                                                   

        

MONTHLY_T TOTAL_SALES                                                   

        
--------- -----------                                                   

        
30-NOV-04        9000                                                   

        
31-DEC-04        6900                                                   

        

12 rows selected.

My requirement is:
I would like to have a Quarter-wise total sales as
 SQL> SELECT ....
    Quarter     sum_sales
    -------     ----------
    JAN-MAR      21500
    APR-JUN      22800
    JUL-SEP      16700
    OCT-DEC      17900
I am in need of the query and Could you please provide that?
I think we can make use of analytic functions with "range interval"
option but I don't know how to proceed with the query?

 

Tom Kyte

Followup  

April 21, 2004 - 8:03 pm UTC

no analytics here, simple "aggregate"

ops$tkyte@ORA9IR2> select to_char(dt,'MON')||'-'||
  2         to_char(add_months(dt,3),'MON') dt,
  3         s
  4    from (
  5  select trunc(monthly_trans_date,'q') dt, sum(total_sales) s
  6    from sales
  7   group by trunc(monthly_trans_date,'q')
  8         )
  9  /
 
DT               S
------- ----------
JAN-APR      21500
APR-JUL      22800
JUL-OCT      15700
OCT-JAN      17900
 
ops$tkyte@ORA9IR2>
 

Please help

May 06, 2004 - 5:44 am UTC

Reviewer: Raju from Hyderabad,India

Dear Tom,
I would like to get average of amount sold in a sales table for a 
"period gap of three days each".The table I use is as follows 

SQL>create table sales(trans_date date,amount_sold number);

SQL>insert into sales
    select to_char(sysdate+rownum,'dd-mon-yyyy'),(rownum*100)+20
    from obj where rownum <= 9;

I think we can use "avg(amount_sold) over(interval '03' range between rows preceding and current row)" something similar to this.But
I don't know how to group this.

Am I correct and could you please help with the query?
 

Tom Kyte

Followup  

May 06, 2004 - 7:47 am UTC

you are mostly interested in the first two columns, the last three I use for "verification".  It is many times useful, with sliding windows, to "see" the top of the window, the bottom of the window and the count of observations in the window itself.  first_value/last_value and count are great for that:

ops$tkyte@ORA9IR2> select trans_date,
  2         avg(amount_sold) over (order by trans_date range 2 preceding) avg_amt,
  3         first_value(trans_date) over (order by trans_date range 2 preceding) start_of_window,
  4         last_value(trans_date) over (order by trans_date range 2 preceding) end_of_window,
  5         count(*) over (order by trans_date range 2 preceding) cnt
  6    from sales
  7  /
 
TRANS_DAT    AVG_AMT START_OF_ END_OF_WI        CNT
--------- ---------- --------- --------- ----------
07-MAY-04        120 07-MAY-04 07-MAY-04          1
08-MAY-04        170 07-MAY-04 08-MAY-04          2
09-MAY-04        220 07-MAY-04 09-MAY-04          3
10-MAY-04        320 08-MAY-04 10-MAY-04          3
11-MAY-04        420 09-MAY-04 11-MAY-04          3
12-MAY-04        520 10-MAY-04 12-MAY-04          3
13-MAY-04        620 11-MAY-04 13-MAY-04          3
14-MAY-04        720 12-MAY-04 14-MAY-04          3
15-MAY-04        820 13-MAY-04 15-MAY-04          3
 
9 rows selected.
 

Thanks

May 06, 2004 - 10:37 am UTC

Reviewer: Raju from Hyderabad,India

Dear Tom,
Thanks for your response.But I expected the result set in the format of

Trans_date Amount_sold
---------- -----------
07-May - 09-May ...
10-May - 12-May ...
13-May - 15-May ...
How to get that?
Please do reply.

Tom Kyte

Followup  

May 06, 2004 - 1:27 pm UTC

hmm, that's even easier, you cannot do that in straight SQL, it is just a little 'math' (subtract and divide, then group)?  

you need a start date (07-may-2004) to be passed to the query


ops$tkyte@ORA9IR2> variable x varchar2(20)
ops$tkyte@ORA9IR2> exec :x := '07-may-2004'
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select min(trans_date), max(trans_date), sum(amount_sold)
  2    from sales
  3   group by floor((trans_date-to_date(:x,'dd-mon-yyyy'))/3);
 
MIN(TRANS MAX(TRANS SUM(AMOUNT_SOLD)
--------- --------- ----------------
07-MAY-04 09-MAY-04              660
10-MAY-04 12-MAY-04             1560
13-MAY-04 15-MAY-04             2460
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
 

Thanks

May 07, 2004 - 1:58 am UTC

Reviewer: Raju from Hyderabad,India

Dear Tom,
Thanks for your reply.But the clause 
"Group by floor(trans_date-to_date(....)/3)"
*frightens* me like anything.
I can't understand that.But I used a easy approach as

SQL> select 'First Three Days' as Three_days,avg(amount_sold) from sales
   2 where to_char(trans_date,'dd') between '7' and '9'
   3 union all
   4 select 'Next Three Days' as Three_days,avg(amount_sold) from sales
   5 where to_char(trans_date,'dd') between '10' and '12'
   6 union all
   7 select 'Last Three Days' as Three_days,avg(amount_sold) from sales
   8 where to_char(trans_date,'dd') between '13' and '15'
  
  

Tom Kyte

Followup  

May 07, 2004 - 8:04 am UTC


osh kosh bgosh -- we use computers right?  we are technical people right? this is *simple math* the type of which we learn in grade school (subtraction, division, no eigen vectors, no fast fourier transforms).

your query -- 3 full scans, sorts and aggregates.

Mine 1 full scan.


Look at the raw data:


ops$tkyte@ORA9IR2> variable x varchar2(20)
ops$tkyte@ORA9IR2> exec :x := '08-may-2004'
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select trans_date-to_date(:x,'dd-mon-yyyy') daysbtwn,
  2         (trans_date-to_date(:x,'dd-mon-yyyy'))/3 grp,
  3         floor((trans_date-to_date(:x,'dd-mon-yyyy'))/3) grp_trunc,
  4         trans_date, amount_sold
  5    from sales
  6  /
 
  DAYSBTWN        GRP  GRP_TRUNC TRANS_DAT AMOUNT_SOLD
---------- ---------- ---------- --------- -----------
         0          0          0 08-MAY-04         120
         1 .333333333          0 09-MAY-04         220
         2 .666666667          0 10-MAY-04         320
         3          1          1 11-MAY-04         420
         4 1.33333333          1 12-MAY-04         520
         5 1.66666667          1 13-MAY-04         620
         6          2          2 14-MAY-04         720
         7 2.33333333          2 15-MAY-04         820
         8 2.66666667          2 16-MAY-04         920
 
9 rows selected.

This is a common, very simple technique that you should be applying in lots of cases.  it is *math*.  you subtract.  you divide.  you floor.  you do this in your sleep after a while you get to do it so often.


If you want to use more than 3 times the resources (n times as your 3 days might be 5 days or something in the future) go for it.  hardware vendors love that.


Me, I'm still in search of fast=true. 

Thanks

May 07, 2004 - 8:13 am UTC

Reviewer: Raju from Hyderabad,India

Dear Tom,
Thanks for your elaborate effort.I really appreciate that.
And now,If we have a requirement of generating a report for sales table
(i) semi-annually
(ii)bi-Monthly
How to do that?

We can assume sales table with * whole year's data*.Could you
please provide some code snippet for that?

I mean
1)sum of sales semi-annually
2)sum of sales bi-monthly

Please do reply.
Bye!


Tom Kyte

Followup  

May 07, 2004 - 8:58 am UTC

what is "semi-annually"

In any case -- use the same sort of technique.  Here is one that rolls

jan+feb -> jan
mar+apr -> mar
may+jun -> may

and so on:

ops$tkyte@ORA9IR2> select trans_date,
  2         trunc(decode(length(to_char(trans_date,'mm')/2), 1, add_months(trans_date,-1), trans_date),'mm') trans_date2
  3    from (select add_months(to_date('12-jan-2004'),rownum-1) trans_date
  4            from all_objects
  5                   where rownum <= 14 )
  6  /
 
TRANS_DAT TRANS_DAT
--------- ---------
12-JAN-04 01-JAN-04
12-FEB-04 01-JAN-04
12-MAR-04 01-MAR-04
12-APR-04 01-MAR-04
12-MAY-04 01-MAY-04
12-JUN-04 01-MAY-04
12-JUL-04 01-JUL-04
12-AUG-04 01-JUL-04
12-SEP-04 01-SEP-04
12-OCT-04 01-SEP-04
12-NOV-04 01-NOV-04
12-DEC-04 01-NOV-04
12-JAN-05 01-JAN-05
12-FEB-05 01-JAN-05
 
14 rows selected.


The trunc(decode....) takes any date and returns the first day of jan, mar, may, jul, ....  we simply map the INPUT date to our target date.  So, a query against "sales" that groups by month pairs could look like this:

ops$tkyte@ORA9IR2> create table t
  2  as
  3  select created trans_date, object_id sales
  4    from all_objects
  5  /
 
Table created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select trunc(decode(length(to_char(trans_date,'mm')/2), 1, add_months(trans_date,-1), trans_date),'mm') trans_date,
  2         min(trans_date),
  3             max(trans_date),
  4             count(*),
  5         sum(sales) sales
  6    from t
  7   group by trunc(decode(length(to_char(trans_date,'mm')/2), 1, add_months(trans_date,-1), trans_date),'mm')
  8  /
 
TRANS_DAT MIN(TRANS MAX(TRANS   COUNT(*)      SALES
--------- --------- --------- ---------- ----------
01-MAY-02 12-MAY-02 12-MAY-02      29234  434296846
01-NOV-03 25-NOV-03 16-DEC-03         52    1594079
01-JAN-04 02-JAN-04 11-FEB-04       1307   42788901
01-MAR-04 17-MAR-04 15-APR-04         17     617068
01-MAY-04 03-MAY-04 07-MAY-04          4     150141



Now, you do the same for "semi-annual"

Look at the DATE FORMATS, there are some for qtrs and other stuff as well -- fully documented.
 

But still confused

June 23, 2004 - 11:37 am UTC

Reviewer: Jon Waterhouse from St. John's, Newfoundland

I thought I could use analytic functions to do the following. I have information on a client with a start and end date. Where the periods are contiguous, i.e. I have a row for January and a row for February, I wanted to combine them, i.e. with one row with a start date of January 1 and and end date of the end of February.

But I didn't even manage to get the first step done: marking rows which followed on from a previous period or were immediately followed by another period.

My table looked like this:

CREATE TABLE testMEMBERS (
CLIENT NUMBER (4) NOT NULL,
DATESTART DATE NOT NULL,
DATEEND DATE,
PERSTART NUMBER (1),
PEREND NUMBER (1)) ;

Five rows of data

begin
insert into testmembers (client,datestart,dateend) values (1,to_date('20040101','yyyymmdd'),to_date('20040131','yyyymmdd'));
insert into testmembers (client,datestart,dateend) values (1,to_date('20040201','yyyymmdd'),to_date('20040331','yyyymmdd'));
insert into testmembers (client,datestart,dateend) values (1,to_date('20040501','yyyymmdd'),to_date('20040531','yyyymmdd'));
insert into testmembers (client,datestart,dateend) values (1,to_date('20040601','yyyymmdd'),to_date('20040630','yyyymmdd'));
insert into testmembers (client,datestart,dateend) values (1,to_date('20040701','yyyymmdd'),to_date('20040731','yyyymmdd'));
end;


This query properly identifies the rows that should be marked:

select C.client,C.DATESTART,Q.client,Q.DATESTART,
DECODE(lag(Q.dateend) OVER (PARTITION BY Q.client ORDER BY Q.datestart),Q.datestart-numtodsinterval(1,'DAY'),0,1) contigb4,
DECODE(lead(Q.datestart) OVER (PARTITION BY Q.client ORDER BY Q.datestart),Q.dateend+numtodsinterval(1,'DAY'),0,1) contigafter
from testmembers c,testMEMBERS Q
where C.client = 1
AND c.datestart = q.datestart
AND c.client = q.client

but in this form of the query

SELECT C.client,C.DATESTART FROM testMEMBERS C
WHERE c.client = 1 AND (0,1) IN
(SELECT DECODE(lag(dateend) OVER (PARTITION BY client
ORDER BY datestart),datestart-numtodsinterval(1,'DAY'),0,1) contigb4,
DECODE(lead(datestart) OVER (PARTITION BY client
ORDER BY datestart),dateend+numtodsinterval(1,'DAY'),0,1) contigafter
FROM testmembers q
where c.datestart = q.datestart AND c.client = q.client)
the window function doesn't get called at all, I guess it is because it always the last part of the query to be done. This form would return no rows: the subquery always returns (1,1).

What I was actually trying to do was to update the values in perstart and perend, like so:

update testMEMBERS C set (perstart,perend) =
SELECT DECODE(lag(dateend) OVER (PARTITION BY client
ORDER BY datestart),datestart-numtodsinterval(1,'DAY'),0,1) contigb4,
DECODE(lead(datestart) OVER (PARTITION BY client ORDER BY datestart,dateend+numtodsinterval(1,'DAY'),0,1) contigafter
FROM testmembers q
WHERE c.datestart = q.datestart AND c.client = q.client)
where client = 1
I figure there should be a way to use analytic functions to do it, but how?

Thanks for all the help! (I've read the analytic functions chapter in Expert 1on1 and I still don't really understand why it doesn't work. I'm using 9.0.1)


Tom Kyte

Followup  

June 23, 2004 - 11:58 am UTC

it is much simplier than that - no self joins (that is what the analytics do for us)

ops$tkyte@ORA9IR2> select client,
  2         min(datestart),
  3         max(dateend)
  4    from (
  5  select client,
  6         datestart,
  7         dateend,
  8         max(grp) over (partition by client order by dateend) grp
  9    from (
 10  select client, datestart, dateend,
 11         case when lag(dateend)
 12                 over (partition by client order by dateend) != datestart-1
 13                   or
 14                   row_number()
 15                     over (partition by client order by dateend) = 1
 16              then row_number()
 17                     over (partition by client order by dateend)
 18          end grp
 19    from testmembers
 20         )
 21         )
 22   group by client, grp
 23   order by 1, 2
 24  /
 
    CLIENT MIN(DATES MAX(DATEE
---------- --------- ---------
         1 01-JAN-04 31-MAR-04
         1 01-MAY-04 31-JUL-04


 

Nice

June 23, 2004 - 7:13 pm UTC

Reviewer: A reader


Can I use analytic functions here?

July 20, 2004 - 8:46 am UTC

Reviewer: Nina

Tom --

I was wondering if I can use analytic functions in the following code:
I have the following query:

SELECT
SUBSTR(spriden_last_name, 1, 20) last_name,
nbbposn_posn posn,
spriden_id id,

FROM spriden, perjtot, pebempl, nbbposn, ptrecls

WHERE spriden_change_ind is NULL
AND spriden_pidm = perjtot_pidm
AND spriden_pidm = pebempl_pidm
AND perjtot_year = year_parm
AND perjtot_posn = nbbposn_posn

This query brings back the following data:

id posn last_name
------------------------------------------------------
545454 A0345 Smith
545454 A9845 Smith
969696 S3456 Williams
568974 A8745 Sherry
568974 H1234 Sherry

As you can see per each id there exist multiple positions (posn). Each position in turn can
have multiple salaries attached to it. To get the salaries I have the following query:

SELECT
nbrbjob_id,
nbrbjob_posn,
nbrbjob_ann_salary,
nbrjobs_contract_type
FROM nbrbjob, nbrjobs X
WHERE nbrbjob_id = v_id (an id from the query above)
AND nbrbjob_posn = v_posn (posn from the query above)
AND nbrbjob_id = nbrjobs_id
AND nbrbjob_posn = nbrjobs_posn
AND nbrbjob_suff = nbrjobs_suff
AND nbrjobs_status = 'A'
AND nbrjobs_effective_date =
(SELECT MAX(nbrjobs_effective_date)
FROM nbrjobs
WHERE nbrjobs_id = X.nbrjobs_id
AND nbrjobs_posn = X.nbrjobs_posn
AND nbrjobs_suff = X.nbrjobs_suff
AND nbrjobs_status = 'A'
AND nbrjobs_effective_date <= TO_DATE(date_parm,'DD-MON-YY'))
ORDER BY nbrbjob_contract_type;

So if we take the first record from above for example with id 545454, the results might be as follows:

id posn salary contract_type
---------------------------------------------------------
545454 A0345 30,000 P
545454 A0345 25,000 O
545454 A0345 15,000 S


After passing to the above query the next posn (545454 id) - A9845, the results are as follows:

id posn salary contract_type
------------------------------------------------------------
545454 A9845 16,000 P


In short, what I need to get for id 545454 is the total sum of all the salaries across all of his
positions (posn) that were selected in the query #1. The thing is that query #1 has a year parm. So if say the year
parm is 2003, id 545454 might have only 1 position A0345 and if parm was 2002 for example, the id 545454 will have
2 positions A0345 and A9845. Hence, I did not see any other way but to have 2 separate queries. Moreover, since
each of the positions can have multiple salaries, to get the sum of the salaries per position, I would have to do an inline
view (no problem there..). The final requirenment is that each positon should have one contract_type of P. So,
what I actually need to get out of the second query is the sum of the salaries per positon and only THE FIRST record with P (ordering by and selecting rownum < 2 should do it.)
I am only interested in contract_type P. Of course, if a record has an S or something with it, I take that if there is no
P. So now the query # 2 looks as follows:

SELECT total_amount, nbrbjob_contract_type
FROM nbrbjob, nbrjobs X,
(SELECT SUM(nbrjobs_ann_salary) total_amount, nbrjobs_id ID
FROM nbrbjob, nbrjobs X
WHERE nbrbjob_id = '3041913'
AND nbrbjob_posn = '102238'
AND nbrbjob_id = nbrjobs_pidm
AND nbrbjob_posn = nbrjobs_posn
AND nbrbjob_suff = nbrjobs_suff
AND nbrjobs_status = 'A'
AND nbrjobs_effective_date =
(SELECT MAX(nbrjobs_effective_date)
FROM nbrjobs
WHERE nbrjobs_id = X.nbrjobs_id
AND nbrjobs_posn = X.nbrjobs_posn
AND nbrjobs_suff = X.nbrjobs_suff
AND nbrjobs_status = 'A'
AND nbrjobs_effective_date <= TO_DATE(date_parm,'DD-MON-YY')
ORDER BY nbrbjob_contract_type
GROUP BY nbrjobs_pidm ) A
WHERE nbrbjob_id = v_id
AND nbrbjob_posn = v_posn
AND nbrbjob_id = nbrjobs_id
AND nbrbjob_posn = nbrjobs_posn
AND nbrbjob_suff = nbrjobs_suff
AND nbrjobs_id=A.ID
AND nbrjobs_status = 'A'
AND nbrjobs_effective_date =
(SELECT MAX(nbrjobs_effective_date)
FROM nbrjobs
WHERE nbrjobs_id = X.nbrjobs_id
AND nbrjobs_posn = X.nbrjobs_posn
AND nbrjobs_suff = X.nbrjobs_suff
AND nbrjobs_status = 'A'
AND nbrjobs_effective_date <= TO_DATE('31-DEC-03','DD-MON-YY'))
AND rownum<2
ORDER BY nbrbjob_contract_type;

I am getting rownum here and ordering by just to get the P contract type first. Anyway, sorry for this long explanation.
In short, I created 2 PL/SQL tables, I get the data form the first query into the first table, then from the second into the second,
then compare the results and print id, name and sum of salary (that is the end goal here - just to print the sum of salaries per
id - no position needed). I was wondering though if I could simplify all of this and use analytic functions? Is it
possible here?

thank you in advance.



Tom Kyte

Followup  

July 20, 2004 - 7:54 pm UTC

lots of tables there -- got lost when you added the year?

but don't see why you cannot just join your first query to the second query (use an inline view and join to it) by the same two fields you are passing to it and then just "sum"?

analytics in plsql

July 20, 2004 - 1:59 pm UTC

Reviewer: kit from england

is there restriction of analytics in pl/sql eg
cursor c_xx is
SELECT 'FINAL', xxx_tid, princ
FROM (
SELECT xxx_tid , ABS (principal_amt) princ
, row_number () over (PARTITION BY wss_tid ORDER BY version DESC) rn
FROM tempa
WHERE xxx_tid = xxx_tid)
WHERE rn = 1;

fails to compile in pl/sql but ok in sql. it complains about the "(" after over in the row_number

Tom Kyte

Followup  

July 20, 2004 - 8:42 pm UTC

ops$tkyte@ORA9IR2> create table tempa ( xxx_tid number, principal_amt number, wss_tid number , version number );
 
Table created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>    SELECT 'FINAL', xxx_tid, princ
  2  FROM (
  3  SELECT xxx_tid , ABS (principal_amt) princ
  4   , row_number () over (PARTITION BY wss_tid ORDER BY version DESC) rn
  5  FROM tempa
  6  WHERE xxx_tid = xxx_tid)
  7  WHERE  rn = 1;
 
no rows selected
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> declare
  2  cursor c_xx is
  3     SELECT 'FINAL', xxx_tid, princ
  4  FROM (
  5  SELECT xxx_tid , ABS (principal_amt) princ
  6   , row_number () over (PARTITION BY wss_tid ORDER BY version DESC) rn
  7  FROM tempa
  8  WHERE xxx_tid = xxx_tid)
  9  WHERE  rn = 1;
 10
 11  begin
 12  null;
 13  end;
 14  /
 
PL/SQL procedure successfully completed.
 


worked for more, need more details on exactly how to reproduce. 

Analytic functions

July 21, 2004 - 5:19 am UTC

Reviewer: kit from england

I'm working on 8.1.7.4 instance.
I still get the same error and also reproduced your scenario and same problem

I presume its like cannot use case in pl/sql but ok for normall sql

The thing is i have used other analytics before in 8.17

Tom Kyte

Followup  

July 21, 2004 - 8:29 am UTC

in 817, analytics, group by cube, group by rollup, order by in a subquery, scalar subqueries and other SQL constructs are not supported.

you have never used analytics in 817 in plsql unless you

a) used a view to hide the construct
b) used dynanmic sql to do the same.

analytics

September 01, 2004 - 8:54 am UTC

Reviewer: Isi from london - uk

I looked and looked but you do nor mention how to run an average within an query , for all values between the 5 and 95 percentile. i belive this is a challenge

Tom Kyte

Followup  

September 01, 2004 - 9:06 am UTC

do you mean something like this:

ops$tkyte@ORA9IR2> select emp.deptno, avg(sal)
  2    from emp,
  3         (select deptno,
  4                 percentile_cont(0.5) within group (order by sal) mins,
  5                 percentile_cont(0.95) within group (order by sal) maxs
  6            from emp
  7           group by deptno ) x
  8   where emp.sal between x.mins and x.maxs
  9     and emp.deptno = x.deptno
 10   group by emp.deptno;
 
    DEPTNO   AVG(SAL)
---------- ----------
        10   12403.13
        20 15145.3167
        30   7846.875
 

Call ranging different windows!

September 05, 2004 - 10:48 am UTC

Reviewer: Prasad from UK

Hi tom,

Hope you are doing fine.
Here is my requirement
There are 3 time Windows
(MON,TUE,WED,THU,FRI) AND TIME BETWEEN (08:00:00 TO 18:00:00) => WEEKDAY
(MON,TUE,WED,THU,FRI) and (TIME < 08:00:00 OR TIME > 18:00:00) => EVENING
(SAT,SUN) => WEEKEND

Now, my inputs are
1: Time of call
2: Duration of call (in seconds)

Now with these two input I've to find out what n all boundaries did my call crossed.

e.g. If the call originated on Friday at 17:55:00 and lasted for 22200 seconds (6 Hrs 10 mins) then it crossed:
5 mins => WEEKDAY CALL
6 Hrs => EVENING CALL
5 mins => WEEKEND CALL

Please do let me know if I'm not clear here.

Thanks as always !

Tom Kyte

Followup  

September 05, 2004 - 12:18 pm UTC

I guess I would write a stored procedure or some other function in whatever language I was best at coding in to procedurally figure this one out.

Any suggestion how to go about it?

September 06, 2004 - 4:33 am UTC

Reviewer: A reader


Tom Kyte

Followup  

September 06, 2004 - 10:24 am UTC

it seems to me to be rather straight forward procedural code, solved easily in an almost "brute force fashion"? coding "101" stuff. anyway.

a loop, 3 counters, a little addition, an exit when start_date >= end_date.....

you have:

(MON,TUE,WED,THU,FRI) AND TIME BETWEEN (08:00:00 TO 18:00:00) => WEEKDAY
(MON,TUE,WED,THU,FRI) and (TIME < 08:00:00 OR TIME > 18:00:00) => EVENING
(SAT,SUN) => WEEKEND

Now, my inputs are
1: Time of call
2: Duration of call (in seconds)
............

this is purely "psuedo code"

l_start_date := input;
l_end_date := input + 1/24/60/60*input_seconds;
while l_start_date <= l_end_date
loop
if (to_char(l_start_date,'dy') in ( 'SAT', 'SUN' ))
then
-- if weekend, advance to midnight monday morning.....
l_next_time := least( trunc(next_day,l_start_date,'MON'), l_end_date);
l_weekend_time := l_weekend_time + (l_next_time-l_start_date);
elsif (to_char(l_start_date,'hh24') between 8 and 17
then
-- else, not a weekend, is it between 8am and 6pm? if so, advance to
-- 6pm on the same day....
l_next_time := least(trunc(l_start_date)+18/24, l_end_date);
l_weekday := l_weekday + (l_next_time-l_start_date);
elsif (to_char(l_date_date,'hh24') < 8 )
then
-- else, not a weekend, not between 8am and 6pm, but it is before 8am
-- advance to 8am then...
l_next_time := least(trunc(l_start_date)+8/24, l_end_date );
l_evening := l_evening + (l_next_time-l_start_date);
else
-- else, not a weekend, not between midnight and 8am. advance to mid
-- night of the next day
l_next_time := least(trunc(l_start_date)+1, l_end_date);
l_evening := l_evening + (l_next_time-l_start_date);
end if;
l_start_date := l_next_time;
end loop;





U R A GENIUS!!

September 06, 2004 - 1:23 pm UTC

Reviewer: A reader


How to merge overlapping records?

September 07, 2004 - 4:08 am UTC

Reviewer: Tony from India

I've data as below and want to merge the records for overlapping dates. MIN and MAX of start and end dates for overlapping records should be the Start and end date of merged record. Is it possible to do it in a single query using analytical or any other features in oracle9i (9.2)?.

Before merge:

Item Code Start_date End_date
============== =========== ===========
111 15-May-2004 20-Jun-2004
111 22-May-2004 07-Jun-2004
111 20-Jun-2004 13-Aug-2004
111 27-May-2004 30-Aug-2004
111 02-Sep-2004 23-Dec-2004
222 21-May-2004 19-Aug-2004
...
...

After the merge:

Item Code Start_date End_date
============== =========== ===========
111 15-May-2004 30-Aug-2004
111 02-Sep-2004 23-Dec-2004
222 21-May-2004 19-Aug-2004
...
...

Scripts:

create table item(item_code number, start_date date, end_date date);

insert into item values (111,'15-May-2004','20-Jun-2004');
insert into item values (111,'22-May-2004','07-Jun-2004');
insert into item values (111,'20-Jun-2004','13-Aug-2004');
insert into item values (111,'27-May-2004','30-Aug-2004');
insert into item values (111,'02-Sep-2004','23-Dec-2004');
insert into item values (222,'21-May-2004','19-Aug-2004');


Thanks in advance.


Tom Kyte

Followup  

September 07, 2004 - 7:43 am UTC

variation on a theme:
http://otn.oracle.com/oramag/oracle/04-mar/o24asktom.html

"analytics to the rescue"

ops$tkyte@ORA9IR2> select item_code,
  2         start_date,
  3         end_date,
  4         case when lag(end_date) over (partition by item_code order by end_date) < start_date
  5                   or
  6                   lag(end_date) over (partition by item_code order by end_date) is null
  7              then row_number() over (partition by item_code order by end_date)
  8          end rn
  9    from item
 10   order by 1, 3
 11  /
 
 ITEM_CODE START_DAT END_DATE          RN
---------- --------- --------- ----------
       111 22-MAY-04 07-JUN-04          1
       111 15-MAY-04 20-JUN-04
       111 20-JUN-04 13-AUG-04
       111 27-MAY-04 30-AUG-04
       111 02-SEP-04 23-DEC-04          5
       222 21-MAY-04 19-AUG-04          1
 
6 rows selected.
 
ops$tkyte@ORA9IR2> select item_code,
  2         start_date,
  3         end_date,
  4         max(rn) over (partition by item_code order by end_date) max_rn
  5    from (
  6  select item_code,
  7         start_date,
  8         end_date,
  9         case when lag(end_date) over (partition by item_code order by end_date) < start_date
 10                   or
 11                   lag(end_date) over (partition by item_code order by end_date) is null
 12              then row_number() over (partition by item_code order by end_date)
 13          end rn
 14    from item
 15         )
 16   order by 1, 3
 17  /
 
 ITEM_CODE START_DAT END_DATE      MAX_RN
---------- --------- --------- ----------
       111 22-MAY-04 07-JUN-04          1
       111 15-MAY-04 20-JUN-04          1
       111 20-JUN-04 13-AUG-04          1
       111 27-MAY-04 30-AUG-04          1
       111 02-SEP-04 23-DEC-04          5
       222 21-MAY-04 19-AUG-04          1
 
6 rows selected.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select item_code, min(start_date), max(end_date)
  2    from (
  3  select item_code,
  4         start_date,
  5         end_date,
  6         max(rn) over (partition by item_code order by end_date) max_rn
  7    from (
  8  select item_code,
  9         start_date,
 10         end_date,
 11         case when lag(end_date) over (partition by item_code order by end_date) < start_date
 12                   or
 13                   lag(end_date) over (partition by item_code order by end_date) is null
 14              then row_number() over (partition by item_code order by end_date)
 15          end rn
 16    from item
 17         )
 18         )
 19   group by item_code, max_rn
 20   order by 1, 2
 21  /
 
 ITEM_CODE MIN(START MAX(END_D
---------- --------- ---------
       111 15-MAY-04 30-AUG-04
       111 02-SEP-04 23-DEC-04
       222 21-MAY-04 19-AUG-04
 
 

Any function available

September 07, 2004 - 7:11 am UTC

Reviewer: P.Karthick from India

Hai,

I have created two tables as shown below.

create table temp_trans (dept_id char(4),emp_id char(4),emp_name char(20))

create table temp_trans_1 (emp_id char(4))


I have inserted the following data in temp_trans.

insert into temp_trans values('D001','E001','KARTHICK')

insert into temp_trans values('D001','E002','KAUSHIK')

insert into temp_trans values('D001','E003','RAVI')

insert into temp_trans values('D001','E004','SESHU')

insert into temp_trans values('D002','E001','VIMAL')

insert into temp_trans values('D002','E002','VIJAY')

insert into temp_trans values('D002','E003','SURESH')


I have inserted the following data in temp_trans_1.

insert into temp_trans_1 values('E001')

insert into temp_trans_1 values('E002')

insert into temp_trans_1 values('E003')

insert into temp_trans_1 values('E004')

insert into temp_trans_1 values('E005')



Now iam trying to get a output of this format.

------------------------------------------------------
DEPT E001 E002 E003 E004 E005
------------------------------------------------------

D001 KARTHICK KAUSHIK RAVI SESHU

D002 VIMAL VIJAY SURESH

------------------------------------------------------


Where E001,E002... are got from temp_trans_1.

I dont want to use dynamic sql. is it possible with simple sql statement. is there
any function available to take a transpose of a table i.e. interchanging row and column.


Thank you.



Tom Kyte

Followup  

September 07, 2004 - 9:26 am UTC

go look on the other pages you posted this. and please try to just sort of do it "once" in the future....


How to merge overlapping records?

September 08, 2004 - 1:34 am UTC

Reviewer: Tony from India

Your answer to my previous question on &#8220;merging overlapping records&#8221; is wonderful. I forgot to mention one more scenario. The merge should also include records whose starting date is one day ahead of ending date of any record.

Before merge:

Item Code Start_date End_date
============== =========== ===========
111 15-May-2004 20-Jun-2004
111 21-Jun-2004 07-Aug-2004
222 21-May-2004 19-Aug-2004

After the merge:

Item Code Start_date End_date
============== =========== ===========
111 15-May-2004 07-Aug-2004
222 21-May-2004 19-Aug-2004


Is it possible to tweak your above query to achieve this in a single query? Sorry for troubling you.

Though I work in Oracle for several years, analytical features are very difficult to understand and use appropriately. I think most of the developers will agree with me. Are you planning to write a book on analytical functions? If so, it&#8217;ll be of great help to Oracle lovers. Can you point to any URL or Book on analytical functions?

Thanks in advance.








Tom Kyte

Followup  

September 08, 2004 - 9:24 am UTC

do you have my book "expert one on one Oracle"?

there is an entire chapter on it in there.

there is an entire chapter on it in the "Data warehousing guide" as well (otn.oracle.com)

as for your followup -- give it a try -- it is fairly obvious how "records that overlap" are assigned here, the case statement does that. tweak it. also read the related article, it goes step by step what is happening here.

ora7.3 and ora8i

September 09, 2004 - 10:13 am UTC

Reviewer: ygp from India

I have two oracle instances 'ora8i" and "ora73".

I wanted to generate a report by using analytic functions on a table "ora73tab" in ora73.

I created a DB-link "ora8i_to_ora73" in ora8i and then created a synonym "ora8isyn" for the table "ora73tab@ora8i_to_ora73".

Now, using analytic functions on the synonym ora8isyn ( in instance ora8i, of course )is allowed; it works.

Does it mean that the query gievn to ora8i is not supplied to ora73 as it is ?


Tom Kyte

Followup  

September 09, 2004 - 11:39 am UTC

ora73 didn't have analytics, therefore the data is getting pulled into Oracle8i and processed.

You can verify that with explain plan or autotrace -- it'll show you the sql that is being sent to 7.3

It's fun to play with Analytics using Tony's question

September 09, 2004 - 1:59 pm UTC

Reviewer: Winston from Canada

Using 'Analytics to the Rescue' techniques, I found the following query will serve the purpose.

select item_code,
min(start_date) start_date,
max(end_date) end_date
from
(
select item_code, start_date,end_date,
max(rn) over (partition by item_code order by end_date) max_rn from
(
select item_code, start_date, end_date,
case
when start_date - pr_enddate < 1/24
or pr_enddate is null
then row_num
end rn
from
(
select item_code, start_date, end_date,
lag(end_date)
over (partition by item_code order by end_date) pr_enddate ,
row_number()
over ( order by end_date) row_num
from item)
))
group by item_code, max_rn
order by 1

September 10, 2004 - 6:03 pm UTC

Reviewer: A reader

SQL> select * from t;

         X D
---------- ---------
       100 10-SEP-04
       100 10-SEP-04
       200 09-SEP-04
       200 09-SEP-04
       300 09-SEP-04
       300 09-SEP-04

I need the following

select x, (the summary of x for the whole month of september)
from t where d='10-SEP-04'

what is the best way to do this. Can we do this using analytic functions? 

Tom Kyte

Followup  

September 10, 2004 - 7:44 pm UTC

yup and if you would have given create table/insert into (as the page you used to add this said in bold...) i would even have shown you how!

September 10, 2004 - 10:17 pm UTC

Reviewer: A reader

Here we go Tom, sorry for not giving you the ddl and inserts in the first place. In addition can you also show me how I can populate this kind of data with one single insert.

create table t ( x number, d date)
/
insert into t ( 100, sysdate);
/
insert into t ( 200, sysdate-1);
/
insert into t ( 300, sysdate-2);
/
insert into t ( 400, sysdate-3);
/
insert into t ( 500, sysdate-4);
/
insert into t ( 600, sysdate-5);
/



Tom Kyte

Followup  

September 11, 2004 - 8:14 am UTC

be nice if you tested them (they are not sql as typed)...  And if they did work -- you insert each twice.  

anyway, to make it more interesting, I added another month:

ops$tkyte@ORA9IR2> select x, d, sum(x) over (partition by trunc(d,'mm'))
  2    from t;
 
         X D         SUM(X)OVER(PARTITIONBYTRUNC(D,'MM'))
---------- --------- ------------------------------------
       100 11-SEP-04                                 2100
       200 10-SEP-04                                 2100
       300 09-SEP-04                                 2100
       400 08-SEP-04                                 2100
       500 07-SEP-04                                 2100
       600 06-SEP-04                                 2100
        50 11-OCT-04                                 1050
       100 10-OCT-04                                 1050
       150 09-OCT-04                                 1050
       200 08-OCT-04                                 1050
       250 07-OCT-04                                 1050
       300 06-OCT-04                                 1050
 
12 rows selected.
 
 

you missed the where clause

September 11, 2004 - 9:38 am UTC

Reviewer: A reader

1.
select x, (the summary of x for the whole month of september) from t where d='10-SEP-04'

Tom, you missed the where clause in your solution, my requirement is with the where clause included.

2.
alter table t add ( z varchar2(10) default 'A')
/
UPDATE table T set z = 'B' where x < 300
/

( my sqlplus is hanging when I'm trying to connect to my database, I could not test the above, please forgive me)

Now I want the totals for the month grouped by z for the whole month in the last column.

Thanks in advance.


Tom Kyte

Followup  

September 11, 2004 - 10:50 am UTC


1)

select *
from (
select x, d, sum(x) over (partition by trunc(d,'mm'))
from t )
where d = TO_DATE('.....')



2) give it a go! try it out! really -- the best way to learn analytics is to take the concepts I've shown here on this site over and over and over and apply them.

(hint: partition by is like group by)

Analytics vs Scalar Subquery

September 11, 2004 - 1:03 pm UTC

Reviewer: Reader..

What is the gain (if any) you obtain using here Analytics vs any other thing (say, a scalar subquery) ?

Tom Kyte

Followup  

September 11, 2004 - 1:07 pm UTC

analytics offer a one pass flexibility on the data that scalar subqueries don't

they are both useful in their own right.

September 11, 2004 - 2:18 pm UTC

Reviewer: A reader

Tom, just curious know how you can solve the above using a scalar subquery. Also want to know what scalar means in terms of sql.

Tom Kyte

Followup  

September 11, 2004 - 2:59 pm UTC

ops$tkyte@ORA9IR2> select x, d,
  2         (select sum(x)
  3            from t t2
  4           where d between trunc(t.d,'mm') and add_months(trunc(t.d,'mm'),1)
  5             and d <> add_months(trunc(t.d,'mm'),1) ) sum_x
  6    from t
  7  /
 
         X D              SUM_X
---------- --------- ----------
       100 11-SEP-04       2100
       200 10-SEP-04       2100
       300 09-SEP-04       2100
       400 08-SEP-04       2100
       500 07-SEP-04       2100
       600 06-SEP-04       2100
        50 11-OCT-04       1050
       100 10-OCT-04       1050
       150 09-OCT-04       1050
       200 08-OCT-04       1050
       250 07-OCT-04       1050
       300 06-OCT-04       1050
 
12 rows selected.


a scalar subquery is a subquery used where you could use the string 'FOOBAR' (as an expression).  it returns a single column and 0 or 1 rows. 

September 11, 2004 - 3:14 pm UTC

Reviewer: A reader

ops$tkyte@ORA9IR2> select x, d,
2 (select sum(x)
3 from t t2
4 where d between trunc(t.d,'mm') and add_months(trunc(t.d,'mm'),1)
5 and d <> add_months(trunc(t.d,'mm'),1) ) sum_x
6 from t
7 /


Why do we need to do a between on the where cant we just do a where to_char(d, 'mm/yyyy') = to_char(date_suppliedasparam, 'mm/yyyy')

Isnt the below more appropriate

select x, d,
(select sum(x)
from t t2
where where to_char(d, 'mm/yyyy') = to_char(date_param, 'mm/yyyy') monthly_sum_of_x
from t
where d = date_param


or



select
a.x,
a.d,
a.monthly_sum_of_x

( select x, d,
sum(x) over (partition by trunc(d,'mm')) monthly_sum_of_x
from t
where to_char(d, 'mm/yyyy') = to_char(date_param, 'mm/yyyy'))

where a.d =date_param



1.Please comment on both of them

2.what do you mean by 'one pass flexibility on the data '

Tom Kyte

Followup  

September 11, 2004 - 3:34 pm UTC

why the between? it is probable we want to do an index range scan there, not a full table scan for every execution of the scalar subquery.

where column between 'value' and 'value' --> index range scan
where f(column) = 'value' --> no index (assuming no function based indexes that is)


so, avoid performing f(column) -- period.


the analytics can get the answer in a single pass on the table. the scalar subquery would

a) read the table to find rows
b) for each row found, re-read the table to get the summation



September 11, 2004 - 3:25 pm UTC

Reviewer: A reader

2.
alter table t add ( z varchar2(10) default 'A')
/
UPDATE table T set z = 'B' where x < 300
/

( my sqlplus is hanging when I'm trying to connect to my database, I could not
test the above, please forgive me)

Now I want the totals for the month grouped by z for the whole month in the last
column.

Tom, as I have said above, my oracle is hanging when I'm trying to connect. I need to have this solution ready by next week Mon. Please help.

Would the below work

select
a.x,
a.d,
a.monthly_sum_of_x_for_z

( select x, d, z,
sum(x) over (partition by trunc(d,'mm'), z) monthly_sum_of_x_for_z
from t
where to_char(d, 'mm/yyyy') = to_char(date_param, 'mm/yyyy'))

where a.d =date_param



Tom Kyte

Followup  

September 11, 2004 - 3:35 pm UTC

if you cannot log into your database, you don't need any queries no?

really -- i'm not "program everything for me please". sounds like you might need to go into work -- if you have work due on monday morning?

Analytic Functions

October 12, 2004 - 7:02 pm UTC

Reviewer: sam from nj

I just want to add one more condition to the first followup.

You gave a example

scott@ORA817DEV.US.ORACLE.COM> select *
2 FROM (select empno,
3 deptno,
4 hiredate,
5 sal,
6 dense_rank()
7 over (partition by deptno
8 order by sal desc) dr
9 from emp
10 where hiredate >= to_date( '01-JAN-1980', 'dd-mon-yyyy' )
11 and sal > 0)
12 where dr <= 3
13 order by deptno, sal desc
14 /

EMPNO DEPTNO HIREDATE SAL DR
---------- ---------- --------- ---------- ----------
7839 10 17-NOV-81 5000 1
7782 10 09-JUN-81 2450 2
7934 10 23-JAN-82 1300 3
7788 20 09-DEC-82 3000 1
7902 20 03-DEC-81 3000 1
7566 20 02-APR-81 2975 2
7876 20 12-JAN-83 1100 3
7698 30 01-MAY-81 2850 1
7499 30 20-FEB-81 1600 2
7844 30 08-SEP-81 1500 3

10 rows selected.

Can you help me modify above query to have
if sal > 3000 then only show remaining rows. Else show just the first row.

So the result should look like

EMPNO DEPTNO HIREDATE SAL DR
---------- ---------- --------- ---------- ----------
7839 10 17-NOV-81 5000 1
7782 10 09-JUN-81 2450 2
7934 10 23-JAN-82 1300 3
7788 20 09-DEC-82 3000 1
7902 20 03-DEC-81 3000 1
7698 30 01-MAY-81 2850 1

Thanks for your help as always.


Tom Kyte

Followup  

October 12, 2004 - 7:56 pm UTC

ops$tkyte@ORA9IR2> select *
  2  FROM (select empno,
  3               deptno,
  4               hiredate,
  5               sal,
  6               dense_rank()
  7               over (partition by deptno
  8                     order by sal desc) dr,
  9                       max(sal)
 10                           over (partition by deptno) max_sal
 11               from emp
 12               where hiredate >= to_date( '01-JAN-1980', 'dd-mon-yyyy' )
 13               and sal > 0)
 14           where dr <= 3
 15  order by deptno, sal desc
 16  /
 
     EMPNO     DEPTNO HIREDATE         SAL         DR    MAX_SAL
---------- ---------- --------- ---------- ---------- ----------
      7839         10 17-NOV-81       5000          1       5000
      7782         10 09-JUN-81       2450          2       5000
      7934         10 23-JAN-82       1300          3       5000
      7788         20 09-DEC-82       3000          1       3000
      7902         20 03-DEC-81       3000          1       3000
      7566         20 02-APR-81       2975          2       3000
      7876         20 12-JAN-83       1100          3       3000
      7698         30 01-MAY-81       2850          1       2850
      7499         30 20-FEB-81       1600          2       2850
      7844         30 08-SEP-81       1500          3       2850
 
10 rows selected.
 


That should be sufficient to get you going........... now what should the where clause be -- that should be easy. 

An observation...

October 12, 2004 - 11:18 pm UTC

Reviewer: Yogee from Mahwah, NJ

Is it always like

select ....
from ( ...query with analytical function)
where ....

Does it always have to be a sub-query?
Would that add to performance problems?

Tom Kyte

Followup  

October 13, 2004 - 8:04 am UTC

if you want to "where" on the analytics, you have to use inline views

it is not always like that, there are analytics without inline views, but when you want to where on the analytic, you need the inline view.


it would not be "adding to a performance problem". consider:


select *
from ( select a.*, max(dt) over (partition by empno) max_dt_by_empno
from emp_history )
where dt = max_dt_by_empno
/

to get the "most current emp record from a history". Compare that to:

select *
from emp_history X
where dt = ( select max(dt) from emp_history Y where x.empno = y.empno );


or

select *
from emp_history
where (empno,dt) in (select empno,max(dt) from emp_history group by empno);


I see the last two as performance issues, the first one as the solution to them.
</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:5280776862234#5767134024470 <code>
for example....


Analytic Functions

October 13, 2004 - 9:49 am UTC

Reviewer: sam from nj

Thanks for replying back.

Here is another scenario

The query looks like

select * from
( select AUCTION_ID,
AUCTION_VER,
AUCTION_SUB_CAT_ID,
REC_STAT,
dense_rank() over (partition by auction_id order by
auction_ver desc) dr
from SECLEND_AUCTION )
where dr <= 2

The result is
AUCTION_ID AUCTION_VER AUCTION_SUB REC_STAT DR
1 2 5 4 1
1 1 5 3 2
2 1 5 3 1
42 2 5 2 1
42 1 5 4 2
43 2 5 2 1
43 1 5 6 2

What I want is if the rec_stat = 2 then show 2 rows
If rec_stat <> 2 (we can not do min / max on this column it's random)then get only first row.
So in above case AUCTION_ID = 1 & AUCTION_VER = 2 should not be shown.

Thanks for your help.



Tom Kyte

Followup  

October 13, 2004 - 10:06 am UTC

so, use

max( decode( rec_stat, 2, rec_stat ) ) over (partition by whatever) max_rs


or

count( decode(rec_stat,2,2) ) over (partition by whatever) cnt_rs


-- and that'll tell you if you have at least ONE rec_stat in your partition by clause. Now, you can add the predicate on that, like "and cnt_rs > 0" for example.


you know the technique, know it is just a matter of "application"

Analytic Functions

October 13, 2004 - 10:41 am UTC

Reviewer: sam from nj

Tom,

Sorry not to be crystal clear in my requirement.

select * from
( select AUCTION_ID,
AUCTION_VER,
AUCTION_SUB_CAT_ID,
REC_STAT,
dense_rank() over (partition by auction_id order by
auction_ver desc) dr
from SECLEND_AUCTION )
where dr <= 2

The result is
AUCTION_ID AUCTION_VER AUCTION_SUB REC_STAT DR
1 2 5 4 1
1 1 5 2 2
2 1 5 2 1
42 2 5 2 1
42 1 5 4 2
43 12 5 2 1
43 11 5 6 2

It's not just rec_Stat 2 present in partition or not.
It's the position of REC_STAT = 2.
As you see in above example Auction_id 1 has 2 rows.
AUCTION_VER 1 and 2 , (they are not necessary to be 1 & 2)
So the logic is
for max(AUCTION_VER) if REC_STAT = 2 then show 2 rows.
else show only 1 row.

So in above case for AUCTION_ID 1 only 1 row is displayed.
For AUCTION_ID 2 only one row is displayed since there is no second row.





Tom Kyte

Followup  

October 13, 2004 - 12:00 pm UTC

so, first_value(rec_stat) over (partition by key order by whatever desc)

use the analytics -- you know the technique now -- just a matter of seeing what is available and applying them!!!




how to get this output

October 18, 2004 - 8:11 am UTC

Reviewer: A reader

Hi

I have this table


select * from test01;

ID_TRANSACCION ID_TIPO_TRANS
-------------- -------------
1 1
1 1
1 3
2 2
2 2
3 1
4 1

I would like to get those rows which have more than 1 combination of both columns so should get this output

ID_TRANSACCION ID_TIPO_TRANS
-------------- -------------
1 1
1 3
2 2
3 1
4 1

Can we do this with row_number() ?

Tom Kyte

Followup  

October 18, 2004 - 9:07 am UTC

that looks simply like "distinct" -- but distinct conflicts with "more than 1 combnination" -- but then again, your output conflicts with that as well....

try select distinct.

I make a pardon

October 18, 2004 - 9:29 am UTC

Reviewer: A reader

Hi

I am sorry, posted the wrong output

ID_TRANSACCION ID_TIPO_TRANS
-------------- -------------
1 1
1 1
1 3
2 3
2 2
3 1
4 1

I should get this


ID_TRANSACCION
--------------
1
2

Those IDs which has more than two combinations.

Sorry :(

Tom Kyte

Followup  

October 18, 2004 - 9:50 am UTC

<quote>
I would like to get those rows which have more than 1 combination of both
columns so should get this output
</quote>

well, that conflicts with everything. (being able to clearly state the goal is *vital* in our profession)

See -- here I have to guess. I have to guess "is the text right" or "is the picture right" or "is my interpretation of the picture right" -- or actually "which one of my interpretations is correct"

I guess you want to see distinct values for id_transaccion such that there is more than one value for that id_transaccion in that table (eg: id_tipo_trans is quite simply not relevant to the problem at all -- it has no bearing on the outcome).

that could be:

select id_transaccion from t group by id_transaccion having count(*) > 1;

Or, maybe it could be "you want to see the distinct values for id_transaccion such that there are more than one distinct id_tipo_trans values for it" -- that is, id_tipo_trans is very relevant.

that could be:

select id_transaccion from t group by id_transaccion
having count(distinct ip_tipo_trans) > 1;


or there could be some other thing you wanted -- but I didn't guess it :)





the second query

October 18, 2004 - 10:01 am UTC

Reviewer: A reader

Hi

The second query is what I need. I need to see those id_transaccion which belongs to more than one id_tipo_trans. I have a further question, if I want to output both id_transaccion and id_tipo_trans the group by query will not work. I guess I can do that with analytic function correct?


Tom Kyte

Followup  

October 18, 2004 - 10:17 am UTC

select *
from ( select a, b, count(distinct b) over (partition by a) cnt
from t )
where cnt > 1;



Analytics in 817

October 20, 2004 - 10:30 am UTC

Reviewer: Al from Edinburgh

Tom,
I can run the following query in 817 without any problems , I want to use the results of the query to update a separate table based on the curr and prev columns, when I create an annonymous block and embed the sql in a cursor I get an error is there another way to do this.
SELECT MAX( DECODE(
a.ccy_hedge_event_type_id,
13, a.present_value
)) AS curr_ffp,
MAX( DECODE(
a.ccy_hedge_event_type_id,
15, a.present_value
)) AS curr_ffr,
MAX( DECODE(
a.ccy_hedge_event_type_id,
17, a.present_value
)) AS curr_ffs,
LAG( MAX( DECODE(
a.ccy_hedge_event_type_id,
15, a.present_value
))) OVER ( PARTITION BY a.deal_reference ORDER BY date_from ) prev_ffr,
a.deal_reference,
date_from,
date_to
FROM ccy_hedge_daily_value a, ccy_hedge_deal b
WHERE b.ccy_hedge_deal_type_id IN( 2, 3 )
AND a.deal_reference = b.deal_reference
GROUP BY a.deal_reference, date_from, date_to;
No problem returns 4000 records in < 1 sec

DECLARE
CURSOR c_data IS
SELECT MAX( DECODE(
a.ccy_hedge_event_type_id,
13, a.present_value
)) AS curr_ffp,
MAX( DECODE(
a.ccy_hedge_event_type_id,
15, a.present_value
)) AS curr_ffr,
MAX( DECODE(
a.ccy_hedge_event_type_id,
17, a.present_value
)) AS curr_ffs,
LAG( MAX( DECODE(
a.ccy_hedge_event_type_id,
15, a.present_value
))) OVER ( PARTITION BY a.deal_reference ORDER BY date_from ) prev_ffr,
a.deal_reference,
date_from,
date_to
FROM ccy_hedge_daily_value a, ccy_hedge_deal b
WHERE b.ccy_hedge_deal_type_id IN( 2, 3 )
AND a.deal_reference = b.deal_reference
GROUP BY a.deal_reference, date_from, date_to;
BEGIN
FOR c_rec IN c_data LOOP
DBMS_OUTPUT.put_line('reference ' || c_rec.deal_reference );
END LOOP;
END;

ORA-06550: line 19, column 30:
PLS-00103: Encountered the symbol "(" when expecting one of the following:

Tom Kyte

Followup  

October 20, 2004 - 11:45 am UTC

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:3027089372477 <code>


in 8i, you must use native dynamic sql or a view.

Can you use analytics to compare 'snapshot' data?

October 20, 2004 - 4:45 pm UTC

Reviewer: Robert from Memphis, USA.

Tom,

I have searched your website for this answer, and spent several hours working on this (I finally came up with an effective 'brute force' methed).

Is there a simple, slick method of using analytics to compare 'snapshot' records within the same table?

Here is a sample "segment growth" table...

create table t1
( snaptime date
, seg_name varchar2(10)
, seg_size number
);

insert into t1 values (trunc(sysdate) - 20, 'X1', 3000);
insert into t1 values (trunc(sysdate) - 20, 'X2', 10000);
insert into t1 values (trunc(sysdate) - 5,  'X1', 4000);
insert into t1 values (trunc(sysdate) - 5,  'X2', 12000);
insert into t1 values (trunc(sysdate) - 5,  'X3', 600);
insert into t1 values (trunc(sysdate) - 0,  'X1', 5000);
insert into t1 values (trunc(sysdate) - 0,  'X3', 500);

SQL> select * from t1;

SNAPTIME  SEG_NAME    SEG_SIZE
--------- ---------- ---------
30-SEP-04 X1              3000
30-SEP-04 X2             10000
15-OCT-04 X1              4000
15-OCT-04 X2             12000
15-OCT-04 X3               600
20-OCT-04 X1              5000
20-OCT-04 X3               500

Here is a simple report comparing the latest 2 snapshots...

BEGIN       END         SEGMENT GROWTH
----------- ----------- ------- -------
15-OCT-2004 20-OCT-2004 T1         1000       
15-OCT-2004 20-OCT-2004 T2       -12000       
15-OCT-2004 20-OCT-2004 T3         -100       

Can one use analytics to easily write a query to generate a report like the one above... and even more sophistacated reports?

Thanks,

Robert. 

Tom Kyte

Followup  

October 20, 2004 - 5:16 pm UTC

sure, lag and lead.

select * from (
select snaptime st, lead(snaptime)
over (partition by segment order by snaptime) end,
segment,
lead(seg_size)
over (partition by segment order by snaptime) - seg_size growth
from table )
where end is not null;


just take the current row and use lead to look forward a row.

That's it! Thanks, Tom!

October 20, 2004 - 5:20 pm UTC

Reviewer: Robert from Memphis, USA.


One enhancement....

October 20, 2004 - 6:34 pm UTC

Reviewer: Robert from Memphis, USA

Tom,

How would I return only one row per segment. The row with the MAX endtime?

Here is how the query works now....

SQL> select * from (
  2  select snaptime st
  3       , lead(snaptime) over (partition by seg_name order by snaptime) end
  4       , seg_name,
  5         lead(seg_size) over (partition by seg_name order by snaptime) - seg_size growth
  6  from t1 )
  7  where end is not null
  8  /

ST        END       SEG_NAME      GROWTH
--------- --------- ---------- ---------
30-SEP-04 15-OCT-04 X1              1000
15-OCT-04 20-OCT-04 X1              1000
30-SEP-04 15-OCT-04 X2              2000
15-OCT-04 20-OCT-04 X3              -100

4 rows selected.

How would I get the following...only the 'most recent' row for each segment?

ST        END       SEG_NAME      GROWTH
--------- --------- ---------- ---------
15-OCT-04 20-OCT-04 X1              1000
30-SEP-04 15-OCT-04 X2              2000
15-OCT-04 20-OCT-04 X3              -100

Thanks,

Robert. 

Never mind! ..... I've got it.......But thanks for your help!

October 20, 2004 - 7:02 pm UTC

Reviewer: Robert from Memphis, USA

Tom,
Never mind.... here is the answer to my own question.
Thanks,
Robert.

"analytics rock!" :)

  1  select * from (
  2  select snaptime st
  3       , lead(snaptime) over (partition by seg_name order by snaptime) end
  4       , seg_name
  5       , lead(seg_size) over (partition by seg_name order by snaptime) - seg_size growth
  6       , max(snaptime) over (partition by seg_name) max_snap
  7  from t1 )
  8  where end is not null
  9* and end = max_snap
SQL> /

ST        END       SEG_NAME      GROWTH MAX_SNAP
--------- --------- ---------- --------- ---------
15-OCT-04 20-OCT-04 X1              1000 20-OCT-04
30-SEP-04 15-OCT-04 X2              2000 15-OCT-04
15-OCT-04 20-OCT-04 X3              -100 20-OCT-04
 

Dynamic SQL implementation restriction

October 21, 2004 - 6:17 am UTC

Reviewer: Al from Edinburgh

hi Tom,
the view worked fine however the dynamic sql gave me a error
when I tried to run it, I changed the block to

BEGIN
FOR c_rec IN 'SELECT MAX( DECODE(
a.ccy_hedge_event_type_id,
13, a.present_value
)) AS curr_ffp,
MAX( DECODE(
a.ccy_hedge_event_type_id,
15, a.present_value
)) AS curr_ffr,
MAX( DECODE(
a.ccy_hedge_event_type_id,
17, a.present_value
)) AS curr_ffs,
LAG( MAX( DECODE(
a.ccy_hedge_event_type_id,
15, a.present_value
))) OVER ( PARTITION BY a.deal_reference ORDER BY date_from ) prev_ffr,
a.deal_reference,
date_from,
date_to
FROM ccy_hedge_daily_value a, ccy_hedge_deal b
WHERE b.ccy_hedge_deal_type_id IN( 2, 3 )
AND a.deal_reference = b.deal_reference'
LOOP
DBMS_OUTPUT.put_line('reference ' || c_rec.deal_reference );
END LOOP;
END;

The following error has occurred:

ORA-06550: line 3, column 17:
PLS-00999: implementation restriction (may be temporary)
ORA-06550: line 3, column 4:
PL/SQL: Statement ignored

Is there some parameter that needs to be set before allowing this type of statement?




Tom Kyte

Followup  

October 21, 2004 - 7:03 am UTC

that is not the way to do dynamic sql.


declare
type rc is ref cursor;
l_cursor rc;
begin
open l_cursor for '......'
loop
fetch l_cursor into ....;
exit when l_cursor%notfound;
process row
end loop;
close l_cursor

re-dynamic sql implementation restriction

October 21, 2004 - 7:11 am UTC

Reviewer: Al from Edinburgh

Thanks Tom,
I should have checked your book the ansewer was on page 700, that was a quick response, thought you would be sleeping as I posted it at 10:00 UK time



Analytic-enabled regular aggregate functions

November 05, 2004 - 11:36 am UTC

Reviewer: A reader

Oracle has analytic-enabled the regular aggregate built-in functions like sum, min, max, etc. I am having trouble understanding how that is useful.

I mean, isnt

sum(col1) over (partition by col2)

the same as

sum(col) ... group by col2

Yes, I realize that sum(col) over () is something that cannot be done without analytic functions, but cant everything else be done using the traditional group-by?

Thanks

Tom Kyte

Followup  

November 05, 2004 - 5:16 pm UTC

1 select empno, sal,
2 avg(sal) over (partition by deptno) avg_sal_by_deptno,
3 avg(sal) over (partition by job) avg_sal_by_job,
4 sum(sal) over (partition by mgr) sum_sal_by_mgr
5* from emp
scott@ORA9IR2> /

EMPNO SAL AVG_SAL_BY_DEPTNO AVG_SAL_BY_JOB SUM_SAL_BY_MGR
---------- ---------- ----------------- -------------- --------------
7934 1300 2916.66667 1037.5 1300
7782 2450 2916.66667 2758.33333 8275
7839 5000 2916.66667 5000 5000
7788 3000 2175 3000 6000
7902 3000 2175 3000 6000
7876 1100 2175 1037.5 1100
7566 2975 2175 2758.33333 8275
7369 800 2175 1037.5 800
7900 950 1566.66667 1037.5 6550
7521 1250 1566.66667 1400 6550
7844 1500 1566.66667 1400 6550
7654 1250 1566.66667 1400 6550
7499 1600 1566.66667 1400 6550
7698 2850 1566.66667 2758.33333 8275

14 rows selected.


Try doing that with "group by" -- group by squishes out rows (highly technical term there -- squish).

You would have to have 3 inline views and join to get this without analytics.

You want to compare the salary of each individual to the various averages and compute the pct of salary made by mgr.



Analytics give you the "aggregates" with the "details"

Code change ?

November 05, 2004 - 5:54 pm UTC

Reviewer: A reader

So Tom, that means that if we had something like this..

SQL@9iR2> select ename, ip_address
2 from hostnames
3 where ip_address in ( select ip_address from hostnames group by ip_address having count ( * ) > 1 );

.. that finds how many ip address are repeated and then list the address and hostnames ( please ignore this, just what I am trying to express -- I didn't find anything more adecuate ), should be changed to something like ..

SQL@9iR2> select ename,
2 ip_address
3 from (
4 select ename,
5 ip_address,
6 case when count( * ) over( partition by ip_address ) > 1
7 then rownum
8 end rn
9 from hostnames
10 )
11 where rn is not null
12 /

.. because of performance gain using analytics ? Is that how developers nowadays should be instructed ?

Tom Kyte

Followup  

November 05, 2004 - 6:25 pm UTC

If I was processing all of the rows or most of the rows in the table, yes I would code:

select *
from ( select x.*, count(*) over (partition by ip_address) cnt
from hostnames x
)
where cnt > 1;




there are times when you have predicates on the other columns (since analytics are applied AFTER the where clause) that we would not use that technique.


November 18, 2004 - 4:49 am UTC

Reviewer: Helena Marková from Bratislava, Slovakia


Repeating values and Runing total

November 19, 2004 - 12:56 am UTC

Reviewer: Sanjeev Sapre from India

Tom,
We were trying to run following query

SELECT ename "Ename", deptno "Deptno", sal "Sal",
SUM(sal)
OVER (ORDER BY deptno,ename ) "Running Total",
SUM(SAL)
OVER (PARTITION BY deptno
ORDER BY ename) "Dept Total",
ROW_NUMBER()
OVER (PARTITION BY deptno
ORDER BY ENAME) "Seq"
FROM emp
ORDER BY deptno, ename

the result is like this

scott@VIS> /
Press any key to continue..

Ename Deptno Sal Running Total Dept Total Seq
---------- ---------- ---------- ------------- ---------- ----------
CLARK 10 2450 2450 2450 1
KING 10 5000 7450 7450 2
Kairaj 10 10000 17450 17450 3
MILLER 10 1300 21350 21350 4
MILLER 10 1300 21350 21350 5
MILLER 10 1300 21350 21350 6
ADAMS 20 1100 22450 1100 1
FORD 20 3000 25450 4100 2
JONES 20 2975 28425 7075 3
SCOTT 20 2975 34375 13025 4
SCOTT 20 2975 34375 13025 5
SMITH 20 1100 35475 14125 6
ALLEN 30 1600 37075 1600 1
BLAKE 30 2850 39925 4450 2
JAMES 30 950 40875 5400 3
MARTIN 30 1250 42125 6650 4
TURNER 30 1500 43625 8150 5
WARD 30 1250 44875 9400 6
Jim 50 25000 69875 25000 1

19 rows selected.

there are two cases where sal is repeated ( same ) 1300 and 2975. within the same salary employees there is case of same ename. For these employees the running sum is clubbed together rather than going incrementally.

We are unable to understand how this works ? Is this corret or is a bug. Please explain.

regards


Tom Kyte

Followup  

November 19, 2004 - 10:22 am UTC

when you order your data by ename, deptno -- which comes first

MILLER 10

or

MILLER 10

or

MILLER 10

? none do, they are "the same".


If you order by something that makes them all "unique" ultimately -- then they do have an order. consider:

scott@ORA9IR2> edit
Wrote file afiedt.buf

1 select ename, sal,
2 sum(sal) over (order by sal) sum1,
3 sum(sal) over (order by sal, EMPNO) sum2
4 from emp
5* order by sal
scott@ORA9IR2> /

ENAME SAL SUM1 SUM2
---------- ---------- ---------- ----------
SMITH 800 800 800
JAMES 950 1750 1750
ADAMS 1100 2850 2850
WARD 1250 5350 4100
MARTIN 1250 5350 5350

MILLER 1300 6650 6650
TURNER 1500 8150 8150
ALLEN 1600 9750 9750
CLARK 2450 12200 12200
BLAKE 2850 15050 15050
JONES 2975 18025 18025
SCOTT 3000 24025 21025
FORD 3000 24025 24025
KING 5000 29025 29025

14 rows selected.

In the above, when we order by sal -- 1250 does not come before or after 1250, it comes at the same time. so the analytic looking at the data ordered by SAL considers those two to be "the same". but, order by SAL and then EMPNO and they become "different"


This is the expected behavior.



Using Group By clause with AF

November 19, 2004 - 9:57 am UTC

Reviewer: denni50 from na

Hi Tom

I'm trying to re-write and expand the following query using AF:

Select idnumber,sum(giftamount)
from gift where giftdate>=to_date('01-JAN-2004', 'dd-mon-yyyy')
group by idnumber
having sum(giftamount)>=2500
order by idnumber;

To:

select idnumber from (select idnumber, sum(giftamount) SumGifts,
dense_rank()
over (partition by sum(giftamount)
order by idnumber) dr
from gift
where giftdate >= to_date('01-JAN-2004', 'dd-mon-yyyy')
having sum(giftamount) >= 2500)
where dr <= 100
order by idnumber;

and keep getting the below error:
ERROR at line 1:
ORA-00937: not a single-group group function

I know the idnumber needs to be grouped to get the sum(giftamount) but I'm not quite sure where or how to insert the group by clause within the partition clause.

What I would like to achieve is a listing like the below example:

sum(giftamount) idnumber
between
10000 - 5000
1111
2222
3333

2500 - 4999.99 4444
5555
6666
7777.....and so on.

thanks a bunch!







Tom Kyte

Followup  

November 19, 2004 - 11:33 am UTC

add more layers of inline views there....

do the dense_rank on an inline view that computed the sum(giftamount) by idnumber.

simple query

November 25, 2004 - 8:32 pm UTC

Reviewer: Menon

I have been struggling with a seemingly simple requirement.
I have a list of rows of a number as:
1
2
3
1
2
3
1
2
...

I want to assign a rank of 0 to first set of numbers
till 3, 1 to the second set of (1-3), 2 to the third
set and so on:
num rank
---- ----
1 0
2 0
3 0
1 1
2 1
3 1
1 2
2 2
3 2

The problem is that I can also have a set such as:
num rank
3 0
1 1
2 1
3 1
1 2
2 2
3 2

Also the upper limit 3 should not be hard coded in the
solution - it should work for sets of numbers
with limit of max 10.

Any ideas?

Thanx.

Tom Kyte

Followup  

November 25, 2004 - 8:38 pm UTC

problem is -- rows have no order....

so, where is the missing column here -- rows don't come out in any order -- so give us the create table, insert into's and the select that makes them come out in 1,2,3,1,2,3,1,2,3 order -- and then we can talk about how to assign 0, 1, 2 groups...

thanx!

November 25, 2004 - 8:55 pm UTC

Reviewer: Menon

here you go...
---
scott@ORA10G> drop table t;

Table dropped.

scott@ORA10G> create table t ( x number, y number );

Table created.

scott@ORA10G> insert into t values( 1, 2 );

1 row created.

scott@ORA10G> insert into t values( 2, 3 );

1 row created.

scott@ORA10G> insert into t values( 3, 1 );

1 row created.

scott@ORA10G> insert into t values( 4, 2 );

1 row created.

scott@ORA10G> insert into t values( 5, 3 );

1 row created.

scott@ORA10G> insert into t values( 6, 1 );

1 row created.

scott@ORA10G> insert into t values( 7, 2 );

1 row created.

scott@ORA10G> insert into t values( 8, 3 );

1 row created.

scott@ORA10G> insert into t values( 9, 1 );

1 row created.

scott@ORA10G> insert into t values( 10, 2 );

1 row created.

scott@ORA10G> insert into t values( 11, 3 );

1 row created.

scott@ORA10G> insert into t values( 12, 1 );

1 row created.

scott@ORA10G> insert into t values( 13, 2 );

1 row created.

scott@ORA10G> insert into t values( 14, 3 );

1 row created.

scott@ORA10G> insert into t values( 15, 1 );

1 row created.

scott@ORA10G> insert into t values( 16, 2 );

1 row created.

scott@ORA10G> insert into t values( 17, 3 );

1 row created.

scott@ORA10G> insert into t values( 18, 1 );

1 row created.

scott@ORA10G> insert into t values( 19, 2 );

1 row created.

scott@ORA10G> insert into t values( 20, 3 );

1 row created.

scott@ORA10G> select * from t order by x, y;

X Y
---------- ----------
1 2
2 3
3 1
4 2
5 3
6 1
7 2
8 3
9 1
10 2
11 3
12 1
13 2
14 3
15 1
16 2
17 3
18 1
19 2
20 3

20 rows selected.
---
The values in the second row have started with 2.
It could have started with 1 or 3 as well.
Again, the same set of numbers could have been
from 1 to 4 or from 1 to 5 and so on.
Btw, that was a quick response on thanxgiving -
Bet you are working on your book now:)


Tom Kyte

Followup  

November 26, 2004 - 8:58 am UTC

ops$tkyte@ORA9IR2> select y, dense_rank() over ( order by grp )-1 rnk
  2    from (
  3  select y, max(grp) over (order by x) grp
  4    from (
  5  select x,
  6         y,
  7         case when nvl(lag(y) over (order by x),y+1) > y then row_number() over (order by x) end grp
  8    from t
  9         )
 10         )
 11  /
 
         Y        RNK
---------- ----------
         2          0
         3          0
         1          1
         2          1
         3          1
         1          2
         2          2
         3          2
         1          3
         2          3
         3          3
         1          4
         2          4
         3          4
         1          5
         2          5
         3          5
         1          6
         2          6
         3          6
 
20 rows selected.
 

excellent!

November 26, 2004 - 10:35 am UTC

Reviewer: Menon

Thanx. It took me a few minutes to understand it:)
Had to rerun the query with the inner ones only.
The technique that you used is interesting. You
first created a query to "mark" the points where
the number changed to something less than the
previous number. Then you used dense_rank to
get at the answer. This seems like a very useful
technique in general. I believe I have seen something
like this in an article written by you?
And you have some name to it as well ("mark and..")?
Or may be I am thinking of something else altogether.
Anyway, hope you had a great thanxgiving.
I will see you in the OOW.

Tom Kyte

Followup  

November 26, 2004 - 10:49 am UTC

you remember correctly, it is this technique:

</code> http://www.oracle.com/technology/oramag/oracle/04-mar/o24asktom.html <code>

in analytics to the rescue.

thanx.

November 26, 2004 - 10:52 am UTC

Reviewer: Menon

Yes. I found that article in some other reply of yours just now.
I swear you had given a name to it (was it mark and forward)?
Anyway, thanx a lot.

ok - here is the original problem

November 26, 2004 - 11:51 am UTC

Reviewer: Menon

The problem where I wanted to apply the above technique
was to generate a calendar for a given month and year
using SQL.
I am sure you would have a different and most likely
a more elegant solution than the one I have:)
--------
scott@ORA10G> -- genarate a calendar for a given month
scott@ORA10G> variable month number;
scott@ORA10G> variable year number;
scott@ORA10G> exec :month := &1

PL/SQL procedure successfully completed.

scott@ORA10G> exec :year := &2

PL/SQL procedure successfully completed.

scott@ORA10G> column sun format 999
scott@ORA10G> column mon format 999
scott@ORA10G> column tue format 999
scott@ORA10G> column wed format 999
scott@ORA10G> column thu format 999
scott@ORA10G> column fri format 999
scott@ORA10G> column sat format 999
scott@ORA10G> prompt calendar for the month &1 year &2
calendar for the month 12 year 2004
scott@ORA10G> select max( decode( day_of_week, 1, day_num, null) ) sun,
2 max( decode( day_of_week, 2, day_num, null) ) mon,
3 max( decode( day_of_week, 3, day_num, null) ) tue,
4 max( decode( day_of_week, 4, day_num, null) ) wed,
5 max( decode( day_of_week, 5, day_num, null) ) thu,
6 max( decode( day_of_week, 6, day_num, null) ) fri,
7 max( decode( day_of_week, 7, day_num, null) ) sat
8 from
9 (
10 select day_num, day_of_week,
11 max( grp ) over ( order by day_num ) grp
12 from
13 (
14 select day_num, day_of_week,
15 case when day_of_week < lag_day_of_week then rn end grp
16 from
17 (
18 select day_num, day_of_week,
19 nvl(lag( day_of_week) over( order by day_num ), day_of_week+1) lag_day_of_week,
20 row_number() over( order by day_num) rn
21 from
22 (
23 select day_num, decode( to_char( curr_date, 'dy'), 'sun', 1,
24 'mon', 2,
25 'tue', 3,
26 'wed', 4,
27 'thu', 5,
28 'fri', 6,
29 'sat', 7) day_of_week
30 from
31 (
32 select rownum day_num, to_date( :month || '-' ||rownum || '-2004', 'MM-DD-YYYY') curr_date
33 from all_objects
34 where rownum <= to_number(to_char(last_day(to_date(:month ||'-01-'||:year, 'MM-DD-YYYY')), 'DD'))
35 )
36 )
37 )
38 )
39 )
40 group by grp;

SUN MON TUE WED THU FRI SAT
---- ---- ---- ---- ---- ---- ----
1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31




Tom Kyte

Followup  

November 26, 2004 - 12:27 pm UTC

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:14741686777707 <code>

thanx

November 26, 2004 - 1:09 pm UTC

Reviewer: Menon

well, that solution is definitely much better:)
I did not think of looking for a format that gives
me the week number. In the above solution that I
posted, for what it is worth, we can substitute the
select day_num, decode( to_char( curr_date, 'dy'), 'sun', 1,
'mon', 2,
'tue', 3,
'wed', 4,
'thu', 5,
'fri', 6,
'sat', 7) day_of_week

with
select day_num, to_number(to_char( curr_date, 'D')) day_of_week

so the select becomes:
select max( decode( day_of_week, 1, day_num, null) ) sun,
max( decode( day_of_week, 2, day_num, null) ) mon,
max( decode( day_of_week, 3, day_num, null) ) tue,
max( decode( day_of_week, 4, day_num, null) ) wed,
max( decode( day_of_week, 5, day_num, null) ) thu,
max( decode( day_of_week, 6, day_num, null) ) fri,
max( decode( day_of_week, 7, day_num, null) ) sat
from
(
select day_num, day_of_week,
max( grp ) over ( order by day_num ) grp
from
(
select day_num, day_of_week,
case when day_of_week < lag_day_of_week then rn end grp
from
(
select day_num, day_of_week,
nvl(lag( day_of_week) over( order by day_num ), day_of_week+1) lag_day_of_week,
row_number() over( order by day_num) rn
from
(
select day_num, to_number(to_char( curr_date, 'D')) day_of_week
from
(
select rownum day_num, to_date( :month || '-' ||rownum || '-2004', 'MM-DD-YYYY') curr_date
from all_objects
where rownum <= to_number(to_char(last_day(to_date(:month ||'-01-'||:year, 'MM-DD-YYYY')), 'DD'))
)
)
)
)
)
group by grp;

I am glad I tried my solution before looking at yours
because:
1. I reinforced the idea of your "marking" trick.
2. I learnt about some new date formats after I looked
at your solution.




RE: simple query

November 28, 2004 - 9:05 pm UTC

Reviewer: Joaquin Martinez from Dominican Republic

Was actually playing around for the weekend with analytics ( since I cannot practice them at work -- not too much of a need there and besides, version 8.1.7 SE, so no dice :( ), so I dont loose the 'habit' of managing them and used this thread to *try* to reproduce Tom's answers.. happened to be bothered for a way to do this 'without' the need of them. Surprisingly for me, I came up with an answer.

SQL> create table t as
  2  select x, decode( y, 0, 3, y ) y
  3    from (
  4  select rownum x, mod( ( rownum + 1 ),3 ) y
  5    from all_objects
  6   where rownum <= 20
  7         )
  8  /

Table created.

SQL> select x,
  2         y,
  3         trunc( ( rownum + ( trunc( z / 100 ) - abs( ( mod( z,100 ) - ( trunc( z / 100 ) + 2 ) ) ) ) )
  4                / trunc( z / 100 ) ) rnk
  5    from (
  6  select x,
  7         y,
  8         ( select max( y ) * 100 + mod( min( rownum * 100 + y ),100 )
  9             from t ) z
 10    from t
 11         )
 12  /

         X          Y        RNK
---------- ---------- ----------
         1          2          0
         2          3          0
         3          1          1
         4          2          1
         5          3          1
         6          1          2
         7          2          2
         8          3          2
         9          1          3
        10          2          3
        11          3          3
        12          1          4
        13          2          4
        14          3          4
        15          1          5
        16          2          5
        17          3          5
        18          1          6
        19          2          6
        20          3          6

20 rows selected.

.. of course, not as efficient as using analytics ( two table access, for example ) but wanted to share it with all of you. 

analytic function

November 30, 2004 - 5:19 am UTC

Reviewer: karthick from india

I have a table ,

create table temp_rollup(dept1 char(2),no number(2))

and data are as follows,

insert into temp_rollup values('a',2)
insert into temp_rollup values('a',2)
insert into temp_rollup values('a',2)
insert into temp_rollup values('b',2)
insert into temp_rollup values('b',2)
insert into temp_rollup values('b',2)


i am trying to get the output as follows

Dept No
------------
a 2
a 2
a 2
sub total 6
b 2
b 2
b 2
sub total 6
total 12

Can analytic function help me in achiving this.i tried out with rollup but dint get
the exact output.

i dont want to go with union all.

Tom Kyte

Followup  

November 30, 2004 - 7:57 am UTC

You know -- don't like it when people say "don't want to go with feature X".  what if feature X was the best, only, superior, correct way to do it? why rule out feature "X"

anyway, feature "X" isn't needed this time.

<b>in pure SQL</b>

ops$tkyte@ORA9IR2> select decode( grouping(dept1)+grouping(rowid), 1, 'Subtotal', 2, 'Total', dept1 ), sum(no)
  2    from temp_rollup
  3   group by rollup(dept1,rowid)
  4  /
 
DECODE(G    SUM(NO)
-------- ----------
a                 2
a                 2
a                 2
Subtotal          6
b                 2
b                 2
b                 2
Subtotal          6
Total            12
 
9 rows selected.
 

<b>and if this is a sqlplus report, this will do it</b>


ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> break on dept1       on report
ops$tkyte@ORA9IR2> compute sum of no on report
ops$tkyte@ORA9IR2> compute sum of no on dept1
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select dept1, no from temp_rollup order by dept1, no;
 
DEPT1              NO
---------- ----------
a                   2
                    2
                    2
********** ----------
sum                 6
b                   2
                    2
                    2
********** ----------
sum                 6
           ----------
sum                12
 
6 rows selected.
 
 

Heres a 3 month

November 30, 2004 - 4:01 pm UTC

Reviewer: A reader

select * from (
select SUN_MON_TUE_WED_THU_FRI_SAT from (
select '--------'||rpad(rtrim(TO_CHAR(add_months(sysdate,-1),'Month')),9,'-')||'-----------' ||' '||
'--------'||rpad(rtrim(TO_CHAR(sysdate,'Month')),9,'-')||'-----------' ||' '||
'--------'||rpad(rtrim(TO_CHAR(add_months(sysdate,+1),'Month')),9,'-')||'-----------' SUN_MON_TUE_WED_THU_FRI_SAT, 1 rowno from dual
union --
select 'SUN MON TUE WED THU FRI SAT' ||' '||
'SUN MON TUE WED THU FRI SAT' ||' '||
'SUN MON TUE WED THU FRI SAT' SUN_MON_TUE_WED_THU_FRI_SAT, 2 rowno from dual
union --
select '----------------------------' ||' '||
'----------------------------' ||' '||
'----------------------------' SUN_MON_TUE_WED_THU_FRI_SAT , 3 rowno from dual
union --
select '----------'||TO_CHAR(add_months(sysdate,-1),'YYYY')||'--------------'||' '||
'----------'||TO_CHAR(sysdate,'YYYY')||'--------------'||' '||
'----------'||TO_CHAR(add_months(sysdate,+1),'YYYY')||'--------------' SUN_MON_TUE_WED_THU_FRI_SAT, 10 rowno from dual
union --
SELECT (case when day1 <= add_months(last_day(l_date),-1) OR day1 > last_day(l_date) then ' ' ELSE ' '||to_char(day1,'DD') ||' ' end)
||(case when day2 <= add_months(last_day(l_date),-1) OR day2 > last_day(l_date) then ' ' ELSE ' '||to_char(day2,'DD') ||' ' end)
||(case when day3 <= add_months(last_day(l_date),-1) OR day3 > last_day(l_date) then ' ' ELSE ' '||to_char(day3,'DD') ||' ' end)
||(case when day4 <= add_months(last_day(l_date),-1) OR day4 > last_day(l_date) then ' ' ELSE ' '||to_char(day4,'DD') ||' ' end)
||(case when day5 <= add_months(last_day(l_date),-1) OR day5 > last_day(l_date) then ' ' ELSE ' '||to_char(day5,'DD') ||' ' end)
||(case when day6 <= add_months(last_day(l_date),-1) OR day6 > last_day(l_date) then ' ' ELSE ' '||to_char(day6,'DD') ||' ' end)
||(case when day7 <= add_months(last_day(l_date),-1) OR day7 > last_day(l_date) then ' ' ELSE ' '||to_char(day7,'DD') ||' ' end)
||' '||(case when day8 <= add_months(last_day(m_date),-1) OR day8 > last_day(m_date) then ' ' ELSE (case when trunc(day8) = trunc(m_date) then '-'||to_char(day8,'DD')||'-' ELSE ' '||to_char(day8,'DD') ||' ' end) end)
||(case when day9 <= add_months(last_day(m_date),-1) OR day9 > last_day(m_date) then ' ' ELSE (case when trunc(day9) = trunc(m_date) then '-'||to_char(day9,'DD')||'-' ELSE ' '||to_char(day9,'DD') ||' ' end) end)
||(case when day10 <= add_months(last_day(m_date),-1) OR day10 > last_day(m_date) then ' ' ELSE (case when trunc(day10) = trunc(m_date) then '-'||to_char(day10,'DD')||'-' ELSE ' '||to_char(day10,'DD') ||' ' end) end)
||(case when day11 <= add_months(last_day(m_date),-1) OR day11 > last_day(m_date) then ' ' ELSE (case when trunc(day11) = trunc(m_date) then '-'||to_char(day11,'DD')||'-' ELSE ' '||to_char(day11,'DD') ||' ' end) end)
||(case when day12 <= add_months(last_day(m_date),-1) OR day12 > last_day(m_date) then ' ' ELSE (case when trunc(day12) = trunc(m_date) then '-'||to_char(day12,'DD')||'-' ELSE ' '||to_char(day12,'DD') ||' ' end) end)
||(case when day13 <= add_months(last_day(m_date),-1) OR day13 > last_day(m_date) then ' ' ELSE (case when trunc(day13) = trunc(m_date) then '-'||to_char(day13,'DD')||'-' ELSE ' '||to_char(day13,'DD') ||' ' end) end)
||(case when day14 <= add_months(last_day(m_date),-1) OR day14 > last_day(m_date) then ' ' ELSE (case when trunc(day14) = trunc(m_date) then '-'||to_char(day14,'DD')||'-' ELSE ' '||to_char(day14,'DD') ||' ' end) end)
||' '||(case when day15 <= add_months(last_day(r_date),-1) OR day15 > last_day(r_date) then ' ' ELSE ' '||to_char(day15,'DD') ||' ' end)
||(case when day16 <= add_months(last_day(r_date),-1) OR day16 > last_day(r_date) then ' ' ELSE ' '||to_char(day16,'DD') ||' ' end)
||(case when day17 <= add_months(last_day(r_date),-1) OR day17 > last_day(r_date) then ' ' ELSE ' '||to_char(day17,'DD') ||' ' end)
||(case when day18 <= add_months(last_day(r_date),-1) OR day18 > last_day(r_date) then ' ' ELSE ' '||to_char(day18,'DD') ||' ' end)
||(case when day19 <= add_months(last_day(r_date),-1) OR day19 > last_day(r_date) then ' ' ELSE ' '||to_char(day19,'DD') ||' ' end)
||(case when day20 <= add_months(last_day(r_date),-1) OR day20 > last_day(r_date) then ' ' ELSE ' '||to_char(day20,'DD') ||' ' end)
||(case when day21 <= add_months(last_day(r_date),-1) OR day21 > last_day(r_date) then ' ' ELSE ' '||to_char(day21,'DD') ||' ' end)
SUN_MON_TUE_WED_THU_FRI_SAT, rownum+3 rowno
FROM (select (add_months(last_day(a.l_date),-1)
- to_number(to_char(add_months(last_day(a.l_date),-1),'D') ) + 1) + (7*(rownum-1)) day1,
(add_months(last_day(a.l_date),-1)
- to_number(to_char(add_months(last_day(a.l_date),-1),'D') ) + 2) + (7*(rownum-1))day2,
(add_months(last_day(a.l_date),-1)
- to_number(to_char(add_months(last_day(a.l_date),-1),'D') ) + 3) + (7*(rownum-1))day3,
(add_months(last_day(a.l_date),-1)
- to_number(to_char(add_months(last_day(a.l_date),-1),'D') ) + 4) + (7*(rownum-1))day4,
(add_months(last_day(a.l_date),-1)
- to_number(to_char(add_months(last_day(a.l_date),-1),'D') ) + 5) + (7*(rownum-1))day5,
(add_months(last_day(a.l_date),-1)
- to_number(to_char(add_months(last_day(a.l_date),-1),'D') ) + 6) + (7*(rownum-1))day6,
(add_months(last_day(a.l_date),-1)
- to_number(to_char(add_months(last_day(a.l_date),-1),'D') ) + 7) + (7*(rownum-1))day7,
(add_months(last_day(a.m_date),-1)
- to_number(to_char(add_months(last_day(a.m_date),-1),'D') ) + 1) + (7*(rownum-1)) day8,
(add_months(last_day(a.m_date),-1)
- to_number(to_char(add_months(last_day(a.m_date),-1),'D') ) + 2) + (7*(rownum-1))day9,
(add_months(last_day(a.m_date),-1)
- to_number(to_char(add_months(last_day(a.m_date),-1),'D') ) + 3) + (7*(rownum-1))day10,
(add_months(last_day(a.m_date),-1)
- to_number(to_char(add_months(last_day(a.m_date),-1),'D') ) + 4) + (7*(rownum-1))day11,
(add_months(last_day(a.m_date),-1)
- to_number(to_char(add_months(last_day(a.m_date),-1),'D') ) + 5) + (7*(rownum-1))day12,
(add_months(last_day(a.m_date),-1)
- to_number(to_char(add_months(last_day(a.m_date),-1),'D') ) + 6) + (7*(rownum-1))day13,
(add_months(last_day(a.m_date),-1)
- to_number(to_char(add_months(last_day(a.m_date),-1),'D') ) + 7) + (7*(rownum-1))day14,
(add_months(last_day(a.r_date),-1)
- to_number(to_char(add_months(last_day(a.r_date),-1),'D') ) + 1) + (7*(rownum-1)) day15,
(add_months(last_day(a.r_date),-1)
- to_number(to_char(add_months(last_day(a.r_date),-1),'D') ) + 2) + (7*(rownum-1))day16,
(add_months(last_day(a.r_date),-1)
- to_number(to_char(add_months(last_day(a.r_date),-1),'D') ) + 3) + (7*(rownum-1))day17,
(add_months(last_day(a.r_date),-1)
- to_number(to_char(add_months(last_day(a.r_date),-1),'D') ) + 4) + (7*(rownum-1))day18,
(add_months(last_day(a.r_date),-1)
- to_number(to_char(add_months(last_day(a.r_date),-1),'D') ) + 5) + (7*(rownum-1))day19,
(add_months(last_day(a.r_date),-1)
- to_number(to_char(add_months(last_day(a.r_date),-1),'D') ) + 6) + (7*(rownum-1))day20,
(add_months(last_day(a.r_date),-1)
- to_number(to_char(add_months(last_day(a.r_date),-1),'D') ) + 7) + (7*(rownum-1))day21,
a.l_date ,a.m_date,a.r_date
from all_objects, (select add_months(s_date,-1) l_date, s_date m_date, add_months(s_date,+1) r_date
FROM (select sysdate s_date from dual)) a where rownum <= 6)
) --to_date('1/26/2002','mm/dd/yyyy')
ORDER BY rowno )
/


Squish

December 08, 2004 - 10:46 pm UTC

Reviewer: A reader

"Try doing that with "group by" -- group by squishes out rows (highly technical term there -- squish)"
...
Analytics give you the "aggregates" with the "details" "

Regarding using analytic-enabled aggregate functions...

Often, I find that I need to add a DISTINCT to my queries to remove dupes when I use analytic functions! In your example above, it was not needed because your query used 3 analytic functions and the whole row was not duplicated.

So, if I find that I need to DISTINCT, am I using the analytic functions when a simple GROUP BY would suffice?

As your example above showed, doing 3 (sum and 2 averages) without using AF would be hairy indeed, but would you say that if my query involves only ONE analytic function, it can ALWAYS be done using regular GROUP BY?

Thanks

Tom Kyte

Followup  

December 09, 2004 - 1:08 pm UTC

maybe -- maybe not.

You say "i have to use distinct" -- I personally see nothing wrong in general with getting two rows that have the same "values" -- it is not something you HAVE to remove. Your application might need it, but it is not mandatory.


select ename, deptno, count(*) over (partition by deptno) cnt_by_dept
from emp;

that cannot be done simply with a group by

And that could contain "duplicates"

but I would say removing the duplicates would be an ERROR. If two SMITH's work in deptno = 10 and my report was to show the above data, removing extra smiths would be an error.

Analytic Functions

December 11, 2004 - 12:28 am UTC

Reviewer: krishna padala from NJ USA

Hi Tom ,

The use of dense rank in our application and it is very use full code. Your exaple helped me a lot.
Could you please help in the folloiwng case.

I have dept_no sevirity_code
101 s1
101 s1
101 s2
101 s2
101 s2
102 s1
103 s2
103 s2
103 s3
103 s4

Now we would like to have the output as

D.id S1 S2 S3 S4

101 2 3 0 0
102 1 0 0 0
103 0 2 1 1

Thanks in advance...

Tom Kyte

Followup  

December 11, 2004 - 7:55 am UTC

search for

pivot

rewrite w/o PARTITION BY clause

December 13, 2004 - 1:55 pm UTC

Reviewer: Shalu from VA, USA

Tom, could you pls help me rewrite this qry without using analytic function. (I am writing this qry in After Parameter Form trigger in reports designer, which supports DENSE_RANK() but doesn't support PARTITION By clause.)

SELECT portfolio_id, name
FROM (SELECT p.portfolio_id, p.name,
DENSE_RANK () OVER
(PARTITION BY p.name
ORDER BY TRUNC (p.asof_date) DESC, p.portfolio_id DESC) AS dr
FROM vn_portfolio p, vn_project pr
WHERE p.project_id = pr.project_id
AND pr.name = 'FAS13'
AND TRUNC (p.asof_date) <=
TO_DATE ('12/02/2004 00:00:00', 'MM/DD/YYYY HH24:MI:SS')
AND p.name IN ('A', 'C', 'B'))
WHERE dr = 1
/

It returns:
PORTFOLIO_ID NAME
------------ -----
10 A
45 B
50 C

This sql gives the ids of the latest portfolios in the FAS133 project for 12/02, because a project can have duplicate portfolios (multiple As or Bs or Cs) for a day, but each having a unique portfolio_id. (Portfolio Id is a running sequence). Now A, B, C are loaded everyday. Let's say for 12/02 we don't have 'A' available then I want the latest A from prior dates.

Here are the underlying ddls:

Create table My_project(project_id number, name varchar2(5))
/

Insert into My_project (project_id, name) values (100, 'FAS13')
/
Insert into my_project (project_id, name) values (200, 'XYZ')
/

Create table my_portfolio(portfolio_id number, name varchar2(5), asof_date date, project_id number)
/

Insert into my_portfolio (portfolio_id, name, asof_date, project_id)
values (10, 'A', to_date('12/01/2004 06:00:00','MM/DD/YYYY HH24:MI:SS'), 100)
/
Insert into my_portfolio (portfolio_id, name, asof_date, project_id)
values (20, 'B', to_date('12/01/2004 14:00:00','MM/DD/YYYY HH24:MI:SS'), 100)
/
Insert into my_portfolio (portfolio_id, name, asof_date, project_id)
values (30, 'C', to_date('12/01/2004 12:00:00','MM/DD/YYYY HH24:MI:SS'), 100)
/
Insert into my_portfolio (portfolio_id, name, asof_date, project_id)
values (40, 'B', to_date('12/02/2004 16:00:00','MM/DD/YYYY HH24:MI:SS'), 100)
/
Insert into my_portfolio (portfolio_id, name, asof_date, project_id)
values (50, 'C', to_date('12/02/2004 20:00:00','MM/DD/YYYY HH24:MI:SS'), 100)
/
Insert into my_portfolio (portfolio_id, name, asof_date, project_id)
values (45, 'B', to_date('12/02/2004 16:00:00','MM/DD/YYYY HH24:MI:SS'), 100)
/

Thanks!


Tom Kyte

Followup  

December 13, 2004 - 2:23 pm UTC

can you hide the construct in a view?

Dupes

December 13, 2004 - 3:09 pm UTC

Reviewer: A reader

"You say "i have to use distinct" -- I personally see nothing wrong in general with getting two rows that have the same "values" -- it is not something you HAVE to remove"

Um..how would you like it if your bank statement shows you the same transaction repeated twice?

"select ename, deptno, count(*) over (partition by deptno) cnt_by_dept from emp;

If two SMITH's work in deptno = 10 and my report was to show the above data, removing extra smiths would be an error"

Yes, but your query above is too simplistic. A more real-world query might be like

select emp_pk, ename, deptno, count(*) over (partition by deptno) cnt_by_dept from emp;

The app might choose to hide the emp_pk from the user depending on the requirements.

Even after adding the emp_pk in there, I still get "dupes" because I am NOT doing a group by.

Yes, analytics are great for showing the aggregate and detail in the same row, but to avoid these dupes, one needs to use DISTINCT and that interferes with view merging, predicate pushing and all that fun stuff?

Thanks

Tom Kyte

Followup  

December 13, 2004 - 3:14 pm UTC

Um, why would my bank statement showed the same transaction twice?

the only way for the same transaction to appear twice would have been for the SOURCE DATA TO HAVE IT TWICE.

For you see -- analytics don't "add rows".

What dups could possibly come from:

select emp_pk, ename, deptno, count(*) over (partition by deptno)
cnt_by_dept from emp;

????? (assuming that emp_pk is the primary key).


Show me a dupe with that -- please?

ANALYTICS DO NOT, CANNOT, WILL NOT produce "new records". Show me a duplicate.... And I'll show you source data that has duplicates in it already.

December 13, 2004 - 6:32 pm UTC

Reviewer: A reader

Yes, analytic dont generate new records, its that they dont "squish" source records like a GROUP BY does

select deptno,count(*) over (partition by deptno) from emp;

would produce dupes

select deptno,count(*) from emp group by deptno;

will not.

Thats what I meant.

See, I frequently use analytics to *avoid* the need to group by a whole bunch of columns when all I need is a little summary level data along with a bunch of details

col1,col2,col3,count(*) over (partition by col4,col5,col6)

instead of

...,count(*)... group by col4,col5,col6

Well, there is really no equivalent without using analytics.

In this case, there are lots of dupes so I need to DISTINCT the whole thing which kind of defeats the purpose?

Or maybe I am missing some basic concept here?

Thanks

Tom Kyte

Followup  

December 13, 2004 - 7:08 pm UTC

but now you are going totally against your prior example -- of "would you like to see your banking transaction twice"

there are times to use analytics.

There are times to use group by.

they are utterly DIFFERENT BEASTS, totally.

I would not be using analytics to avoid group bys of lots of columns (how CAN you -- you'd be partitioning by them!)

It would be wasteful to use analytics and then distinct, vs just doing what you meant to do in the first place -- GROUP BY.

why would you want to *avoid* the right way to do it??

Analytics are not creating dups here, analytics are just simply doing what you said to do.


Show me a real world case where:

col1,col2,col3,count(*) over (partition by col4,col5,col6)

makes "sense"? (dups or not)

If that returns dups -- you have really big problems -- what count(*) was 5 for?


I'm not getting it at all.

That's just the same situation i'm in

January 06, 2005 - 10:09 am UTC

Reviewer: Stefan from Switzerland

Hi Tom,

I just got your two book Expert One on One and Effective Oracle by Design, and they are a very interesting read so far. I especially like the depth of information to be found in 1 on 1, it's just great :), amazing work!

Anyhow, i was trying to get a simple result set as in "show me the average number of servers in use at any given hour during office hours".

An example:

auditing@PURPOSE> create table server (id number, when date );

Table created.

insert into server values (1,to_date('01012005080000','DDMMYYYYHH24MISS'));
insert into server values (2,to_date('01012005080500','DDMMYYYYHH24MISS'));
insert into server values (3,to_date('01012005081000','DDMMYYYYHH24MISS'));
insert into server values (4,to_date('01012005082000','DDMMYYYYHH24MISS'));
insert into server values (5,to_date('01012005090000','DDMMYYYYHH24MISS'));
insert into server values (6,to_date('01012005090500','DDMMYYYYHH24MISS'));
insert into server values (7,to_date('01012005090900','DDMMYYYYHH24MISS'));

auditing@PURPOSE> select * from server;

ID WHEN
---------- ---------
1 01-JAN-05
2 01-JAN-05
3 01-JAN-05
4 01-JAN-05
5 01-JAN-05
6 01-JAN-05
7 01-JAN-05

7 rows selected.

The GROUP BY approach:
auditing@PURPOSE> select to_char(when,'DD HH24'), count(*) from server group by to_char(when,'DD HH24');

TO_CH COUNT(*)
----- ----------
01 08 4
01 09 3

Elapsed: 00:00:00.00

and the Analytics approach:

auditing@PURPOSE> select to_char(when,'DD HH24'), count(*) over (partition by to_char(when,'DD HH24')) from server;

TO_CH COUNT(*)OVER(PARTITIONBYTO_CHAR(WHEN,'DDHH24'))
----- -----------------------------------------------
01 08 4
01 08 4
01 08 4
01 08 4
01 09 3
01 09 3
01 09 3

7 rows selected.

Elapsed: 00:00:00.00

Not the same results, so i have to add that evil DISTINCT:

auditing@PURPOSE> select distinct to_char(when,'DD HH24'), count(*) over (partition by to_char(when,'DD HH24')) from server;

TO_CH COUNT(*)OVER(PARTITIONBYTO_CHAR(WHEN,'DDHH24'))
----- -----------------------------------------------
01 08 4
01 09 3

Elapsed: 00:00:00.00

which does give me the right result, but if GROUP BY query and the DISTINCT + Analytics query are compared in i.e. Runstats:

auditing@PURPOSE> select to_char(when,'DD HH24'), count(*) from server group by to_char(when,'DD HH24');

TO_CH COUNT(*)
----- ----------
01 08 4
01 09 3

auditing@PURPOSE> exec rs.rs_middle;

PL/SQL procedure successfully completed.

auditing@PURPOSE> select distinct to_char(when,'DD HH24'), count(*) over (partition by to_char(when,'DD HH24')) from server;

TO_CH COUNT(*)OVER(PARTITIONBYTO_CHAR(WHEN,'DDHH24'))
----- -----------------------------------------------
01 08 4
01 09 3

auditing@PURPOSE> exec rs.rs_stop(500);
Run1 ran in 1914 hsecs
Run2 ran in 1122 hsecs
Run1 ran in 170.59% of the time
Name Run1 Run2 Diff
LATCH.cache buffers chains 461 2,026 1,565

Run1 latches total versus runs -- difference and pct
Run1 Run2 Diff Pct
3,966 6,041 2,075 65.65%

PL/SQL procedure successfully completed.


We can see that the group by in a case like this appears to be the far better choice (and if i attempted to run this on a) more data and b) in a for 1..1000 loop the difference got enormous).

Am i correct that this kind of query just won't be doable in a performant way using analytics ?

PS Thanks a lot of providing this forum to the community, this site just rocks! :)




Tom Kyte

Followup  

January 06, 2005 - 11:13 am UTC

this would be inappropriate for analytics.

analytics do not mean "group by is dead", far from it.

analytics are used when you WANT the detail records -- group by is used when you want to aggregate. You clearly wanted to aggregate here, group by would be the correct approach.

Help on SQL using analytics

January 12, 2005 - 7:42 am UTC

Reviewer: Vikas Khanna from INDIA

Hi Tom,

Please help me in writing a SQL query which outputs the results as a concatenated string in the form of

Ename1,Ename2,Ename3,Ename4,Ename5,Ename6 and so on. using the EMP table under Scott schema.

Is it possible to do that?

What all possible methods are there to do that if the output cannot be returned using SQL?

Regards,


Tom Kyte

Followup  

January 12, 2005 - 9:49 am UTC

search this site for stragg

Query

April 26, 2005 - 11:21 pm UTC

Reviewer: A reader

I have one table like this:

PERSON_ID PRODUCT_CODE ACCOUNT_ID
1000 1 ...
1000 2 ...
1000 3 ...
1001 1 ...
1001 1 ...
1002 2 ...
1002 3 ...
1003 1 ...
1003 1 ...
1003 2 ...

I want to get a list of all PERSON_IDs excluding those who have just accounts with PRODUCT_CODE=1.

So in above example, I want to exclude 1001 person_id since he has only those accounts for which product_code=1. 1003 will be included in the result set because although he has two accounts with product_code=1 he also has some other account for some different product_code.

How can I write the SQL to do this?


Tom Kyte

Followup  

April 27, 2005 - 7:41 am UTC

do you need all of the rows for that person_id or just a list of person_ids?

select distinct person_id from t where product_code <> 1;

gets a list of person_ids


select *
from ( select t.*, count( case when product_code <> 1 then 1 end )
over (partition by person_id) cnt
from t )
where cnt > 0;

if you need all of the details (even product_code = 1 records) for person_ids that have some row with product_code <> 1;


count using analytic functions

May 05, 2005 - 2:05 pm UTC

Reviewer: A reader

Hi Tom,

I have a table t as

create table t
(
d_id number,
status char(1),
crdate date
);

insert into t values (1, 'Y', '04-05-2005 00:00:00');
insert into t values (2, 'Y', '04-05-2005 00:00:00');
insert into t values (3, 'N', '04-05-2005 00:00:00');
insert into t values (4, 'Y', '04-09-2005 00:00:00');
insert into t values (5, 'Y', '04-09-2005 00:00:00');
insert into t values (6, 'N', '04-09-2005 00:00:00');
insert into t values (7, 'N', '04-09-2005 00:00:00');
insert into t values (8, 'N', '04-09-2005 00:00:00');
commit;

I need result of this table to look like the following:

crdate s1 s2
------- --- ----

04-05-2005 00:00:00 2 (status = 'Y') 1 (status = 'N')
04-09-2005 00:00:00 2 (status = 'Y') 3 (status = 'N')


Thanks.



Tom Kyte

Followup  

May 05, 2005 - 2:07 pm UTC

ops$tkyte@ORA10G> select crdate, count(case when status = 'Y' then 1 end) s1, count(case when status='N' then 1 end) s2
  2    from t
  3   group by crdate;
 
CRDATE                      S1         S2
------------------- ---------- ----------
04-05-2005 00:00:00          2          1
04-09-2005 00:00:00          2          3
 

You rock Tom!!!! Thanks a lot!!!!

May 05, 2005 - 2:31 pm UTC

Reviewer: A reader


Error or WAD (Works as Designed)?

May 31, 2005 - 5:01 pm UTC

Reviewer: A reader

CREATE TABLE DEPT (
DEPT_NO NUMBER,
PT_DEPT_NO NUMBER,
CONSTRAINT dept_pk PRIMARY KEY ( DEPT_NO ),
CONSTRAINT dept_fk FOREIGN KEY ( PT_DEPT_NO ) REFERENCES DEPT( DEPT_NO )
);

INSERT INTO DEPT VALUES( 1, NULL );
INSERT INTO DEPT VALUES( 2, 1 );

SELECT DEPT_NO, PT_DEPT_NO, PRIOR DEPT_NO PRIOR_DEPT_NO
FROM DEPT
START WITH PT_DEPT_NO IS NULL
CONNECT BY PRIOR DEPT_NO = PT_DEPT_NO;

SELECT DEPT_NO, PT_DEPT_NO, PRIOR DEPT_NO PRIOR_DEPT_NO, COUNT(*) OVER ()
FROM DEPT
START WITH PT_DEPT_NO IS NULL
CONNECT BY PRIOR DEPT_NO = PT_DEPT_NO;

Adding in the analytic kills the PRIOR data. I would think you would either get the prior column or get a syntax error.

Tom Kyte

Followup  

June 01, 2005 - 8:44 am UTC

well, the analytics will destroy the sort order in general anyway, so the way I would do this would be:

ops$tkyte@ORA9IR2> select x.*, count(*) over ()
  2    from (
  3          SELECT DEPT_NO, PT_DEPT_NO, PRIOR DEPT_NO PRIOR_DEPT_NO, rownum r
  4          FROM DEPT
  5          START WITH PT_DEPT_NO IS NULL
  6          CONNECT BY PRIOR DEPT_NO = PT_DEPT_NO
  7         ) x
  8   order by r
  9                  ;
 
   DEPT_NO PT_DEPT_NO PRIOR_DEPT_NO          R COUNT(*)OVER()
---------- ---------- ------------- ---------- --------------
         1                                   1              2
         2          1             1          2              2

 

Prior vs Analytics

June 01, 2005 - 1:25 pm UTC

Reviewer: A reader

I didn't actually care about the order of the result set. It just seemed odd that PRIOR was syntactically correct but returned the wrong answer based on the analytic being there.

Why does the addition of the analytic kill the PRIOR from working?

Tom Kyte

Followup  

June 01, 2005 - 5:17 pm UTC

Don't know, I did file a bug to have it looked at.

Wondering if this can be done using analytics

June 20, 2005 - 5:35 pm UTC

Reviewer: dharma from CA USA

Hi Tom,

This query can be done using tables, But out of curiosity I want to know if this can be done using analytics.

I have amount in a bit of sequence

10 13 17 21

and a tablular chart like this

Month paid | 1 2 3 4
--------------------------
Month Born
1 |10 21 17 13
|
2 |13 10 21 17
|
3 |17 13 10 21
|
4 |21 17 13 10


If I give the "month born" and "month paid" I have to get the amount. (Actually the months would be from 1..12)

Thanks



Tom Kyte

Followup  

June 20, 2005 - 5:46 pm UTC

does not compute. where does a "chart" come into play in the database and what is a "bit of sequence" in database speak?

June 20, 2005 - 6:15 pm UTC

Reviewer: dharma from CA USA

The sequence : I meant that the difference between the amounts resembles a sequnce. 10 13 17 21 would be 3,4,4
and this is seen in the chart. If for month born=1 and month paidn=1 then the amount is 10; and so the data would look like (1,1)=10 (1,4)=13 (1,3)=17 (1,2)=21.

I have to get the amount to be paid, based on when a person is born and when he pays, from the chart.


Tom Kyte

Followup  

June 20, 2005 - 6:21 pm UTC

give me tables. wait, first read this:

</code> http://asktom.oracle.com/Misc/how-to-ask-questions.html <code>

and understand that to you, this all makes obvious sense. To use, this is just text on the screen.

10 13 17 21 would be 3,4,4???? huh?

phrase the question more precisely, eg: what are the inputs and outputs here.


is it as easy as:

select decode( :MONTH_PAID, 1, c1, 2, c2, 3, c3, ...., 12, c12 )
from t
where month_born = :MONTH_BORN;


(turn your chart into a table, with columns month_born and c1..c12 for month paid?)

If so, I would not model it as a chart - but pretend you have no inside knowledge, now -- describe the problem to someone.

June 20, 2005 - 6:37 pm UTC

Reviewer: dharma

In my first post I have already mentioned that I have done this using tables. I wanted to know from the chart if it was possible to write a query to get the desired result by passing the "month paid" and "month born" parameters. I am a regular user of this site and I have already read the "How to ask a question" a couple of times. Its just that I thought the question was clear enough. My apologies if its not.
Thanks

Tom Kyte

Followup  

June 21, 2005 - 7:50 am UTC

ummm, well, do you care to share your tables. I'm not understanding what a "chart" is.

If the "chart" is a table with rows and columns, the query is above, the one with decode.

If you've already done this using tables, then you must have already answered your own question? I mean, you have the data in some tables, you are running queries to get the answer? So -- yes, I am still somewhat confused as to the question here still.

Look at your question, do you see any create tables? insert intos? I see questions about "what is a chart in relation to a database" for example?

Need Top N query with sum of the rest.

June 21, 2005 - 6:01 am UTC

Reviewer: Shrikant from Mumbai, India

Dear Tom,

Thanks alot for the valuable information/suggestions/links you are providing. And also thanks alot for your book "Expert One-on-One".


I have a question regarding the TOP-N-Query. I have following Fact Table "Invoice_Fact"

Country Customer Sale_Amount Quantity
USA USA1 5000 50
AUS AUS1 500 30
USA USA1 2000 30
USA USA2 700 30
USA USA3 250 40
AUS AUS2 300 70
USA USA4 1700 30
AUS AUS3 600 20
USA USA4 1700 30
USA USA5 1200 40
USA USA6 300 30
USA USA7 500 70
AUS AUS4 2600 20
AUS AUS4 1600 20
AUS AUS5 1800 10
AUS AUS6 1400 15


The end user wants a report which will list top 5 customers based on the sales amount and also give the total for rest customers. I mean the output should be like this.

Country Rank Customer Sum_Sales Quantity %on_Total_Sale
--------------------------------------------------------
USA 1 USA1 7000 80 52.43%
USA 2 USA4 3400 60 25.47%
USA 3 USA5 1200 40 8.99%
USA 4 USA2 700 30 5.24%
USA 5 USA7 500 70 3.75%
USA 6 others 550 70 4.12%

Here "others" means sum of USA3 and USA6. And the percentage is taken based on the grand sales total for USA which is 13350.

I wanted to achieve this in SQL, please guide me in this.

P.S :-
Tom. I know I should not ask a fresh question in Review, but believe me I have searched for similar question but couldnt find it. And I found the base question and related answers closely matches with my requirement, thats why I put the question here.


Thanks alot in advance.

Tom Kyte

Followup  

June 21, 2005 - 4:44 pm UTC

not a create table or insert into in sight....

I Will Help with inserts and create table

June 21, 2005 - 6:59 pm UTC

Reviewer: AS from UK

Hi Tom,

I Will add the required scripts as i was trying to do this myself.
I have also worked out a solution as per my understanding and I am anixously waiting for your comments. I am trying to get hold of analytical and aggregate functions.

Thanks

create table t ( country varchar2(10) ,
customer varchar2 (10),
sale_amount number (5),
quantity number(5) ) ;

insert into t
select 'USA', 'USA1', 5000, 50 from dual union
select 'AUS', 'AUS1', 500 , 30 from dual union
select 'USA', 'USA1', 2000, 30 from dual union
select 'USA', 'USA2', 700, 30 from dual union
select 'USA', 'USA3', 250, 40 from dual union
select 'AUS', 'AUS2', 300, 70 from dual union
select 'USA', 'USA4', 1700, 30 from dual union
select 'AUS', 'AUS3', 600, 20 from dual union
select 'USA', 'USA4' , 1700, 30 from dual union
select 'USA', 'USA5', 1200, 40 from dual union
select 'USA', 'USA6', 300, 30 from dual union
select 'USA', 'USA7', 500, 70 from dual union
select 'AUS', 'AUS4', 2600, 20 from dual union
select 'AUS', 'AUS4', 1600, 20 from dual union
select 'AUS', 'AUS5', 1800, 10 from dual union
select 'AUS', 'AUS6', 1400, 15 from dual
;
commit ;


I came up with this solution .

select mytab.*, ratio_to_report( sale_amount ) over ( ) * 100 as on_total_sale
from
(select country,
rank,
customer,
sale_amount,
quantity
from
(select country,
sale_amount,
rank( ) over ( order by sale_amount desc) rank,
-- dense_rank ( ) over( order by sale_amount desc ) drank,
customer ,
quantity
from (select country,
customer,
sum(sale_amount) as sale_amount,
sum(quantity) as quantity
from t
group by country, customer)
order by sale_amount desc)
where rank <=5
UNION ALL
select 'USA',
6,
'Others',
sum(sale_amount) as ,
sum(quantity)
from
(select country,
sale_amount,
rank( ) over ( order by sale_amount desc) rank,
-- dense_rank ( ) over( order by sale_amount desc ) drank,
customer ,
quantity
from (select country,
customer,
sum(sale_amount) as sale_amount,
sum(quantity) as quantity
from t
group by country, customer)
order by sale_amount desc)
where rank > 5
group by 'USA', 6,'Others') mytab ;


COUNTRY RANK CUSTOMER SALE_AMOUNT QUANTITY ON_TOTAL_SALE
---------- ------------- ---------- ------------- ------------- -------------
USA 1 USA1 7000 80 34.23
AUS 2 AUS4 4200 40 20.54
AUS 3 AUS5 1800 10 8.80
USA 4 USA4 1700 30 8.31
AUS 5 AUS6 1400 15 6.85
USA 6 Others 4350 330 21.27

6 rows selected.


Tom Kyte

Followup  

June 21, 2005 - 11:53 pm UTC

ops$tkyte@ORA10GR1> select cty, cust, amt, qty, dr, ratio_to_report(amt) over ()*100 tot_pct
  2    from (
  3  select case when dr <= 5 then country else 'oth' end cty,
  4         case when dr <= 5 then customer else 'others' end cust,
  5             sum(amt) amt, sum(qty) qty, dr
  6    from (
  7  select country, customer, amt, qty,
  8         least( dense_rank() over (order by amt desc nulls last), 6 ) dr
  9    from (
 10  select country, customer, sum(sale_amount) amt, sum(quantity) qty
 11    from t
 12   group by country, customer
 13         )
 14             )
 15   group by case when dr <= 5 then country else 'oth' end ,
 16            case when dr <= 5 then customer else 'others' end ,
 17                    dr
 18             )
 19   order by dr
 20  /
 
CTY        CUST              AMT        QTY         DR    TOT_PCT
---------- ---------- ---------- ---------- ---------- ----------
USA        USA1             7000         80          1 34.2298289
AUS        AUS4             4200         40          2 20.5378973
AUS        AUS5             1800         10          3 8.80195599
USA        USA4             1700         30          4 8.31295844
AUS        AUS6             1400         15          5 6.84596577
oth        others           4350        330          6 21.2713936
 
6 rows selected.
 
 

Thanks You Very Much, Tom.

June 22, 2005 - 3:52 am UTC

Reviewer: A reader

You are a great.

June 22, 2005 - 7:22 am UTC

Reviewer: A reader

Hi Tom,

I have the foll requirement:

create table t1
(
datecol date,
idcol1 number,
idcol2 number,
col1 varchar2(20),
charcol char(1),
opcol varchar2(10)
);

I am doing the following queries in a single procedure all based on t1.

select max(datecol), max(idcol1), max(idcol2)
into tdatecol, tidcol1, tidcol2
from t1
where col1 = '1234'
and charcol = 'A';

select max(datecol), max(idcol2)
into tdatecol1, tidcol21
from t1
where idcol1 = tidcol1
and charcol = 'B';

select opcol into topcol
from t1
where idcol2 = tidcol2;

if (tidcol21 is not null) then
select opcol into topcol1
from t1
where idcol2 = tidcol21;
end if;

Is it possible for me to put this all in a single query? Can I use analytic functions for this? Please help.

Thanks.


Tom Kyte

Followup  

June 22, 2005 - 7:31 am UTC

please, ask things once and in one place. I'll play with this when I get a chance later.

June 22, 2005 - 10:05 am UTC

Reviewer: A reader

Hi Tom,

I am sorry about that. Though I already posted the same question in another topic, I think two people asked two questions at the same time and I didn't know if you got my question. I am sorry about that.

Thanks a lot for your help.



Possible answer for "A Reader"

June 22, 2005 - 11:45 am UTC

Reviewer: Sean from Columbus, OH USA

I wasn't sure about constraints and so on, so I took a guess based on the queries that idcol2 must be unique

I don't consider this optimal, but it's a start.

Hope it helps!


SELECT tdatecol, tidcol1, tidcol2,
tdatecol1, tidcol21, topcol,
(CASE
WHEN tidcol21 IS NOT NULL
THEN (SELECT opcol
FROM t1
WHERE idcol2 = tidcol21)
ELSE 'UNKNOWN'
-- no "else" in original question,
--so I didn't know what to do here
END
) topcol1
FROM (SELECT DISTINCT tdatecol, tidcol1, tidcol2,
MAX(datecol) OVER () tdatecol1,
MAX(b.idcol2) OVER () tidcol21,
(SELECT opcol FROM t1
WHERE idcol2 = tidcol2) topcol
FROM (SELECT MAX(datecol) tdatecol,
MAX(idcol1) tidcol1,
MAX(idcol2) tidcol2
FROM t1
WHERE col1 = '1234'
AND charcol = 'A'
) a,
t1 b
WHERE b.idcol1(+) = a.tidcol1
AND b.charcol(+) = 'B')

Shameful cross posting

June 23, 2005 - 2:35 am UTC

Reviewer: Mike from England

This is a shameful cross posting as I've not generated a lot of discussion on google groups. There is however an interesting (I Think) Analyitics challenge on there

See

</code> http://groups-beta.google.com/group/comp.databases.oracle.server/browse_frm/thread/d132a82f68dc7862/a8f467ef122f7f02?hl=en#a8f467ef122f7f02 <code>

or if that doesn't work serach for:

SQL Challenge DEGREES

I've got a C++ Developer breathing down my neck in terms of performance (as I'm using a bit of PL/SQL) so I could do with some help to save the name of Oracle and SQL.

Big thanks to those who follow the link

Cheers,

Mike.

Little debugging …

June 24, 2005 - 4:12 pm UTC

Reviewer: Gabe

Mike:

[I only read the google groups … don’t like to leave my email … so here it is].

Jonathan missed a little thing in there … should use row_number() and not rank().

Consider:

create table test_data
( id number(9) not null
,degrees number(3) not null check (degrees between 0 and 359)
,sequence_no number(9)
);

insert into test_data values ( 1, 358, null);
insert into test_data values ( 1, 10, null);
insert into test_data values ( 1, 180, null);
insert into test_data values ( 1, 181, null);

flip@FLOP> select
2 id,
3 sequence_no,
4 degrees,
5 rank() over (
6 partition by id
7 order by
8 decode(first,1,0,degrees)
9 ) new_seq
10 from (
11 select
12 id,
13 sequence_no,
14 degrees,
15 rank() over (
16 partition by id
17 order by least(degrees, abs(360-degrees))
18 ) first
19 from
20 test_data
21 )
22 ;

ID SEQUENCE_NO DEGREES NEW_SEQ
---------- ----------- ---------- ----------
1 358 1
1 10 2
1 180 3
1 181 4

Seems OK … but …

flip@FLOP> insert into test_data values ( 1, 2, null);

1 row created.

flip@FLOP> select
2 id,
3 sequence_no,
4 degrees,
5 rank() over (
6 partition by id
7 order by
8 decode(first,1,0,degrees)
9 ) new_seq
10 from (
11 select
12 id,
13 sequence_no,
14 degrees,
15 rank() over (
16 partition by id
17 order by least(degrees, abs(360-degrees))
18 ) first
19 from
20 test_data
21 )
22 ;

ID SEQUENCE_NO DEGREES NEW_SEQ
---------- ----------- ---------- ----------
1 358 1
1 2 1 <== wrong sequence number
1 10 3
1 180 4
1 181 5

One row_number() seems to fix it …

flip@FLOP> select
2 id,
3 sequence_no,
4 degrees,
5 row_number() over (
6 partition by id
7 order by
8 decode(first,1,0,degrees)
9 ) new_seq
10 from (
11 select
12 id,
13 sequence_no,
14 degrees,
15 rank() over (
16 partition by id
17 order by least(degrees, abs(360-degrees))
18 ) first
19 from
20 test_data
21 )
22 ;

ID SEQUENCE_NO DEGREES NEW_SEQ
---------- ----------- ---------- ----------
1 358 1
1 2 2
1 10 3
1 180 4
1 181 5

But [while proving the results are non-deterministic] we also see:

flip@FLOP> delete from test_data where id=1 and degrees=358;

1 row deleted.

flip@FLOP> insert into test_data values ( 1, 358, null);

1 row created.

flip@FLOP> select
2 id,
3 sequence_no,
4 degrees,
5 row_number() over (
6 partition by id
7 order by
8 decode(first,1,0,degrees)
9 ) new_seq
10 from (
11 select
12 id,
13 sequence_no,
14 degrees,
15 rank() over (
16 partition by id
17 order by least(degrees, abs(360-degrees))
18 ) first
19 from
20 test_data
21 )
22 ;

ID SEQUENCE_NO DEGREES NEW_SEQ
---------- ----------- ---------- ----------
1 2 1
1 358 2 <== wrong sequence number
1 10 3
1 180 4
1 181 5

So, both calls to rank() should be replaced with row_number() …

flip@FLOP> select
2 id,
3 sequence_no,
4 degrees,
5 row_number() over (
6 partition by id
7 order by
8 decode(first,1,0,degrees)
9 ) new_seq
10 from (
11 select
12 id,
13 sequence_no,
14 degrees,
15 row_number() over (
16 partition by id
17 order by least(degrees, abs(360-degrees))
18 ) first
19 from
20 test_data
21 )
22 ;

ID SEQUENCE_NO DEGREES NEW_SEQ
---------- ----------- ---------- ----------
1 2 1
1 10 2
1 180 3
1 181 4
1 358 5

Which still leaves the result non-deterministic ... if that concerns you.

Cheers.



payments distribution problem, could you help.

August 09, 2005 - 5:49 am UTC

Reviewer: Alexander Semenov from Ukraine

Hi, Tom!

Is there any way to implement the task:

There are two tables “Debt_tab” and “Pmnt_tab” with such structure:

create table debt_tab(
debt_id number,
ag_id number,
debt_sum number,
);

create table pmnt_tab(
pm_id number,
ag_id number,
pmnt_sum number,
);

insert into debt_tab
values(1, 25, 101.0);
insert into debt_tab
values(2, 25, 200.0);
insert into debt_tab
values(3, 35, 213.0);
insert into debt_tab
values(4, 35, 15.0);
insert into debt_tab
values(5, 35, 132.0);
insert into debt_tab
values(6, 35, 39.0);


insert into pmnt_tab
values(1, 25, 25.0);
insert into pmnt_tab
values(2, 25, 80.0);
insert into pmnt_tab
values(3, 25, 250.0);
insert into pmnt_tab
values(4, 35, 100.0);
insert into pmnt_tab
values(5, 35, 120.0);
insert into pmnt_tab
values(6, 35, 108.0);

Is it possible to make a SQL that will return payments distribution via depts according to their order:

debt_id ag_id debt_sum pm_id pmnt_sum rest_topay
1 25 101.0 1 25.0 76.0
1 25 101.0 2 76.0 0.0
2 25 200.0 2 4.0 196.0
2 25 200.0 3 196.0 0.0
25 3 54.0 -54.0
3 35 213.0 4 100.0 113.0
3 35 213.0 5 113.0 0.0
4 35 15.0 5 7.0 8.0
4 35 15.0 6 8.0 0.0
5 35 132.0 6 100.0 32.0

Thank you.

Tom Kyte

Followup  

August 09, 2005 - 10:11 am UTC

it would be super tricky to get this part out:

1 25 101.0 2 76.0 0.0
2 25 200.0 2 4.0 196.0 <<=== that one
2 25 200.0 3 196.0 0.0

nothing pops to mind for this.

Solution for above?

August 09, 2005 - 2:13 pm UTC

Reviewer: Bob B from Albany, NY

I'm not sure where the 5th record came from (the one with no debt id or debt sum), but this query returns all the other records in the same order and seems to have the logic correct.

SELECT *
FROM (
SELECT
d.DEBT_ID,
d.AG_ID,
d.DEBT_SUM,
p.PM_ID,
LEAST( p.PMNT_SUM, d.RUNNING_TOTAL - p.RUNNING_TOTAL + p.PMNT_SUM, d.DEBT_SUM - d.RUNNING_TOTAL + p.RUNNING_TOTAL ) PMNT_SUM,
CASE
WHEN d.RUNNING_TOTAL - p.RUNNING_TOTAL < -1 * p.PMNT_SUM THEN NULL
WHEN d.RUNNING_TOTAL - p.RUNNING_TOTAL > d.DEBT_SUM THEN NULL
ELSE GREATEST( d.RUNNING_TOTAL - p.RUNNING_TOTAL, 0 )
END REST_TO_PAY
FROM (
SELECT DEBT_ID, AG_ID, DEBT_SUM, SUM( DEBT_SUM ) OVER ( PARTITION BY AG_ID ORDER BY DEBT_ID ) RUNNING_TOTAL
FROM DEBT_TAB
) d, (
SELECT PM_ID, AG_ID, PMNT_SUM, SUM( PMNT_SUM ) OVER ( PARTITION BY AG_ID ORDER BY PM_ID ) RUNNING_TOTAL
FROM PMNT_TAB pt
)p
WHERE d.AG_ID = p.AG_ID
)
WHERE REST_TO_PAY IS NOT NULL
ORDER BY DEBT_ID, PM_ID

DEBT_ID AG_ID DEBT_SUM PM_ID PMNT_SUM REST_TO_PAY
1 25 101 1 25 76
1 25 101 2 76 0
2 25 200 2 4 196
2 25 200 3 196 0
3 35 213 4 100 113
3 35 213 5 113 0
4 35 15 5 7 8
4 35 15 6 8 0
5 35 132 6 100 32


Tom Kyte

Followup  

August 09, 2005 - 3:22 pm UTC

thanks -- these are great "puzzles" aren't they sometimes.

Puzzles are always fun

August 09, 2005 - 4:01 pm UTC

Reviewer: Bob B from Albany, NY

It was simpler than my original idea of using the MODEL clause. The problem is similar to the merge portion of a sort merge. Take two sequenced lists and use the elements from each list in order, according to the "sort" rules. In this case the rules were:

1) Net owed (REST_TO_PAY) must be less than or equal to the current debt record
2) Prior net owed (REST_TO_PAY + current payment) must be greater than 0

Puzzles are always fun

August 10, 2005 - 10:02 am UTC

Reviewer: Alexander Semenov from Ukraine

Bob, thanks a lot!


Found a bug in the result view of the puzzle but not in distribution result.

August 17, 2005 - 2:19 pm UTC

Reviewer: Alexander Semenov from Ukraine

Hi,
The bug resides at PMNT_SUM:

LEAST( p.PMNT_SUM,
d.RUNNING_TOTAL - p.RUNNING_TOTAL + p.PMNT_SUM,
d.DEBT_SUM - d.RUNNING_TOTAL + p.RUNNING_TOTAL ) as PMNT_SUM

The bug isn't seen in the example because of "ideal conditions" shown here. It rises when a payment shifts and spreaded between a few debts.
Try this:

create table debt_tab(
debt_id number,
ag_id number,
debt_sum number
);

create table pmnt_tab(
pm_id number,
ag_id number,
pmnt_sum number
);

insert into debt_tab
values(3, 35, 1650.0);
insert into debt_tab
values(4, 35, 804.0);
insert into debt_tab
values(5, 35, 525.0);
insert into debt_tab
values(6, 35, 765.0);


insert into pmnt_tab
values(4, 35, 520.0);
insert into pmnt_tab
values(5, 35, 19.0);
insert into pmnt_tab
values(6, 35, 2800.0);

and after executing the distribution query instead of expeced we get:
DEBT_ID AG_ID DEBT_SUM PM_ID p_sum PMNT_SUM REST_TO_PAY
3 35 1650 4 520 520 1130
3 35 1650 5 19 19 1111
3 35 1650 6 2800 1111 0
4 35 804 6 2800 1689 0 <---here
5 35 525 6 2800 885 0
6 35 765 6 2800 360 405

1689=2800-1111 = d.Debt_Sum-d.Running_Total+p.Running_Total

It seems that some other condition is missed or something else...
Could it be fixed?

Thank you.

Bug is fixed

August 18, 2005 - 6:12 am UTC

Reviewer: Alexander Semenov from Ukraine

Yesterday I was too tired...
The solution:

LEAST(
p.PMNT_SUM,
d.RUNNING_TOTAL - p.RUNNING_TOTAL + p.PMNT_SUM,
d.DEBT_SUM - d.RUNNING_TOTAL + p.RUNNING_TOTAL,
d.DEBT_SUM <--
) as PMNT_SUM



A query that will fill in the missing rows for the payments distribution problem posted on August 09

August 24, 2005 - 11:39 am UTC

Reviewer: Frank Zhou from Braintree, MA

Here is the sql that built on top of Bob's nice query 
This sql seems return a resultset that will fill in the missing rows 

(     25               3       54.0       -54.0 )  <==========missing rows

Thanks,

Frank



SQL> SELECT debt_id , ag_id , debt_sum, pm_id, PMNT_SUM, rest_to_pay
  2   FROM
  3  (    SELECT debt_id   ,
  4              ag_id     ,
  5              DECODE( r, 1, debt_sum, NULL ) debt_sum,
  6              pm_id,
  7              CASE WHEN r = 1 THEN PMNT_SUM
  8                   ELSE CASE WHEN diff > 0
  9                     THEN diff
 10                     ELSE NULL END
 11                   END AS PMNT_SUM,
 12              CASE WHEN r = 2
 13                   THEN 0 - diff
 14               ELSE  REST_TO_PAY END AS rest_to_pay,
 15            flag, r , diff
 16      FROM ( SELECT *
 17          FROM ( SELECT  d.DEBT_ID,
 18                         d.AG_ID,
 19                  d.DEBT_SUM,
 20                         p.PM_ID,
 21                    LEAST(p.PMNT_SUM,
 22                          d.RUNNING_TOTAL - p.RUNNING_TOTAL + p.PMNT_SUM,
 23                          d.DEBT_SUM - d.RUNNING_TOTAL + p.RUNNING_TOTAL,
 24                          d.DEBT_SUM ) AS PMNT_SUM,
 25               CASE
 26               WHEN d.RUNNING_TOTAL - p.RUNNING_TOTAL < -1 * p.PMNT_SUM 
                  THEN NULL
 27               WHEN d.RUNNING_TOTAL - p.RUNNING_TOTAL > d.DEBT_SUM 
                  THEN NULL
 28               ELSE GREATEST( d.RUNNING_TOTAL - p.RUNNING_TOTAL, 0 )
 29               END REST_TO_PAY,
 30               pm_sum - de_sum  AS diff,
 31                CASE WHEN d.ag_id != lead(d.ag_id ) over (ORDER BY d.ag_id)
 32                 THEN '1' END AS flag
 33                 FROM (
 34                      SELECT DEBT_ID, AG_ID, DEBT_SUM,
 35                      SUM( DEBT_SUM ) OVER ( PARTITION BY AG_ID 
                            ORDER BY DEBT_ID ) RUNNING_TOTAL ,
 36                      SUM( DEBT_SUM ) OVER ( PARTITION BY AG_ID) de_sum
 37                      FROM DEBT_TAB) d,
 38                     ( SELECT PM_ID, AG_ID, PMNT_SUM,
 39                       SUM( PMNT_SUM ) OVER ( PARTITION BY AG_ID 
                              ORDER BY PM_ID ) RUNNING_TOTAL,
 40                       SUM( PMNT_SUM ) OVER ( PARTITION BY AG_ID) pm_sum
 41                           FROM PMNT_TAB pt ) p
 42                  WHERE d.AG_ID = p.AG_ID)
 43                WHERE REST_TO_PAY IS NOT NULL),
 44                (SELECT 1 r FROM dual UNION ALL SELECT 2 r FROM dual)
 45            )
 46           WHERE r = 1 OR ( flag = 1 AND diff > 0 )
 47       ORDER BY DEBT_ID, PM_ID
 48  /

   DEBT_ID      AG_ID   DEBT_SUM      PM_ID   PMNT_SUM REST_TO_PAY              
---------- ---------- ---------- ---------- ---------- -----------              
         1         25        101          1         25          76              
         1         25        101          2         76           0              
         2         25        200          2          4         196              
         2         25        200          3        196           0              
         2         25                     3         54         -54 <==      
         3         35        213          4        100         113              
         3         35        213          5        113           0              
         4         35         15          5          7           8              
         4         35         15          6          8           0              
         5         35        132          6        100          32              

10 rows selected.

SQL> spool off
 

Ovelapping data sets

September 09, 2005 - 4:11 am UTC

Reviewer: Russell from Melbourne, VIC Australia

Hi Tom,

I have a table structure as follows, and relevant data.

CREATE TABLE TABLE1 (
REFDATE DATE,
AGS VARCHAR2(10),
CONFIRMATIONDATE DATE,
STARTTIME DATE,
COMPLETIONDATE DATE,
SORTORDER NUMBER );

Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 07:59:23', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 07:59:24', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:23:58', 'MM/DD/YYYY HH24:MI:SS'), 24);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 12:24:25', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:24:26', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:24:28', 'MM/DD/YYYY HH24:MI:SS'), 23);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 12:24:47', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:24:48', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:24:50', 'MM/DD/YYYY HH24:MI:SS'), 22);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 12:25:09', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:25:10', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:25:13', 'MM/DD/YYYY HH24:MI:SS'), 21);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 12:25:34', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:25:35', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:25:37', 'MM/DD/YYYY HH24:MI:SS'), 20);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 12:25:58', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:25:59', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:26:01', 'MM/DD/YYYY HH24:MI:SS'), 19);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 12:26:21', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:26:22', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:26:24', 'MM/DD/YYYY HH24:MI:SS'), 18);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 12:26:43', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:26:44', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:26:46', 'MM/DD/YYYY HH24:MI:SS'), 17);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 12:27:06', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:27:07', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:27:09', 'MM/DD/YYYY HH24:MI:SS'), 16);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 12:57:08', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:57:09', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 13:18:06', 'MM/DD/YYYY HH24:MI:SS'), 15);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 13:19:20', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 13:19:21', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 14:56:53', 'MM/DD/YYYY HH24:MI:SS'), 14);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 14:57:29', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 14:57:30', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 14:57:32', 'MM/DD/YYYY HH24:MI:SS'), 13);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 14:57:59', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 14:58:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 14:58:05', 'MM/DD/YYYY HH24:MI:SS'), 12);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 07:54:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 11:00:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 15:00:00', 'MM/DD/YYYY HH24:MI:SS'), 11);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 07:55:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 10:30:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 15:02:00', 'MM/DD/YYYY HH24:MI:SS'), 10);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 07:58:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 11:30:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 15:04:00', 'MM/DD/YYYY HH24:MI:SS'), 9);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 07:57:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 09:30:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 15:06:00', 'MM/DD/YYYY HH24:MI:SS'), 8);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 12:32:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:32:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 16:00:00', 'MM/DD/YYYY HH24:MI:SS'), 7);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 12:30:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:32:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 16:02:00', 'MM/DD/YYYY HH24:MI:SS'), 6);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 14:58:26', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 14:58:27', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 16:03:27', 'MM/DD/YYYY HH24:MI:SS'), 5);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 16:03:50', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 16:03:51', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 16:03:53', 'MM/DD/YYYY HH24:MI:SS'), 4);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 12:31:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 12:31:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 16:14:00', 'MM/DD/YYYY HH24:MI:SS'), 3);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 16:05:27', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 16:05:28', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 16:31:00', 'MM/DD/YYYY HH24:MI:SS'), 2);
Insert into Table1 Values (TO_DATE('06/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), '36852707', TO_DATE('06/29/2005 16:31:24', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 16:31:25', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('06/29/2005 16:34:34', 'MM/DD/YYYY HH24:MI:SS'), 1);

The records indicate tasks completed by our field workforce. The ConfirmationDate is the time when a technician selects a task. They are meant to travel to the customers premise, and once onsite, they start working (ie STARTTIME). Once finished they COMPLETE the task. It is possible for a technician to have more than one task at a time (how they work on more than 1 is a mystery to me). The Sortorder field is something I have added to the task, and is basically based on the CompeltionDate in Descending order.

I am trying to work out Work and Travel Times. The travel time should be set as the time between the earliest CONFIRMATIONDATE and earliest STARTTIME (displayed on the first task only). The worktime would be the time between the minimum STARTTIME, and the maximum COMPLETIONDATE divided by the number of jobs in the set. I am trying to ascertain the complete overlapping set. I have attempted, using analytic functions displayed on your site to calculate this result. I would be fine, if I could just put the tasks in order, based on one column (IE COMPLETIONDATE), but having two columns with conflicting sorts (CONFIRMATIONDATE and COMPLETIONDATE), and then establishing the total overlapping records for both of these sets is really testing me.

Looking at the dataset, records with SORTORDER between 12 and 24 are seperate tasks without overlapping times but, tasks 1 to 11 overlap each other, and ultimately tasks 12 to 24. Therefore for this example, I am looking for a result like:
TravelTime between 7:54 am and 7:59:24, and worktime being between 7:59:24 and 16:34:34 (divided by 24)

I am using Oracle 9i. Is it possible to do this in a single SQL statement?

I hope this is clear enough

Thanks
Russell

Tom Kyte

Followup  

September 09, 2005 - 7:16 am UTC

I don't see one easily -- no.

the problem is, we'd need to know the number of tasks in the group (in order to find the maximum completion time) but in order to know the number of jobs we need to know the ultimate completion times....



can we treat this data in minute levels instead of second levels? or even better - say a level of 5 minutes of granularity - we can "explode the data out" to have an observation per minute/5minutes/whatever and just aggregate.

Calculate month-wise

November 02, 2005 - 1:48 am UTC

Reviewer: Kumar from India

Hi Tom,

I have table like this,

Action-type startdt enddt
ADD 10/01/2005 11/01/2005
ADD 09/01/2005 10/01/2005
UPDATE 10/01/2005 10/01/2005
DELETE 10/01/2005 10/01/2005

Now, I have to generate a report (based on the enddate) as follows,

ActionType currentmonthcount month-1_count month-2_count
Add 2 2 1
update 1 1 1
Delete 1 0 0

Can you please help me in this regard?

Thanks

Tom Kyte

Followup  

November 02, 2005 - 5:03 am UTC

I'd really like to, but there wasn't any create table or insert into :(

Calculate month-wise

November 02, 2005 - 6:44 am UTC

Reviewer: Kumar from India

Hi Tom,

Thanks for your reply.

Please find below the create table statement and my select query. I would like to know, is there a better way to frame this query.

CREATE TABLE WORKFLOW_HISTORY (
ACTION_TYPE VARCHAR2 (255),
WORK_START_DTM DATE,
WORK_END_DTM DATE,
) ;

select a. action_type,
decode(b.cnt, null, 0, b.cnt) December,
decode(c.cnt, null, 0, c.cnt) January,
decode(d.cnt, null, 0, d.cnt) Feburary,
decode(e.cnt, null, 0, e.cnt) March,
decode(f.cnt, null, 0, f.cnt) April,
decode(g.cnt, null, 0, g.cnt) May,
decode(h.cnt, null, 0, h.cnt) June,
decode(j.cnt, null, 0, j.cnt) July,
decode(k.cnt, null, 0, k.cnt) August,
decode(m.cnt, null, 0, m.cnt) September,
decode(n.cnt, null, 0, n.cnt) October,
decode(p.cnt, null, 0, p.cnt) November
from workflow_history a,
(select action_type, count(action_type) cnt from workflow_history
where to_char(trunc(work_end_dtm), 'mon') = to_char(add_months(sysdate,-11), 'mon')
and work_end_dtm is not null
group by action_type) b,
(select action_type, count(action_type) cnt from workflow_history
where to_char(trunc(work_end_dtm), 'mon') = to_char(add_months(sysdate,-10), 'mon')
and work_end_dtm is not null
group by action_type) c,
(select action_type, count(action_type) cnt from workflow_history
where to_char(trunc(work_end_dtm), 'mon') = to_char(add_months(sysdate,-9), 'mon')
and work_end_dtm is not null
group by action_type) d,
(select action_type, count(action_type) cnt from workflow_history
where to_char(trunc(work_end_dtm), 'mon') = to_char(add_months(sysdate,-8), 'mon')
and work_end_dtm is not null
group by action_type) e,
(select action_type, count(action_type) cnt from workflow_history
where to_char(trunc(work_end_dtm), 'mon') = to_char(add_months(sysdate,-7), 'mon')
and work_end_dtm is not null
group by action_type) f,
(select action_type, count(action_type) cnt from workflow_history
where to_char(trunc(work_end_dtm), 'mon') = to_char(add_months(sysdate,-6), 'mon')
and work_end_dtm is not null
group by action_type) g,
(select action_type, count(action_type) cnt from workflow_history
where to_char(trunc(work_end_dtm), 'mon') = to_char(add_months(sysdate,-5), 'mon')
and work_end_dtm is not null
group by action_type) h,
(select action_type, count(action_type) cnt from workflow_history
where to_char(trunc(work_end_dtm), 'mon') = to_char(add_months(sysdate,-4), 'mon')
and work_end_dtm is not null
group by action_type) j,
(select action_type, count(action_type) cnt from workflow_history
where to_char(trunc(work_end_dtm), 'mon') = to_char(add_months(sysdate,-3), 'mon')
and work_end_dtm is not null
group by action_type) k,
(select action_type, count(action_type) cnt from workflow_history
where to_char(trunc(work_end_dtm), 'mon') = to_char(add_months(sysdate,-2), 'mon')
and work_end_dtm is not null
group by action_type) m,
(select action_type, count(action_type) cnt from workflow_history
where to_char(trunc(work_end_dtm), 'mon') = to_char(add_months(sysdate,-1), 'mon')
and work_end_dtm is not null
group by action_type) n,
(select action_type, count(action_type) cnt from workflow_history
where to_char(trunc(work_end_dtm), 'mon') = to_char(sysdate, 'mon')
and work_end_dtm is not null
group by action_type) p
where a.action_type = b.action_type (+)
and a.action_type = c.action_type (+)
and a.action_type = d.action_type (+)
and a.action_type = e.action_type (+)
and a.action_type = f.action_type (+)
and a.action_type = g.action_type (+)
and a.action_type = h.action_type (+)
and a.action_type = j.action_type (+)
and a.action_type = k.action_type (+)
and a.action_type = m.action_type (+)
and a.action_type = n.action_type (+)
and a.action_type = p.action_type (+)
group by a.action_type, b.cnt, c.cnt, d.cnt, e.cnt, f.cnt, g.cnt, h.cnt, j.cnt,
k.cnt, m.cnt, n.cnt, p.cnt

Tom Kyte

Followup  

November 03, 2005 - 5:03 am UTC

yes there is.


select action_type,
count( case when trunc(work_end_dtm,'mm') =
trunc(add_months(sysdate,-11),'mm') then action_type end ) dec,
count( case when trunc(work_end_dtm,'mm') =
trunc(add_months(sysdate,-10),'mm') then action_type end ) jan,
count( case when trunc(work_end_dtm,'mm') =
trunc(add_months(sysdate,-9 ),'mm') then action_type end ) feb
/* .... */
from workflow_history
where work_end_dtm >= add_months(sysdate,-11)
group by action_type;



when dealig with dates, try to not use TO_CHAR except to format (exception to that would be when looking for 'saturdays' for example- but in general,use TRUNC, not TO_CHAR)



Re: Month-wise

November 02, 2005 - 9:49 am UTC

Reviewer: Jagjeet Singh from Delhi, India

SQL> select * from a;

ACTION_TYP WORK_STAR WORK_END_
---------- --------- ---------
ADD        10-JAN-05 11-JAN-05
ADD        09-JAN-05 10-JAN-05
UPDATE     10-JAN-05 10-JAN-05
DELETE     10-JAN-05 10-JAN-05

SQL> select
  2  action_type,
  3  sum(decode(mm,1,1,0)) a1,
  4  sum(decode(mm,2,1,0)) a2,
  5  sum(decode(mm,3,1,0)) a3,
  6  sum(decode(mm,5,1,0)) a4,
  7  sum(decode(mm,6,1,0)) a5,
  8  sum(decode(mm,7,1,0)) a6,
  9  sum(decode(mm,8,1,0)) a7,
 10  sum(decode(mm,9,1,0)) a8,
 11  sum(decode(mm,10,1,0)) a9,
 12  sum(decode(mm,11,1,0)) a10,
 13  sum(decode(mm,12,1,0)) a11,
 14  sum(decode(mm,12,1,0)) a12
 15  from
 16  (select action_type,to_number(to_char(trunc(work_end_dtm),'mm')) mm from a)
 17  group by action_type
SQL> /

ACTION_TYP  A1  A2  A3  A4  A5  A6  A7  A8  A9 A10 A11 A12
---------- --- --- --- --- --- --- --- --- --- --- --- ---
ADD          2   0   0   0   0   0   0   0   0   0   0   0
DELETE       1   0   0   0   0   0   0   0   0   0   0   0
UPDATE       1   0   0   0   0   0   0   0   0   0   0   0


A1 = Jan
A2 = Feb 
 ..
 

Re: Month-wise - TYPO

November 02, 2005 - 9:59 am UTC

Reviewer: Jagjeet Singh from Delhi, India

TYPO = Instead of using 4 I repeated 12 two time
in month.

RE: Calculate month-wise

November 02, 2005 - 10:06 am UTC

Reviewer: Duke Ganote from THE EASTERN FRONTIER of Cincinnati, Ohio USA

Kumar-- I'm curious to see what analytic-function oriented query Tom might propose. I'm still wrestling with the analytic functions, so I've only formulated a non-analytic alternative. For comparison, here's what I came up.

First the set up. I assume there are no records with work_end_dtm older than 365 days. Then I generated some sample data:

create table workflow_history
as select case mod(level,3)
when 0 then 'Delete'
when 1 then 'Add'
when 2 then 'Update'
end AS ACTION_TYPE,
sysdate-mod(level,365)-5 AS work_start_dtm,
sysdate-mod(level,365) AS work_end_dtm
from dual connect by level < 50000
/
analyze table workflow_history compute statistics
/

Next I ran the following non-analytic-function query:

select action_type
, sum(back11) December
, sum(back10) January
, sum(back9) February
, sum(back8) March
, sum(back7) April
, sum(back6) May
, sum(back5) June
, sum(back4) July
, sum(back3) August
, sum(back2) September
, sum(back1) October
, sum(current_month) November
from (
select action_type,
case when to_char(sysdate, 'mon')
= to_char(trunc(work_end_dtm), 'mon')
then 1 end as CURRENT_month,
case when to_char(add_months(sysdate,-1),'mon')
= to_char(trunc(work_end_dtm), 'mon')
then 1 end as BACK1,
case when to_char(add_months(sysdate,-2),'mon')
= to_char(trunc(work_end_dtm), 'mon')
then 1 end as BACK2,
case when to_char(add_months(sysdate,-3),'mon')
= to_char(trunc(work_end_dtm), 'mon')
then 1 end as BACK3,
case when to_char(add_months(sysdate,-4),'mon')
= to_char(trunc(work_end_dtm), 'mon')
then 1 end as BACK4,
case when to_char(add_months(sysdate,-5),'mon')
= to_char(trunc(work_end_dtm), 'mon')
then 1 end as BACK5,
case when to_char(add_months(sysdate,-6),'mon')
= to_char(trunc(work_end_dtm), 'mon')
then 1 end as BACK6,
case when to_char(add_months(sysdate,-7),'mon')
= to_char(trunc(work_end_dtm), 'mon')
then 1 end as BACK7,
case when to_char(add_months(sysdate,-8),'mon')
= to_char(trunc(work_end_dtm), 'mon')
then 1 end as BACK8,
case when to_char(add_months(sysdate,-9),'mon')
= to_char(trunc(work_end_dtm), 'mon')
then 1 end as BACK9,
case when to_char(add_months(sysdate,-10),'mon')
= to_char(trunc(work_end_dtm), 'mon')
then 1 end as BACK10,
case when to_char(add_months(sysdate,-11),'mon')
= to_char(trunc(work_end_dtm), 'mon')
then 1 end as BACK11,
case when to_char(add_months(sysdate,-12),'mon')
= to_char(trunc(work_end_dtm), 'mon')
then 1 end as BACK12
from workflow_history
where work_end_dtm IS NOT NULL
)
group by action_type
/

The statistics are:

Statistics
---------------------------------------------------
0 recursive calls
0 db block gets
192 consistent gets
0 physical reads
0 redo size
1224 bytes sent via SQL*Net to client
784 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
3 rows processed

That's 192 LIOs; I got 2496 when I ran the query you provided previously:

Statistics
---------------------------------------------------
1 recursive calls
0 db block gets
2496 consistent gets
0 physical reads
0 redo size
1224 bytes sent via SQL*Net to client
2254 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
13 sorts (memory)
0 sorts (disk)
3 rows processed

We're looking forward to Tom's return from UKOUG...

Re: Month-wise

November 02, 2005 - 11:08 pm UTC

Reviewer: Kumar from India

Duke Ganote:

Thanks for your reply. This looks much better than what I wrote.

Curious to see tom's reply also.

Regards
Kumar

Re:

November 02, 2005 - 11:23 pm UTC

Reviewer: Reader from IND

Hi Kumar,

have you tried Jagjeet's query.
Is it anything wrong .. ?

Re: Month-wise

November 03, 2005 - 2:56 am UTC

Reviewer: Kumar from India

Hi,

Jagjeet query is also correct much faster than my query.

Kumar

code compare

November 03, 2005 - 5:46 am UTC

Reviewer: Duke Ganote from THE EASTERN FRONTIER (of Cincinnati, Ohio USA)

Jagjeet used absolute dates, specifically
to_number(to_char(trunc(work_end_dtm),'mm'))
Tom and I used relative dating like Kumar did in his example. Tom's formulation:
trunc(work_end_dtm,'mm') =
trunc(add_months(sysdate,-11),'mm')

Tom's example (you have to look back right after Kumar's query -- Tom can jump back and answer prior entries) didn't use subqueries, which is cleaner code:

select action_type,
count( case when trunc(work_end_dtm,'mm') =
trunc(add_months(sysdate,-11),'mm') then action_type end )
dec,
count( case when trunc(work_end_dtm,'mm') =
trunc(add_months(sysdate,-10),'mm') then action_type end )
jan,
count( case when trunc(work_end_dtm,'mm') =
trunc(add_months(sysdate,-9 ),'mm') then action_type end )
feb
/* .... */
from workflow_history
where work_end_dtm >= add_months(sysdate,-11)
group by action_type;


LIO compare & "the great disappearing ROWNUM"

November 03, 2005 - 9:38 am UTC

Reviewer: Duke Ganote from THE EASTERN FRONTIER of Cincinnati civilization, Ohio USA

The LIOs for Tom's query were comparable to the query I formulated, but his code is cleaner.

But we've strayed from analytic functions... I noticed this behavior, which I call "the great disappearing ROWNUM" when ordering by a constant:

select 'by rownum desc' ordering, rownum, level l
, row_number() over (order by rownum desc) "row_nbr"
from dual connect by level < 5
UNION
select 'by null or constant', rownum, level l
, row_number() over (order by null desc)
from dual connect by level < 5
/
ORDERING ROWNUM L row_nbr
------------------- ---------- ---------- ----------
by null or constant 1 1
by null or constant 2 2
by null or constant 3 3
by null or constant 4 4
by rownum desc 1 1 4
by rownum desc 2 2 3
by rownum desc 3 3 2
by rownum desc 4 4 1

9i and 10g show the same results, and I'm trying to understand the usefulness and intentionality of the behavior.

Tom Kyte

Followup  

November 04, 2005 - 2:32 am UTC

that looks like a bug, I'll file one.

Re: Month Wise

November 03, 2005 - 10:36 pm UTC

Reviewer: Jagjeet Singh from Delhi, India

> Jagjeet used absolute dates, specifically
> to_number(to_char(trunc(work_end_dtm),'mm'))
> Tom and I used relative dating like Kumar did in his
> example.

Will it have any effect on output ?

Tom Kyte

Followup  

November 04, 2005 - 3:36 am UTC

to use relative versus absolute dates? Sure, wait a couple of weeks and see what happens...



"the great disappearing ROWNUM"

November 04, 2005 - 2:42 am UTC

Reviewer: Matthias Rogel from Kaiserslautern, Germany

Re: Month Wise (relative vs absolute dates); ROWNUM

November 04, 2005 - 6:32 am UTC

Reviewer: Duke Ganote from The EASTERN FRINGE of greater Cincinnati, Ohio USA

There's a bit of requirements confusion because Kumar's original example asked for results like this:

ActionType currentmonthcount month-1_count month-2_count
Add 2 2 1
update 1 1 1
Delete 1 0 0

but in his query, the column labels are absolute ("December", etc) though the calculations are for relative dates.
******************
Matthias Rogel indeed found the same issue I did for ROWNUM vs ROW_NUMBER. However, I can't imagine a real-world query where I'd use both ROWNUM and a ROW_NUMBER ordered by a constant in the same query: ROWNUM would be simpler.

Tom Kyte

Followup  

November 04, 2005 - 8:50 am UTC

if you can use rownum, you should (top-n query, more efficient).

use row_number only when you need to partition the data and do a top-n by some set of columns

Re: Month-wise

November 04, 2005 - 8:35 am UTC

Reviewer: Jagjeet Singh from Delhi, India

> to use relative versus absolute dates? Sure, wait a couple of weeks and see
> what happens...

I was talking about Kumars's query's output.



Tom Kyte

Followup  

November 04, 2005 - 8:55 am UTC

today - nothing.

but if you use absolute dates versus relative dates - problems come in the future.

Calculate month-wise

November 07, 2005 - 1:12 am UTC

Reviewer: Kumar from India

Hello Tom,

Many thanks for your reply and suggestion.

Using Analytic Functions within Pro*C code

November 09, 2005 - 1:20 pm UTC

Reviewer: Prat from UK

Hi Tom,

Please could you let us know if we can use the Analytic functions within Pro*C codes. I tried using the DENSE_RANK function in my pro*C code, and compiled it with my pro*C compiler (9.2.0.4), but it appears that it does not recognise the DENSE_RANK syntax.

Thanks,
Prat

Tom Kyte

Followup  

November 11, 2005 - 10:24 am UTC

you would need to use dynamic sql for that unfortunately.

November 14, 2005 - 3:07 pm UTC

Reviewer: A reader from MD, USA

Dear Tom,

My attempts to get the following did not make much head way. Can I seek your help, please?
I want to get the time spent on scattered read for a given active session in the past 60 minutes in buckets of 5 minutes using v$active_session_history.

Thanks in advance.
Shivaswamy


Tom Kyte

Followup  

November 14, 2005 - 4:24 pm UTC

you would just need to "normal" the sample time down to 5 minute intervals - no analytics or anything fancy.



sample_time,

to_char(sample_time,'dd-mon-yyyy h24:')||
to_char(5*trunc((to_char(sample_time,'mi'))/5),'fm00')


just fix up the "5 minute" part.

Restriction on CLOB?

December 02, 2005 - 11:12 am UTC

Reviewer: Yuan from Newark, NJ USA

Is there a restriction on using CLOB as an argument for LAG() and LEAD()? When I tried, I get:

ORA-00932: inconsistent datatypes: expected - got CLOB

I was unable to find any such restriction in the documentation.

Tom Kyte

Followup  

December 02, 2005 - 11:39 am UTC

analytics take "expressions", here is a potential workaround:

ops$tkyte@ORA10GR2> select id, x, (select x from t where id = last_id) last_x
  2    from (
  3  select id, x, lag(id) over (order by id) last_id
  4    from t
  5         )
  6  /

        ID X          LAST_X
---------- ---------- ----------
         1 hello
         2 world      hello
 

Restriction on CLOB? (continued)

December 02, 2005 - 1:14 pm UTC

Reviewer: Yuan from Newark, NJ USA

I don't have a PK field on the table in question. Can I use ROWID?

Tom Kyte

Followup  

December 03, 2005 - 9:55 am UTC

you don't have a primary key on a table?????? what?

yes, you can use rowid, but hey - you are missing something here.

No PK

December 05, 2005 - 7:36 am UTC

Reviewer: Yuan from Newark, NJ USA

It's a history/audit table of another table. Every iteration of every record on the other table is on this table. This table will only have records added and never updated. Do you still think I need a PK? If so, why?

Tom Kyte

Followup  

December 06, 2005 - 4:18 am UTC

rowid will be fine.



What is wrong with this ?

December 12, 2005 - 6:39 pm UTC

Reviewer: A reader

I am using
db: oracle 8.1.7.4
os: sun sol v8, 64 bit

dev-->
dev--> CREATE OR REPLACE PROCEDURE select_rec (
2 my_cur OUT credit.cursortype,
3 num_days IN NUMBER
4 )
5 AS
6 BEGIN
7 OPEN my_cur FOR
8 SELECT *
9 FROM (SELECT oh.*, DENSE_RANK() OVER(PARTITION BY ID_GROUP) rn
10 FROM ora_history oh
11 WHERE status = 2)
12 WHERE rn = 1;
13
14 END select_rec;
15 /

Warning: Procedure created with compilation errors.

Elapsed: 00:00:00.00
dev--> show err
Errors for PROCEDURE SELECT_REC:

LINE/COL ERROR
-------- -----------------------------------------------------------------
9/48 PLS-00103: Encountered the symbol "(" when expecting one of the
following:
, from


Do you know why is this error occured ?

Tom Kyte

Followup  

December 12, 2005 - 7:28 pm UTC

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:3027089372477 <code>

Missing a table alias???

December 13, 2005 - 9:07 am UTC

Reviewer: Mike from Minneapolis

Is it possible that the inner table expression is missing the required "alias" after the parentheses? My formulation of the query seems to work just fine.

SELECT MISSING_TABLE_ALIAS.* FROM
(SELECT oh.*, DENSE_RANK() OVER(PARTITION BY ID_GROUP) rn
FROM ora_history oh
WHERE oh.status = 2) MISSING_TABLE_ALIAS
WHERE MISSING_TABLE_ALIAS.rn = 1;



Tom Kyte

Followup  

December 13, 2005 - 9:50 am UTC

no, read the above link - 8i PLSQL did not understand analytics or <list of stuff>, the workaround is documented in full there.

Thanks Tom.

December 13, 2005 - 10:41 am UTC

Reviewer: A reader

I have another qustion. Please let me know if this is possible in single sql in oracel 8.1.7.4

I have following query..

SELECT *
FROM (SELECT oh.*, DENSE_RANK() OVER(PARTITION BY ID , oh_side order by oh_date desc) rn
FROM ora_history oh
WHERE status = 2)
WHERE rn = 1;

now, teh dense_rank is created based on partition
id and oh_side. now oh_side can assume only 2 values 1 or 0.
my problem is when oh_size is 0 I want to sort "order by oh_date desc"
and if the oh_side = 1 I want to sort order by "oh_date asc"

is that possible somehow ? may be using decode ? in side
-- over( PARTITION BY ID , oh_side order by oh_date decode(oh_side,1,desc,2,asc))

possible ? good idea ?

Tom Kyte

Followup  

December 13, 2005 - 1:07 pm UTC

partition by id, oh_side
order by decode( oh_side, 1, oh_date) DESC,
decode( oh_side, 2, oh_date) ASC

Thanks tom.

December 13, 2005 - 2:32 pm UTC

Reviewer: A reader


UKOUG-2005

December 17, 2005 - 6:11 am UTC

Reviewer: Reader from US

I attended your session at UKOUG-2005 --SQL Techniques
I am now going thru the slides .
You showed an excellent use of fmt ....
I am unable to find that SQL .
Can you help ?

Thanks

Ashok


Tom Kyte

Followup  

December 17, 2005 - 11:17 am UTC

fmt?

or do you mean "fm" the format modifier in to_char?

select owner,
to_number(substr(data,1,10)) cnt,
to_date(substr(data,11,14),'yyyymmddhh24miss') min_created,
to_date(substr(data,25),'yyyymmddhh24miss') max_created
from (
select owner,
(select to_char( count(*), 'fm0000000000') ||
to_char( min(created),'yyyymmddhh24miss') ||
to_char( max(created),'yyyymmddhh24miss')
from big_table.big_table b
where b.owner = a.owner and b.object_type = 'TABLE' ) data
from (
select a.owner
from big_table.big_table_owners a
order by a.owner
) a
where rownum <= 2
)
/


I used fm there to remove the leading whitespace on a fixed width, postive integer.

First time where .....

December 20, 2005 - 3:38 pm UTC

Reviewer: Yuan from Newark, NJ USA

Suppose I have a Sale_History table as such:

CREATE TABLE Sale_History (Dt DATE,
Sale_Id INTEGER,
Status VARCHAR2(1),
Other VARCHAR2(1))
/
INSERT INTO Sale_History VALUES ('1-DEC-05', 1, 'A', '1')
/
INSERT INTO Sale_History VALUES ('2-DEC-05', 1, 'A', '2')
/
INSERT INTO Sale_History VALUES ('3-DEC-05', 1, 'B', '2')
/
INSERT INTO Sale_History VALUES ('4-DEC-05', 1, 'A', '2')
/
INSERT INTO Sale_History VALUES ('5-DEC-05', 1, 'A', '1')
/
COMMIT
/

How do I find the most earliest time Sale_Id 1 became the current status this time around? In this example, I'd want to see 12/4, not 12/1.

Tom Kyte

Followup  

December 20, 2005 - 4:42 pm UTC

added another sale_id to make it "more interesting"

ops$tkyte@ORA10GR2> select * from sale_history order by sale_id, dt;

DT           SALE_ID S O
--------- ---------- - -
01-DEC-05          1 A 1
02-DEC-05          1 A 2
03-DEC-05          1 B 2
04-DEC-05          1 A 2
05-DEC-05          1 A 1
02-DEC-05          2 A 1
03-DEC-05          2 A 2
04-DEC-05          2 B 2
05-DEC-05          2 A 2
06-DEC-05          2 A 1

10 rows selected.

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> select sale_id, max(min_dt)
  2    from (
  3  select sale_id, min(dt) over (partition by sale_id, maxgrp) min_dt
  4    from (
  5  select sale_id, dt, status,
  6         max(grp) over (partition by sale_id order by dt) maxgrp
  7    from (
  8  select dt, sale_id, status,
  9         case when lag(status) over (partition by sale_id order by dt) <> status
 10              then row_number() over (partition by sale_id order by dt)
 11          end grp
 12    from sale_history
 13         )
 14   where status = 'A'
 15         )
 16         )
 17   group by sale_id
 18  /

   SALE_ID MAX(MIN_D
---------- ---------
         1 04-DEC-05
         2 05-DEC-05


8..12 - start to setup for grouping by marking the beginning of a group.

5.. 14 - carry the group "down", keep only 'A' groups, contigous 'A' records in the set has the same maxgrp id by sale_id now...

3..15 - for each group, assign the min_dt

1..17 - by sale_id, find the maximum date - which is the minimum date of the last group...

 

First time where ..... (continued)

December 20, 2005 - 4:00 pm UTC

Reviewer: Yuan from Newark, NJ USA

I think I've found an answer to my question, but please feel free to post a better one:

SELECT A.Status, D.*
FROM (SELECT Sale_Id, MAX(Dt) Dt
FROM (SELECT Sale_Id,
Dt,
Status, LEAD(Status) OVER (PARTITION BY Sale_Id ORDER BY Dt DESC) Next_Status
FROM Sale_History)
WHERE Status != Next_Status
GROUP BY Sale_Id) D,
Sale_History A
WHERE A.Sale_Id = D.Sale_Id
AND A.Dt = D.Dt

My solution assumes the table is unique by Sale_Id, Dt, which I believe to be an acceptable assumption in my real table, which actually has Timestamp instead of date.

Analytic Functions

December 20, 2005 - 4:52 pm UTC

Reviewer: Reader123 from New Jersey

This site is very helpful ,as always.

First time where ..... (continued)

December 21, 2005 - 10:33 am UTC

Reviewer: Yuan from Newark, NJ USA

You gave me a breakthrough on how to get around not being able to use analytic functions in the where clause, but I'm very confused about why so many steps were necessary. Wouldn't you be able to do the same thing with this?

select sale_id, max(dt)
from (
select dt, sale_id, status,
case when lag(status) over (partition by sale_id order by dt) <>
status
then row_number() over (partition by sale_id order by dt)
end grp
from sale_history
)
WHERE grp IS NOT null
group by sale_id

Another thing, your solution assumes the final status is 'A', which is not a valid assumption. I've added a couple of rows to your Sale_Id 2 to illustrate this point.

CREATE TABLE Sale_History (Dt DATE,
Sale_Id INTEGER,
Status VARCHAR2(1),
Other VARCHAR2(1))
/
INSERT INTO Sale_History VALUES ('1-DEC-05', 1, 'A', '1')
/
INSERT INTO Sale_History VALUES ('2-DEC-05', 1, 'A', '2')
/
INSERT INTO Sale_History VALUES ('3-DEC-05', 1, 'B', '2')
/
INSERT INTO Sale_History VALUES ('4-DEC-05', 1, 'A', '2')
/
INSERT INTO Sale_History VALUES ('5-DEC-05', 1, 'A', '1')
/
INSERT INTO Sale_History VALUES ('02-DEC-05', 2, 'A', '1')
/
INSERT INTO Sale_History VALUES ('03-DEC-05', 2, 'A', '2')
/
INSERT INTO Sale_History VALUES ('04-DEC-05', 2, 'B', '2')
/
INSERT INTO Sale_History VALUES ('05-DEC-05', 2, 'A', '2')
/
INSERT INTO Sale_History VALUES ('06-DEC-05', 2, 'A', '1')
/
INSERT INTO Sale_History VALUES ('07-DEC-05', 2, 'D', '1')
/
INSERT INTO Sale_History VALUES ('08-DEC-05', 2, 'D', '0')
/
COMMIT
/

If you simply remove your line 14 from your query, that seems to work just fine, but wouldn't mine be even simpler? Am I missing something?

One other requirement I forgot to mention. I need to know the current status for each sale. For 1, it'd be 'A' and for 2 'D'. My original solution does that, but I'll try to work with what you've taught me to see if I can use that too.

Tom Kyte

Followup  

December 21, 2005 - 7:43 pm UTC

good points all - and a good show of why boundary conditions are important...

I can break your example too 


   WHERE grp IS NOT null

(think of something with ALL A's) - but yes, you are right, we can optmize this...

if you added this:

INSERT INTO Sale_History VALUES ('06-DEC-05', 3, 'A', '1')
/


you would not find this record BUT, modify just a bit:

ops$tkyte@ORA10GR2> select sale_id, max(dt)
  2      from (
  3    select dt, sale_id, status,
  4           case when nvl(lag(status) over (partition by sale_id order by dt),'xx') <> status
  5                then row_number() over (partition by sale_id order by dt)
  6            end grp
  7      from sale_history
  8           )
  9      WHERE grp IS NOT null
 10     group by sale_id
 11  /

   SALE_ID MAX(DT)
---------- ---------
         1 04-DEC-05
         2 07-DEC-05
         3 06-DEC-05



and we do - I was thinking "too procedurally" :) 

First time where ..... (continued)

December 21, 2005 - 10:45 am UTC

Reviewer: Yuan from Newark, NJ USA

This seems to work:
select * from sale_history where rowid in (select first_value(rid) OVER (PARTITION BY sale_id ORDER BY dt desc) rid
from (
select dt, sale_id, status,
case when lag(status) over (partition by sale_id order by dt) <>
status
then row_number() over (partition by sale_id order by dt)
end grp, ROWID rid
from sale_history
)
WHERE grp IS NOT NULL)

Would I be better off using EXISTS rather than IN on a big table?

Tom Kyte

Followup  

December 21, 2005 - 7:43 pm UTC

see above, but I'd rather do a single pass, with analytics.

First time where ..... (continued)

December 21, 2005 - 11:37 am UTC

Reviewer: Yuan from Newark, NJ USA

Simpler yet:

select * from sale_history where rowid in (select first_value(rid) OVER (PARTITION BY sale_id ORDER BY dt desc) rid
from (
select dt, sale_id, status,
case when lag(status) over (partition by sale_id order by dt) <>
status
then 1
end grp, ROWID rid
from sale_history
)
WHERE grp = 1)

First time where ..... (continued)

December 22, 2005 - 10:13 am UTC

Reviewer: Yuan from Newark, NJ USA

You're absolutely right! I totally forgot about those that never changed statuses. I thank you and my users would too... if they ever learned of my oversight. 8^D

How about this?

select * from sale_history where rowid in (select first_value(rid) OVER
(PARTITION BY sale_id ORDER BY dt desc) rid
from (
select dt, sale_id, status,
case when lag(status) over (partition by sale_id order by dt) <> status
then 1
when lag(status) over (partition by sale_id order by dt) is null
then 1
end grp, ROWID rid
from sale_history
)
WHERE grp = 1)

I prefer the second when statement to handle this scenario so I never have to worry about a possible new 'xx' status getting added to the list of possible statuses.

Also, I think you missed this part:
<quote>One other requirement I forgot to mention. I need to know the current status for each sale. For 1, it'd be 'A' and for 2 'D'. My original solution does that, but I'll try to work with what you've taught me to see if I can use that
too. </quote>

Your final solution did not provide the status for each sale. That's why my final solution does another pass.

Tom Kyte

Followup  

December 22, 2005 - 11:05 am UTC

status was a varchar2(1) :)

but yes, this would work just fine - or

decode( lag(status) over (partit.....), status, null, 1 )


(decode is nice for a "compare A to B and include NULLs as being a 'value'")



First time where ..... (continued)

December 22, 2005 - 2:56 pm UTC

Reviewer: Yuan from Newark, NJ USA

Right again. I'll use DECODE. Also saves from having to evaluate the lag twice, which I was doing my way.

How about getting the status? Do you see a better way than what I'm doing?

Tom Kyte

Followup  

December 22, 2005 - 5:22 pm UTC

but don't we know the status ? it is "a"

First time where ..... (continued)

December 22, 2005 - 8:25 pm UTC

Reviewer: Yuan from Newark, NJ USA

No, A for Sales 1 and 3 in our examples, but D for Sale 2. The status for each Sale would be whatever it was changed to last.

Tom Kyte

Followup  

December 23, 2005 - 12:16 pm UTC

Ahh, then my first query totally misinterpreted your question - I thought "A" was "Active" and that is what you were looking for.

So, it is the minimum timestamp for the last "group" in SALE_ID (grouping by status)


ops$tkyte@ORA10GR2> select sale_id,
  2         to_date( substr(data,6,14), 'yyyymmddhh24miss') dt,
  3             substr(data,20) status
  4    from (
  5  select sale_id, min( to_char(rn,'fm00000') || to_char(dt,'yyyymmddhh24miss') || status ) data
  6    from (
  7  select sale_id, dt, status,
  8         decode(lead(status) over (partition by sale_id order by dt DESC),
  9                status, to_number(null),
 10                row_number() over (partition by sale_id order by dt DESC)) rn
 11    from sale_history
 12         )
 13   where rn is not null
 14   group by sale_id
 15         )
 16  /

   SALE_ID DT        ST
---------- --------- --
         1 04-DEC-05 A
         2 07-DEC-05 D

 

First time where ..... (continued)

December 23, 2005 - 2:28 pm UTC

Reviewer: Yuan from Newark, NJ USA

Very interesting way of doing it. Thanks. I think I've improved upon mine, though. No 2nd pass.

select distinct sale_id,
first_value(dt) OVER (PARTITION BY sale_id ORDER BY dt desc) dt,
FIRST_VALUE(status) OVER (PARTITION BY sale_id ORDER BY dt desc) status
from (select dt, sale_id, status,
decode(lag(status) over (partition by sale_id order by dt), status, NULL, 1) grp
from sale_history
)
WHERE grp = 1

Tom Kyte

Followup  

December 23, 2005 - 2:51 pm UTC

we do about the same - i grouped by - you distinct-ed. (but you have two buffer sorts, I'll have one)

First time where ..... (continued)

December 24, 2005 - 8:04 am UTC

Reviewer: Yuan from Newark, NJ USA

Yes, one point for you there, but mine wouldn't need modification if date turned into timestamp or timestamp precision changed. One point for me?

Tom Kyte

Followup  

December 24, 2005 - 9:29 am UTC

I didn't know we were racing?

If the precision changed, that would be a datatype change, all queries become suspect and must be looked at - but yes, yours might not have to change, mine would.

First time where ..... (continued)

December 27, 2005 - 6:59 am UTC

Reviewer: Yuan from Newark, NJ USA

I'm always racing. Just ask my wife.

Thanks again for everything!

DISTINCT and analytics

December 27, 2005 - 2:02 pm UTC

Reviewer: VA from New Jersey, USA

I have

select col1,col2,col3
from ...
where ...

This gives 20 rows

select distinct col1,col2,col3
from ...
where ...

gives 10 rows but

select distinct col1,col2,col3,count(*) over () cnt
from ...
where ...

Here 'cnt' gives 20 even though only 10 rows are returned.

How can I get the 'cnt' column to give me the real count after the distinct?

Thanks




Tom Kyte

Followup  

December 27, 2005 - 2:25 pm UTC

select a.*, count(*) over ()
from (select disinct c1, c2, c3 from t ) a




January 12, 2006 - 10:44 am UTC

Reviewer: A reader

hi tom,

could you please give me a hand with the following case:

drop table MyTab ;

create table MyTab(
MyGroup varchar2( 10 ),
MyVersion number,
MyMember varchar2( 10 ),
constraint XPKMyTab primary key( MyGroup, MyVersion, MyMember )
) ;

insert into MyTab( MyGroup, MyVersion, MyMember )
values( 'Group A', 1, 'Member 1' ) ;
insert into MyTab( MyGroup, MyVersion, MyMember )
values( 'Group A', 2, 'Member 1' ) ;
insert into MyTab( MyGroup, MyVersion, MyMember )
values( 'Group A', 2, 'Member 2' ) ;
insert into MyTab( MyGroup, MyVersion, MyMember )
values( 'Group A', 3, 'Member 1' ) ;
insert into MyTab( MyGroup, MyVersion, MyMember )
values( 'Group A', 3, 'Member 3' ) ;
commit ;

the challenge (from my current point of view) is to write a SQL statement that doesn't generate the output of:

select MyGroup, MyVersion, count(*) Cnt
from MyTab
group by MyGroup, MyVersion
order by MyGroup, MyVersion ;

... only but also includes additional information about whether CONSECUTIVE 'groups' contain the SAME 'members' or not. for example: the 'versions' 2 and 3 of 'group' A contain the same number of 'members' but do not include the same 'members'. therefore 'version' 3 of 'group' A should be marked as "members did change" ...

is this something that can be achieved with analytics such as "lag()" and the like at all (at least pure SQL)?

Tom Kyte

Followup  

January 12, 2006 - 11:12 am UTC

how big is this table in real life.

January 12, 2006 - 4:14 pm UTC

Reviewer: A reader

best guess: at most 500.000 (the set is generated by a query)

Tom Kyte

Followup  

January 13, 2006 - 10:54 am UTC

You can use a technique like this

a) create a set that has your columns PLUS a count of rows by your group key (data in my example)

b) outer join this result with itself by group/verion/member and group by your group by

c) compare the counts to decide if the next version has the same members

I expanded your test case to include some examples where the next version was actually the "same":

ops$tkyte@ORA9IR2> select * from myTab;

MYGROUP     MYVERSION MYMEMBER
---------- ---------- ----------
Group A             1 Member 1
Group A             2 Member 1
Group A             2 Member 2
Group A             3 Member 1
Group A             3 Member 3
Group A             4 Member 1
Group A             4 Member 3
Group b             1 Member 1
Group b             2 Member 1
Group b             2 Member 2
Group b             3 Member 1
Group b             3 Member 3

12 rows selected.

ops$tkyte@ORA9IR2> with data
  2  as
  3  (
  4  select mygroup, myversion, mymember, count(*) over (partition by mygroup, myversion) cnt
  5    from mytab
  6  )
  7  select a.mygroup, a.myversion,
  8         case when a.cnt = max(b.cnt) and a.cnt = count(b.mymember)
  9                  then 'Next Group Same'
 10                          else 'Next Group Different'
 11                  end what
 12    from data a, data b
 13   where a.mygroup = b.mygroup(+)
 14     and a.myversion = (b.myversion(+)-1)
 15     and a.mymember = b.mymember(+)
 16   group by a.mygroup, a.myversion, a.cnt
 17  /

MYGROUP     MYVERSION WHAT
---------- ---------- ------------------------------
Group A             1 Next Group Different
Group A             2 Next Group Different
Group A             3 Next Group Same
Group A             4 Next Group Different
Group b             1 Next Group Different
Group b             2 Next Group Different
Group b             3 Next Group Different

7 rows selected.
 

group by composite primary key

January 13, 2006 - 1:58 am UTC

Reviewer: apparao from kakinada

I have a table A which I am copying into another table B which has a composite primary key defined on the first two colloumns .Table A has duplicate rows so i cannot insert as it is into table B due to the primary key constraint. What can I do so that I can merge the offending rows of table A into one row and then insert into table B ? Any chance of using group by.

Tom Kyte

Followup  

January 13, 2006 - 11:12 am UTC

describe how you would "merge them". that bit is missing here - what is your "algorithm" for turning multiple rows into 1.

can we use group by on composite primary key

January 15, 2006 - 8:41 pm UTC

Reviewer: apparao from kakinada

hi tom here is my merge logic
Table A
col 1 col2 col3 col4 col5
row one 1 2 10 20 www
row two 1 2 11 10 www

On inserting into
Table B(condition :col1 and col2 form a composite primary key in table B)
col1 col2 col3 col4 col5
1 2 21 30 www

merge logic :I would like to add the remaning colloumns for duplicate primary Keys ( composite primary key composed of col1 and col2)

Assume that for non numeric colloumns like col4 in table A it contains the same values for row 1 and row 2.

regards
apparao

Tom Kyte

Followup  

January 16, 2006 - 9:04 am UTC

I can see how to get col3 and col4 - but col5 - what if it was

www
yyy

in the two rows, what would col5 be ?

January 16, 2006 - 8:50 pm UTC

Reviewer: apparao from kakinada

Assume that for non numeric colloumns like col5 in table A it contains the same values for row 1 and row 2.

the business rule is I add numeric coloumns and for non numeric colloumns the values contained are assumed to be the same i.e. there will be no case where it will be
www
yyy

it will always be
www
www

or

yyy
yyy

just in case if the previous assumption was not there and the business rule says for non numeric colloumns select any one randomly how do i do that.

i thought about the solution to my original question and came up with the following answer is there a better way do do it

select col1,col2 ,sum(col3),sum(col4),max(col5) group by col1,col2


Tom Kyte

Followup  

January 16, 2006 - 9:07 pm UTC

then you can either do the

select c1, c2, sum(c3), sum(c4), c5
group by c1, c2, c5;


or max(c5) and group by c1, c2


IF your assumption ever proves wrong, that there is a www yyy - then the first one will let you "see it".

January 16, 2006 - 8:53 pm UTC

Reviewer: apparao from kakinada

sorry a typo i meant to say

select col1,col2 ,sum(col3),sum(col4),max(col5) from tableA group by col1,col2


Thank you for the response

January 17, 2006 - 11:16 pm UTC

Reviewer: apparao

As always Thank you for the timely and quick response.

regards
apparao

analytic function

January 26, 2006 - 8:28 am UTC

Reviewer: omer from Pakistan

Hi Tom!
I have a table lot_temp

 SQL> select * 
  2  from lot_temp
  3  /

    LOT_NO    AVAIL_BALES
---------- --------------
         9              2
         1              3
        12              4
         4              5
         2              8
         3             13
         8              7

7 rows selected.

I have to give 30 bales from all lots and it should be equally given from every lot, must be integer
so the result will like.

SQL> select lot_no, avail_bales, given_bales --calculated
  2  from lot_temp
  3* order by lot_no
SQL> /

    LOT_NO    AVAIL_BALES GIVEN_BALES
---------- -------------- -----------
         1              3           5
         2              8           5
         3             13           4
         4              5           4
         8              7           4
         9              2           4
        12              4           4

7 rows selected.

i am giving 5 bales from 1,2 lots because they are older (i have date in another table, but lot_no can be used to sort)

but , given bales should not greater than available bales. here you can see given bales are greater than available bales.
Lot no 1 has 3 available bales so 3 bales should be given from lot 1, and remaining 27 should be equally divide in remaining 6 lots.

please help me to solve this problem.
I am using SQL.

Regards 

Tom Kyte

Followup  

January 26, 2006 - 10:23 am UTC

bin fitting is not something you will be doing easily in SQL - you might be able to cook up something with the model clause (i'm not going to try myself though), but it requires an "algorithm" that can procedurally process the data.



try this

January 26, 2006 - 1:10 pm UTC

Reviewer: Jaromir D.B. Nemec from Vienna, Austria

Hi Omer,
There are probable more elegant solutions, but I'll start with this simple approach:
Lots under the average give all available bales.
The rest of bales is divided uniformly with possible correction of +1 to get the sum right.
See select below

select
lot_no, avail_bales,
case when
avail_bales <= avg_bal then avail_bales -- below avg give all
else
div + case when rn <= mod then 1 else 0 end
end as given_bales
from (
select
lot_no, avail_bales,avg_bal, rn,
trunc((30 - sum_le_avg) / cnt_gt_avg) div,
mod((30 - sum_le_avg) , cnt_gt_avg) mod
from(
select
lot_no, avail_bales,avg_bal,
sum(case when avail_bales <= avg_bal then avail_bales else 0 end ) over () sum_le_avg,
sum(case when avail_bales > avg_bal then 1 else 0 end ) over () cnt_gt_avg,
row_number() over (partition by
case when avail_bales > avg_bal then 1 else 0 end
order by case when avail_bales > avg_bal then avail_bales else 0 end) rn
from(
select
lot_no, avail_bales,
trunc(30 / count(avail_bales) over ()) avg_bal
from lot_temp
)))
order by lot_no;

LOT_NO AVAIL_BALES GIVEN_BALES
---------- ----------- -----------
1 3 3
2 8 5
3 13 5
4 5 6
8 7 5
9 2 2
12 4 4

7 rows selected.


Regads,

Jaromir D.B. Nemec


analytic question

January 27, 2006 - 1:17 am UTC

Reviewer: omer from Pakistan

Hi!
Ok Tom, will you go to solve this in PL/SQL.

thanks jaromir for your answer but you can see the result it is not correct.
At last level when you add average, it can exceeds from available bales

Regards

A SQL solution

January 30, 2006 - 9:25 am UTC

Reviewer: Michel Cadot from France

Hi,

Here's a query with model and analytic functions.
If you worked with floating point number it'ld be easy (i put some comments and additional columns to see how it works):

SQL> select lot_no, avail_bales from lot_temp order by lot_no;
    LOT_NO AVAIL_BALES
---------- -----------
         1           3
         2           8
         3          13
         4           5
         8           7
         9           2
        12           4

7 rows selected.

SQL> def tot_bales=30
SQL> break on report
SQL> compute sum label Total of given_bales on report
SQL> def max_iter=10
SQL> select ln lot_no, ab avail_bales, gb given_bales, rb, rl, pr, ni
  2  from lot_temp
  3  model
  4    dimension by (lot_no ln)
  5    measures (avail_bales ab, cast(null as number) gb,
  6              cast(null as number) rb, cast(null as number) rl,
  7              cast(null as number) pr, 0 as ni)
  8    rules update sequential order 
  9      iterate(&max_iter) 
 10      until (rl[1] = 0)
 11      ( -- rb = remaining bales at the beginning
 12        rb[ANY] = &tot_bales - nvl(sum(gb)[ANY],0),
 13        -- rl = remaining lots at the beginning
 14        rl[ANY] = count(*)[ANY] - count(gb)[ANY],
 15        -- gb = given bales
 16        gb[ANY] = case
 17                  when gb[cv()] is null and rb[cv()]/rl[cv()] >= ab[cv()]
 18                    -- if share does not fit take only available
 19                    then ab[cv()]
 20                  when gb[cv()] is not null
 21                    -- don't change what's already set
 22                    then gb[cv()]
 23                  when pr[cv()] = rl[cv()] and gb[cv()] is null 
 24                    -- no change in the previous iteration -> fill the rest
 25                    then rb[cv()]/rl[cv()]
 26                  end,
 27        -- memorize previous remaining lots
 28        pr[ANY] = rl[cv()],
 29        -- rb = remaining bales at the end
 30        rb[ANY] = &tot_bales - nvl(sum(gb)[ANY],0),
 31        -- rl = remaining lots at the end
 32        rl[ANY] = count(*)[ANY] - count(gb)[ANY],
 33        -- number of iterations (just curious)
 34        ni[ANY] = ni[cv()] + 1
 35      )
 36  order by 1
 37  /
    LOT_NO AVAIL_BALES GIVEN_BALES         RB         RL         PR         NI
---------- ----------- ----------- ---------- ---------- ---------- ----------
         1           3           3 1.0000E-38          0          3          4
         2           8  5.33333333 1.0000E-38          0          3          4
         3          13  5.33333333 1.0000E-38          0          3          4
         4           5           5 1.0000E-38          0          3          4
         8           7  5.33333333 1.0000E-38          0          3          4
         9           2           2 1.0000E-38          0          3          4
        12           4           4 1.0000E-38          0          3          4
                       -----------
Total                           30

7 rows selected.

Now we can enclose this query to get the integer behaviour you want:

SQL> select ln lot_no, ab avail_bales, -- gb, 
  2         case
  3         when gb = trunc(gb) then gb
  4         when sum(ceil(gb)-gb) over(order by ln) 
  5              < sum(gb-trunc(gb)) over()
  6           then ceil(gb)
  7         else trunc(gb)
  8         end given_bales
  9  from ( select ln, ab, gb
 10         from lot_temp
 11         model
 12           dimension by (lot_no ln)
 13           measures (avail_bales ab, cast(null as number) gb,
 14                     cast(null as number) rb, cast(null as number) rl,
 15                     cast(null as number) pr, 0 as ni)
 16           rules update sequential order 
 17             iterate(&max_iter) 
 18             until (rl[1] = 0)
 19             ( -- rb = remaining bales at the beginning
 20               rb[ANY] = &tot_bales - nvl(sum(gb)[ANY],0),
 21               -- rl = remaining lots at the beginning
 22               rl[ANY] = count(*)[ANY] - count(gb)[ANY],
 23               -- gb = given bales
 24               gb[ANY] = case
 25                         when    gb[cv()] is null
 26                             and rb[cv()]/rl[cv()] >= ab[cv()]
 27                           -- if share does not fit take only available
 28                           then ab[cv()]
 29                         when gb[cv()] is not null
 30                           -- don't change what's already set
 31                           then gb[cv()]
 32                         when pr[cv()] = rl[cv()] and gb[cv()] is null 
 33                           -- no change in the previous iteration
 34                           -- -> fill the rest
 35                           then rb[cv()]/rl[cv()]
 36                         end,
 37               -- memorize previous remaining lots
 38               pr[ANY] = rl[cv()],
 39               -- rb = remaining bales at the end
 40               rb[ANY] = &tot_bales - nvl(sum(gb)[ANY],0),
 41               -- rl = remaining lots at the end
 42               rl[ANY] = count(*)[ANY] - count(gb)[ANY],
 43               -- number of iterations (just curious)
 44               ni[ANY] = ni[cv()] + 1
 45             )
 46       )
 47  order by 1
 48  /
    LOT_NO AVAIL_BALES GIVEN_BALES
---------- ----------- -----------
         1           3           3
         2           8           6
         3          13           5
         4           5           5
         8           7           5
         9           2           2
        12           4           4
                       -----------
Total                           30

7 rows selected.

Regards
Michel

 

model clause

February 01, 2006 - 1:20 am UTC

Reviewer: Omer from Pakistan

Hi!
superb query. I was not aware of this new model clause.
thanks for this execellent piece of code.

Regards

ROW_NUMBER() ordering

February 03, 2006 - 4:59 pm UTC

Reviewer: Zahir M from Monroe NJ

I have two identical sqls ( except for the where clause ) ; 
the order of the result set are different , even though my order clause is the same in the 'Row Number' function.
Please explain .

SQL> Select * from ( 
  2    Select ename ,deptno , row_number() over( order by deptno) from emp) ;

ENAME       DEPTNO ROW_NUMBER()OVER(ORDERBYDEPTNO)
------- ---------- -------------------------------
CLARK           10                               1
Zahir           10                               2
Zahir           10                               3
MILLER          10                               4
KING            10                               5
JONES           20                               6
SMITH           20                               7
FORD            20                               8
ADAMS           20                               9
SCOTT           20                              10
WARD            30                              11

ENAME       DEPTNO ROW_NUMBER()OVER(ORDERBYDEPTNO)
------- ---------- -------------------------------
TURNER          30                              12
ALLEN           30                              13
JAMES           30                              14
MARTIN          30                              15
BLAKE           30                              16

16 rows selected.


  1  Select * from (
  2    Select ename ,deptno , row_number() over( order by deptno) rn from emp )
  3*   Where rn < 17
SQL> /

ENAME       DEPTNO         RN
------- ---------- ----------
Zahir           10          1
Zahir           10          2
CLARK           10          3
KING            10          4
MILLER          10          5
SMITH           20          6
JONES           20          7
SCOTT           20          8
ADAMS           20          9
FORD            20         10
ALLEN           30         11

ENAME       DEPTNO         RN
------- ---------- ----------
WARD            30         12
TURNER          30         13
JAMES           30         14
MARTIN          30         15
BLAKE           30         16

SQL> Select version from v$instance;

VERSION
-----------------
10.2.0.1.0 

Follow up to ROW_NUMBER() ordering

February 07, 2006 - 4:06 pm UTC

Reviewer: Michel Cadot from France

Hi Zahir,

In your row_number you said you order by deptno and your row numbers are sorted by deptno.

If you want more sorts then give more columns in your "order by" clause.
If you don't specify an order for other columns there is no reason to get a specific order.

Finally, the reason to get a different order is you don't give an order.
Moreover you should be happy to get the final result in the row number order (rn column), this can be another order.

Regards
Michel

Rank() Order By

March 29, 2006 - 1:47 pm UTC

Reviewer: Arjun from NY

Still not clear why the order is diffrent

Tom Kyte

Followup  

March 29, 2006 - 2:23 pm UTC

because the query doesn't have an order by on it, without the query having an order by - you cannot assume anything about the order of rows in the result set.

Finding rows between 2 alphanumeric values

March 31, 2006 - 4:55 pm UTC

Reviewer: TS from USA

Tom,

I have a table in which the accounts are alphanumeric,
something like this:-

table_a
=======
accountid
---------
A1234
B123
B235
B9000
C6754

select * from table_a where account_id between 'B100'
and 'B300';

gets me all the rows with accountid's starting with B%
I only want 2 records retrieved:
B123 and B235

Could you pls let me know of a way.





Tom Kyte

Followup  

March 31, 2006 - 5:19 pm UTC

are you sure about that

  1* select * from t where accountid between 'B100' and 'B300'
ops$tkyte@ORA9IR2> /

ACCOUNTID
----------
B123
B235

ops$tkyte@ORA9IR2> select * from t;

ACCOUNTID
----------
A1234
B123
B235
B9000
C6754
 

Finding rows between 2 alphanumeric values

March 31, 2006 - 5:48 pm UTC

Reviewer: TS from USA

ok. my bad. Actually the problem is like this:-

Acctid
------
B100
B235
B23456
B11234

select * from table_a where acctid between 'B100' and 'B300';

gets me all four rows. I think the reason being its trying
to sort every character value and since the 3rd and 4th rows
first couple of values falls between 100 and 300 it retrieves those rows also.

Is there a work around for this?


Thanks
TS


Tom Kyte

Followup  

March 31, 2006 - 7:00 pm UTC

well, I don't know what workaround there could be - you seem to be thinking this is two fields

<string> <number>


if they are in fact in this format, you'd have to do something with substr and to_number()

otherwise, this is working exactly as you would expect. B2<ANYTHING> is by definition between B1.. and B3.. - regardless of what the .. is.

this is how strings "by their very definition" compare.

select query from matadata

April 04, 2006 - 7:02 am UTC

Reviewer: Tony from USA

Tom

How are you?

I have one query .

I am creating one select statement from the table metadata.

the metadata table contains

V_COL_NAME | V_SRC_TAB | V_FILTER_CLAUSE | V_GRP_BY_CLAUSE
------------------------------------------------------------
empno | emp | emp.deptno = dept.deptno | empno
empname | dept | | empname
dname | | | dname
sum(sal) | | |

The table creation script is : CREATE TABLE REP_QUERY_DATA(V_COL_NAME VARCHAR2(500 BYTE),
V_SRC_TAB VARCHAR2(500 BYTE),
V_FILTER_CLAUSE VARCHAR2(500 BYTE),
V_GRP_BY_CLAUSE VARCHAR2(500 BYTE)
)

I want to create the select query using this table.

The query will be :

select
empno,
empname,
dname,
sum(sal)
from
emp,
dept
where emp.deptno = dept.deptno
group by
empno,
empname,
dname

can you please guide me?

Tom Kyte

Followup  

April 04, 2006 - 9:58 am UTC

it is your data - you are storing it, you designed it.


but you cannot put it all together yourself???


I won't even comment on how bad an idea I think this is. The data model you are using is wholly inappropriate. You might want to give a bit more thought to that - I see the need for at least 5 tables to do this right.

Help on Query using analytics

April 27, 2006 - 5:24 pm UTC

Reviewer: Jonty from NJ, USA

Hello Tom,

I have following data in my table.

Row# Item Order_Date Order_No
---- ----- ---------- ---------
1 101 1-Jan-2006 9001
2 101 2-Jan-2006 9002
3 101 8-Jan-2006 9003
4 101 16-Jan-2006 9004
5 101 18-Jan-2006 9005

6 102 1-Jan-2006 9006
7 102 5-Jan-2006 9007
8 102 7-Jan-2006 9008
9 102 20-Jan-2006 9009

10 103 5-Jan-2006 9010
11 103 7-Jan-2006 9011
12 103 9-Jan-2006 9012


I need to find out which are the latest group of Orders for each Item.

Group of Ordres - any 2 orders order_date are within 1 week time gap.

In above example,
For
Item 101 - Row # 4,5 will be group # 1
Row# 1,2,3 wiil be group # 2

Item 102 - Row # 9 will be group # 1
Row # 6,7,8 will be group # 2

Item 103 - Row # 10, 11, 12 will be group # 1

I am only interested in group # 1. Query results should be like this

Item # Order No
-------- ------------
101 9004
101 9005

102 9009

103 9010
103 9011
103 9012

I really appreciate your help on this.

Thanks

Jonty

Summation over week's data

June 01, 2006 - 6:02 pm UTC

Reviewer: A reader

Hi Tom,

Can you please assist me with the sql using analytics to compute the sum of the weekly data for a report. I will extend it then to cover monthly and yearly data.

Here is the sql:

SELECT TRUNC (a_date), COUNT (DISTINCT p_id)
FROM a_table
WHERE (a_str = 'Login')
AND (a_date >= TRUNC (SYSDATE - (TO_CHAR (SYSDATE, 'D') - 1))
AND (a_date < TRUNC (SYSDATE) + 1)
)
GROUP BY (TRUNC (a_date))

Row# TRUNC(A_DATE) COUNT(DISTINCT P_ID)

1 05/28/2006 143
2 05/29/2006 472
3 05/30/2006 3959
4 05/31/2006 4065
5 06/01/2006 3720

On the sixth row, I'd like to have the sum of the count values.

Thanks for your help.

Ma$e

Tom Kyte

Followup  

June 01, 2006 - 6:24 pm UTC

give me a table and inserts to work with!

(you want a group by rollup apparently.... that might be enough for you to get it...)


scott@ORA10GR2> select trunc(hiredate,'y'), count(empno)
2 from emp
3 group by rollup(trunc(hiredate,'y'))
4 /

TRUNC(HIR COUNT(EMPNO)
--------- ------------
01-JAN-80 1
01-JAN-81 10
01-JAN-82 2
01-JAN-83 1
14


The distinct in the select is the culprit....

June 01, 2006 - 11:41 pm UTC

Reviewer: A reader

Hi Tom:

I'm working on getting the ddl and dml for you but in the mean time here is what I think is the problem.

It is the distinct in the select that is the key.

SQL> SELECT   TRUNC (d_date), COUNT (DISTINCT p_id)
  2      FROM d_table
  3     WHERE (    d_date >=
  4                                   TRUNC (to_date('06-JAN-2006','DD-MON-YYYY') - (TO_CHAR 
  5  (to_date('06-JAN-2006','DD-MON-YYYY'), 'D') - 1))
  6            AND (D_DATE < TRUNC (to_date('06-JAN-2006','DD-MON-YYYY')) + 1)
  7           )
  8  GROUP BY rollup (TRUNC (D_DATE));

TRUNC(D_D COUNT(DISTINCTP_ID)
--------- -------------------
01-JAN-06                   1
02-JAN-06                   1
03-JAN-06                   2
04-JAN-06                   3
05-JAN-06                   2
06-JAN-06                   1
                            6

I want the total to be 10. The rollup should perform the sum of the count(...) values.

Basically the sql is to answer: How many total unique user logins have occured daily and the total of that at the bottom (i.e total of all unqiue logins).

However:

SQL> SELECT COUNT(DISTINCT P_ID) FROM D_TABLE WHERE D_DATE BETWEEN '01-JAN-2006'  AND '06-JAN-2006';


COUNT(DISTINCTP_ID)
-------------------
                  5

7 rows selected.

As usual your assistance is greatly appreciated.

Ma$e 

here is the dml, ddl....

June 02, 2006 - 12:04 am UTC

Reviewer: A reader

SQL> desc d_table;
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 P_ID                                      NOT NULL NUMBER(10)
 D_DATE                                    NOT NULL DATE


insert into d_table values (7052,'01-JAN-06');
insert into d_table values (7052,'02-JAN-06');
insert into d_table values (7052,'02-JAN-06');
insert into d_table values (7052,'02-JAN-06');
insert into d_table values (1000,'03-JAN-06');
insert into d_table values (1000,'03-JAN-06');
insert into d_table values (1000,'03-JAN-06');
insert into d_table values (7204,'03-JAN-06');
insert into d_table values (7204,'03-JAN-06');
insert into d_table values (7204,'03-JAN-06');
insert into d_table values (7204,'03-JAN-06');
insert into d_table values (1000,'03-JAN-06');
insert into d_table values (1000,'03-JAN-06');
insert into d_table values (7204,'03-JAN-06');
insert into d_table values (7204,'03-JAN-06');
insert into d_table values (7058,'04-JAN-06');
insert into d_table values (7040,'04-JAN-06');
insert into d_table values (1000,'04-JAN-06');
insert into d_table values (1000,'04-JAN-06');
insert into d_table values (7040,'04-JAN-06');
insert into d_table values (7052,'05-JAN-06');
insert into d_table values (7052,'05-JAN-06');
insert into d_table values (7052,'05-JAN-06');
insert into d_table values (7052,'05-JAN-06');
insert into d_table values (7052,'05-JAN-06');
insert into d_table values (7052,'05-JAN-06');
insert into d_table values (7204,'05-JAN-06');
insert into d_table values (7022,'06-JAN-06');

thanks
 

Tom Kyte

Followup  

June 02, 2006 - 10:23 am UTC

ops$tkyte@ORA10GR2> select dt, sum(cnt)
  2    from (
  3  SELECT   TRUNC (d_date) dt, COUNT (DISTINCT p_id) cnt
  4      FROM d_table
  5     WHERE (    d_date >=
  6           TRUNC (to_date('06-JAN-2006','DD-MON-YYYY') - (TO_CHAR
  7  (to_date('06-JAN-2006','DD-MON-YYYY'), 'D') - 1))
  8            AND (D_DATE < TRUNC (to_date('06-JAN-2006','DD-MON-YYYY')) + 1)
  9           )
 10  GROUP BY (TRUNC (D_DATE))
 11         )
 12  group by rollup(dt);
 
DT          SUM(CNT)
--------- ----------
01-JAN-06          1
02-JAN-06          1
03-JAN-06          2
04-JAN-06          3
05-JAN-06          2
06-JAN-06          1
                  10
 
7 rows selected.
 

Most problems do have simple solutions....if you can think out of the box...

June 02, 2006 - 11:35 am UTC

Reviewer: A reader

Thanks Tom,

I should have expected the solution would be that simple,but I was stuck on a single query and could not think out of the box...

Ma$e


Ami

June 09, 2006 - 10:37 am UTC

Reviewer: Ami from Albany, NY

Hi Tom,

I need to create table B from table A. Below is sample data from table A.

col_1 col_2 col_3 col_4
489915 89831 76592641 415082
492207 89831 76592641 415082
638185 89831 76592641 415082
638813 89831 76592641 415082
638172 89831 76592641 415082
492433 89831 76592641 415082

526170 89834 45897659 167876
533538 89834 45897659 167876
531911 89834 45897659 167876

838980 89840 67731620 561277
926029 89840 67731620 561277
925192 89840 67731620 561277
923034 89840 67731620 561277

900096 89844 26091723 2854
863387 89844 794581462 16122
678308 89844 99188161 68802
884397 89844 6952568 599327
882958 89844 6952568 599327
881272 89844 6952568 599327
879516 89844 6952568 599327

487415 89905 1414739 82226
638480 89905 1414739 82226
638135 89905 1414739 82226
636633 89905 1414739 82226
636631 89905 1414739 82226
491705 89905 931873913 158128

The table is unique by col_1 .
For table B, I need one record from each group of col_2 based on max(col_1) , but if a group has more than 1 value in col_4 then the record I need has to be the record with the max(col_4).

Any ideas on how to do this in sql?
As usual, your advice is greatly appreciated.

Thank you,
Ami



Tom Kyte

Followup  

June 09, 2006 - 1:22 pm UTC

sure, we could do this...

but without a create table and insert into to demonstrate with, I won't.

You can. It involves analytics - lots of examples of analytics on the site.

Your recommendation

August 01, 2006 - 11:00 am UTC

Reviewer: Mark from Boston, MA

Hi Tom,

I'd like your *economic* recommendation of how to go about learning the Analytics Functions in Oracle 9i R2. Can you recommend a good hands-on book or anything that has clear examples? I'm browsing the Oracle docs but perhaps looking for something a bit better and more hands-on.

Regards,
Mark

Tom Kyte

Followup  

August 01, 2006 - 6:47 pm UTC

They hid the best bits in the data warehousing guide:

</code> http://docs.oracle.com/docs/cd/B10501_01/server.920/a96520/analysis.htm#1020 <code>

and I've been told my chapter in "Expert one on one Oracle" (first book, shipped as a CD with the latest book "Expert Oracle Database Architecture") has a pretty readable chapter on it.



row_number()

August 03, 2006 - 12:37 pm UTC

Reviewer: Su Baba

CREATE TABLE x (
name VARCHAR2(10),
theDate DATE
);

INSERT INTO x VALUES ('John', TO_DATE('2006/01/01', 'YYYY/MM/DD'));
INSERT INTO x VALUES ('John', TO_DATE('2006/04/01', 'YYYY/MM/DD'));
INSERT INTO x VALUES ('John', TO_DATE('2006/07/01', 'YYYY/MM/DD'));
INSERT INTO x VALUES ('Mary', TO_DATE('2006/04/01', 'YYYY/MM/DD'));
INSERT INTO x VALUES ('Mary', TO_DATE('2006/07/01', 'YYYY/MM/DD'));
INSERT INTO x VALUES ('Ken', TO_DATE('2006/04/01', 'YYYY/MM/DD'));
INSERT INTO x VALUES ('Ken', TO_DATE('2006/07/01', 'YYYY/MM/DD'));
commit;


SELECT name,
row_number() OVER (PARTITION BY name ORDER BY theDate) rn
FROM x;

NAME RN
---------- ----------
John 1
John 2
John 3
Ken 1
Ken 2
Mary 1
Mary 2

Now is there anyway to use analytical functions to produce rn2 (see below) which gives rownum across names, but the rownum should stay the same within the same name?

NAME RN RN2
---------- ---------- ---
John 1 1
John 2 1
John 3 1
Ken 1 2
Ken 2 2
Mary 1 3
Mary 2 3


Tom Kyte

Followup  

August 03, 2006 - 4:25 pm UTC

dense_rank() over (order by name)

select

August 10, 2006 - 5:07 am UTC

Reviewer: Robert Hanrahan from Rome - Italy

hi Tom,

I have a question for you:

I have a select like this:
SELECT a.id_contract, a.credit_card, b.num_of_contracts
FROM big_table a,(select count(credit_card) num_of_contracts, credit_card from big_table a
group by credit_card
having count(credit_card) > 1) b
where a.credit_card = b.credit_card
order by credit_card, id_contract;

and the output is this:
id_contract credit_card n
----------- -------------- -
7349873 36009899812733 2
7349922 36009899812733 2
7762917 371232759765007 4
7766297 371232759765007 4
7771404 371232759765007 4
7771426 371232759765007 4

Now, I have to insert this data inside another table in this way:

id_contract1 id_contract2 id_contract3 id_contract4 credit_card
7349873 7349922 NULL NULL 36009899812733(2)
7762917 7766297 7771404 7771426 37123275976500(4)

What do I have to do to accomplish this?

let me know

Robert Hanrahan

Tom Kyte

Followup  

August 10, 2006 - 9:20 am UTC

you'll use row_number() over (partition by credit_card order by id_contract) to assign a number 1, 2, 3, 4... to the rows - so take your query above, call it q1

select q1.*, row_number() over (... )rn
from (Q1) q1

call that q2. then pivot it.

select credit_card,
max( decode( rn, 1, id_contract ) ),
max( decode( rn, 2, id_contract ) ),
...
max( decode( rn, N, id_contract ) ) /* n is max number of id_contracts
per credit_card */
from (Q2)
group by credit_card

Variance of First time where

August 15, 2006 - 9:04 am UTC

Reviewer: Milan from Prague

Hi Tom,

I have following table:

create table sale_hist
(
tmstmp date,
sale_id number(4),
sale_type_id number(4),
value number(4)
)

insert into sale_hist values (to_date('01082006','ddmmyyyy'), 1, 1, 1);
insert into sale_hist values (to_date('01082006','ddmmyyyy'), 1, 2, 1);
insert into sale_hist values (to_date('01082006','ddmmyyyy'), 1, 3, 1);
insert into sale_hist values (to_date('02082006','ddmmyyyy'), 1, 1, 0);
insert into sale_hist values (to_date('02082006','ddmmyyyy'), 1, 2, 0);
insert into sale_hist values (to_date('02082006','ddmmyyyy'), 1, 3, 1);
insert into sale_hist values (to_date('03082006','ddmmyyyy'), 1, 3, 0);
insert into sale_hist values (to_date('05082006','ddmmyyyy'), 1, 1, 2);
insert into sale_hist values (to_date('05082006','ddmmyyyy'), 1, 2, 3);
insert into sale_hist values (to_date('06082006','ddmmyyyy'), 1, 1, 1);
insert into sale_hist values (to_date('06082006','ddmmyyyy'), 1, 3, 2);

For sale_id I have to sum the latest values for all sale_type_id

I think I can do it with

select
sum(value)
from
(
select distinct sale_id, sale_type_id,
first_value(value) over (partition by sale_id, sale_type_id
order by tmstmp desc) value
from sale_hist
where sale_id = :p_sale_id
and tmstmp <= :p_tmstmp
)

My problem is:
I have to found for every sale_id the latest tmstmp when the sum started to be positive.

I have read the similar questions on this page and I can do it for single value (LAG ...). But I have problem with the sum, I cannot relate the tmstmp and sale_id to previous 'sum' statement. Could you help me, please?

Thanks in advance.
Milan


Tom Kyte

Followup  

August 15, 2006 - 12:16 pm UTC

I don't know what you mean by "I have to found for every sale_id the latest tmstmp when the sum started to be positive."

Variance of First time where

August 15, 2006 - 5:20 pm UTC

Reviewer: Milan from Prague

Hi Tom,
thanks for fast response. I try to clarify it.

I get following values (sum the latest values for all sale_type_id) from the data for tmstmp <= [1. column] and sale_id = 1

tmstmp <= sale_id sum
01.08.2006 1 3
02.08.2006 1 1
03.08.2006 1 0
04.08.2006 1 0
05.08.2006 1 5
06.08.2006 1 6

For example :p_tmstmp = 06.08.2006
If the sum for :p_tmstmp is positive (sum > 0) I have to find in the history the begining of continuos row of positive sums - it is 05.08.2006 in this case.
...
04.08.2006 1 0 -- zero
05.08.2006 1 5 -- positive <=== the begining of continuous row
06.08.2006 1 6 -- positive


Tom Kyte

Followup  

August 15, 2006 - 6:16 pm UTC

do you mean like this:
ops$tkyte%ORA9IR2> select *
  2    from (
  3  select sale_id, tmstmp, value,
  4         max(flag) over (partition by sale_id order by tmstmp DESC) max_flag
  5    from (
  6  select sale_id, tmstmp, value,
  7         case when row_number() over (partition by sale_id order by tmstmp DESC) = 1
  8                  then 1
  9                          when value <= 0
 10                          then 2
 11                  end flag
 12    from sale_hist
 13         )
 14         )
 15   where max_flag = 1
 16   order by sale_id, tmstmp DESC
 17  /

   SALE_ID TMSTMP         VALUE   MAX_FLAG
---------- --------- ---------- ----------
         1 06-AUG-06          1          1
         1 06-AUG-06          2          1
         1 05-AUG-06          2          1
         1 05-AUG-06          3          1
 

Variance of First time where

August 16, 2006 - 4:29 am UTC

Reviewer: Milan from Prague

Hi Tom,
thanks for your effort. Ok, it works fine for the value.

But I have to work with the sum defined above (sum of the latest values for all sale_type_id for one sale_id) instead of the value. I have problem 'to substitute the value by the sum' in your query.

I can do it with PL/SQL function, but I would like to solve it with one query (if it is possible).

Tom Kyte

Followup  

August 16, 2006 - 8:35 am UTC

you need to really lay this out like a specification.

Pretend for a moment I was a coder.
Now, write the specification that a coder can use to produce code from.

Variance of First time where

August 16, 2006 - 10:42 am UTC

Reviewer: Milan from Prague

Hi Tom,

here is specification:

I have a history information about values.

The value can not be empty, it can be negative, 0, positive. The value is of some type. The value belongs to some object. The value relates to some timestamp.

The object can have lot of values of various (or the same) types. These values can relate to various (or the same) timestamp, but the combination of object, type, value, timestamp has to be unique.

I need a SUM of latest values of every type - for the object and given timestamp.

Example:
timestamp object type value
---------- ------ ---- -----
...
03.08.2006 1 3 0
05.08.2006 1 1 2
05.08.2006 1 2 3 <-- the latest value for type = 2 and timestamp <= 06.08.2006
06.08.2006 1 1 1 <-- the latest value for type = 1 and timestamp <= 06.08.2006
06.08.2006 1 3 2 <-- the latest value for type = 3 and timestamp <= 06.08.2006


My SUM is then 1 + 3 + 2 = 6 for object = 1 and timestamp <= 06.08.2006
My SUM is then 2 + 3 + 0 = 5 for object = 1 and timestamp <= 05.08.2006
...

If this SUM is positive I have to find a timestamp in the history - a begining of continuos row of positive SUM

Example:
timestamp object SUM for timestamp
---------- ------ -----------------
...
04.08.2006 1 0 -- zero
05.08.2006 1 5 -- positive <=== the begining of continuous row
06.08.2006 1 6 -- positive

That's all. Sorry, I can not express it better.

========================
create table object_hist
(
tmstmp date,
object_id number(4),
value_type number(4),
value number(4)
)

insert into object_hist values (to_date('01082006','ddmmyyyy'), 1, 1, 1);
insert into object_hist values (to_date('01082006','ddmmyyyy'), 1, 2, 1);
insert into object_hist values (to_date('01082006','ddmmyyyy'), 1, 3, 1);
insert into object_hist values (to_date('02082006','ddmmyyyy'), 1, 1, 0);
insert into object_hist values (to_date('02082006','ddmmyyyy'), 1, 2, 0);
insert into object_hist values (to_date('02082006','ddmmyyyy'), 1, 3, 1);
insert into object_hist values (to_date('03082006','ddmmyyyy'), 1, 3, 0);
insert into object_hist values (to_date('05082006','ddmmyyyy'), 1, 1, 2);
insert into object_hist values (to_date('05082006','ddmmyyyy'), 1, 2, 3);
insert into object_hist values (to_date('06082006','ddmmyyyy'), 1, 1, 1);
insert into object_hist values (to_date('06082006','ddmmyyyy'), 1, 3, 2);



Tom Kyte

Followup  

August 16, 2006 - 11:16 am UTC

lost me here:
My SUM is then 1 + 3 + 2 = 6 for object = 1 and timestamp <= 06.08.2006
My SUM is then 2 + 3 + 0 = 5 for object = 1 and timestamp <= 05.08.2006

are we missing something like "this sum should be always the sum of the three most current records"???


reading on - you keeping using "sum" but seem to be looking VALUE only.


(and the example is un-interesting to test with until you add object "2" or some other object to analyze - unless missing from the spec is "we always look at one object only")

Could you rewrite this query to have less elapsed time.

August 22, 2006 - 9:14 pm UTC

Reviewer: Sreenivasa Rao from india

Hi Tom,
Currently this query takes more than 30minutes and fails sometime with Ora-1555 error.
Could rewrite you rewrite this query to execute in short span of time.
-------------------------------------------
select substr(a.owner,1,10),
substr(a.segment_name,1,40) "Table name",
substr(round(a.bytes),1,8) "Tsizes",
substr(b.index_name,1,40) "Index name",
substr(round(b.bytes),1,8) "Isizes"
from
(select owner,
segment_name,
bytes/1024/1024 bytes
from dba_segments
where segment_type = 'TABLE'
and owner not in ('SYS','SYSTEM','PERFSTAT','OUTLN','DBSNMP')
and segment_type not like '%PARTIT%') a,
(select dba_indexes.owner,
table_name,index_name,bytes/1024/1024 bytes
from dba_indexes,dba_segments
where dba_indexes.owner = dba_segments.owner
and index_name=segment_name
and dba_indexes.owner
not in ('SYS','SYSTEM','PERFSTAT',
'OUTLN','DBSNMP')) b
where a.owner = b.owner
and a.segment_name = b.table_name
and (b.bytes - a.bytes) > 10
------------------------------------------------

Thanks in advance,
Sreenivas.

Tom Kyte

Followup  

August 27, 2006 - 3:14 pm UTC

what is the purpose of such a query? what question is it trying to answer here.

the b.bytes-a.bytes part looks "interesting", what is that about?

A query on analytics

August 23, 2006 - 3:24 am UTC

Reviewer: Ravi

Tom,
My test case:

create table t3 (a number, b number)
/



insert into t3 values(1,2);
insert into t3 values(2,2);
insert into t3 values(3,3);
insert into t3 values (4,4);
insert into t3 values (1,2);
insert into t3 values(2,3);

My query:

SELECT a, b, COUNT ( a ) OVER ( PARTITION BY b ORDER BY a DESC ) y2
, LAST_VALUE ( a ) OVER ( PARTITION BY b ORDER BY a DESC ) b3_2
, FIRST_VALUE ( a ) OVER ( PARTITION BY b ORDER BY a DESC ) b3_2_2
, SUM ( a ) OVER ( PARTITION BY b ORDER BY a DESC ) x1
, ROW_NUMBER ( ) OVER ( PARTITION BY b ORDER BY a DESC ) b4
, ROW_NUMBER ( ) OVER ( PARTITION BY b ORDER BY a ASC ) b4_2
, FIRST_VALUE ( a ) OVER ( PARTITION BY b ORDER BY a ASC ) b5
, LAST_VALUE ( a ) OVER ( PARTITION BY b ORDER BY a ASC ) b6
, MIN ( a ) OVER ( PARTITION BY b ORDER BY a DESC ) b7
, MAX ( a ) OVER ( PARTITION BY b ORDER BY a DESC ) b9
, DENSE_RANK ( ) OVER ( PARTITION BY b ORDER BY a DESC ) b9_2
, DENSE_RANK ( ) OVER ( PARTITION BY b ORDER BY a ASC ) b9_2_1
, SUM ( a ) OVER ( PARTITION BY b ORDER BY a DESC ) x1
, SUM ( a ) OVER ( PARTITION BY b ORDER BY a ASC ) x5
, MAX ( a ) OVER ( PARTITION BY b ORDER BY a ASC ) b9_3
, MIN ( a ) OVER ( PARTITION BY b ORDER BY a ASC ) b9_3_2
, SUM ( a ) OVER ( PARTITION BY b ORDER BY a ASC ) x5
, COUNT ( a ) OVER ( PARTITION BY b ORDER BY a ASC ) y1
FROM t3
ORDER BY b DESC

Please can you confirm if I am right if I say, all cases the default Range Unbounded Preceding:

1) If I'd ordered by a column DESC, there can be only ONE unique First_Value and Max value for the ordered column.

1b) However I might expect the LAST_VALUE to keep changing as I traverse down the rows, depending on the ordered column

2)If I'd ordered by a column ASC, there can be only ONE unique Last_Value and Min value for the ordered column.

2b) However I might expect the FIRST_VALUE to keep changing as I traverse down the rows, depending on the ordered column

Tom Kyte

Followup  

August 27, 2006 - 3:19 pm UTC

1) if you order by a column desc, then first_value and max will be the same.

I am not sure what you mean by "there can be only ONE unique ..."??  there is always only ONE first_value and always only ONE max - by definition.  Whether they are unique in the grand scheme of things - I cannot say.


1b) by definition, yes.  the window is growing "down" like pulling a window shade down.  the bottom is getting further from the top.

2) No, last_value is changing.  Just like it did when you ordered desc

ops$tkyte%ORA10GR2> select x,
  2         min(x) over (order by x asc) min_x,
  3             last_value(x) over (order by x asc) last_x
  4    from t;

         X      MIN_X     LAST_X
---------- ---------- ----------
         1          1          1
         2          1          2


2b) no, by definition, the top of the window doesn't move

ops$tkyte%ORA10GR2> select x,
  2         min(x) over (order by x asc) min_x,
  3         last_value(x) over (order by x asc) last_x,
  4         first_value(x) over (order by x asc) first_x
  5    from t;

         X      MIN_X     LAST_X    FIRST_X
---------- ---------- ---------- ----------
         1          1          1          1
         2          1          2          1

 

Could you rewrite this query to have less elapsed time.

August 28, 2006 - 4:20 am UTC

Reviewer: Sreenivasa Rao from india

Hi Tom,

I know you will be surprised to see this reply.
Here we have some procdure (rather it is standard) to rebuild an index if it's size exceeds the size of corresponding table.

So pls rewrite the same query,to shorten its elapsed time.

Regards,
Sreenivas.

Tom Kyte

Followup  

August 28, 2006 - 10:48 am UTC

why do you do that?

I'm not surprised, I'd love to hear the reasoning - don't you end up rebuilding the same indexes over and over and over again? So this doesn't actually save "disk space" over time, when you measured performance - does it go up, down or sideways after you do this?

wouldn't it be funny if

a) this didn't affect performance noticably
b) affected it negatively

given the amount of resources you expend running the query....




Could you rewrite this query to have less elapsed time.

August 29, 2006 - 3:27 pm UTC

Reviewer: Sreenivasa Rao from INDIA

Hi Tom,

Here we don't consider space merely for 1 or 2 gb. when it comes to performance,we rebuild indexes in off peak hours since it is DSS environment.

Only requirement is ,when the process is initiated ,that must not be affected by index fragmentation.
Since every day we do big uploads on database at specified time.

Regards,
Sreenivas.


Tom Kyte

Followup  

August 29, 2006 - 4:46 pm UTC

I do not think you got my point.

Have you ever measured the results of this?!?

(meaning, if you don't do it - what then? has anyone taken the time to actually verify this is something you WANT TO BE DOING, or do you just do it "because"?)

Converting row values to a single column

September 21, 2006 - 4:15 pm UTC

Reviewer: Jonty from NY

Hi,

Dept, Emp columns in a table.

I want to get results like

Dept1, 'emp1, emp2, emp3'
Dept2, 'emp11, emp22'
Dept3, 'emp9'

How can I get this ? Can Analytical function helps on this ?

Response will be highly appreciated.

Thanks

Jonty



Tom Kyte

Followup  

September 22, 2006 - 2:48 am UTC

how about looking in the other place(s) you asked this identical question?

Which is better

October 27, 2006 - 11:42 am UTC

Reviewer: Chris from Louisville, KY USA

Tom,
I have a result set that has unique groupings with many attribute columns, one of which is a value. The value column is the sae for each group. I want to add a column to my select statement that is the sum of value for each group for the whole set. I have shown sample data below (where ttl is the column I'd like to add).

group value col1 ttl
----- ----- ----- ---
1 5 A 20
1 5 B 20
2 10 C 20
2 10 D 20
2 10 E 20
3 5 F 20

I was hoping to do this without a nested subquery, but can't seem to think of a way to do it. At this point my options seem to be:

A: use row_number() to sum the first row's

select group_col, value_col, col1, sum(rnval) ttl over ()
from (
select group_col, value_col, col1,
decode(row_number() over (partition by group_col
order by group_col),
1, value_col, 0) rnval
from mytab)

B:

select group_col, value_col, col1,
sum(decode(drank,1,value_col,0)) ttl over ()
from (
select group_col, value_col, col1,
dense_rank() over(partition by group_col order by group_col, rownum) drank
from mytab)

Would one of these be better than the other, and do you have yet a better approach you can think of?

Tom Kyte

Followup  

October 27, 2006 - 6:24 pm UTC

just use row_number() - seems the most intuitive, the dense_rank is just plain confusing and by referencing rownum - you have to sort of pre-materialize stuff.



Analytical function to add values of two columns

December 08, 2006 - 9:31 am UTC

Reviewer: Robin from USA

Lets say you have an employee who works in two different departments.

So KING may work for dept 10 and dept 20.

His records might look like this:

DEPTNO EMPNO NAME SAL MON TUE WED
10 222 KING 1000 4 6 8
20 222 KING 400 4 2 1


We want to make sure that KING doesn't work more than a total of 40 hours in any one week (we don't want to pay him OT). But we don't care if he works over 8 on a particular day. Neither supervisor knows how many hours KING works for the other dept. He usally works 5 days a week (but I only could show part of the table without screwing up formatting)

Anyway, I can use the following query to find the total for KING's daily hours.

SELECT EMPNO, NAME, SUM(MON) AS T_MON, SUM(TUE) AS T_TUE, SUM(WED) AS T_WED, SUM(THU) AS T_THU, SUM(FRI) AS T_FRI
FROM VW_WORKWEEK
GROUP BY EMPNO, NAME;

Results (with addition days):

EMPNO NAME T_SAL T_MON T_TUE T_WED T_THU T_FRI
222 KING 1400 8 8 9 10 8

But when I try to total the columns I get into trouble. What I want it to look like in the end is:

EMPNO NAME T_SAL T_WRK_WEEK
222 KING 1400 43

I've tried:

SELECT EMPNO, NAME, SUM(SAL) AS T_SAL, ((SUM(MON) + SUM(TUE)+ SUM(WED)+ SUM(THU) + SUM(FRI)) as T_WRK_WEEK
FROM VW_HOUR_COUNT
GROUP BY EMPNO, NAME;

Results:

EMPNO NAME T_SAL T_WRK_WEEK
222 KING 1400

Another try:

SELECT EMPNO, NAME, SUM(SAL) as T_SAL
FROM VW_WORKWEEK
GROUP BY EMPNO, NAME
HAVING ((SUM(MON) + SUM(TUE)+ SUM(WED)+ SUM(THU) + SUM(FRI)) > 40;

Results (no rows returned):

EMPNO NAME T_SAL T_WRK_WEEK

And finally I've tried:

SELECT EMPNO, NAME, SUM(SUM(MON) AS T_MON, SUM(TUE) AS T_TUE, SUM(WED) AS T_WED, SUM(THU) AS T_THU, SUM(FRI) AS T_FRI)
FROM VW_WORKWEEK
GROUP BY EMPNO, NAME;

I get:

ORA-00923: FROM keyword not found where expected

Is it possible to do this?

thanks!




Tom Kyte

Followup  

December 09, 2006 - 12:19 pm UTC

one of the sums must have been null


nvl(sum(mon),0)+nvl......

I figured it out.

December 08, 2006 - 12:33 pm UTC

Reviewer: Robin

I forgot to include the NVL() function. DUH

Thanks anyway!

Appropriate indexing?

January 23, 2007 - 11:19 am UTC

Reviewer: John Howard from London, UK

Tom,

Although loosely related to analytics this is more a question of indexing strategy. Sorry if it is a little off topic.

To illustrate my question I'll use the table ORDERS that comes with the OE sample schema (9.2).

 
SQL> desc oe.orders
 Name                      Null?    Type
 ------------------------- -------- --------------------------------------------
 ORDER_ID                  NOT NULL NUMBER(12)
 ORDER_DATE                NOT NULL TIMESTAMP(6) WITH LOCAL TIME ZONE
 ORDER_MODE                         VARCHAR2(8)
 CUSTOMER_ID               NOT NULL NUMBER(6)
 ORDER_STATUS                       NUMBER(2)
 ORDER_TOTAL                        NUMBER(8,2)
 SALES_REP_ID                       NUMBER(6)
 PROMOTION_ID                       NUMBER(6)



What I want is the most recent order, with a not null SALES_REP_ID, for each customer. That is for each CUSTOMER_ID I want the last record with a not null SALES_REP_ID when ordered by ORDER_DATE. I know that I can do this easily with the RANK() function :-

select customer_id,
        order_date,
        sales_rep_id,
        order_id
from
(select         customer_id,
                rank() over (partition by customer_id order by order_date desc) drank,
                order_date,
                sales_rep_id,
                order_id
 from   oe.orders
 where  sales_rep_id is not null
)
where drank = 1
/

CUSTOMER_ID ORDER_DATE                     SALES_REP_ID   ORDER_ID
----------- ------------------------------ ------------ ----------
        157 20-NOV-99 05.22.53.224175 AM            163       2398
        158 20-NOV-99 06.22.38.340990 AM            161       2399
        159 10-JUL-99 07.34.29.559387 AM            161       2400
        160 10-JUL-99 08.22.53.554822 AM            163       2401
        161 02-JUL-99 09.34.44.665170 AM            154       2402
....

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=70 Bytes=4550)
   1    0   VIEW (Cost=4 Card=70 Bytes=4550)
   2    1     WINDOW (SORT PUSHED RANK) (Cost=4 Card=70 Bytes=1680)
   3    2       TABLE ACCESS (FULL) OF 'ORDERS' (Cost=2 Card=70 Bytes=1680)


Now this is ok when the table is of such a size that a full scan is bearable. My question is regarding the best way (if any) to use an index to reduce the cost when the table becomes large but the number of customers remains fairly small. I was hoping that something like: -

 (CUSTOMER_ID, ORDER_DATE DESC, SALES_REP_ID) 


could be "skip-scanned" in some way so that for each CUSTOMER_ID we would only need to traverse along to the first entry with non-null SALES_REP_ID then skip to the next CUSTOMER_ID.
Anything I've tried so far full scans the index.

Am I barking up the wrong tree - or am I just barking?

If so - what is the preferred method for keeping a query like this from becoming slower as time goes by?

Many thanks in advance,
John



Even Distribution

February 14, 2007 - 6:19 pm UTC

Reviewer: Ed

Hi Tom,
I'd like to ask about "Even-Distribution"
(May not be the right term)

Say I have a bunch of bowling players of different skill level as indicated by his avg_score in the table below.

I need to allot them into n teams (say 8), of equivalent strength on the TEAM level so no team ends up with mostly high-scorers and vic-versa.

(and yes, let's say players can not be evenly divided into teams)

Would you shed some light on how to achieve this ?

Thanks

10gR2> create table players (id integer primary key, avg_score number, team_no integer) ;

10gR2> desc players
Name      Type   
--------- -------
ID        INTEGER                           
AVG_SCORE NUMBER                         
TEAM_NO   INTEGER

10gR2> BEGIN
  2    FOR i IN 1..113
  3    LOOP
  4        INSERT INTO players (id, avg_score)  VALUES(i,round(dbms_random.value(75,295)));
  5    END LOOP;
  6  END ;
  7  /

10gR2> commit;


Analytic Function?

April 11, 2007 - 3:36 pm UTC

Reviewer: Bea from Spain

Hi Tom,

I have this table with these values:

YEAR VALUE1
------------
2007 1.04
2008 1.04
2009 1.04
2010 1.04
2011 1.04

And I have to get this result

YEAR VALUE2
-----------
2007 1.04 (same value)
2008 1.0816 (1.04 (2007) * 1.04 (2008)
2009 1.12486 (1.0816 (2008)*1.04 (2009)
2010 1.16985 (1.12486 (2009) * 1.04 (2010)
2011 1.21664 (1.16985(2010) * 1.04 (2011)

That is:
For 2007, the same value in the table
For 2008, the value from the table*previous value in 2007
For 2009, the value from the table*calculated previous value in 2008

and so on

Thanks!!!





Tom Kyte

Followup  

April 11, 2007 - 5:53 pm UTC

ops$tkyte%ORA10GR2> select year, v, exp( sum( ln(v) ) over (order by year) )
  2    from t;

      YEAR          V EXP(SUM(LN(V))OVER(ORDERBYYEAR))
---------- ---------- --------------------------------
      2007       1.04                             1.04
      2008       1.04                           1.0816
      2009       1.04                         1.124864
      2010       1.04                       1.16985856
      2011       1.04                        1.2166529


April 11, 2007 - 9:39 pm UTC

Reviewer: Eka from Indonesia

Hi Tom,

Here's the table:

SQL> create table test(goods_id varchar2(1),trx_date date,lastprod_qty number);
SQL> insert into test values('A',to_date('010407','ddmmyy'),20);
SQL> insert into test values('A',to_date('030407','ddmmyy'),0);
SQL> insert into test values('A',to_date('040407','ddmmyy'),0);
SQL> insert into test values('A',to_date('060407','ddmmyy'),80);
SQL> insert into test values('B',to_date('010407','ddmmyy'),50);
SQL> insert into test values('B',to_date('060407','ddmmyy'),10);
SQL> insert into test values('B',to_date('090407','ddmmyy'),0);

SQL> select * from test order by goods_id,trx_date;

GOODS_ID TRX_DATE    LASTPROD_QTY
-------- ----------- ------------
A        4/1/2007              20
A        4/3/2007               0
A        4/4/2007               0
A        4/6/2007              80
A        4/10/2007              0
B        4/1/2007              50
B        4/6/2007              10
B        4/9/2007               0

8 rows selected

The goal is to update column lastprod_qty if it is 0 with the last non zero value. I want the output to be like this:
GOODS_ID TRX_DATE    LASTPROD_QTY
-------- ----------- ------------
A        4/1/2007              20
A        4/3/2007              20
A        4/4/2007              20
A        4/6/2007              80
A        4/10/2007             80
B        4/1/2007              50
B        4/6/2007              10
B        4/9/2007              10

I can easily accomplish that by using plsql, looping rows one by one, but I wonder is there a better solution? This table will be very big, and I hope I can use single sql statement, since it will be much faster, but I don't have any idea how to do that. I tried lag analytic function, but it didn't help in this case.
Tom, do you have any suggestion?

Thanks in advance,
Eka


Tom Kyte

Followup  

April 12, 2007 - 8:57 am UTC

ops$tkyte%ORA10GR2> select *
  2    from (
  3  select rowid rid, goods_id, trx_date, lastprod_qty,
  4         last_value( decode(lastprod_qty,0,to_number(null),lastprod_qty) ignore nulls) over (partition by goods_id order by trx_date) last_qty
  5    from test
  6         )
  7   where lastprod_qty = 0;

RID                G TRX_DATE  LASTPROD_QTY   LAST_QTY
------------------ - --------- ------------ ----------
AAARU+AAEAAAAHPAAB A 03-APR-07            0         20
AAARU+AAEAAAAHPAAC A 04-APR-07            0         20
AAARU+AAEAAAAHPAAE A 10-APR-07            0         80
AAARU+AAEAAAAHPAAH B 09-APR-07            0         10

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> merge into test
  2  using(
  3  select *
  4    from (
  5  select rowid rid, goods_id, trx_date, lastprod_qty,
  6         last_value( decode(lastprod_qty,0,to_number(null),lastprod_qty) ignore nulls) over (partition by goods_id order by trx_date) last_qty
  7    from test
  8         )
  9   where lastprod_qty = 0
 10      ) test2
 11  on (test.rowid = test2.rid)
 12  when matched then update set lastprod_qty = test2.last_qty;

4 rows merged.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select * from test order by goods_id, trx_date;

G TRX_DATE  LASTPROD_QTY
- --------- ------------
A 01-APR-07           20
A 03-APR-07           20
A 04-APR-07           20
A 06-APR-07           80
A 10-APR-07           80
B 01-APR-07           50
B 06-APR-07           10
B 09-APR-07           10

8 rows selected.



or for 9i

ops$tkyte%ORA9IR2> select *
  2    from (
  3  select rowid rid, goods_id, trx_date, lastprod_qty,
  4         to_number(substr(max( case when lastprod_qty <> 0 then to_char(trx_date,'yyyymmdd')||lastprod_qty end) over (partition by goods_id order by trx_date),9)) last_qty
  5    from test
  6         )
  7   where lastprod_qty = 0;

RID                G TRX_DATE  LASTPROD_QTY   LAST_QTY
------------------ - --------- ------------ ----------
AAAJrxAAJAAAAGWAAB A 03-APR-07            0         20
AAAJrxAAJAAAAGWAAC A 04-APR-07            0         20
AAAJrxAAJAAAAGWAAE A 10-APR-07            0         80
AAAJrxAAJAAAAGWAAH B 09-APR-07            0         10

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> merge into test
  2  using
  3  (
  4  select *
  5    from (
  6  select rowid rid, goods_id, trx_date, lastprod_qty,
  7         to_number(substr(max( case when lastprod_qty <> 0 then to_char(trx_date,'yyyymmdd')||lastprod_qty end) over (partition by goods_id order by trx_date),9)) last_qty
  8    from test
  9         )
 10   where lastprod_qty = 0
 11  ) test2
 12  on (test.rowid = test2.rid)
 13  when matched then update set lastprod_qty = last_qty
 14  when not matched then insert (goods_id) values (null) /* never happens! */
 15  /

4 rows merged.

ops$tkyte%ORA9IR2> select * from test order by goods_id, trx_date;

G TRX_DATE  LASTPROD_QTY
- --------- ------------
A 01-APR-07           20
A 03-APR-07           20
A 04-APR-07           20
A 06-APR-07           80
A 10-APR-07           80
B 01-APR-07           50
B 06-APR-07           10
B 09-APR-07           10

8 rows selected.


To Eka

April 12, 2007 - 2:22 am UTC

Reviewer: Michel CADOT from France


SQL> select * from test order by goods_id,trx_date;

G TRX_DATE   LASTPROD_QTY
- ---------- ------------
A 01/04/2007           20
A 03/04/2007            0
A 04/04/2007            0
A 06/04/2007           80
B 01/04/2007           50
B 06/04/2007           10
B 09/04/2007            0

7 rows selected.

SQL> select goods_id, trx_date, lastprod_qty,
  2         last_value(nullif(lastprod_qty,0) ignore nulls) 
  3           over (partition by goods_id order by trx_date) new_qty
  4  from test
  5  order by goods_id, trx_date
  6  /

G TRX_DATE   LASTPROD_QTY    NEW_QTY
- ---------- ------------ ----------
A 01/04/2007           20         20
A 03/04/2007            0         20
A 04/04/2007            0         20
A 06/04/2007           80         80
B 01/04/2007           50         50
B 06/04/2007           10         10
B 09/04/2007            0         10

7 rows selected.

SQL> update test a 
  2  set lastprod_qty = 
  3        ( select new_qty
  4          from ( select last_value(nullif(lastprod_qty,0) ignore nulls) 
  5                          over (partition by goods_id order by trx_date) new_qty,
  6                        rowid row_id
  7                 from test ) b
  8          where a.rowid = b.row_id )
  9  where lastprod_qty = 0
 10  /

3 rows updated.

SQL> select * from test order by goods_id,trx_date;

G TRX_DATE   LASTPROD_QTY
- ---------- ------------
A 01/04/2007           20
A 03/04/2007           20
A 04/04/2007           20
A 06/04/2007           80
B 01/04/2007           50
B 06/04/2007           10
B 09/04/2007           10

7 rows selected.

Regards
Michel

Thanks Michel

April 12, 2007 - 3:58 am UTC

Reviewer: eka@i4te.com from Indonesia


What is the most efficient way to get top N rows

April 13, 2007 - 12:02 am UTC

Reviewer: Shree TVM from MD USA

Dear Tom,
We have an application that queries the Top-N rows based on filter from a very huge table. We are currently using the DENSE_RANK() function like this

select a.*
from (select c1,c2,c3, dense_rank() over (order by c1 desc,c3 desc) as rnk
from t1
where c1 < 12345
and c3 < sysdate
and c2 = 'XYZ' ) a
where rnk < :N
order by a.c1,a.c3;

Is this the most effient way to achive this?

I have been a reader for many many years and posting a question for very first time.

Tom Kyte

Followup  

April 13, 2007 - 2:04 pm UTC

why do you order by c1 desc, c3 desc in the rank - but not the final output?

What is the most efficient way to get top N rows

April 14, 2007 - 9:55 am UTC

Reviewer: Shree from MD USA

Application expects the rowset in ascending order of c1,c3

Tom Kyte

Followup  

April 16, 2007 - 10:38 am UTC

ok, then it is as good as it is going to get.


I presume c1, c3 is NOT unique.

Analytical Function

April 15, 2007 - 10:37 am UTC

Reviewer: Shree from MD USA

I have unique requirement,there is a table A (huge 15+ millions rows) in Oracle 9ir2 like

Table A(id,cd,cdate,uid);

111,1,02/04/2007,3
112,4,02/04/2007,3
112,5,02/04/2007,3
113,5,02/04/2007,3
114,4,02/04/2007,3
115,7,02/04/2007,3
115,4,02/04/2007,3
116,2,02/04/2007,3
116,3,02/04/2007,3
117,3,02/05/2007,3
117,5,02/05/2007,3

want an output like

111,1,02/04/2007,3
112,4,02/04/2007,3
113,5,02/04/2007,3
114,4,02/04/2007,3
115,7,02/04/2007,3
115,4,02/04/2007,3
116,2,02/04/2007,3
117,3,02/05/2007,3
117,5,02/05/2007,3

If for an id, cd 4 and 5 exists merge(or eliminate the row with cd 5) it togather and create a row with cd 4(ex. ID:112), same way if for an id, cd 2 and 3 exists merge(or eliminate the row with cd 3) it togather and create a row with cd 2 (ex. ID:116). All other rows will apprear in the output as is.

Is there any way I can accomplish this with a SQL statement, may be some analytical function. Here my target to do it as efficently as possible.


Thanks
Tom Kyte

Followup  

April 16, 2007 - 1:02 pm UTC

no creates
no inserts
no look

Analytical Function

April 25, 2007 - 6:38 am UTC

Reviewer: Krish from India

Hi Tom,

Could you please help me with an easier approach to get the below result from emp table?

EMPNO ENAME 2_MAXSAL DEPTNO EMPNO ENAME 3_MAXSAL
9999 CLARK 2450 10 7934 MILLER 1300
7566 JONES 2975 20 7876 ADAMS 1100
7499 ALLEN 1600 30 7844 TURNER 1500

Ideally I am trying to get the 2nd and 3 top salary details in a single row.

I was trying to use the TOPN function with inline views (virtual tables)...but couldn't succeed.

Thanks in Advance.
Krish.
Tom Kyte

Followup  

April 25, 2007 - 10:18 am UTC

scott%ORA10GR2> select deptno,
  2         max( decode( rn, 2, empno ) ) empno2,
  3         max( decode( rn, 2, ename ) ) ename2,
  4         max( decode( rn, 2, sal ) ) sal2,
  5         max( decode( rn, 3, empno ) ) empno3,
  6         max( decode( rn, 3, ename ) ) ename3,
  7         max( decode( rn, 3, sal ) ) sal3
  8    from (
  9  select *
 10    from (
 11  select empno, ename, deptno, sal,
 12         dense_rank() over (partition by deptno order by sal DESC nulls last)rn
 13    from emp
 14         )
 15   where rn between 2 and 3
 16         )
 17   group by deptno
 18   order by deptno
 19  /

    DEPTNO     EMPNO2 ENAME2           SAL2     EMPNO3 ENAME3           SAL3
---------- ---------- ---------- ---------- ---------- ---------- ----------
        10       7782 CLARK            2450       7934 MILLER           1300
        20       7566 JONES            2975       7876 ADAMS            1100
        30       7499 ALLEN            1600       7844 TURNER           1500



but, beware of what you ask for - ask yourself "what if there were 5 people making 2,975 in deptno=20 - what would happen then?



Got the answer TOM

April 25, 2007 - 8:58 am UTC

Reviewer: Krish from India

Hi TOM,

I was able to find the answer by myself.
but,could u please have a look into it and suggest a better way ??

select a.ename, a.sal,a.deptno, b.ename, b.sal from

(select deptno, ename, sal from (SELECT * FROM (
SELECT deptno, ename, sal, ROW_NUMBER()
OVER (
PARTITION BY deptno ORDER BY sal DESC
) Top2 FROM scott.emp
)
WHERE Top2 <= 2) where top2 = 2 ) a join

(select deptno, ename, sal from (SELECT * FROM (
SELECT deptno, ename, sal, ROW_NUMBER()
OVER (
PARTITION BY deptno ORDER BY sal DESC
) Top3 FROM scott.emp
)
WHERE Top3 <= 3) where top3 = 3) b on(a.deptno=b.deptno)
Tom Kyte

Followup  

April 25, 2007 - 10:38 am UTC

see above.

Thanks TOM

May 08, 2007 - 3:12 am UTC

Reviewer: Krish from Bangalore,India

The Best as usual...
Yes, you are right Tom. I didn't think about it.

May 24, 2007 - 3:40 pm UTC

Reviewer: VR from Boston, USA

Tom,
Given the following:

create table maxtest(a int, b date, c date);
insert into maxtest values(1, '2006-09-12','2006-12-31');
insert into maxtest values(1, '2006-08-03','9999-12-31');
insert into maxtest values(1, '2006-03-07','2006-10-15');
insert into maxtest values(1, '2006-05-06','9999-12-31');
insert into maxtest values(1, '2006-02-01','2006-12-31');


I need the following result set.

1,'2006-08-03','9999-12-31'

I basically need a way to get the max value for b for a given max value of c. The max value of c is 9999-12-31 and although 2006-09-12 is the max value of b it is not asscoiated with the max value of c. i.e. for 9999-12-31 which is the max value of c the max b value is 2006-08-03.

How can I achieve this in one shot without scanning the table multiple time? Can I use analytic functions to accomplish this??

Thanks
VR
Tom Kyte

Followup  

May 26, 2007 - 11:46 am UTC

ops$tkyte%ORA10GR2> select *
  2    from (select t1.*, row_number() over (partition by a order by c DESC nulls last) rn
  3            from maxtest t1
  4         )
  5   where rn = 1;

         A B          C                  RN
---------- ---------- ---------- ----------
         1 2006-08-03 9999-12-31          1

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select a,
  2         to_date( substr( data, 15 ),'yyyymmddhh24miss') b,
  3         to_date( substr( data, 1, 14 ), 'yyyymmddhh24miss' ) c
  4    from (
  5  select a, max( to_char(c,'yyyymmddhh24miss') || to_char(b,'yyyymmddhh24miss') )  data
  6    from maxtest t1
  7   group by a
  8         );

         A B          C
---------- ---------- ----------
         1 2006-08-03 9999-12-31

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select a,
  2         max(b) KEEP (dense_rank first order by c desc nulls last) b,
  3         max(c)
  4    from maxtest t1
  5   group by a;

         A B          MAX(C)
---------- ---------- ----------
         1 2006-08-03 9999-12-31


May 24, 2007 - 6:18 pm UTC

Reviewer: A reader

Tom,
This is in reference to my question above. The best I can come up with is the query below.

select * from
(select a,row_number() over(partition by a order by b desc) rn, b, c from maxtest where (a,c) in
(select a, max(c) from maxtest group by a)
)
where rn = 1

As you can see it scans the table twice and I woud like to avoid it if at all possible.

Thanks
VR

May 26, 2007 - 12:50 pm UTC

Reviewer: A reader

Thanks a lot Tom. However, with the first solution:

ops$tkyte%ORA10GR2> select *
2 from (select t1.*, row_number() over (partition by a order by c DESC nulls last) rn
3 from maxtest t1
4 )
5 where rn = 1;

A B C RN
---------- ---------- ---------- ----------
1 2006-08-03 9999-12-31 1

Is this result always guaranteed? Since we are only ordering by C DESC and there are two records with 9999-12-31, is there a possibility that the other record will be returned? In other words can the query return the following depending on the path the optimizer takes?

A B C RN
---------- ---------- ---------- ----------
1 2006-05-06 9999-12-31 1

Tom Kyte

Followup  

May 26, 2007 - 1:04 pm UTC

you are right, we need to

order by C desc nulls last, B desc nulls last


probably,

order by c desc, b desc

and where c is not null and b is not null in the inline view instead.. as there could be a case where C is not null - but B is null for all of those C's

but you can decide on how to deal with NULLS

good catch - thanks.

Groupping Rule In Oracle Applications

May 28, 2007 - 2:39 am UTC

Reviewer: vishnu from USA

Tom,

Thanks for providing this forum for answering questions related to Oracle.

I had one question which is releted to Oracle Apps Invoicing.
Is there any reule for that? like Groupping Rule ? in AP or AR modules.

I have to do this

May 28, 2007 - 6:48 pm UTC

Reviewer: A reader

I have a TABLE xx_contract_detials
the COLUMNS are
contract_type VARCHAR2(100),
region VARCHAR2(100)
contract_NO VARCHAR2(1000),
organizations NUMBER ,
dealsize NUMBER ,
customer_name VARCHAR2(1000),
salesrep VARCHAR2(1000),
contract_effective_date DATE,
contract_desc VARCHAR2(1000),
contract_status VARCHAR2(100),
version_comments VARCHAR2(1000),
org_id NUMBER,
authoring_party_code VARCHAR2(1000),
owner_id NUMBER
contract_expriation_date DATE

i need the output LIKE this

Type Grouping of: Special Bid / Standard /Clause Standard Contract / Custom
Geo Sales region (under grouping of the type)
< $50K Totals under $50K per region per type
$50K - $250K Totals between $50K and $250K per region per type
> $250 Totals over $250K per region per type
Totals Totals of rows


Tom Kyte

Followup  

May 30, 2007 - 10:18 am UTC

no create table
no insert into table
no look

and use the CODE BUTTON for fixed fonts.

Correlating timestamps across rows

May 29, 2007 - 7:40 pm UTC

Reviewer: David Wellborn from Richardson, TX

I have a table, like a log file, that captures disconnection and connection events for a device and timestamps them. Sometimes I get extra connections even though there is no disconnection. I'd like a query to correlate the connections and disconnections on a single row.

Here is the SQL to demonstrate:
CREATE TABLE CONN_HIST
(
  ID               NUMBER,
  CONNTYPE         VARCHAR2(3 BYTE),
  EVENT            VARCHAR2(4 BYTE),
  DEV_ID           VARCHAR2(24 BYTE),
  EVENT_TIMESTAMP  DATE 
)
LOGGING 
NOCACHE
NOPARALLEL
NOMONITORING;

COMMENT ON COLUMN CONN_HIST.CONNTYPE IS '1=RDR 2=SSR 3=CWA';

COMMENT ON COLUMN CONN_HIST.EVENT IS '1=CONN 2=DISC';

COMMENT ON COLUMN CONN_HIST.EVENT_TIMESTAMP IS 'TIME OF THE CONNECT OR DISCONNECT EVENT';

ALTER TABLE CONN_HIST ADD (
  CONSTRAINT CONN_HIST_PK
 PRIMARY KEY
 (ID));
 
 INSERT INTO CONN_HIST ( ID, CONNTYPE, EVENT, DEV_ID,
EVENT_TIMESTAMP ) VALUES ( 
2, 'SSR', 'CONN', '07041007383600000000562D',  TO_Date( '04/10/2007 07:38:36 PM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO CONN_HIST ( ID, CONNTYPE, EVENT, DEV_ID,
EVENT_TIMESTAMP ) VALUES ( 
104, 'SSR', 'DISC', '07041007383600000000562D',  TO_Date( '05/29/2007 04:18:22 PM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO CONN_HIST ( ID, CONNTYPE, EVENT, DEV_ID,
EVENT_TIMESTAMP ) VALUES ( 
105, 'SSR', 'CONN', '07041007383600000000562D',  TO_Date( '05/29/2007 04:26:26 PM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO CONN_HIST ( ID, CONNTYPE, EVENT, DEV_ID,
EVENT_TIMESTAMP ) VALUES ( 
106, 'SSR', 'CONN', '07041007383600000000562D',  TO_Date( '05/29/2007 05:24:06 PM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO CONN_HIST ( ID, CONNTYPE, EVENT, DEV_ID,
EVENT_TIMESTAMP ) VALUES ( 
107, 'SSR', 'CONN', '07041007383600000000562D',  TO_Date( '05/29/2007 05:24:27 PM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO CONN_HIST ( ID, CONNTYPE, EVENT, DEV_ID,
EVENT_TIMESTAMP ) VALUES ( 
108, 'SSR', 'DISC', '07041007383600000000562D',  TO_Date( '05/29/2007 05:27:50 PM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO CONN_HIST ( ID, CONNTYPE, EVENT, DEV_ID,
EVENT_TIMESTAMP ) VALUES ( 
109, 'SSR', 'CONN', '07041007383600000000562D',  TO_Date( '05/29/2007 05:28:25 PM', 'MM/DD/YYYY HH:MI:SS AM')); 
COMMIT;

SELECT ID, CONNTYPE, EVENT, DEV_ID, TO_CHAR(EVENT_TIMESTAMP, 'DD-MON-YYYY HH24:MI:SS') EVENT_TIME 
FROM CONN_HIST ORDER BY EVENT_TIMESTAMP DESC;



And the output:

Table created.


Comment created.


Comment created.


Comment created.


Table altered.


1 row created.


1 row created.


1 row created.


1 row created.


1 row created.


1 row created.


1 row created.


Commit complete.


        ID CON EVEN DEV_ID                   EVENT_TIME
---------- --- ---- ------------------------ --------------------
       109 SSR CONN 07041007383600000000562D 29-MAY-2007 17:28:25
       108 SSR DISC 07041007383600000000562D 29-MAY-2007 17:27:50
       107 SSR CONN 07041007383600000000562D 29-MAY-2007 17:24:27
       106 SSR CONN 07041007383600000000562D 29-MAY-2007 17:24:06
       105 SSR CONN 07041007383600000000562D 29-MAY-2007 16:26:26
       104 SSR DISC 07041007383600000000562D 29-MAY-2007 16:18:22
         2 SSR CONN 07041007383600000000562D 10-APR-2007 19:38:36

7 rows selected.


Is there any way using an analytical query I could see the data like this:

CON DEV_ID                   DISC_TIME            CONN_TIME
--- ------------------------ -------------------- --------------------
SSR 07041007383600000000562D 29-MAY-2007 16:18:22 29-MAY-2007 16:26:26
SSR 07041007383600000000562D                      29-MAY-2007 17:24:06
SSR 07041007383600000000562D                      29-MAY-2007 17:24:27
SSR 07041007383600000000562D 29-MAY-2007 17:27:50 29-MAY-2007 17:28:25


Tom Kyte

Followup  

May 30, 2007 - 11:26 am UTC

can you explain your output better?

disconnect time < connect time??

do you want a row with event = conn to be merged with a row where event = 'disc' if that row is row before it?

what about id = 2?

Disconnect / Connect example

May 30, 2007 - 4:08 pm UTC

Reviewer: David Wellborn from Richardson, TX

We are interested in periods of disconnected time. This is why the disconnect timestamp is < connect timestamp.

So, yes, I want to merge the row as you stated above. The connect time should be >= the most recent disconnect time for the device.

I inadvertently left id 2 out of my desired result set.
Tom Kyte

Followup  

May 30, 2007 - 4:16 pm UTC

ops$tkyte%ORA10GR2> select conntype, dev_id, disc_time, conn_time
  2    from (
  3  select conntype, dev_id,
  4         case when lag(event) over (order by event_timestamp) = 'DISC'
  5              then lag(event_timestamp) over (order by event_timestamp)
  6                  end disc_time,
  7                  event_timestamp conn_time,
  8                  event
  9    from conn_hist
 10         )
 11   where event = 'CONN'
 12  /

CON DEV_ID                   DISC_TIME            CONN_TIME
--- ------------------------ -------------------- --------------------
SSR 07041007383600000000562D                      10-apr-2007 19:38:36
SSR 07041007383600000000562D 29-may-2007 16:18:22 29-may-2007 16:26:26
SSR 07041007383600000000562D                      29-may-2007 17:24:06
SSR 07041007383600000000562D                      29-may-2007 17:24:27
SSR 07041007383600000000562D 29-may-2007 17:27:50 29-may-2007 17:28:25


May 30, 2007 - 6:16 pm UTC

Reviewer: Joe Reynolds from Philadelphia, PA

Using the same data set as above, is it possible to generate this result set using SQL?
DEV_ID                   ONTIME               OFFTIME
------------------------ -------------------- --------------------
07041007383600000000562D 4/10/2007 7:38:36 PM 4/10/2007 11:59:59 PM
07041007383600000000562D 5/29/2007 4:26:26 PM 5/29/2007 5:27:50 PM 
07041007383600000000562D 5/29/2007 5:28:25 PM 5/29/2007 11:59:59 PM

A description is: The DEV_ID stays connected until it either disconnects or the day ends. Ignore extra disconnects (i.e. without a connect) and extra connects (connect if already connected).

I tried using this query, but I can't figure out how to combine rows 2-4 into a single row:
select rownum, dev_id, EVENT_TIMESTAMP ontime, offtime from
 (    
select conntype, event, dev_id, EVENT_TIMESTAMP,
  case when event ='CONN' AND lead(conntype) over (partition by dev_id, trunc(EVENT_TIMESTAMP) order by EVENT_TIMESTAMP) is not null
            then lead(EVENT_TIMESTAMP) over (partition by dev_id, trunc(EVENT_TIMESTAMP) order by EVENT_TIMESTAMP)
       when event ='CONN' AND lead(conntype) over (partition by dev_id, trunc(EVENT_TIMESTAMP) order by EVENT_TIMESTAMP) is null
            then trunc(EVENT_TIMESTAMP+1)-1/(24*60*60)
       else null
  end offtime
from conn_hist
 ) 
where offtime is not null 
order by ontime

ROWNUM DEV_ID                   ONTIME               OFFTIME              
------ ------------------------ -------------------- -------------------- 
1      07041007383600000000562D 4/10/2007 7:38:36 PM 4/10/2007 11:59:59 PM
2      07041007383600000000562D 5/29/2007 4:26:26 PM 5/29/2007 5:24:06 PM 
3      07041007383600000000562D 5/29/2007 5:24:06 PM 5/29/2007 5:24:27 PM 
4      07041007383600000000562D 5/29/2007 5:24:27 PM 5/29/2007 5:27:50 PM 
5      07041007383600000000562D 5/29/2007 5:28:25 PM 5/29/2007 11:59:59 PM


Any suggestions? As usual, thanks a million for all these years of help.
Tom Kyte

Followup  

May 30, 2007 - 6:31 pm UTC

you switched from 24 hour to am/pm - very confusing.


and what happens if you connect on april 10th, and disconnect on april 11th - does the end of day trump or is there a fictitious midnight connect created or what.

Whoopsy on the datemask!

May 30, 2007 - 8:09 pm UTC

Reviewer: Joe Reynolds from Philadelphia, PA

Datemask switcheroo is annoying... sorry about that.

All connections end at midnight, so there's a hard disconnect at 23:59:59. I know it seems odd, but there is a logical business reason why the data comes in that way.

My query could probably use some tweaking (still working to understand analytics), but I'm really struggling with trying to fold the 3 connect rows into a single row.

Thanks again.

custom aggregate function

June 01, 2007 - 5:18 pm UTC

Reviewer: Saket from Louisville, KY, USA

My Table admin_fees is as follows

Name Type Nullable Default Comments
------------------ ------------ -------- ------- --------
ADMIN_FEES_ID NUMBER
CLIENT_ID NUMBER
BRANCH_GROUP_ID NUMBER Y
EMPLOYER_GROUP_ID NUMBER Y
DEPARTMENT VARCHAR2(25) Y
EVENT_TYPE_CODE VARCHAR2(20) Y
PURSUIT_LEVEL_CODE VARCHAR2(20) Y
FEE_PERCENT NUMBER Y
FEE_AMOUNT NUMBER Y

Sample Data :


ID Client BG EG DEPT ET P_L_C Amount Percent
1836 122 COMAUTO-I 2
1865 122 MEDMAL-UM 2
1873 122 OTHER-A 2
1885 122 PERAUTO-PL 2
1875 122 OVERPAY-R 2
1870 122 OTHER-UM 3
1907 122 UNKNOWN-A 3
1626 122 110097 5
1851 122 GENLIAB-UM 5
....
....
....

The table is having many values so


Now I want an abstract result from the above table and data.

If any of the columns except amount, percent and ID is not null it shows Y, and N else.

Then I want to perform logical OR on distinct values of Amount and percent on each column values.

So I do the following thing

select
new_or(client_id) client_id,
new_or(branch_group_id) branch_group_id,
new_or(employer_group_id) employer_group_id,
new_or(department) department,
new_or(event_type_code) event_type_code,
new_or(pursuit_level_code) pursuit_level_code,
fee_percent, fee_amount
from
(select distinct nvl2((case when client_id is not null and branch_group_id is null and employer_group_id is null and department is null and event_type_code is null and pursuit_level_code is null then client_id else null end),'Y', 'N') client_id,
nvl2(branch_group_id, 'Y', 'N') branch_group_id,
nvl2(employer_group_id, 'Y', 'N') employer_group_id,
nvl2(department, 'Y', 'N') department,
nvl2(event_type_code, 'Y', 'N') event_type_code,
nvl2(pursuit_level_code, 'Y', 'N') pursuit_level_code,
fee_percent,
fee_Amount
from admin_fees where client_id = 122
) group by fee_percent, fee_amount;


The result is exactly what I want:


client bg eg dept et pl percent amount

N N Y N Y N 2
N N N N Y N 3
N N Y N N N 4
N N Y N Y N 5
N N Y N Y N 8
N N N N Y N 9

The last query provides distinct values of amount and percent with aggregate values for each columns (Y if column is not null, N if column is null).

I have created a group function named new_or based on new type created my_or.

What it does is a normal logical OR function.

Y N N = Y
N N N N N = N
N N N N Y N N N N = Y


My question is

1. Why does not ORACLE Provide such group function like AND, OR, NAND, NOR, XOR etc.....
2. Is there any other way to get same results using SQL query (If my DBA does not allow me to create type.)

I am working on Oracle Forms 10g Application. I was really wondering if Oracle provides a facility of showing a meaning full message
when a constraint is violated like "Employee Code already exists." instead of ORA-00001 Unique constraint violated .....

3. Does not Oracle ever think of adding a message string on constraints definitions so that when ever the constraint is violated a more meaningful message could be displayed instead of ora-00001 constraint violated .... kind of stuff....
Tom Kyte

Followup  

June 03, 2007 - 5:30 pm UTC

we do, bitand for example and many for raws in the utl_raw package.

but you have strings, strings are not "or'ed and anded"


ops$tkyte%ORA10GR2> select x, case when instr( x, 'Y' ) > 0 then 'Y' else 'N' end yn
  2    from t;

X     Y
----- -
NNY   Y
NNNNN N
YNYYN Y
      N

Custom Group Functions

June 04, 2007 - 9:12 am UTC

Reviewer: Saket from Louisville, KY, USA

Thank you for your instant reply.

My question is a bit different here:

Supposing my data is as follows:


Col1 Col2 Col3 Amt
N Y Y 2
N Y N 2
N N Y 5
N N Y 6
N N Y 7
N Y N 7
Y N N 7


I want the following output form it:

Col1 Col2 Col3 Amt
N Y Y 2
N N Y 5
N N Y 6
Y Y Y 7

So here, what I have done is that I have grouped by Amt. and used
group function called NEW_OR.

I dont think UTL_RAW.BIT_OR could be helpful in this scenario.

Also I would remind you of my question no. 3 in previous posting.

Waiting for your reply.
Tom Kyte

Followup  

June 05, 2007 - 8:50 am UTC

select max(col1), max(col2), max(col3), amt
  from t
 group by amt


eg:

ops$tkyte%ORA10GR2> select * from t;

X Y        AMT
- - ----------
N Y          1
N N          1
Y N          2
N N          2
Y N          3
N Y          3

6 rows selected.

ops$tkyte%ORA10GR2> select max(x), max(y), amt from t group by amt;

M M        AMT
- - ----------
N Y          1
Y N          2
Y Y          3

will work in your case since Y > N.

they are STRINGS, think of them as STRINGS.

enhancement requests go via support, #3.

RE: Custom Group Functions

June 04, 2007 - 1:32 pm UTC

Reviewer: Frank Zhou from Braintree, MA

Saket,

There are several possible ways to implement this query
Here is one of the solutions.

create table test5 as
select 'N' as Col1, 'Y' as Col2, 'Y' as Col3 , 2 as Amt from dual
union all
select 'N' as Col1, 'Y' as Col2, 'N' as Col3 , 2 as Amt from dual
union all
select 'N' as Col1, 'N' as Col2, 'Y' as Col3 , 5 as Amt from dual
union all
select 'N' as Col1, 'N' as Col2, 'Y' as Col3 , 6 as Amt from dual
union all
select 'N' as Col1, 'N' as Col2, 'Y' as Col3 , 7 as Amt from dual
union all
select 'N' as Col1, 'Y' as Col2, 'N' as Col3 , 7 as Amt from dual
union all
select 'Y' as Col1, 'N' as Col2, 'N' as Col3 , 7 as Amt from dual


SQL> select
2 case when instr( XMLAgg(XMLElement(x,Col1) ), 'Y' ) > 0 then 'Y' else 'N' end as Col1 ,
3 case when instr( XMLAgg(XMLElement(x,Col2) ), 'Y' ) > 0 then 'Y' else 'N' end as Col2 ,
4 case when instr( XMLAgg(XMLElement(x,Col3) ), 'Y' ) > 0 then 'Y' else 'N' end as Col3,
5 amt
6 from test5 group by Amt;

C C C AMT
- - - ----------
N Y Y 2
N N Y 5
N N Y 6
Y Y Y 7

SQL>

try this

June 04, 2007 - 1:51 pm UTC

Reviewer: Jay from Herndon, VA


with temptable as

(
select case
when col1 = 'N'
then null
else 'Y'
end col1,

case
when col2 = 'N'
then null
else'Y'
end col2,

case
when col3 = 'N'
then null
else 'Y'
end col3,

amt

from t

)

select nvl(max(col1),'N') col1,
nvl(max(col2),'N') col2,
nvl(max(col3),'N') col3,
amt

from t

group by amt

did a mistate. here is the correct one

June 04, 2007 - 1:59 pm UTC

Reviewer: Jay from Herndon, VA

with temptable as

(
select case
    when col1 = 'N' 
    then null
    else 'Y'
       end col1,

       case
    when col2 = 'N' 
    then null
    else'Y'
       end col2,

       case
    when col3 = 'N' 
    then null
    else 'Y'
       end col3,

       amt    
    
  from t 

)

select nvl(max(col1),'N') col1,
       nvl(max(col2),'N') col2,
       nvl(max(col3),'N') col3,    
       amt

 from temptable ----- i had done a mistake here

group by amt



thanks!

group function

June 04, 2007 - 9:57 pm UTC

Reviewer: Saket from Louisville, KY

Thank you very much Tom and Jay.


Group Function

June 05, 2007 - 8:39 am UTC

Reviewer: Saket from Louisville, KY

Looks Nice. 

(Extremely) Sorry Tom, I failed to format text properly. I will keep this in my mind next time. This was my first posting. 

Thanks to Frank Zhou. 


July 02, 2007 - 4:04 pm UTC

Reviewer: A reader

Tom,

I have following infomation in the table...

Cno org_id User Pass
100 500 100 abc
200 500
300 500


I want query to return following for the same org_id....

Cno org_id User Pass
100 500 100 abc
200 500 100 abc <<---
300 500 100 abc <<---

The query should return User/Pass for Cno 200 and 300 (Though in the table we have User/Pass for cno 100)

Is there anyway I can do this using analytics?

Thanks






Tom Kyte

Followup  

July 03, 2007 - 10:02 am UTC

no create table
no inserts
no look

incomplete specification - what if you had

100 500 100 abc
200 500 
300 500 123 def
400 500 


no version either :(


read about last_value, ignore nulls.

  1* select empno, comm, last_value(comm ignore nulls) over (order by empno) from emp
scott%ORA10GR2> /

     EMPNO       COMM LAST_VALUE(COMMIGNORENULLS)OVER(ORDERBYEMPNO)
---------- ---------- ---------------------------------------------
      7369
      7499        300                                           300
      7521        500                                           500
      7566                                                      500
      7654       1400                                          1400
      7698                                                     1400
      7782                                                     1400
      7788                                                     1400
      7839                                                     1400
      7844          0                                             0
      7876                                                        0
      7900                                                        0
      7902                                                        0
      7934                                                        0

14 rows selected.

Analytics NTILE ?

July 09, 2007 - 5:31 pm UTC

Reviewer: Martin from Montreal, CANADA

Hi Tom.

I have this table with two columns NAME and TOTAL.

CREATE TABLE TABLE_A
(
  NAME   VARCHAR2(50 BYTE),
  TOTAL  NUMBER(9)
);


SET DEFINE OFF;
Insert into TABLE_A (NAME, TOTAL) Values ('BILL', 25);
Insert into TABLE_A (NAME, TOTAL) Values ('PETER', 23);
Insert into TABLE_A (NAME, TOTAL) Values ('JOHN', 21);
Insert into TABLE_A (NAME, TOTAL) Values ('ROSIE', 18);
Insert into TABLE_A (NAME, TOTAL) Values ('STEVE', 14);
Insert into TABLE_A (NAME, TOTAL) Values ('CARL', 13);
Insert into TABLE_A (NAME, TOTAL) Values ('GREG', 11);
Insert into TABLE_A (NAME, TOTAL) Values ('DIRK', 11);
Insert into TABLE_A (NAME, TOTAL) Values ('MARY', 9);
Insert into TABLE_A (NAME, TOTAL) Values ('KATE', 7);
Insert into TABLE_A (NAME, TOTAL) Values ('ANA', 4);
COMMIT;

NAME  TOTAL
------ -------
BILL      25
PETER     23
JOHN      21
ROSIE     18
STEVE     14
CARL      13
DIRK      11
GREG      11
MARY      9
KATE      7
ANA       4


I need to order the table by the TOTAL column DESCENDING
Divide the table into groups of 3 records.
Calculate the average for each of the groups.
If the last group doesn't have 3 records, then it should be part of the previous group.

Expected Output:

NAME  TOTAL
------ -------
BILL     23
PETER    23
JOHN     23
ROSIE    15
STEVE    15
CARL     15
DIRK     8.4
GREG     8.4
MARY     8.4
KATE     8.4
ANA      8.4  


I am trying to find an analytical function to do this, I've tried with NTILE, but I am unable to get the result I want.

My oracle version is 9i.

Thanks,

Martin

Tom Kyte

Followup  

July 09, 2007 - 8:58 pm UTC

ops$tkyte%ORA10GR2> select name, avg(total) over (partition by new_grp) average
  2    from (
  3  select name, total, decode( count(*) over (partition by grp), 3, grp, grp-1) new_grp
  4    from (
  5  select name, total, ceil( row_number() over (order by total desc)/3 ) grp
  6    from table_a
  7         )
  8         )
  9  /

NAME                      AVERAGE
---------------------- ----------
BILL                           23
PETER                          23
JOHN                           23
ROSIE                          15
STEVE                          15
CARL                           15
GREG                          8.4
DIRK                          8.4
MARY                          8.4
KATE                          8.4
ANA                           8.4

11 rows selected.

analytical rocks

July 10, 2007 - 7:35 am UTC

Reviewer: Akki

.. again ! !

analytical rocks

July 10, 2007 - 9:44 am UTC

Reviewer: A reader

Amazing!!!!! Thanks a lot!

Get data in two groups.

July 24, 2007 - 3:31 pm UTC

Reviewer: Sean from NJ, USA

Hi Tom,

I want to get data in two groups by c2,
group one: c2 =< 10,
group two: c2 > 10.

In each group, data should be ordered by c1. Here is the sample of the data. Database 10g. Thanks so much


c1 c2

2 8
3 6
5 10

1 14
4 11
6 13


create table t1 (c1 number, c2 number);

insert into t1(c1, c2)
values (2, 8);

insert into t1(c1, c2)
values (3, 6);

insert into t1(c1, c2)
values (5, 10);

insert into t1(c1, c2)
values (1, 14);

insert into t1(c1, c2)
values (4, 11);

insert into t1(c1, c2)
values (6, 13);
Tom Kyte

Followup  

July 26, 2007 - 8:41 am UTC

ops$tkyte%ORA10GR2> select case when c2 <= 10 then 1 else 2 end grp, c1, c2
  2    from t1
  3   order by case when c2 <= 10 then 1 else 2 end, c1
  4  /

       GRP         C1         C2
---------- ---------- ----------
         1          2          8
         1          3          6
         1          5         10
         2          1         14
         2          4         11
         2          6         13

6 rows selected.


August 24, 2007 - 12:39 pm UTC

Reviewer: A reader

Tom

Here is my test case

drop table t;

--- Test to check what happens for Analytical Functions that need to have a row with a column value of null

create table t ( a number, b varchar2(15), c number);

begin
insert into t values (1, 'A',2);
insert into t values(1,'B',3);
insert into t values(1,'C',2);
insert into t values(1,'D',null);
insert into t values (1,null,2);
insert into t values (1,null,null);
insert into t values(1,null,4);

end;

select a,b,c, last_value( c ) over (partition by b order by b nulls first ) n , last_value( c ) over (partition by b order by b nulls last) l ,
first_value( c ) over (partition by b order by b nulls first ) j , first_value( c ) over (partition by b order by b nulls last) k,
last_value( c ) over (partition by b ) tt
from t

1) Why does the Last_value always return NULL for B=NULL when the First_Value function returns 2, if the Null value is 'indeterminate', it should be so for both the first_value and the last_value?

2) select a,b,c, last_value( c ) over (partition by b ) n , last_value( nvl(c,1) ) over (partition by b order by b nulls last) l from t

How come the NVL in column L in the above query, made the value in column N work, considering a similar column in Question 1 (column TT) did not work?
Is that a bug? or is this a feature of Analytical queries, when two columns do analytical functions they somehow 'interact'?

Tom Kyte

Followup  

August 27, 2007 - 3:48 pm UTC

1) Maybe this'll help. Run the first query with b as null, and then all null b's set to xx

the results are "the same"

ops$tkyte%ORA10GR2> select a,b,c,
  2         last_value( c ) over (partition by b order by b nulls first range unbounded preceding) l  ,
  3         first_value( c ) over (partition by b order by b nulls first range unbounded preceding) f
  4    from t
  5   order by b nulls first
  6  /

         A B                        C          L          F
---------- --------------- ---------- ---------- ----------
         1                          2                     2
         1                          4                     2
         1                                                2
         1 A                        2          2          2
         1 B                        3          3          3
         1 C                        2          2          2
         1 D

7 rows selected.

ops$tkyte%ORA10GR2> update t set b = 'xx' where b is null;

3 rows updated.

ops$tkyte%ORA10GR2> select a,b,c,
  2         last_value( c ) over (partition by b order by b nulls first range unbounded preceding) l  ,
  3         first_value( c ) over (partition by b order by b nulls first range unbounded preceding) f
  4    from t
  5   order by b nulls first
  6  /

         A B                        C          L          F
---------- --------------- ---------- ---------- ----------
         1 A                        2          2          2
         1 B                        3          3          3
         1 C                        2          2          2
         1 D
         1 xx                       2                     2
         1 xx                       4                     2
         1 xx                                             2

7 rows selected.

ops$tkyte%ORA10GR2> spool off


when you break up all of the data by B (all of the null b's go together - for BOTH first and last), it is pretty 'clear'

I would say it could also be considered non-deterministic...

if you partition by B and then sort by B, the order of the rows in each partition is not deterministic and first/last value are free to return "something", but not necessarily the same thing run after run against the same data!


2) since I disagree with your analysis of "works", I'll pass on trying to interpret this one :)

It is working as I expect it should - null or not.

August 31, 2007 - 12:50 pm UTC

Reviewer: A reader

Tom

If I were to do partition by the same columns (or criteria) over multiple columns, would Oracle optimize the grouping to do it just once or for once per column.

I mean

select count(something) over (partition by column1, column2) data1,
sum(something) over (partiton by column1, column2) data2
from table

Would Oracle have done the Partition two times, we are having Oracle 9i with us.

I think I can prove it by making it to partition by a function output and recording in dbms_session.action each time the function is called.

The reason I am asking is a query I have written that way runs very slowly.

Now, presumably doing

Select sum(something), count(something) from
table group by column1,column2
would only create the Groups once and do the functions on the constituents of the Grouped data.

Thanks

Ravi

Tom Kyte

Followup  

September 05, 2007 - 9:25 am UTC

Oracle will "share" the partitions as much as possible - even nested ones - for example:

ops$tkyte%ORA10GR2> column PLAN_TABLE_OUTPUT format a72 truncate
ops$tkyte%ORA10GR2> set autotrace traceonly explain
ops$tkyte%ORA10GR2> select ename,
  2         count(*) over (partition by deptno, job) ,
  3         sum(sal) over (partition by deptno, job) ,
  4         count(*) over (partition by deptno) ,
  5         sum(sal) over (partition by deptno)
  6    from emp
  7   order by deptno, job, ename
  8  /

Execution Plan
----------------------------------------------------------
Plan hash value: 3145491563

------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time
------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |    15 |   585 |     3   (0)| 00:00:0
|   1 |  WINDOW SORT       |      |    15 |   585 |     3   (0)| 00:00:0
|   2 |   TABLE ACCESS FULL| EMP  |    15 |   585 |     3   (0)| 00:00:0
------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement

see how there is only one window sort (partition) - and it basically did is by deptno, job, ENAME (it rolled the order by into the operation as well..)

but, if you add a partition by that cannot "fit" inside the others - it has to do it twice:


ops$tkyte%ORA10GR2> select ename,
  2         count(*) over (partition by deptno, job) ,
  3         sum(sal) over (partition by deptno, job) ,
  4         count(*) over (partition by deptno) ,
  5         sum(sal) over (partition by deptno) ,
  6         count(*) over (partition by job)
  7    from emp
  8   order by deptno, job, ename
  9  /

Execution Plan
----------------------------------------------------------
Plan hash value: 4086863039

------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time
------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |    15 |   585 |     4  (25)| 00:00:
|   1 |  WINDOW SORT        |      |    15 |   585 |     4  (25)| 00:00:
|   2 |   WINDOW SORT       |      |    15 |   585 |     4  (25)| 00:00:
|   3 |    TABLE ACCESS FULL| EMP  |    15 |   585 |     3   (0)| 00:00:
------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement

ops$tkyte%ORA10GR2> set autotrace off


so, it'll partition as few times as it can.

Use of analytic functions in UPDATE statements

August 31, 2007 - 6:32 pm UTC

Reviewer: Ananth from USA

Use of analytic functions in UPDATE statements

Hi Tom,

I have the following table of data:

CREATE TABLE A_TEST (
EFF_DT DATE,
A_DESC VARCHAR2(20 BYTE),
ACTL_CHRG NUMBER,
ACTL_CHRG_YTD NUMBER,
ACTL_CHRG_YTD_UPD NUMBER
);


INSERT INTO A_TEST (EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD) VALUES('28-FEB-07','NV5000',-275.133708,-275.133708);
INSERT INTO A_TEST (EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD) VALUES('28-FEB-07','NV5000',275.133708,0.000000);
INSERT INTO A_TEST (EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD) VALUES('31-MAR-07','NV5000',-19.714995,-19.714995);
INSERT INTO A_TEST (EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD) VALUES('31-MAR-07','NV5000',147.351333,127.636338);
INSERT INTO A_TEST (EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD) VALUES('30-APR-07','NV5000',-77.253573,50.382765);
INSERT INTO A_TEST (EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD) VALUES('30-APR-07','NV5000',125.007672,175.390437);
INSERT INTO A_TEST (EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD) VALUES('31-MAY-07','NV5000',-65.570613,109.819824);
INSERT INTO A_TEST (EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD) VALUES('31-MAY-07','NV5000',126.322005,236.141829);
INSERT INTO A_TEST (EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD) VALUES('30-JUN-07','NV5000',-35.340954,200.800875);
INSERT INTO A_TEST (EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD) VALUES('30-JUN-07','NV5000',111.864342,312.665217);
INSERT INTO A_TEST (EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD) VALUES('31-JUL-07','NV5000',90.250866,402.916083);
INSERT INTO A_TEST (EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD) VALUES('31-JUL-07','NV5000',0.000000,402.916083);
INSERT INTO A_TEST (EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD) VALUES('31-AUG-07','NV5000',97.260642,500.176725);
INSERT INTO A_TEST (EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD) VALUES('31-AUG-07','NV5000',0.000000,500.176725);


SELECT EFF_DT, A_DESC, ACTL_CHRG, ACTL_CHRG_YTD, ACTL_CHRG_YTD_UPD
FROM a_test;

EFF_DT A_DESC ACTL_CHRG ACTL_CHRG_YTD ACTL_CHRG_YTD_UPD
========= ====== ============ ============= ==================
02/28/07 NV5000 -275.133708 -275.133708
02/28/07 NV5000 275.133708 0.000000
03/31/07 NV5000 -19.714995 -19.714995
03/31/07 NV5000 147.351333 127.636338
04/30/07 NV5000 -77.253573 50.382765
04/30/07 NV5000 125.007672 175.390437
05/31/07 NV5000 -65.570613 109.819824
05/31/07 NV5000 126.322005 236.141829
06/30/07 NV5000 -35.340954 200.800875
06/30/07 NV5000 111.864342 312.665217
07/31/07 NV5000 90.250866 402.916083
07/31/07 NV5000 0.000000 402.916083
08/31/07 NV5000 97.260642 500.176725
08/31/07 NV5000 0.000000 500.176725

-- update ACTL_CHRG_YTD_UPD (cumulative sum from ACTL_CHRG) column using MERGE statement

MERGE INTO a_test
USING ( SELECT rowid rid,
sum(ACTL_CHRG) over (partition by A_DESC order by A_DESC,eff_dt)
new_actl_chrg_ytd_upd
FROM a_test
ORDER BY eff_dt) a
ON ( a_test.rowid = a.rid )
WHEN MATCHED
THEN UPDATE SET actl_chrg_ytd_upd = a.new_actl_chrg_ytd_upd
WHEN NOT MATCHED
THEN INSERT ( actl_chrg_ytd_upd ) VALUES ( null );


EFF_DT A_DESC ACTL_CHRG ACTL_CHRG_YTD ACTL_CHRG_YTD_UPD
======== ====== ============ ============== ===============
02/28/07 NV5000 -275.133708 -275.133708 0.000000 -275.133708
02/28/07 NV5000 275.133708 0.000000 0.000000
03/31/07 NV5000 -19.714995 -19.714995 127.636338 -19.714995
03/31/07 NV5000 147.351333 127.636338 127.636338
04/30/07 NV5000 -77.253573 50.382765 175.390437 50.382765
04/30/07 NV5000 125.007672 175.390437 175.390437
05/31/07 NV5000 -65.570613 109.819824 236.141829 109.819824
05/31/07 NV5000 126.322005 236.141829 236.141829
06/30/07 NV5000 -35.340954 200.800875 312.665217 200.800875
06/30/07 NV5000 111.864342 312.665217 312.665217
07/31/07 NV5000 90.250866 402.916083 402.916083
07/31/07 NV5000 0.000000 402.916083 402.916083
08/31/07 NV5000 97.260642 500.176725 500.176725
08/31/07 NV5000 0.000000 500.176725 500.176725

-- The updated ACTL_CHRG_YTD_UPD data is not matching with ACTL_CHRG_YTD data.

Thanks for you help.....


Tom Kyte

Followup  

September 05, 2007 - 9:34 am UTC

I cannot reproduce in 9i or 10g


ops$tkyte%ORA9IR2> select a.*, sum(ACTL_CHRG) over (partition by A_DESC order by A_DESC,eff_dt) newthing
  2    from a_test a
  3   order by eff_dt
  4  /

EFF_DT    A_DESC                ACTL_CHRG ACTL_CHRG_YTD ACTL_CHRG_YTD_UPD   NEWTHING
--------- -------------------- ---------- ------------- ----------------- ----------
28-FEB-07 NV5000               -275.13371    -275.13371                            0
28-FEB-07 NV5000               275.133708             0                            0
31-MAR-07 NV5000               -19.714995    -19.714995                   127.636338
31-MAR-07 NV5000               147.351333    127.636338                   127.636338
30-APR-07 NV5000               -77.253573     50.382765                   175.390437
30-APR-07 NV5000               125.007672    175.390437                   175.390437
31-MAY-07 NV5000               -65.570613    109.819824                   236.141829
31-MAY-07 NV5000               126.322005    236.141829                   236.141829
30-JUN-07 NV5000               -35.340954    200.800875                   312.665217
30-JUN-07 NV5000               111.864342    312.665217                   312.665217
31-JUL-07 NV5000                90.250866    402.916083                   402.916083
31-JUL-07 NV5000                        0    402.916083                   402.916083
31-AUG-07 NV5000                97.260642    500.176725                   500.176725
31-AUG-07 NV5000                        0    500.176725                   500.176725

14 rows selected.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> MERGE INTO a_test
  2  USING ( SELECT rowid rid,
  3                sum(ACTL_CHRG) over (partition by A_DESC order by A_DESC,eff_dt)
  4                new_actl_chrg_ytd_upd
  5           FROM a_test
  6       ORDER BY eff_dt) a
  7  ON ( a_test.rowid = a.rid )
  8  WHEN MATCHED
  9       THEN UPDATE SET actl_chrg_ytd_upd = a.new_actl_chrg_ytd_upd
 10  WHEN NOT MATCHED
 11       THEN INSERT ( actl_chrg_ytd_upd ) VALUES ( null );

14 rows merged.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> select a.*, sum(ACTL_CHRG) over (partition by A_DESC order by A_DESC,eff_dt) newthing
  2    from a_test a
  3   order by eff_dt
  4  /

EFF_DT    A_DESC                ACTL_CHRG ACTL_CHRG_YTD ACTL_CHRG_YTD_UPD   NEWTHING
--------- -------------------- ---------- ------------- ----------------- ----------
28-FEB-07 NV5000               -275.13371    -275.13371                 0          0
28-FEB-07 NV5000               275.133708             0                 0          0
31-MAR-07 NV5000               -19.714995    -19.714995        127.636338 127.636338
31-MAR-07 NV5000               147.351333    127.636338        127.636338 127.636338
30-APR-07 NV5000               -77.253573     50.382765        175.390437 175.390437
30-APR-07 NV5000               125.007672    175.390437        175.390437 175.390437
31-MAY-07 NV5000               -65.570613    109.819824        236.141829 236.141829
31-MAY-07 NV5000               126.322005    236.141829        236.141829 236.141829
30-JUN-07 NV5000               -35.340954    200.800875        312.665217 312.665217
30-JUN-07 NV5000               111.864342    312.665217        312.665217 312.665217
31-JUL-07 NV5000                90.250866    402.916083        402.916083 402.916083
31-JUL-07 NV5000                        0    402.916083        402.916083 402.916083
31-AUG-07 NV5000                97.260642    500.176725        500.176725 500.176725
31-AUG-07 NV5000                        0    500.176725        500.176725 500.176725

14 rows selected.


Use of analytic functions in UPDATE statements

September 05, 2007 - 12:17 pm UTC

Reviewer: Ananth from USA

Tom,

I cannot reproduce in 9i or 10g...

The script was created and tested in Release 9.2.0.1.0.

Problem:
I am not able compute the cumulative total from ACTL_CHRG column and update in ACTL_CHRG_YTD_UPD column.
(The cumulative data in ACTL_CHRG_YTD_UPD column should look like data in ACTL_CHRG_YTD column.)

If I am not able to do this in SQL can you suggest a PL/SQL

Thanks,



Tom Kyte

Followup  

September 05, 2007 - 5:33 pm UTC

well, you can take your "using" query and bulk collect that into arrays and then forall i update - sure.


select rowid, analytic() bulk collect into rid_array, analytic_array
  from t
.....;

forall i in 1 .. rid_array.count
  update t set column = analytic_array(i) where rowid = rid_array(i);



September 06, 2007 - 5:40 am UTC

Reviewer: A reader

Tom,

Here is my test case

drop table t8;

create table t8 (a number, b number, c number);

truncate table t8;


begin

insert into t8 values (1, 2, 9);

insert into t8 values (1, 2, 7);

insert into t8 values (1,3,8);
insert into t8 values (1,3,9);
insert into t8 values (1,3,9);

insert into t8 values (1,3,6);

insert into t8 values (2, 7,6);

insert into t8 values (2, 7,4);

insert into t8 values (2,9,2);

insert into t8 values (2,9,3);

end;


If I were to do

SELECT a,b,c,LAST_VALUE ( c ) OVER ( PARTITION BY a ORDER BY b ) d
--- Both first and Last value will be same for SAME value of B
, FIRST_VALUE ( c ) OVER ( PARTITION BY a ORDER BY b RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING )
l2
, LAST_VALUE ( c ) OVER ( PARTITION BY a ORDER BY b RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING )
m2
FROM t8


Question:

1) Is it right in saying Oracle Orders Column C as well, other than partitioning by A and Ordering By B, this observation from the column value of both L2 and M2?


Thanks

Ravi
Tom Kyte

Followup  

September 11, 2007 - 8:01 am UTC

no that is not right.

you will break up the data by A (partition by A)
you will order the data by B within each A partition.

that is all you can say - if the data happens to accidentally be sorted by C, that is just that - an accident and nothing you can be assured of.

Use of analytic functions in UPDATE statements

September 06, 2007 - 2:47 pm UTC

Reviewer: Ananth from USA

MERGE INTO a_test
USING ( SELECT rowid rid,
sum(ACTL_CHRG) over (partition by A_DESC order by A_DESC,eff_dt)
new_actl_chrg_ytd_upd
FROM a_test
ORDER BY eff_dt) a
ON ( a_test.rowid = a.rid )
WHEN MATCHED
THEN UPDATE SET actl_chrg_ytd_upd = a.new_actl_chrg_ytd_upd
WHEN NOT MATCHED
THEN INSERT ( actl_chrg_ytd_upd ) VALUES ( null );

Can you pl. help me to convert the above SQL using bulk collect for update.

Thanks,

Tom Kyte

Followup  

September 11, 2007 - 8:25 am UTC

I already did, right above, long time ago.

Please help me

September 07, 2007 - 12:48 pm UTC

Reviewer: Kailasanathan Subbaraman from NJ

Tom,

I have been a very big fan of you. I have solved lot of problems with analytic functions (truly
analytics rock and analytics roll). But I have the following situation that I need your help.

I have a table with 2 columns :
Create Table A (F1 VARCHAR2(2), F2 VARCHAR2(3));

INSERT INTO A ( F1, F2 ) VALUES ( 'A1', '1');
INSERT INTO A ( F1, F2 ) VALUES ( 'A1', '2');
INSERT INTO A ( F1, F2 ) VALUES ( 'A1', '3');
INSERT INTO A ( F1, F2 ) VALUES ( 'A1', '4');
INSERT INTO A ( F1, F2 ) VALUES ( 'A2', '2');
INSERT INTO A ( F1, F2 ) VALUES ( 'A3', '1');
INSERT INTO A ( F1, F2 ) VALUES ( 'A4', '5');
INSERT INTO A ( F1, F2 ) VALUES ( 'A2', '3');
INSERT INTO A ( F1, F2 ) VALUES ( 'A2', '4');
INSERT INTO A ( F1, F2 ) VALUES ( 'A2', '5');
INSERT INTO A ( F1, F2 ) VALUES ( 'A2', '6');
INSERT INTO A ( F1, F2 ) VALUES ( 'A2', '7');
INSERT INTO A ( F1, F2 ) VALUES ( 'A3', '6');
INSERT INTO A ( F1, F2 ) VALUES ( 'A3', '7');
INSERT INTO A ( F1, F2 ) VALUES ( 'A3', '8');
INSERT INTO A ( F1, F2 ) VALUES ( 'A4', '5');
INSERT INTO A ( F1, F2 ) VALUES ( 'A4', '7');

I need to split these rows into say 3 groups (using function such as NTILE) in such a way that all
records with the same value for the column F1 should be in the same group.

e.g

A1 1 1
A1 2 1
A1 3 1
A1 4 1

A2 2 2
A2 3 2
A2 4 2
A2 5 2
A2 6 2
A2 7 2

A3 1 3
A3 6 3
A3 7 3
A3 8 3
A4 5 3
A4 7 3

Thanks

Tom Kyte

Followup  

September 12, 2007 - 10:14 am UTC

insufficient specifications here.

I see no reason logically why A4 is in the group with A3 when A2 is not with A1

what if A3 had 10 rows - what would output be then.


Need *a lot* more information here as to the precise logic.

Kailasanathan

September 11, 2007 - 9:52 am UTC

Reviewer: A reader

why are
A3 8 3
and
A4 5 3
in the same group "3"
?

September 11, 2007 - 10:46 am UTC

Reviewer: Kailash Subbaraman from NJ

That's because the logic should be as follows :

1. Divide the total number of rows in the table into ranges so that each range as far as possible has equal number of rows.
2. Records with same value in F1 should not be split and should be in the same set.

In the above eg, based on the above conditions the first set has 4 rows, second and third set has 6 rows which satisfies condition 1. If column F1 with value A4 is in another set, the that set will only have 2 records which violates condition 1.

Hope this helps
Tom Kyte

Followup  

September 15, 2007 - 3:29 pm UTC

... Divide the total number of rows in the table into ranges so that each range
as far as possible has equal number of rows. ...

bin fitting problem - not easy to do in SQL, I do not recommend SQL for this particular solution.

This may be a starting point for you

September 11, 2007 - 2:19 pm UTC

Reviewer: Tyler from Vancouver BC

This could be a starting point for you, it will resolve out that A4 needs to be included in the A3 group based on the requirements you've given...

Hope that helps.
WITH 
   BaseInfo AS
(
   SELECT 
      F1, 
      F2
   FROM A 
   ORDER BY F1 ASC
),
   SecondSet AS
(
   SELECT 
      F1, 
      F2, 
      COUNT(*)       OVER (PARTITION BY F1)                 AS GroupedCount,
      ROW_NUMBER()   OVER (PARTITION BY F1 ORDER BY F1 ASC) AS RowsPerGroup
   FROM BaseInfo
)
   SELECT 
         F1, 
         F2, 
         RowsPerGroup,
         GroupedCount,
         LAG(GroupedCount, RowsPerGroup - 1) OVER (PARTITION BY F1 ORDER BY F1 ASC),      
         CASE 
            WHEN 
                     GroupedCount < MAX(GroupedCount) OVER () 
               AND   (LAG(GroupedCount, RowsPerGroup - 1) OVER (PARTITION BY F1 ORDER BY F1 ASC) + GroupedCount) - MAX(GroupedCount) OVER () < 2
            THEN
               'INCLUDE IN PREVIOUS SET'
            ELSE
               'DON''T COMBINE'
         END           
   FROM SecondSet;

To: Kailash Subbaraman

September 16, 2007 - 2:34 am UTC

Reviewer: Michel Cadot from France


See an example of bin fitting problem and its solution with MODEL clause at:
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:3170642805938#55879813219675

Regards
Michel

Analytic question

September 20, 2007 - 7:04 am UTC

Reviewer: Lise from UK

I am sure I can do something smart with analytics here, but I am not sure what...
create table COMP
(
AMNT NUMBER,
MBMR NUMBER
);
create table AMOUNTS
(
AMNT NUMBER,
MBMR NUMBER
);
insert into COMP (AMNT, MBMR)
values (500, 1);
insert into COMP (AMNT, MBMR)
values (100, 1);
insert into COMP (AMNT, MBMR)
values (1000, 1);
insert into AMOUNTS (AMNT, MBMR)
values (100, 1);
insert into AMOUNTS (AMNT, MBMR)
values (500, 1);
insert into AMOUNTS (AMNT, MBMR)
values (250, 1);
insert into AMOUNTS (AMNT, MBMR)
values (250, 1);
insert into AMOUNTS (AMNT, MBMR)
values (250, 1);
insert into AMOUNTS (AMNT, MBMR)
values (250, 1);

I want to be able to match the payments in comp with payment sin amounts.
Imagine I had 1000 on comp and I wanted to find the rows on amounts that added up to 1000. It would be 250 * 4, but how can I automate this calculation?
How would I do this please?
Tom Kyte

Followup  

September 24, 2007 - 7:19 am UTC

but if you find some 4 rows that make 1000, you might not be able to find the other X rows to make 100 or 500 - because if you used a row for 1000, it might have been the only row that could answer 100, but if you used it for 1000....

It seems to me that the matching of payments should have been done "during insertion" in real life.

I'm not sure a sql solution here is what you are looking for.

I agree

September 25, 2007 - 5:09 pm UTC

Reviewer: Lise from Scotland

It is associated with payments received without a reference to what it is for apart from the customer id. I think you are correct, in that I would have to compute the sum of all possible combinations. I think the business needs to reconsider their approach, or live with the possibility that we can find up to 2 matched payments whilst the remainder has to be done as is.

Thanks Tom.

Analytical Help

October 02, 2007 - 6:10 am UTC

Reviewer: Akki from London

Hi Tom,

I am using function to get the required field output. But I wonder if this can be replaced with analytical functions (as after using function, the performance is very poor)

This is the sample code of our transaction tables. The data in both the tables are inserted using same sequence.
I am dealing with historic data, so there is less scope for changing the table structures.

create table t_pur_orders (id number, advert_id number, area_code varchar2(10), advert_type varchar2(10),seq_id number);

create table t_orders (id number, advert_id number, area_code varchar2(10), advert_type varchar2(10));

INSERT INTO T_PUR_ORDER ( ID, ADVERT_ID, AREA_CODE, ADVERT_TYPE, SEQ_ID ) VALUES ( 1, 1001, 'A', 'ONE', 1); 
INSERT INTO T_PUR_ORDER ( ID, ADVERT_ID, AREA_CODE, ADVERT_TYPE, SEQ_ID ) VALUES ( 1, 1001, 'A', 'ONE', 2); 
INSERT INTO T_PUR_ORDER ( ID, ADVERT_ID, AREA_CODE, ADVERT_TYPE, SEQ_ID ) VALUES ( 2, 1002, 'B', 'TWO', 1); 
INSERT INTO T_PUR_ORDER ( ID, ADVERT_ID, AREA_CODE, ADVERT_TYPE, SEQ_ID ) VALUES ( 2, 1002, 'B', 'TWO', 2); 
INSERT INTO T_PUR_ORDER ( ID, ADVERT_ID, AREA_CODE, ADVERT_TYPE, SEQ_ID ) VALUES ( 2, 1002, 'B', 'TWO', 3); 
INSERT INTO T_PUR_ORDER ( ID, ADVERT_ID, AREA_CODE, ADVERT_TYPE, SEQ_ID ) VALUES ( 3, 1003, 'C', 'THREE', 1); 
INSERT INTO T_PUR_ORDER ( ID, ADVERT_ID, AREA_CODE, ADVERT_TYPE, SEQ_ID ) VALUES ( 8, 1001, 'A', 'ONE', 1); 
INSERT INTO T_PUR_ORDER ( ID, ADVERT_ID, AREA_CODE, ADVERT_TYPE, SEQ_ID ) VALUES ( 9, 1004, 'A', 'ONE', 1); 

COMMIT;

INSERT INTO T_ORDERS ( ID, ADVERT_ID, AREA_CODE, ADVERT_TYPE ) VALUES ( 4, 1001, 'A', 'ONE'); 
INSERT INTO T_ORDERS ( ID, ADVERT_ID, AREA_CODE, ADVERT_TYPE ) VALUES ( 5, 1002, 'B', 'TWO'); 
INSERT INTO T_ORDERS ( ID, ADVERT_ID, AREA_CODE, ADVERT_TYPE ) VALUES ( 10, 1001, 'A', 'ONE'); 
INSERT INTO T_ORDERS ( ID, ADVERT_ID, AREA_CODE, ADVERT_TYPE ) VALUES ( 11, 1004, 'A', 'ONE'); 
COMMIT;

Output
------

T_ORDERS.ID   T_PUR_ORDER.ID   ADVERT_ID     AREA_CODE    ADVERT_TYPE
-----------   --------------   ---------     ----------   ------------ 
4                   1          1001              A            ONE
5                   2          1002              B            TWO
10                  8          1001              A            ONE
11                  9          1004              A            ONE


The requirement is that I need to display all records from T_ORDERS objects along with associated ID from T_PUR_ORDER.
For every record in the T_ORDERS, there will be one associated record in the T_PUR_ORDER. The matching columns in
both the tables are ADVERT_ID, AREA_CODE and ADVERT_TYPE.

I need to get latest ID from T_PUR_ORDER where T_PUR_ORDER.ID is less than or equal to T_ORDERS.ID joining rest of the
columns.

I can achive this by creating function to get T_PUR_ORDER.ID but I wonder if this can be done using ANALYTICAL FUNCTIONS

The Function I am using so far.

Create or replace function f_get_pur_id (p_id number, p_advert_id number, p_area_code varchar2, p_advert_type varchar2) 
return number is
  v_id number;
begin
   select max(id) 
   into   v_id
   from   t_pur_order
   where  id <= p_id
   and    advert_id = p_advert_id
   and    area_code = p_area_code
   and    advert_type = p_advert_type;
   
   return v_id;

end;
/


And the query I am using :
select id, 
       advert_id, 
    area_code, 
    advert_type, 
       f_get_pur_id(id, advert_ID, area_code, advert_type) pur_id 
from t_orders;

Tom Kyte

Followup  

October 03, 2007 - 5:02 pm UTC

you forgot to give us the first query - I'm not sure how to join these tables, no keys defined - no correlation names on the first result set, not clear.


but first - no function - just use sql - use a scalar subquery

select ..., 
      (select max(id) from t_pur_order where id <= X.id and advert_id = X.advert_id and ... )
  from t_orders X;




or just join and aggregate (analytics are cool, but they are used when you DO NOT want to aggregate)

ops$tkyte%ORA11GR1> select a.id, a.advert_id, a.area_code, a.advert_type, max(b.id)
  2    from t_orders a, t_pur_orders b
  3   where a.advert_id = b.advert_id
  4     and a.area_code = b.area_code
  5     and a.advert_type = b.advert_type
  6     and a.id >= b.id
  7   group by a.id, a.advert_id, a.area_code, a.advert_type
  8   order by a.id
  9  /

        ID  ADVERT_ID AREA_CODE  ADVERT_TYP  MAX(B.ID)
---------- ---------- ---------- ---------- ----------
         4       1001 A          ONE                 1
         5       1002 B          TWO                 2
        10       1001 A          ONE                 8
        11       1004 A          ONE                 9



Analytical Help

October 02, 2007 - 9:26 am UTC

Reviewer: Akki from London

Hi Tom,

Please ignore my previous request. I managed to get the output using following query.

select id, advert_id, area_code, advert_type, lag from
(select typ,id, advert_id, area_code, advert_type, 
       lag(id) over (partition by advert_id, area_code, advert_type order by id) lag
from
(select 'po' typ,id, advert_id, area_code, advert_type from t_pur_order
union all 
select 'o' typ,id, advert_id, area_code, advert_type from t_orders
order by id
))
where typ = 'o'


Thanks

Should I use Analytic funciton for this example?

October 05, 2007 - 1:51 pm UTC

Reviewer: BM from USA

I have following need. Do you think I can derive following column (NEXT_CHNG_PR) using analytic function?


TIME PRICE NEXT_CHNG_PR (this is derived field)
ROW1 800 2.4 2.5 (next different price value ordered by time)
ROW2 810 2.4 2.5
ROW3 820 2.5 2.3
ROW4 830 2.5 2.3
ROW5 900 2.3 (nothing is no next record)

Forgot to give version

October 05, 2007 - 2:21 pm UTC

Reviewer: BM from USA

My oracle version is 9.2.0.8.

To BM

October 09, 2007 - 10:01 am UTC

Reviewer: Jay from Herndon, VA

Your question is so INCOMPLETE!! Give your create and inserts and explain your question more clearly.
You can't just give some values and request for a query to arrive at that result. You know what I mean?
Thanks!

October 09, 2007 - 5:31 pm UTC

Reviewer: BM from MD, USA


CREATE TABLE TEST_NEXT_ATRB
(
  TIMEID  VARCHAR2(4 BYTE),
  SEQ_NB  NUMBER,
  PRICE   NUMBER(10,2)
)
/

INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0900', 5, 2.3); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0830', 4, 2.5); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0820', 3, 2.5); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0810', 2, 2.4); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0800', 1, 2.5); 


SQL> select * from test_next_atrb
order by timeid, seq_nb

TIME     SEQ_NB      PRICE
---- ---------- ----------
0800          1        2.5
0810          2        2.4
0820          3        2.5
0830          4        2.5
0900          5        2.3


I need to derive 4th columns NEXT_CHNG_PRICE as below - 

SQL> select * from test_next_atrb
order by timeid, seq_nb

TIME     SEQ_NB      PRICE   NEXT_CHNG_PRICE
---- ---------- ----------
0800          1        2.5   2.4
0810          2        2.4   2.5
0820          3        2.5   2.3
0830          4        2.5   2.3
0900          5        2.3   NULL


October 09, 2007 - 5:31 pm UTC

Reviewer: BM from MD, USA


CREATE TABLE TEST_NEXT_ATRB
(
  TIMEID  VARCHAR2(4 BYTE),
  SEQ_NB  NUMBER,
  PRICE   NUMBER(10,2)
)
/

INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0900', 5, 2.3); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0830', 4, 2.5); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0820', 3, 2.5); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0810', 2, 2.4); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0800', 1, 2.5); 


SQL> select * from test_next_atrb
order by timeid, seq_nb

TIME     SEQ_NB      PRICE
---- ---------- ----------
0800          1        2.5
0810          2        2.4
0820          3        2.5
0830          4        2.5
0900          5        2.3


I need to derive 4th columns NEXT_CHNG_PRICE as below - 

SQL> select * from test_next_atrb
order by timeid, seq_nb

TIME     SEQ_NB      PRICE   NEXT_CHNG_PRICE
---- ---------- ----------
0800          1        2.5   2.4
0810          2        2.4   2.5
0820          3        2.5   2.3
0830          4        2.5   2.3
0900          5        2.3   NULL


Try this out...

October 10, 2007 - 11:38 am UTC

Reviewer: Jay from Herndon, va



select timeid,
seq_nb,
price,
case
when lead_price is null
then lead(lead_price) over (partition by unique_key order by timeid, seq_nb)
else lead_price
end as next_diff
from
(
select unique_key,
timeid,
seq_nb,
price,
case
when lead(price) over (partition by unique_key order by timeid, seq_nb) <> price
then lead(price) over (partition by unique_key order by timeid, seq_nb)
else null
end as lead_price
from

(
select 'Time_Group' as unique_key,
timeid,
seq_nb,
price
from test_next_atrb
order by timeid, seq_nb)
)

Still an Issue

October 11, 2007 - 1:27 pm UTC

Reviewer: BM from MD, USA

Thanks for your time Jay.

Your query works for the data provided. But as soon as I have more than 2 records with same price, it fails. For example insert following and see what happens to results. It returs NULL for 0820. Issue is that data can have any number for rows before there is chagne in value.

INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES (
'0845', 4, 2.5);


TIME SEQ_NB PRICE NEXT_DIFF
---- ---------- ---------- ----------
0800 1 2.5 2.4
0810 2 2.4 2.5
0820 3 2.5 NULL
0830 4 2.5 2.3
0845 4 2.5 2.3
0900 5 2.3 NULL

hmm..

October 11, 2007 - 5:33 pm UTC

Reviewer: Jay from Herndon, va

BM,

Sure we can do this. But, you need to have a common grouping indicator for the same prices. Here is an example of what I mean. Let me know if this makes sense.

--create
CREATE TABLE TEST_NEXT_ATRB
(
  TIMEID  VARCHAR2(4 BYTE),
  SEQ_NB  NUMBER,
  PRICE   NUMBER(10,2)
)



--insert
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0800', 1, 2.5); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0810', 2, 2.4); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0820', 3, 2.5); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0830', 3, 2.5); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0845', 3, 2.5); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0900', 4, 2.3); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0915', 5, 3.2); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0930', 5, 3.2); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0945', 5, 3.2); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'1000', 6, 3.8); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'1015', 7, 4.0); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'1030', 7, 4.0); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'1045', 7, 4.0); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'1100', 8, 5.3); 


--query

select timeid,       
       seq_nb,
       price,
       max(lead_price) over (partition by unique_key, seq_nb) as next_diff
       
  from (select unique_key,
               timeid,
               seq_nb,
               price,        
               case 
                when lead(price) over (partition by unique_key order by timeid, seq_nb) <> price
                then lead(price) over (partition by unique_key order by timeid, seq_nb)
                else null
               end as lead_price,
               rownum rn
          from 
                (
                select 'Time_Group' as unique_key,
                       timeid,
                       seq_nb,
                       price        
                  from test_next_atrb   
                order by timeid, seq_nb))                 



Here is the output:

 
TIMEID SEQ_NB PRICE NEXT_DIFF   
0800  1  2.50  2.4
0810  2  2.40  2.5
0820  3  2.50  2.3
0830  3  2.50  2.3
0845  3  2.50  2.3
0900  4  2.30  3.2
0915  5  3.20  3.8
0930  5  3.20  3.8
0945  5  3.20  3.8
1000  6  3.80  4
1015  7  4.00  5.3
1030  7  4.00  5.3
1045  7  4.00  5.3
1100  8  5.30  



Thanks!

Still not resolved

October 15, 2007 - 10:20 am UTC

Reviewer: BM from MD, USA

Cool. But the problem is that we dont have any control over how the data comes. But we do know that SEQ_NB is unique. So though the results looks like what we want but they do not solve the original problem.

thanks for your time.

what???

October 15, 2007 - 11:39 am UTC

Reviewer: Jay from Herndon, VA

It's your database and it's your data. When you have no control over how the data comes, then you do have a problem. Probably, a good time to talk to your data architect. Others might have a better solution. Good luck!

????

October 18, 2007 - 10:35 am UTC

Reviewer: BM from MD, USA

Well, this is how the real world situations are. If you cant solve the problem then you dont update the data to get around that.


Tom.. can you help?

October 23, 2007 - 11:06 am UTC

Reviewer: Jay from Herndon, VA

Hi Tom,

This isn't really my question. But, I am wondering if the problem above can be solved by a straight sql or if we need to use cursors etc., to achieve this output?

Thanks a lot for your time!

December 12, 2007 - 2:44 am UTC

Reviewer: A reader

How to execute the function which has OUT parameter
Tom Kyte

Followup  

December 12, 2007 - 8:21 am UTC

variable x <datatype>
variable y <datatype>
exec :y := f(:x)


or

declare
x datatype;
y datatype;
begin
y := f(x);
end;

To 'BM'

December 14, 2007 - 11:14 am UTC

Reviewer: Jay from Herndon, VA

Based on Tom's 'carry forward technique', I think this query will solve your problem.

drop table TEST_NEXT_ATRB

create TABLE TEST_NEXT_ATRB
(
  TIMEID  VARCHAR2(4 BYTE),
  SEQ_NB  NUMBER,
  PRICE   NUMBER(10,2)
)
/

INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0900', 5, 2.3); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0830', 4, 2.5); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0820', 3, 2.5); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0810', 2, 2.4); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0800', 1, 2.5); 
INSERT INTO TEST_NEXT_ATRB ( TIMEID, SEQ_NB, PRICE ) VALUES ( 
'0845', 4, 2.5); 

commit;


QUERY:


select timeid,
       seq_nb,
       price,
       substr(max(test) over (order by timeid, seq_nb),12) as next_different_price
  from
(  
select timeid,
       seq_nb,
       price,
       case 
        when lead_price is not null
        then to_char(row_number() over (order by test),'fm00000000000') || lead_price
       end as test
  from       
(      
select timeid,
       seq_nb,
       price,
       test, 
       case 
        when lead(price) over (order by test) <> price
        then lead(price) over (order by test)
        else null
       end as lead_price
  from
(select timeid,
        seq_nb,
        price,
        row_number() over (order by timeid, seq_nb) as test       
   from test_next_atrb)))


Output:
TIMEID SEQ_NB PRICE NEXT_DIFFERENT_PRICE
0800      1   2.50   2.4
0810      2   2.40   2.5
0820      3   2.50   2.5
0830      4   2.50   2.5
0845      4   2.50   2.3
0900      5   2.30   2.3


Thanks,
Jay

ignore nulls

February 26, 2008 - 1:44 pm UTC

Reviewer: James Su from Toronto

Hello Tom,
I want to use ignore nulls in lag but it's not supported, so what I did is to replace it with:

last_value(... ignore nulls) over(partition by ... order by ... range between unbounded preceding and 1 preceding)

Is this correct or is there a better way?
Thank you.
Tom Kyte

Followup  

February 26, 2008 - 3:12 pm UTC

depends, last_value might be sufficient - depending on your needs, it carries down the last non-null value, if that is what you want - then that is what you want.

If you understand what you've typed in - you'll be able to answer this yourself....

Is last value a replacement for lag - no way, doesn't do the same thing

Is last value what you want - perhaps, if you define your requirements not in terms of LAG (which appears to be not what you want) but in terms of "specification"

If you want the last non-null value in the partition above the current record - looks like you have it

More on Analytics

March 07, 2008 - 6:54 am UTC

Reviewer: Shivdeep Modi from Newcastle Upon Tyne, UK

Hi,

Basically I am trying to find all the roles who have only read only privileges on tables and no other privileges, which have
been granted to a specific set of users.

Here are the distinct privileges granted to assigned roles(GRANTEE)

GRANTEE                        PRIVILEGE
------------------------------ ---------------
AO_SUPPORT_RO                  SELECT
BASE                           SELECT
DATA_ENTRY_CLERK               DELETE
DATA_ENTRY_CLERK               INSERT
DATA_ENTRY_CLERK               UPDATE
SUPERVISOR                     DELETE
SUPERVISOR                     EXECUTE
SUPERVISOR                     INSERT
SUPERVISOR                     SELECT
SUPERVISOR                     UPDATE
WNS_SUPERVISOR                 DELETE
WNS_SUPERVISOR                 EXECUTE
WNS_SUPERVISOR                 INSERT
WNS_SUPERVISOR                 UPDATE


I've written something like:

SQL > select grantee, count(privilege)
SQL > from (
SQL > select distinct grantee,privilege
SQL > from dba_tab_privs
SQL > where grantee in (
SQL > select distinct granted_role
SQL > from dba_role_privs
SQL > where grantee in (
SQL > select username
SQL > from dba_users
SQL > where replace(translate(username,'0123456789','+'),'+','') in ('N','U') -- Only find users like N122345 or U12345 etc
SQL > )
SQL > and granted_role not like 'IMDD%'
SQL > )
SQL > )
SQL > group by grantee
SQL > having count(privilege) = 1
SQL > /
GRANTEE                        COUNT(PRIVILEGE)
------------------------------ ----------------
AO_SUPPORT_RO                                 1
BASE                                          1

2 rows selected.

SQL >


But this is bound to fail if we have a single distinct privilege apart from SELECT, granted to the role(s). Can we use analytics or is there a better way of doing it?

Regards,
Shivdeep
Tom Kyte

Followup  

March 10, 2008 - 10:19 am UTC

just fyi - your use of the DISTINCT above shows that you don'thave a full understanding of how set operations work:

SQL > where grantee in (
SQL > select distinct granted_role
SQL >  from dba_role_privs 


that distinct is superfluous - "IN" does the right thing, distinct is not necessary or desirable there.


start with this, you obviously have requirements above and beyond the stated (you have filters on usernames and what not)

ops$tkyte%ORA10GR2> select grantee,
  2         count( case when privilege = 'SELECT' then 1 end ) cnt_select,
  3             count( case when privilege <> 'SELECT' then 1 end ) cnt_not_select,
  4             count(*)
  5  from dba_tab_privs where grantee in (select role from dba_Roles)
  6  group by grantee
  7  /

GRANTEE                        CNT_SELECT CNT_NOT_SELECT   COUNT(*)
------------------------------ ---------- -------------- ----------
EXP_FULL_DATABASE                      20             19         39
AQ_ADMINISTRATOR_ROLE                  13              8         21
CTXAPP                                  0             10         10
XDBADMIN                                2              8         10
DBA                                     6             32         38
AQ_USER_ROLE                            0              4          4
SELECT_CATALOG_ROLE                  1767              3       1770
OLAP_USER                              11             30         41
HS_ADMIN_ROLE                          14              1         15
GATHER_SYSTEM_STATISTICS                1              3          4
IMP_FULL_DATABASE                       1             17         18
EXECUTE_CATALOG_ROLE                    0             77         77
DELETE_CATALOG_ROLE                     0              2          2
LOGSTDBY_ADMINISTRATOR                  0              2          2
OEM_MONITOR                             6              9         15
WM_ADMIN_ROLE                          12              0         12
MGMT_USER                              84              0         84

17 rows selected.

ops$tkyte%ORA10GR2> select grantee,
  2         count( case when privilege = 'SELECT' then 1 end ) cnt_select,
  3             count( case when privilege <> 'SELECT' then 1 end ) cnt_not_select,
  4             count(*)
  5  from dba_tab_privs where grantee in (select role from dba_Roles)
  6  group by grantee
  7  having count( case when privilege = 'SELECT' then 1 end ) = count(*)
  8  /

GRANTEE                        CNT_SELECT CNT_NOT_SELECT   COUNT(*)
------------------------------ ---------- -------------- ----------
WM_ADMIN_ROLE                          12              0         12
MGMT_USER                              84              0         84



Analytics

March 12, 2008 - 2:45 pm UTC

Reviewer: Prasanth from Germany

hi tom,

I am facing a problem with presenting the row information as a column value.

Create table account
( month number, acct number,subacct number,monvol number,yearvol number);

INSERT INTO account values(8,10,1,100,200);
INSERT INTO account values(8,10,2,20,30);
INSERT INTO account values(8,10,3,15,25);
INSERT INTO account values(9,10,1,50,10);
INSERT INTO account values(9,10,2,21,50);
INSERT INTO account values(9,20,1,2,15);

select * from account;

MONTH ACCT SUBACCT MONVOL YEARVOL
------ ----- ------- ------ -------
8 10 1 100 200
8 10 2 20 30
8 10 3 15 25
9 10 1 50 10
9 10 2 21 50
9 20 1 2 15

The expected ouput is.

ACCT SUBACCT 8_MONVOL 8_YEARVOL 9_MONVOL 9_YEARVOL
---- -------- -------- -------- -------- -----------
10 1 100 200 50 10
10 3 15 25 0 0
20 1 0 0 2 15

Could you help me to get this result. I tried using in-line queires but didn't get the expected output.
Can we get the result using analytic function.

The business logic is
1. All the accoint information is required, even if its not present in one particular month.
2. Requires only the account whose MONVOL or YEARVOL difference greater than 10.

Thank you
Tom Kyte

Followup  

March 12, 2008 - 5:49 pm UTC

search this site for PIVOT


but anyway, are 8 and 9 the only ones or are you expecting this to magically figure out "8,9,10,11,12,...... whatever"

(if 8 and 9 are the only ones - this is easy, search for pivot and you are done. If you want magic - where it figures out how many columns the query will return - that won't be happening, you'd have to run a query to figure out how many columns you are going to have and then build a query that matches that using the same technique as 8, 9 did)

rank fn in 8.1.7.4.0

April 02, 2008 - 1:44 pm UTC

Reviewer: sara from NJ

Hi Tom,

The version I am using is, 8.1.7.4.0
When I run this on sql prompt, it works

SELECT empno from (
SELECT empno, ename, RANK() OVER (PARTITION BY ename ORDER BY JDT DESC ) authorrank
FROM emp
WHERE rownum < 5)
WHERE authorrank IN (1,2) AND ROWNUM < 5

But When i compile that as a procedure, gives me error:

CREATE OR REPLACE PROCEDURE test_anal(
p_emp_id IN NUMBER
, p_generic_cursor OUT report_pkg.cursor_typ )
IS
BEGIN
OPEN p_generic_cursor FOR
SELECT empno from (
SELECT empno, ename, RANK() OVER (PARTITION BY ename ORDER BY JDT DESC ) authorrank
FROM emp
WHERE rownum < 5)
WHERE authorrank IN (1,2) AND ROWNUM < 5;
END test_anal;

Please let me know if this is allowed in procedural calls.

Thanks

Sara
Tom Kyte

Followup  

April 02, 2008 - 2:10 pm UTC

It is - in 9i and above.

in 8i, you can:
CREATE OR REPLACE PROCEDURE test_anal(
          p_emp_id    IN  NUMBER
        , p_generic_cursor  OUT report_pkg.cursor_typ )
IS
BEGIN
   OPEN p_generic_cursor FOR
       'SELECT empno from (
         SELECT empno, ename, 
                RANK() OVER (PARTITION BY ename  ORDER BY JDT DESC ) authorrank
         FROM   emp
         WHERE  rownum < 5)
      WHERE authorrank IN (1,2) AND ROWNUM < 5';
END test_anal;


use "dynamic" sql - that'll hide it from the plsql parser which didn't understand this construct way way way back when.

rank fn in 8.1.7.4.0

April 02, 2008 - 3:20 pm UTC

Reviewer: sara from NJ

Thanks Yom! That helps

Sara

rank fn in 8.1.7.4.0

April 02, 2008 - 3:21 pm UTC

Reviewer: sara from NJ

Thank you Tom.

query

May 13, 2008 - 8:26 pm UTC

Reviewer: Sean from NJ, USA

Hi Tom,

Here is the logic:
(1) when t1.c2 is not null, use both c1 and c2 in join
(2) when t1.c2 is null, use c1 for join only



create table t1(c1 number, c2 number);


create table t2(c1 number, c2 number);


insert into t1(c1, c2) values(1, 10);
insert into t1(c1) values(2);


insert into t2(c1, c2) values(2, 20);
insert into t2(c1, c2) values(1, 10);
insert into t2(c1) values(1);



Here is the result I want:



C1 C2
---------- ----------
1 10
2 20



-- Query is like this one, but this one does not work as I want.
select t2.c1, t2.c2
from t1, t2
where t2.c1 = t1.c1(+) and t2.c2 = t1.c2(+);

Thanks so much for your help. -- Sean


Tom Kyte

Followup  

May 14, 2008 - 10:34 am UTC

that is not logical logic.

it is rather ugly.

but, your join is

t1.c1 = t2.c1

and then you have a filter

a) t2.c2 = t1.c2 OR t1.c2 is null


ops$tkyte%ORA10GR2> select *
  2    from t1, t2
  3   where t1.c1 = t2.c1
  4     and (t1.c2 is null OR t1.c2 = t2.c2)
  5  /

        C1         C2         C1         C2
---------- ---------- ---------- ----------
         2                     2         20
         1         10          1         10

How analytic functions work

May 14, 2008 - 5:34 am UTC

Reviewer: Dibyendu from India

Hi Tom,
I have used a number of analytical functions. But I am not sure how these analytical functions actually work internally. Could you please give an idea?
Thanks a lot.
Tom Kyte

Followup  

May 14, 2008 - 3:27 pm UTC

not sure what you mean by "work internally"??

they sort, cache, sift through data - like order by, group by, whatever would

Ranking Function

May 29, 2008 - 9:42 am UTC

Reviewer: Prasanth from Germany

Hi Tom,
As per the business requirement I want to achieve the median RANK values of a column. Could you please tell me, how it will be able to achieve it using the aggregate functions?

Create table account ( month number(2), acct number(5),amt number(5));

INSERT INTO account values(1,101,90);
INSERT INTO account values(1,101,50);
INSERT INTO account values(1,101,50);
INSERT INTO account values(1,101,50);
INSERT INTO account values(1,101,40);
INSERT INTO account values(1,102,90);
INSERT INTO account values(1,102,50);
INSERT INTO account values(1,102,50);
commit;

select month,acct,amt,rank() over(partition by acct order by amt desc NULLS LAST) as rnk from account

MONTH ACCT AMT RNK
---------- ---------- ---------- ----------
1 101 90 1
1 101 50 2
1 101 50 2
1 101 50 2
1 101 40 5
1 102 90 1
1 102 50 2
1 102 50 2

But i require the RANK output as

MONTH ACCT AMT RNK
---------- ---------- ---------- ----------
1 101 90 1
1 101 50 3
1 101 50 3
1 101 50 3
1 101 40 5
1 102 90 1
1 102 50 3
1 102 50 3

Thank you.
Tom Kyte

Followup  

May 29, 2008 - 10:00 am UTC

well, I guess you'll have to explain your definition of a "median rank" in the manner you are presenting it.


Ranking Function

May 29, 2008 - 11:06 am UTC

Reviewer: Prasanth from Germany

Hi Tom,
Sorry for the confusion statement.
If there is a tie on the rank column, (this case amount), instead of assigning the next rank no (i.e. 2 in this case) wants to assigns the mean of the corresponding tied colums rank no.

For the tie columns get the rank no in sequential order and then (Sum of that ranks no / total no of tied columns).
For this e.g.: For the tied amount 50 of account 101 (2+3+4) / 3 = 3

Thank you
Tom Kyte

Followup  

May 29, 2008 - 11:26 am UTC

perhaps something like:

ops$tkyte%ORA9IR2> select month, acct, amt, rnk,
  2         round( avg(rn) over (partition by acct, rnk) ) med_rnk
  3    from (
  4  select month,acct,amt,
  5         row_number() over(partition by acct order by amt desc NULLS LAST) as rn,
  6         rank() over(partition by acct order by amt desc NULLS LAST) as rnk
  7    from account
  8         )
  9  /

     MONTH       ACCT        AMT        RNK    MED_RNK
---------- ---------- ---------- ---------- ----------
         1        101         90          1          1
         1        101         50          2          3
         1        101         50          2          3
         1        101         50          2          3
         1        101         40          5          5
         1        102         90          1          1
         1        102         50          2          3
         1        102         50          2          3

8 rows selected.

Ranking Function

May 30, 2008 - 5:34 am UTC

Reviewer: Prasanth from Germany

Hi Tom,
Yes, this is the same i am looking for. Thank you for the quick response. We are migrating an application from Mainframe to Oracle and the main frame has this sort of ranking.

busy day busy hour

June 09, 2008 - 9:09 am UTC

Reviewer: Ramakrishnan from Bangalore

Hi Tom,

My table

CREATE TABLE TRAFFIC
( CO_ID NUMBER NOT NULL ENABLE,
START_TIME DATE NOT NULL ENABLE,
TCH_REQUEST NUMBER,
CONSTRAINT TRAFFIC_PK PRIMARY KEY (CO_ID, START_TIME)
)

I am using following SQL to find busy hour of a day. finding a counter (TCH_request ) when its max in a day.

Select co_id, Day, hour, TCH_request FROM
( Select co_id, Day, hour, TCH_request, RANK() OVER (PARTITION BY Day ORDER BY Day,
TCH_request DESC NULLS LAST) as TCH_request_Rank
FROM
( SELECT co_id, TRUNC(START_TIME,'DD') Day, TRUNC(START_TIME,'HH24') Hour,
SUM(TCH_request) TCH_request
FROM omc.Traffic
group by TRUNC(START_TIME,'HH24'), co_id, TRUNC(START_TIME,'DD') ) groupByHr
) hr_Rank
WHERE TCH_request_Rank = 1;

inner most SQL do groupBy Hour , then next inner SQL does ranking hours with in a day , outer most SQL to display busy hour of a day.


This SQL is perfectly fine.


1. but is it possible to find "busy day busy hour" i.e find a busy day in week and on that busy day find a busy hour.
is it possible to do in single SQL.

2. Is it possible to find busy day in every 3 days and on that busy day find a busy hour.

Best regards
Ramakrishnan.A

Tom Kyte

Followup  

June 09, 2008 - 1:46 pm UTC

... This SQL is perfectly fine. ...

I think not.

it does not return the "busiest hour"

it returns the busiest hour on the first day for a GIVEN co_id

the inner most sql does not aggregate by hour, it aggregates by co_id and hour of day.

please rephrase this entirely - it doesn't add up.

busy day busy hour

June 10, 2008 - 5:55 am UTC

Reviewer: Ramakrishnan.A from Bangalore

Hi Tom,

Thanks for pointing out in first look itself !!

I corrected my SQL , now I added "PARTITION BY co_id, Day ORDER BY Day". I want to find busiest hour for given co_id, for every day.

Select co_id, to_char( Day, 'dd-mm-yyyy' ), to_char( hour, 'dd-mm-yyyy hh:mi' ), TCH_request FROM
( Select co_id, Day, hour, TCH_request, RANK() OVER ( PARTITION BY co_id, Day ORDER BY Day,
TCH_request DESC NULLS LAST) as TCH_request_Rank
FROM
( SELECT co_id, TRUNC(START_TIME,'DD') Day, TRUNC(START_TIME,'HH24') Hour,
SUM(TCH_request) TCH_request
FROM omc.Traffic
group by TRUNC(START_TIME,'HH24'), co_id, TRUNC(START_TIME,'DD') ) groupByHr
) hr_Rank
WHERE TCH_request_Rank = 1;

So my question again
1. is it possible to find "busy day busy hour" i.e. find a busy day in week and on that busy day find a busy hour. is it possible to do in single SQL.
2. Is it possible to find busy day in every 3 days and on that busy day find a busy hour.


Best regards
Ramakrishnan.A
Tom Kyte

Followup  

June 10, 2008 - 8:08 am UTC

so, now, I need data - interesting data - provide the create table and insert into statements - make sure to have more than one co_id, a busy day that is not the "first" day and so on.

Busy day busy hour

June 10, 2008 - 12:05 pm UTC

Reviewer: Ramakrishnan.A from Bangalore

Hi Tom ,

I didn't provide the data because I thought it will fill the page.


CREATE TABLE "TRAFFIC"
( "CO_ID" NUMBER NOT NULL ENABLE,
"START_TIME" DATE NOT NULL ENABLE,
"TCH_REQUEST" NUMBER,
CONSTRAINT "TRAFFIC_PK" PRIMARY KEY ("CO_ID", "START_TIME"))

data :


insert into traffic values ( 1, to_date('9-06-2008 6:00:00','dd-mm-yyyy hh:mi:ss'), 13 );
insert into traffic values ( 2, to_date('9-06-2008 6:00:00','dd-mm-yyyy hh:mi:ss'), 11 );
insert into traffic values ( 3, to_date('9-06-2008 6:00:00','dd-mm-yyyy hh:mi:ss'), 12 );
insert into traffic values ( 1, to_date('9-06-2008 6:30:00','dd-mm-yyyy hh:mi:ss'), 10 );
insert into traffic values ( 2, to_date('9-06-2008 6:30:00','dd-mm-yyyy hh:mi:ss'), 11 );
insert into traffic values ( 3, to_date('9-06-2008 6:30:00','dd-mm-yyyy hh:mi:ss'), 12 );

insert into traffic values ( 1, to_date('9-06-2008 7:00:00','dd-mm-yyyy hh:mi:ss'), 65 );
insert into traffic values ( 2, to_date('9-06-2008 7:00:00','dd-mm-yyyy hh:mi:ss'), 30 );
insert into traffic values ( 3, to_date('9-06-2008 7:00:00','dd-mm-yyyy hh:mi:ss'), 9 );
insert into traffic values ( 1, to_date('9-06-2008 7:30:00','dd-mm-yyyy hh:mi:ss'), 13 );
insert into traffic values ( 2, to_date('9-06-2008 7:30:00','dd-mm-yyyy hh:mi:ss'), 10 );
insert into traffic values ( 3, to_date('9-06-2008 7:30:00','dd-mm-yyyy hh:mi:ss'), 9 );

insert into traffic values ( 1, to_date('9-06-2008 8:00:00','dd-mm-yyyy hh:mi:ss'), 13 );
insert into traffic values ( 2, to_date('9-06-2008 8:00:00','dd-mm-yyyy hh:mi:ss'), 70 );
insert into traffic values ( 3, to_date('9-06-2008 8:00:00','dd-mm-yyyy hh:mi:ss'), 35);
insert into traffic values ( 1, to_date('9-06-2008 8:30:00','dd-mm-yyyy hh:mi:ss'), 13 );
insert into traffic values ( 2, to_date('9-06-2008 8:30:00','dd-mm-yyyy hh:mi:ss'), 10 );
insert into traffic values ( 3, to_date('9-06-2008 8:30:00','dd-mm-yyyy hh:mi:ss'), 14 );

insert into traffic values ( 1, to_date('10-06-2008 6:00:00','dd-mm-yyyy hh:mi:ss'), 25 );
insert into traffic values ( 2, to_date('10-06-2008 6:00:00','dd-mm-yyyy hh:mi:ss'), 11 );
insert into traffic values ( 3, to_date('10-06-2008 6:00:00','dd-mm-yyyy hh:mi:ss'), 12 );
insert into traffic values ( 1, to_date('10-06-2008 6:30:00','dd-mm-yyyy hh:mi:ss'), 10 );
insert into traffic values ( 2, to_date('10-06-2008 6:30:00','dd-mm-yyyy hh:mi:ss'), 11 );
insert into traffic values ( 3, to_date('10-06-2008 6:30:00','dd-mm-yyyy hh:mi:ss'), 12 );

insert into traffic values ( 1, to_date('10-06-2008 7:00:00','dd-mm-yyyy hh:mi:ss'), 10 );
insert into traffic values ( 2, to_date('10-06-2008 7:00:00','dd-mm-yyyy hh:mi:ss'), 73 );
insert into traffic values ( 3, to_date('10-06-2008 7:00:00','dd-mm-yyyy hh:mi:ss'), 9 );
insert into traffic values ( 1, to_date('10-06-2008 7:30:00','dd-mm-yyyy hh:mi:ss'), 13 );
insert into traffic values ( 2, to_date('10-06-2008 7:30:00','dd-mm-yyyy hh:mi:ss'), 10 );
insert into traffic values ( 3, to_date('10-06-2008 7:30:00','dd-mm-yyyy hh:mi:ss'), 9 );

insert into traffic values ( 1, to_date('10-06-2008 8:00:00','dd-mm-yyyy hh:mi:ss'), 13 );
insert into traffic values ( 2, to_date('10-06-2008 8:00:00','dd-mm-yyyy hh:mi:ss'), 10 );
insert into traffic values ( 3, to_date('10-06-2008 8:00:00','dd-mm-yyyy hh:mi:ss'), 35);
insert into traffic values ( 1, to_date('10-06-2008 8:30:00','dd-mm-yyyy hh:mi:ss'), 13 );
insert into traffic values ( 2, to_date('10-06-2008 8:30:00','dd-mm-yyyy hh:mi:ss'), 10 );
insert into traffic values ( 3, to_date('10-06-2008 8:30:00','dd-mm-yyyy hh:mi:ss'), 14 );

insert into traffic values ( 1, to_date('11-06-2008 6:00:00','dd-mm-yyyy hh:mi:ss'), 25 );
insert into traffic values ( 2, to_date('11-06-2008 6:00:00','dd-mm-yyyy hh:mi:ss'), 11 );
insert into traffic values ( 3, to_date('11-06-2008 6:00:00','dd-mm-yyyy hh:mi:ss'), 12 );
insert into traffic values ( 1, to_date('11-06-2008 6:30:00','dd-mm-yyyy hh:mi:ss'), 10 );
insert into traffic values ( 2, to_date('11-06-2008 6:30:00','dd-mm-yyyy hh:mi:ss'), 11 );
insert into traffic values ( 3, to_date('11-06-2008 6:30:00','dd-mm-yyyy hh:mi:ss'), 12 );

insert into traffic values ( 1, to_date('11-06-2008 7:00:00','dd-mm-yyyy hh:mi:ss'), 13 );
insert into traffic values ( 2, to_date('11-06-2008 7:00:00','dd-mm-yyyy hh:mi:ss'), 30 );
insert into traffic values ( 3, to_date('11-06-2008 7:00:00','dd-mm-yyyy hh:mi:ss'), 9 );
insert into traffic values ( 1, to_date('11-06-2008 7:30:00','dd-mm-yyyy hh:mi:ss'), 13 );
insert into traffic values ( 2, to_date('11-06-2008 7:30:00','dd-mm-yyyy hh:mi:ss'), 10 );
insert into traffic values ( 3, to_date('11-06-2008 7:30:00','dd-mm-yyyy hh:mi:ss'), 9 );

insert into traffic values ( 1, to_date('11-06-2008 8:00:00','dd-mm-yyyy hh:mi:ss'), 13 );
insert into traffic values ( 2, to_date('11-06-2008 8:00:00','dd-mm-yyyy hh:mi:ss'), 10 );
insert into traffic values ( 3, to_date('11-06-2008 8:00:00','dd-mm-yyyy hh:mi:ss'), 85);
insert into traffic values ( 1, to_date('11-06-2008 8:30:00','dd-mm-yyyy hh:mi:ss'), 13 );
insert into traffic values ( 2, to_date('11-06-2008 8:30:00','dd-mm-yyyy hh:mi:ss'), 10 );
insert into traffic values ( 3, to_date('11-06-2008 8:30:00','dd-mm-yyyy hh:mi:ss'), 14 );

busy day for each CO_ID are , on this day I want busy hour

CO_ID DAY TCH_REQUEST
------ ------ ------------
1 09-06-2008 127
2 09-06-2008 142
3 11-06-2008 141

Tom Kyte

Followup  

June 10, 2008 - 2:47 pm UTC


ops$tkyte%ORA10GR2> select co_id, dt, tch, sum_tch
  2    from (
  3  select co_id, dt, tch, sum_tch, rank() over (partition by co_id order by sum_tch DESC nulls last) big_by_day
  4    from (
  5  select co_id, dt, trunc(dt,'dd'), tch,
  6         rank() over (partition by co_id, trunc(dt,'dd') order by tch DESC nulls last) big_by_hour,
  7             sum(tch) over (partition by co_id, trunc(dt,'dd') ) sum_tch
  8    from (
  9  select co_id, trunc(start_time,'hh') dt, sum(tch_request) tch
 10    from traffic
 11   group by co_id, trunc(start_time,'hh')
 12         )
 13             )
 14   where big_by_hour = 1
 15         )
 16   where big_by_day = 1
 17  /

     CO_ID DT                          TCH    SUM_TCH
---------- -------------------- ---------- ----------
         1 09-jun-2008 07:00:00         78        127
         2 09-jun-2008 08:00:00         80        142
         3 11-jun-2008 08:00:00         99        141



busy day busy hour

June 13, 2008 - 12:25 pm UTC

Reviewer: Ramakrishnan.A from Bangalore

Thanks Tom,
I try to finding Busiest day of a week; and busy hour for that busy day. I tested and its works fine.
How can I find Busy day with in 3 days or 5 days ?


Select * from (
select co_id, to_char(dt,'dd-mm-yyyy HH:MI') BDBH, to_char(dt,'IW') week, tch,big_by_hour, tch_day_sum ,big_by_day
from (
select co_id, dt, tch, tch_day_sum, dense_rank() over (partition by co_id, trunc(dt,'IW') order by tch_day_sum DESC nulls last) big_by_day
,dense_rank() over ( partition by co_id, trunc(dt,'dd') order by tch DESC nulls last) big_by_hour
from (
select co_id, dt, trunc(dt,'dd'), tch,
--rank() over (partition by co_id, trunc(dt,'dd') order by tch DESC nulls last) big_by_hour,
sum(tch) over (partition by co_id, trunc(dt,'dd') ) tch_day_sum
from (
select co_id, trunc(start_time,'hh') dt, sum(tch_request) tch
from traffic
group by co_id, trunc(start_time,'hh')
)
)
)
)
where big_by_day = 1 and big_by_hour = 1
order by week, co_id

Tom Kyte

Followup  

June 13, 2008 - 1:26 pm UTC

... How can I find Busy day with in 3 days or 5 days ?
....

define better what you mean. don't post a query that doesn't work and expect we can reverse engineer it to figure out what you really meant - post a textual description detailing precisely the logic you need to employ - like a requirement, like you were describing to your mom how to do this.

Sorting columns

June 16, 2008 - 8:31 am UTC

Reviewer: Bhavesh Ghodasara from Ahmedabad,Gujarat,India

Hi Tom,

I want to sort column values.

e.g.
create table t
(a number,
b number,
c number,
d number,
e number,
f number);

insert into t values (6,3,2,1,4,5);

insert into t values (40,10,60,30,20,50);

select *
from t;

A B C D E F

6 3 2 1 4 5
40 10 60 30 20 50


I want output like :

1#2#3#4#5#6
10#20#30#40#50#60

It is possible that some columns contains null value. In that case we can ignore that value.

Thanks in advance.
Tom Kyte

Followup  

June 16, 2008 - 1:13 pm UTC

one has to wonder what sort of data this would be then, to be stored in a relational database - I'll bet the column names are like "C1, C2, ... C6" - meaning they are a 1:m relationship stored incorrectly IN RECORD.


ops$tkyte%ORA10GR2> select max( decode( rn, 1, column_value ) ) c1,
  2         max( decode( rn, 2, column_value ) ) c2,
  3         max( decode( rn, 3, column_value ) ) c3,
  4         max( decode( rn, 4, column_value ) ) c4,
  5         max( decode( rn, 5, column_value ) ) c5,
  6         max( decode( rn, 6, column_value ) ) c6
  7    from (
  8  select t.rowid rid, c.column_value, row_number() over (partition by t.rowid order by c.column_value) rn
  9    from T, table ( select sys.odcinumberlist( a,b,c,d,e,f ) from dual ) c
 10         )
 11   group by rid
 12  /

        C1         C2         C3         C4         C5         C6
---------- ---------- ---------- ---------- ---------- ----------
         1          2          3          4          5          6
        10         20         30         40         50         60


why is odcinumberlist not working for a simple query?

June 18, 2008 - 4:34 pm UTC

Reviewer: A reader

askus@oracle11g> select * from table ( select sys.odcinumberlist( a,b,c,d,e,f ) from dual ) c;
select * from table ( select sys.odcinumberlist( a,b,c,d,e,f ) from dual ) c
                                                           *
ERROR at line 1:
ORA-00904: "F": invalid identifier


askus@oracle11g> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
PL/SQL Release 11.1.0.6.0 - Production
CORE    11.1.0.6.0      Production
TNS for Linux: Version 11.1.0.6.0 - Production
NLSRTL Version 11.1.0.6.0 - Production

askus@oracle11g> table ( select sys.odcinumberlist( a,b,c,d,e,f ) from dual ) c;
SP2-0734: unknown command beginning "table ( se..." - rest of line ignored.
askus@oracle11g> select * from table ( select sys.odcinumberlist( a,b,c,d,e,f ) from dual ) c;
select * from table ( select sys.odcinumberlist( a,b,c,d,e,f ) from dual ) c
                                                           *
ERROR at line 1:
ORA-00904: "F": invalid identifier


askus@oracle11g> 

Tom Kyte

Followup  

June 18, 2008 - 4:39 pm UTC

well, you are not selecting from a table that has F, you are selecting from dual which only has dummy

I had their table T

  9    from <b>T,</b> table ( select sys.odcinumberlist( a,b,c,d,e,f ) from dual ) c


a,b,c,d,e,f came from T

How to do "right outer join" using lag/lead

July 09, 2008 - 3:01 pm UTC

Reviewer: Salman Syed from Pittsburgh, PA

Tom,

I have been struggling with this for some time now. Here is a simplified version of my problem.

Suppose the data is as follows:

Month Year Salary
1 2008 500
2 2008 600
1 2007 400
2 2007 450
3 2007 475
4 2008 625

The output I want is:

Month This_Year_Salary Last_Year_Salary
1 500 400
2 600 450
3 (null) 475
4 625 (null)

Is there a way to achieve this using analytics?

If I use lag, I miss the third record in the output. If I use lead, I miss the fourth records in the output.

Thanks a lot for helping me out.

Tom Kyte

Followup  

July 09, 2008 - 3:24 pm UTC

this is a pivot

select month, 
       max( decode( year, 2007, salary ) ) last_year,
       max( decode( year, 2008, salary ) ) this_year
  from t
 group by month
 order by month
/


analytics are for when you want all rows preserved (you do not, you want to squish out some)


August 04, 2008 - 10:35 am UTC

Reviewer: Reader

create table business_data
(dt date
,val varchar2(5)
,price1 number(5,2)
,price2 number(5,2)
);


select to_date('06/26/2008 9:30:20 AM' ,'mm/dd/yyyy HH:MI:SS AM') from dual

insert into business_data
values
(to_date('06/26/2008 9:30:20 AM' ,'mm/dd/yyyy HH:MI:SS AM'),'ABC',33.98,34.03);

insert into business_data
values
(to_date('06/26/2008 9:30:20 AM', 'mm/dd/yyyy HH:MI:SS AM'),'ABC',33.98,34.02);


insert into business_data
values
(to_date('06/26/2008 09:30:21 AM' ,'mm/dd/yyyy HH:MI:SS AM'),'ABC',33.98,34.06);

insert into business_data
values
(to_date('06/26/2008 09:30:21 AM', 'mm/dd/yyyy HH:MI:SS AM'),'ABC',33.98,34.07);


insert into business_data
values
(to_date('06/26/2008 09:30:25 AM' ,'mm/dd/yyyy HH:MI:SS AM'),'ABC',33.99,34.09);

insert into business_data
values
(to_date('06/26/2008 09:30:25 AM', 'mm/dd/yyyy HH:MI:SS AM'),'ABC',33.99,34.09);

insert into business_data
values
(to_date('06/26/2008 09:30:25 AM', 'mm/dd/yyyy HH:MI:SS AM'),'ABC',34.05,34.09);

insert into business_data
values
(to_date('06/26/2008 09:30:25 AM' ,'mm/dd/yyyy HH:MI:SS AM'),'ABC',33.99,34.09);

insert into business_data
values
(to_date('06/26/2008 09:30:25 AM', 'mm/dd/yyyy HH:MI:SS AM'),'ABC',33.99,34.09);

insert into business_data
values
(to_date('06/26/2008 09:30:25 AM', 'mm/dd/yyyy HH:MI:SS AM'),'ABC',33.99,34.09);

insert into business_data
values
(to_date('06/26/2008 09:30:25 AM', 'mm/dd/yyyy HH:MI:SS AM'),'ABC',33.99,34.09);


insert into business_data
values
(to_date('06/26/2008 09:30:26 AM', 'mm/dd/yyyy HH:MI:SS AM'),'ABC',34.02,34.1);

insert into business_data
values
(to_date('06/26/2008 09:30:28 AM' ,'mm/dd/yyyy HH:MI:SS AM'),'ABC',34.02,34.1);

insert into business_data
values
(to_date('06/26/2008 09:30:29 AM', 'mm/dd/yyyy HH:MI:SS AM'),'ABC',34.02,34.08);

insert into business_data
values
(to_date('06/26/2008 09:30:29 AM' ,'mm/dd/yyyy HH:MI:SS AM'),'ABC',34.02,34.08);


insert into business_data
values
(to_date('06/26/2008 09:30:39 AM' ,'mm/dd/yyyy HH:MI:SS AM'),'ABC',34.03,34.03);

insert into business_data
values
(to_date('06/26/2008 09:30:39 AM' ,'mm/dd/yyyy HH:MI:SS AM'),'ABC',34.03,34.06);

commit;

DT VAL PRICE1 PRICE2
---------------------- ----- ---------- ----------
06/26/2008 09:30:20 am ABC 33.98 34.03
06/26/2008 09:30:20 am ABC 33.98 34.02
06/26/2008 09:30:21 am ABC 33.98 34.06
06/26/2008 09:30:21 am ABC 33.98 34.07
06/26/2008 09:30:25 am ABC 33.99 34.09
06/26/2008 09:30:25 am ABC 33.99 34.09
06/26/2008 09:30:25 am ABC 34.05 34.09
06/26/2008 09:30:25 am ABC 33.99 34.09
06/26/2008 09:30:25 am ABC 33.99 34.09
06/26/2008 09:30:25 am ABC 33.99 34.09
06/26/2008 09:30:25 am ABC 33.99 34.09
06/26/2008 09:30:26 am ABC 34.02 34.1
06/26/2008 09:30:28 am ABC 34.02 34.1
06/26/2008 09:30:29 am ABC 34.02 34.08
06/26/2008 09:30:29 am ABC 34.02 34.08
06/26/2008 09:30:39 am ABC 34.03 34.03
06/26/2008 09:30:39 am ABC 34.03 34.06

For each minute in the business_data table, I need to get the min of price1 column and max of price2 column and sort the data by val, dt column.
Here is the query I used to get this data.

select distinct val,dt, min_price_1,max_price_2
from(
select val,
dt,
first_value(price1) over (partition by val,dt order by price1) min_price_1,
first_value(price2) over (partition by val,dt order by price2 desc) max_price_2
from business_data
)
order by val,dt;

VAL DT MIN_PRICE_1 MAX_PRICE_2
----- ---------------------- ----------- -----------
ABC 06/26/2008 09:30:20 am 33.98 34.03
ABC 06/26/2008 09:30:21 am 33.98 34.07
ABC 06/26/2008 09:30:25 am 33.99 34.09
ABC 06/26/2008 09:30:26 am 34.02 34.1
ABC 06/26/2008 09:30:28 am 34.02 34.1
ABC 06/26/2008 09:30:29 am 34.02 34.08
ABC 06/26/2008 09:30:39 am 34.03 34.06

7 rows selected.



1st record: If the time difference between the first record and second record is one second,
I need to get the minimum of PRICE_1 from the 1st and 2nd record (i.e min(33.98,33.98) = 33.98)
I need to get the maximum of PRICE_2 from the 1st and 2nd record (i.e max(34.03,34.07) = 34.07)

Result:
val,dt,PRICE_1,PRICE_2
ABC,06/26/2008 09:30:20 am, 33.98,34.07


2nd record: If the time difference between the 2nd record and 1st record is one second then if the time difference between 2nd and 3rd record is one second,
I need to take min PRICE_1 among the 1st three records (i.e min(33.98,33.98,33.99) = 33.98)
I need to make max PRICE_2 among the 1st three records (i.e max(34.03,34.07,34.09) = 34.09)

val,dt,PRICE_1,PRICE_2
ABC,06/26/2008 09:30:21 am,33.98,34.09


I do not have data for 06/26/2008 09:30:22 am,06/26/2008 09:30:23 am, 06/26/2008 09:30:24 am. I should get null values for that time

val,dt,PRICE_1,PRICE_2
ABC,06/26/2008 09:30:22 am,null,null
ABC,06/26/2008 09:30:23 am,null,null
ABC,06/26/2008 09:30:24 am,null,null

For 09:30:25 am, i should compare it with 09:30:26 am only, since there is no data for 09:30:24 am.
val,dt,PRICE_1,PRICE_2
ABC,06/26/2008 09:30:21 am,33.99,34.1

Is this possible to do using analytical functions?
Tom Kyte

Followup  

August 04, 2008 - 2:06 pm UTC

you wrote:


For each minute in the business_data table, I need to get the min of price1
column and max of price2 column
and sort the data by val, dt column.
Here is the query I used to get this data.


but you don't do that - you are breaking the data up by val,dt - ordering by price1 - there is nothing about minutes here at all?


so, tell you what, lose the queries that do not work - and in text, describe what you want.


your description contradicts your query contradicts your subsequent "within one second" talk. You talk of first and next records - we presume you really mean "after partitioning by value" so there are many possible first records (and update the inserts to reflect that, make the data reasonable)

what if 6,000 records all exist within 1 second of each other - does "minute" still come into play? spend a little more time on the probably description - like you were writing specifications for a programming to write code from - it'll take time to do that (and you might find the answer to your question as you do)


August 04, 2008 - 10:40 am UTC

Reviewer: Reader

Last record should be as below. Apologies for the typo

val,dt,PRICE_1,PRICE_2
ABC,06/26/2008 09:30:25 am,33.99,34.1

can rolling up be done easily ?

August 05, 2008 - 10:59 am UTC

Reviewer: Pranav Parekh from MA, USA

I want to rollup the sum of salaries for each employee to his manager in a hieararchy. For example, I want to add one more column in the below query to show the sum of salaries for each manager for the employees works under them. Can we do it in one query ?

SELECT EMPNO, ENAME, SAL, MGR,
SYS_CONNECT_BY_PATH (ENAME, '/') HIER, LEVEL
FROM EMP
START WITH MGR IS NULL
CONNECT BY PRIOR EMPNO = MGR;
Tom Kyte

Followup  

August 05, 2008 - 1:08 pm UTC


ops$tkyte%ORA10GR2> SELECT EMPNO, ENAME, SAL, MGR,  SYS_CONNECT_BY_PATH (ename, '/') HIER, LEVEL,
  2        (select sum(e2.sal) from scott.emp e2 start with e2.empno = e1.empno connect by prior e2.empno = e2.mgr) sum_sal
  3  FROM scott.EMP e1
  4  START WITH MGR IS NULL
  5  CONNECT BY PRIOR EMPNO = MGR;

     EMPNO ENAME             SAL        MGR HIER                                LEVEL    SUM_SAL
---------- ---------- ---------- ---------- ------------------------------ ---------- ----------
      7839 KING             5000            /KING                                   1      26825
      7566 JONES            2975       7839 /KING/JONES                             2       8675
      7788 SCOTT             800       7566 /KING/JONES/SCOTT                       3       1900
      7876 ADAMS            1100       7788 /KING/JONES/SCOTT/ADAMS                 4       1100
      7902 FORD             3000       7566 /KING/JONES/FORD                        3       3800
      7369 SMITH             800       7902 /KING/JONES/FORD/SMITH                  4        800
      7698 BLAKE            2850       7839 /KING/BLAKE                             2       9400
      7499 ALLEN            1600       7698 /KING/BLAKE/ALLEN                       3       1600
      7521 WARD             1250       7698 /KING/BLAKE/WARD                        3       1250
      7654 MARTIN           1250       7698 /KING/BLAKE/MARTIN                      3       1250
      7844 TURNER           1500       7698 /KING/BLAKE/TURNER                      3       1500
      7900 JAMES             950       7698 /KING/BLAKE/JAMES                       3        950
      7782 CLARK            2450       7839 /KING/CLARK                             2       3750
      7934 MILLER           1300       7782 /KING/CLARK/MILLER                      3       1300

14 rows selected.

Thanks !!

August 06, 2008 - 3:38 pm UTC

Reviewer: Pranav Parekh from MA, USA

Thanks Tom. It works.

can i use Analytic to ..

August 07, 2008 - 4:21 am UTC

Reviewer: rain from Taiwan

hi, Tom

select * from (
select empno,ename,job,deptno,
row_number() over(partition by deptno order by empno) as seq
from emp);

EMPNO ENAME JOB DEPTNO SEQ
---------- ---------- ---------- ---------- ----------
7369 SMITH CLERK 20 1
7566 JONES MANAGER 20 2
7788 SCOTT ANALYST 20 3
7876 ADAMS CLERK 20 4
7902 FORD ANALYST 20 5
7499 ALLEN SALESMAN 30 1
7521 WARD SALESMAN 30 2
7654 MARTIN SALESMAN 30 3
7698 BLAKE MANAGER 30 4
7844 TURNER SALESMAN 30 5
7900 JAMES CLERK 30 6


if i want to get:
two of emp in deptno 20, and three of emp in deptno 30

EMPNO ENAME JOB DEPTNO SEQ
---------- ---------- ---------- ---------- ----------
7369 SMITH CLERK 20 1
7566 JONES MANAGER 20 2
7499 ALLEN SALESMAN 30 1
7521 WARD SALESMAN 30 2
7654 MARTIN SALESMAN 30 3

now i use tow sql union to achieve this , but
Can it use analytic functions and single sql achieve??

Thanks
Maer
Tom Kyte

Followup  

August 07, 2008 - 1:23 pm UTC

select * from (
select empno,ename,job,deptno,
row_number() over(partition by deptno order by empno) as seq
from emp)
where (deptno=20 and seq <= 2)
or (deptno=30 and seq<=3);

thanks~~

August 08, 2008 - 12:09 am UTC

Reviewer: rain from taiwan

YA~YA~YA~ thanks !!
select * from (
select empno,ename,job,deptno,
row_number() over(partition by deptno order by empno) as seq from emp)
where (deptno>=30 and seq <= 3) or (deptno<30 and seq<=2);

I'm too focus on the analytic functions, but i neglect the where condition instead!!
Tom, you are smartest.

Now, I still have a problem.
The deptno are 10,20,30.
There are 3 emps in deptno 10, 5 emps in deptno 20, 6 emps in deptno 30 .
I want to display 5 emps which three from deptno 30,and two from below deptno 30 (deptno 10 or 20)

in my original sql is :

select * from(
select empno,ename,job,deptno from emp where deptno=30 order by sys_guid())
where rownum<4
union
select * from(
select empno,ename,job,deptno from emp where deptno <30 order by sys_guid())
where rownum<3;

 EMPNO ENAME    JOB         DEPTNO
------ -------- --------- --------
  7698 BLAKE    MANAGER         30
  7844 TURNER   SALESMAN        30
  7900 JAMES    CLERK           30
  7902 FORD     ANALYST         20
  7934 MILLER   CLERK           10

(Our product system have some query like this.because performance i want to tuning them.)

I use analytic sql like last question, but how to group deptno 30 , below deptno 30(20,10) ?
Can it use analytic functions and single sql achieve??

thanks

above

August 13, 2008 - 4:50 am UTC

Reviewer: rain from Taiwan

Hi,Tom
would you give me some hints on my last question.

thanks
Tom Kyte

Followup  

August 18, 2008 - 8:54 am UTC

this:

I use analytic sql like last question, but how to group deptno 30 , below deptno 30(20,10) ?
Can it use analytic functions and single sql achieve??


did not make sense, "how to group deptno 30", what does grouping have to do with anything??



if you really mean:

we need 3 rows from deptno = 30, they are to be RANDOM rows

we need 2 rows not from deptno = 30, they are to be RANDOM rows

and you need this to be "efficient"

you are not going to get there from here easily......

RE

August 19, 2008 - 5:41 am UTC

Reviewer: rain from Taiwan

Tom ,
Thanks your reply.

"we need 3 rows from deptno = 30, they are to be RANDOM rows "
"we need 2 rows not from deptno = 30, they are to be RANDOM rows "
Yes, we want it.
and
does your mean that we split two sql ,one do deptno 30, the other do bellow deptno 30,
then union them is the best way in this case.

Performance of Analytic functions

August 28, 2008 - 9:56 am UTC

Reviewer: Tilak from Leicester, UK

Hi Tom,
I am writing a query like

select * from (SELECT c.*,
Row_number() OVER (PARTITION BY customer_number ORDER BY extract_date DESC) rnk
FROM cust_table c ) where rnk=1 ;
wanted to extract one record incase of multiples orderd by extract_date

the table cust_table has something around 3 million records, the query takes around 5 min to comeup with result.It does'nt seem to be using any indexes,
anyways to reduce the time, as i want to use this in a view that will be quick.

thanks in advance,
Tom Kyte

Followup  

August 29, 2008 - 10:30 pm UTC

thank goodness it uses no indexes!!!! can you imagine how slow it would be if it did!!! holy cow!!!

if you want to see that in action, do something like:


select * from cust_table c1
where extract_date = ( select max(c2.extract_date) from cust_table c2 where c2.customer_number = c1.customer_number)

and hint the heck out of it to use the index on (customer_number,extract_date) - then sit back and wait.....


You'd need to tell us what this query waits on, does. TKPROF it (with 10046 level 12 trace - or in 10g use dbms_monitor to trace wait events). Show us what it does and what it waits on.

Lag functions

August 28, 2008 - 2:00 pm UTC

Reviewer: Seetha from CA,USA

Hi Tom,

I wanted to fetch previous record if particular previous record's column value equal to One. Note that we have to navigate the previous records till we get the particular record that meets the above said condition.

For ex:-

ColA ColB ColC

R1 1 0
R2 NULL 10
R3 NULL 10
R4 NULL 10
R5 NULL 10
R6 NULL 10

In this example, For R6, it should fetch R1 as previous record because the ColB equals to 1.We should ignore NULLS previous records

ColA ColB ColC

R1 1 0
R2 NULL 10

In this example, For R2, it should fetch R1 as previous record because the ColB equals to 1.

Thanks,
Seetha

Tom Kyte

Followup  

August 29, 2008 - 10:57 pm UTC

I would have enjoyed showing you lag(x ignore nulls) had only you provided

a) a version
b) a create table
c) inserts into table....


but now you know what to look for!

sorting in analytical functions

December 03, 2008 - 11:18 am UTC

Reviewer: hash from Pak

create table merit (
emp_id number,
dob date,
rank_code number,
sen_date date
)
/
insert into merit values(1, to_date('03/04/76', 'dd/mm/rr'), 1, to_date('05/06/96', 'dd/mm/rr'));
insert into merit values(1, to_date('03/04/76', 'dd/mm/rr'), 2, to_date('05/06/00', 'dd/mm/rr'));
insert into merit values(1, to_date('03/04/76', 'dd/mm/rr'), 3, to_date('05/06/04', 'dd/mm/rr'));
insert into merit values(1, to_date('03/04/76', 'dd/mm/rr'), 4, to_date('05/06/08', 'dd/mm/rr'));
insert into merit values(2, to_date('03/05/76', 'dd/mm/rr'), 1, to_date('05/06/96', 'dd/mm/rr'));
insert into merit values(2, to_date('03/05/76', 'dd/mm/rr'), 2, to_date('05/06/00', 'dd/mm/rr'));
insert into merit values(2, to_date('03/05/76', 'dd/mm/rr'), 3, to_date('05/06/04', 'dd/mm/rr'));
insert into merit values(2, to_date('03/05/76', 'dd/mm/rr'), 4, to_date('05/06/08', 'dd/mm/rr'));
insert into merit values(3, to_date('03/06/76', 'dd/mm/rr'), 1, to_date('05/06/96', 'dd/mm/rr'));
insert into merit values(3, to_date('03/06/76', 'dd/mm/rr'), 2, to_date('05/06/00', 'dd/mm/rr'));
insert into merit values(3, to_date('03/06/76', 'dd/mm/rr'), 3, to_date('05/06/04', 'dd/mm/rr'));
insert into merit values(3, to_date('03/06/76', 'dd/mm/rr'), 4, to_date('05/06/08', 'dd/mm/rr'));
insert into merit values(4, to_date('03/04/76', 'dd/mm/rr'), 1, to_date('05/06/96', 'dd/mm/rr'));
insert into merit values(4, to_date('03/04/76', 'dd/mm/rr'), 2, to_date('05/06/96', 'dd/mm/rr'));
insert into merit values(5, to_date('03/04/76', 'dd/mm/rr'), 1, to_date('05/06/96', 'dd/mm/rr'));
commit
/

SQL> SELECT * FROM MERIT;

    EMP_ID DOB         RANK_CODE SEN_DATE
---------- ---------- ---------- ----------
         1 03/04/1976          1 05/06/1996
         1 03/04/1976          2 05/06/2000
         1 03/04/1976          3 05/06/2004
         1 03/04/1976          4 05/06/2008
         2 03/05/1976          1 05/06/1996
         2 03/05/1976          2 05/06/2000
         2 03/05/1976          3 05/06/2004
         2 03/05/1976          4 05/06/2008
         3 03/06/1976          1 05/06/1996
         3 03/06/1976          2 05/06/2000
         3 03/06/1976          3 05/06/2004
         3 03/06/1976          4 05/06/2008
         4 03/04/1976          1 05/06/1996
         4 03/04/1976          2 05/06/1996
         5 03/04/1976          1 05/06/1996

15 rows selected.

now the sort order for a given rank_code (say 4) is:

1. oldest sen_date first.
2. if there is a tie on sen_date, then the record with oldest sen_date in previous rank_code.
3. repeat step 2 until all rank_codes for a particular emp_id are exhausted.
4. if there is still a tie, then sort by DOB.

Tom Kyte

Followup  

December 09, 2008 - 10:16 am UTC

what does this have to do with analytics?


#2 makes no sense. If there is a "tie", then you have two dates that are the same - what do you mean by referencing the "oldest of previous rank code" - you have two records already here.

Seems you want to

order by emp_id, sen_date, rank_code, dob;



I couldn't explained in the first place

December 11, 2008 - 6:48 am UTC

Reviewer: hash from Pak

This is original data, 

SQL> select * from merit 
  2  order by emp_id, sen_date, rank_code, dob;

    EMP_ID DOB         RANK_CODE SEN_DATE         FLAG
---------- ---------- ---------- ---------- ----------
         1 03/04/1976          1 05/06/1996
         1 03/04/1976          2 05/06/2000
         1 03/04/1976          3 05/06/2004
         1 03/04/1976          4 05/06/2008
         2 03/05/1976          1 05/06/1996
         2 03/05/1976          2 05/06/2000
         2 03/05/1976          3 05/06/2004
         2 03/05/1976          4 05/06/2008
         3 03/06/1976          1 05/06/1996
         3 03/06/1976          2 05/06/2000
         3 03/06/1976          3 05/06/2004
         3 03/06/1976          4 05/06/2008
         4 03/04/1976          1 05/06/1996
         4 03/04/1976          2 05/06/1996
         5 03/04/1976          1 05/06/1996

I want to run query for a particular rank_code (i.e, 4)

  1  select * from merit
  2  where rank_code = 4
  3* order by emp_id, sen_date, rank_code, dob
SQL> /

    EMP_ID DOB         RANK_CODE SEN_DATE   
---------- ---------- ---------- ---------- 
         1 03/04/1976          4 05/06/2008
         2 03/05/1976          4 05/06/2008
         3 03/06/1976          4 05/06/2008


As you can see, all the above 3 emp_id have same sen_date (seniority date).

  1  select * from merit
  2  where rank_code = 3
  3* order by emp_id, sen_date, rank_code, dob
SQL> /

    EMP_ID DOB         RANK_CODE SEN_DATE       
---------- ---------- ---------- ---------- 
         1 03/04/1976          3 05/06/2004
         2 03/05/1976          3 05/06/2004
         3 03/06/1976          3 05/06/2004

now again they have the same sen_date in the previous rank.

  1  select * from merit
  2  where rank_code = 2
  3* order by emp_id, sen_date, rank_code, dob
SQL> /

    EMP_ID DOB         RANK_CODE SEN_DATE       
---------- ---------- ---------- ---------- 
         1 03/04/1976          2 05/06/2000
         2 03/05/1976          2 05/06/2000
         3 03/06/1976          2 05/06/2000

now again in the previous rank, they have the same sen_date.

  1  select * from merit
  2  where rank_code = 1
  3* order by emp_id, sen_date, rank_code, dob
SQL> /

    EMP_ID DOB         RANK_CODE SEN_DATE       
---------- ---------- ---------- ---------- 
         1 03/04/1976          1 05/06/1996
         2 03/05/1976          1 05/06/1996
         3 03/06/1976          1 05/06/1996 

Uptil here, all the the rank_codes are exhuasted. So the senior would be emp_id 1 because he was born first. If they have tie in any of the previous ranks, the one with the earliest sen_date would be senior.

I think I have coveyed what i wanted to say


Tom Kyte

Followup  

December 11, 2008 - 8:12 am UTC

depending on the nature of sen_date - I believe this might do it for you

ops$tkyte%ORA11GR1> select emp_id,
  2         min( case when rank_code = 4 then sen_date end ) sen_date,
  3             max( sen_date ) max_sen_date,
  4             max( dob )
  5    from merit
  6   where rank_code <= 4
  7   group by emp_id
  8  having max( rank_code ) = 4
  9  order by 2, 3, 4
 10  /

    EMP_ID SEN_DATE             MAX_SEN_DATE         MAX(DOB)
---------- -------------------- -------------------- --------------------
         1 05-jun-2008 00:00:00 05-jun-2008 00:00:00 03-apr-1976 00:00:00
         2 05-jun-2008 00:00:00 05-jun-2008 00:00:00 03-may-1976 00:00:00
         3 05-jun-2008 00:00:00 05-jun-2008 00:00:00 03-jun-1976 00:00:00




Thanks a lot

December 11, 2008 - 9:27 am UTC

Reviewer: hash from Pak

It works for me

thanks again
Tom Kyte

Followup  

December 11, 2008 - 9:40 am UTC

make sure that you actually understand what it does - don't just test empirically - understand what I did there!

re: sorting by sen-date

December 18, 2008 - 10:05 am UTC

Reviewer: Stew Ashton from Paris, France


I think the test data lacks some boundary conditions. Let's say emp_id 2 got to rank 2 earlier, and emp_id 3 got to rank 3 earlier. In that case, shouldn't emp_id 3 come first and emp_id 2 come second?
update merit set sen_date = sen_date-1 where emp_id in(2,3) and emp_id = rank_code;
/
select emp_id,
       max(case when rank_code = 4 then sen_date end) sen_date_4,
       max(case when rank_code = 3 then sen_date end) sen_date_3,
       max(case when rank_code = 2 then sen_date end) sen_date_2,
       max(case when rank_code = 1 then sen_date end) sen_date_1,
       max(dob)
  from merit
where rank_code <= 4
group by emp_id
order by 2, 3, 4, 5, 6
/
EMP_ID SEN_DATE_4 SEN_DATE_3 SEN_DATE_2 SEN_DATE_1 MAX(DOB) 
------ ---------- ---------- ---------- ---------- ---------
3      05/06/08   04/06/04   05/06/00   05/06/96   03/06/76 
2      05/06/08   05/06/04   04/06/00   05/06/96   03/05/76 
1      05/06/08   05/06/04   05/06/00   05/06/96   03/04/76 
4                            05/06/96   05/06/96   03/04/76 
5                                       05/06/96   03/04/76

December 29, 2008 - 5:49 pm UTC

Reviewer: Reader

Tom,
create table map
(old_sym varchar2(100)
,new_sym varchar2(100)
,Cntry_CODE varchar2(100)
,start_dt date);

insert into map
values
('ABC','ABCE','US',to_Date('11/04/2008','mm/dd/yyyy');

insert into map
values
('ABCE','ABC','US',to_Date('11/18/2008','mm/dd/yyyy');

commit;

OLD_sym new_sym Cntry_CODE START_DT
ABC ABCE US 11/04/2008 - 1st record
ABCE ABC US 11/18/2008 - 2nd record

we get 1st record on 11/04/2008 with old_sym = ABC and new_sym = ABCE
this got reversed on the 2nd record that we got on 11/18/2008

If difference between the start_Dt on the 1st and 2nd record is one month,
and if new_sym of 2nd record = old_sym of 1st record, I need to delete 1st record

If difference between the start_Dt on the 2nd and 1st record is one month,
and if old_sym of 2nd record = new_sym of 1st record, I need to delete 2nd record

Can this be done using a merge statement?
Tom Kyte

Followup  

January 05, 2009 - 8:26 am UTC

insufficient data here. How do you know these two records even "go together"

what does merge have to do with this? Seems you have a table with two rows in it?

what happens if there is a 3rd record? A fourth? And so on?

details are very much lacking here...

Function to list all the combination possibilities

January 19, 2009 - 5:26 pm UTC

Reviewer: Junior from VA, USA

Tom:
How to write a function to pass a list of data set and generate all posibilities combinations as final result? E.g, pass a,b,c,d,e,f,g,h, and expecting to list any 5 of data I passesed. The result is like:
a,b,c,d,e,f
a,b,c,d,e,g
a,b,c,d,e,h
a,b,c,d,f,g
...
Thanks.
Tom Kyte

Followup  

January 19, 2009 - 9:53 pm UTC

is a,a,a,a,a a valid response?

is 5 the only number? or is that a parameter as well.

will there always be 8 things in the input?

Function to list all the combination possibilities

January 19, 2009 - 10:43 pm UTC

Reviewer: Junior from VA, USA

Tom:
Sorry did not make it clear. The input parameter data will between 7 to 10 (and can be characters or numbers). The output will be fixed in 6 values only. E.g.
input parameter data: a,b,c,d,e,1,2 (up to 10 values)
expecting output result:
a,b,c,d,e,1
a,b,c,d,e,2
a,b,c,d,1,2
a,b,c,e,1,2
...
Thank you very much.
Tom Kyte

Followup  

January 20, 2009 - 6:50 am UTC

you still haven't specified this

Ok, a string of inputs - 7 to 10 of them.

Output - six columns (not five)

Now, is a,a,a,a,a,a valid?

Are you expecting to see b,c,d,e,1,a as well as a,b,c,d,e,1

Function to list all the combination possibilities

January 20, 2009 - 7:28 am UTC

Reviewer: Junior from VA, USA

Tom:
Sorry. :o)
A string of inputs - 7 to 10 of them.
Output - six columns (not five)
The output value is unique, and each of them cannot be repeated. a,a,a,a,a,a is not allowed.
b,c,d,e,1,a and a,b,c,d,e,1 is same value will only display once.
Thanks.

Tom Kyte

Followup  

January 20, 2009 - 7:47 am UTC

ops$tkyte%ORA11GR1> variable txt varchar2(20)
ops$tkyte%ORA11GR1> exec :txt := 'a,b,c,d,e,f,g,1'

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> with data
  2  as
  3  (
  4  select
  5    trim( substr (txt,
  6          instr (txt, ',', 1, level  ) + 1,
  7          instr (txt, ',', 1, level+1)
  8             - instr (txt, ',', 1, level) -1 ) )
  9      as token
 10     from (select ','||:txt||',' txt
 11             from dual)
 12   connect by level <=
 13      length(:txt)-length(replace(:txt,',',''))+1
 14   )
 15   select * from data;

TOK
---
a
b
c
d
e
f
g
1

8 rows selected.

ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> column token format a3
ops$tkyte%ORA11GR1> with data
  2  as
  3  (
  4  select distinct
  5    trim( substr (txt,
  6          instr (txt, ',', 1, level  ) + 1,
  7          instr (txt, ',', 1, level+1)
  8             - instr (txt, ',', 1, level) -1 ) )
  9      as token
 10     from (select ','||:txt||',' txt
 11             from dual)
 12   connect by level <=
 13      length(:txt)-length(replace(:txt,',',''))+1
 14   )
 15  select * from data d1, data d2, data d3, data d4, data d5, data d6
 16  where d1.token < d2.token
 17    and d2.token < d3.token
 18    and d3.token < d4.token
 19    and d4.token < d5.token
 20    and d5.token < d6.token;

TOK TOK TOK TOK TOK TOK
--- --- --- --- --- ---
1   a   b   c   d   e
1   a   b   d   e   g
1   a   c   d   e   g
1   b   c   d   e   g
a   b   c   d   e   g
1   a   b   c   e   g
1   a   b   c   d   g
1   a   d   e   f   g
1   b   d   e   f   g
a   b   d   e   f   g
1   c   d   e   f   g
a   c   d   e   f   g
b   c   d   e   f   g
1   a   b   e   f   g
1   a   c   e   f   g
1   b   c   e   f   g
a   b   c   e   f   g
1   a   b   d   f   g
1   a   c   d   f   g
1   b   c   d   f   g
a   b   c   d   f   g
1   a   b   c   f   g
1   a   b   d   e   f
1   a   c   d   e   f
1   b   c   d   e   f
a   b   c   d   e   f
1   a   b   c   e   f
1   a   b   c   d   f

28 rows selected.

Function to list all the combination possibilities

January 20, 2009 - 8:12 am UTC

Reviewer: Junior from VA, USA

Tom:
Thank you so much. FYI, it works perfectly on Oracle 10g in UNIX enviroment. It did not work on Oralce 9.2.0.1.0 in Windows. Error message is:

  1  with data
  2   as
  3    (
  4    select
  5      trim( substr (txt,
  6            instr (txt, ',', 1, level  ) + 1,
  7            instr (txt, ',', 1, level+1)
  8               - instr (txt, ',', 1, level) -1 ) )
  9        as token
 10       from (select ','||:txt||',' txt
 11               from dual)
 12     connect by level <=
 13        length(:txt)-length(replace(:txt,',',''))+1
 14     )
 15*    select * from data;
SQL> /
   select * from data;
                     *
ERROR at line 15:
ORA-00911: invalid character

Tom Kyte

Followup  

January 20, 2009 - 8:25 am UTC

it works fine in 9ir2 - what version of sqlplus where you using.

and make sure your script is "windows compliant", eg: it has the right end of line marker for that OS.

Function to list all the combination possibilities

January 20, 2009 - 8:27 am UTC

Reviewer: Junior from VA, USA

Tom:
Correction.
It works on both Oracle 9 and 10 in any enviroment.

:)

January 20, 2009 - 9:30 am UTC

Reviewer: Neven from SF Bulgaria

--another decision of the problem is the following :


drop type MYTABLEVARCHAR;
drop type MYOBJECTVARCHAR;
drop function parse_it_now;

CREATE OR REPLACE
TYPE MYOBJECTVARCHAR
AS OBJECT
(
A VARCHAR2(32000),
idx NUMBER
)
/


CREATE OR REPLACE
TYPE MYTABLEVARCHAR
AS TABLE OF myObjectVarchar
/


CREATE OR REPLACE FUNCTION parse_it_now(input_str CLOB, separator VARCHAR2)
RETURN mytablevarchar PIPELINED
IS
v PLS_INTEGER := 0;
BEGIN
FOR i IN
(SELECT TRIM
(CASE
WHEN LEVEL = 1
THEN SUBSTR (input_str,
1,
INSTR (input_str, separator, 1, 1) - 1
)
WHEN INSTR (input_str, separator, 1, LEVEL) <> 0
THEN SUBSTR (input_str,
INSTR (input_str,
separator,
1,
LEVEL - 1
)
+ 1,
INSTR (input_str, separator, 1, LEVEL)
- INSTR (input_str,
separator,
1,
LEVEL - 1
)
- 1
)
END) x
FROM DUAL
WHERE INSTR (input_str, separator, 1, 1) <> 0
CONNECT BY LEVEL < 9999999999
AND INSTR (input_str, separator, 1, LEVEL) <> 0
UNION ALL
SELECT TRIM (SUBSTR (input_str, INSTR (input_str, separator, -1, 1) + 1))
FROM DUAL)
LOOP
v := v + 1;
PIPE ROW (myobjectvarchar (i.x,v));
END LOOP;
RETURN;
END;
/

--and the query is :

SELECT x.a col1,y.a col2,z.a col3,
A.a col4,B.a col5
FROM TABLE(CAST(parse_it_now('a,b,c,d,e,f,g,h,k',',') AS mytablevarchar)) X,
TABLE(CAST(parse_it_now('a,b,c,d,e,f,g,h,k',',') AS mytablevarchar)) Y,
TABLE(CAST(parse_it_now('a,b,c,d,e,f,g,h,k',',') AS mytablevarchar)) Z,
TABLE(CAST(parse_it_now('a,b,c,d,e,f,g,h,k',',') AS mytablevarchar)) A,
TABLE(CAST(parse_it_now('a,b,c,d,e,f,g,h,k',',') AS mytablevarchar)) B

--If you have performance problems with the function , you could change the following clause : "CONNECT BY LEVEL < 9999999999" to a lower number

--HAVE FUN :)


February 09, 2009 - 10:33 am UTC

Reviewer: Reader

create table main_table
(dt_key date,
sym varchar2(20),
vol number
);

create table map_main
(old_sym varchar2(20),
new_sym varchar2(20),
effective_dt date
);

--main_table inserts
insert into main_table
values
(to_date('02/06/2009','mm/dd/yyyy'), 'CCR',1000);

insert into main_table
values
(to_date('02/05/2009','mm/dd/yyyy'), 'CJ',3000);

insert into main_table
values
(to_date('02/04/2009','mm/dd/yyyy'), 'CJ',2000);


insert into main_table
values
(to_date('02/06/2009','mm/dd/yyyy'), 'PHPL',2000);

insert into main_table
values
(to_date('02/05/2009','mm/dd/yyyy'), 'PHP',1000);

insert into main_table
values
(to_date('02/04/2009','mm/dd/yyyy'), 'PHP',4000);

insert into main_table
values
(to_date('02/03/2009','mm/dd/yyyy'), 'PHP',5000);

--map_main table inserts
insert into map_main
values
('CJ','CCR',to_date('02/06/2009','mm/dd/yyyy'));

insert into map_main
values
('PHP','PHPL',to_date('02/06/2009','mm/dd/yyyy'));

commit;
main_table data
DT_KEY SYM VOL
--------- -------------------- ----------
06-FEB-09 CCR 1000
05-FEB-09 CJ 3000
04-FEB-09 CJ 2000
06-FEB-09 PHPL 2000
05-FEB-09 PHP 1000
04-FEB-09 PHP 4000
03-FEB-09 PHP 5000

main_map table data
OLD_SYM NEW_SYM EFFECTIVE
-------------------- -------------------- ---------
CJ CCR 06-FEB-09
PHP PHPL 06-FEB-09

main_table has dt_key, sym and vol columns
main_map contains old_sym and new_sym and effective_dt

I need to get last 20 day average volume for each sym in main_table. I do not store data for holiday and weekends. The thing is, sym may be changed and the effective date will be in main_map table.
For example: CCR is changed to CJ on 2009-02-06 and CJ starts appearing in main_table starting 2009-02-06. If I calculate the 20 day average volume using
main_table, I have to consider sym=CJ for 2009-02-06 and for the data loaded previously, I need to take the SYM=CCR. The last 20D average for CCR=(1000 (loaded 02/06)+ 3000 (loaded 02/05) + 2000 (loaded 02/04))/20

The last 20D average for PHPL=(2000 (loaded 02/06)+ 1000 (loaded 02/05) + 4000 (loaded 02/04)) + 5000 (loaded on 02/03)/20

Can you advice me how to do this?
Tom Kyte

Followup  

February 09, 2009 - 7:27 pm UTC

... For example: CCR is changed to CJ on 2009-02-06 ...

main_map table data
OLD_SYM NEW_SYM EFFECTIVE
-------------------- -------------------- ---------
CJ CCR 06-FEB-09



looks like CJ is changed to CCR???


could CJ change to CCR and CCR change to XYZ - do you need to account for that?

February 09, 2009 - 8:38 pm UTC

Reviewer: Reader

Apologies for the mistake in sending the information. CJ to CCR and PHP to PHPL.
Tom Kyte

Followup  

February 10, 2009 - 7:01 am UTC

answer the other much more important question

February 10, 2009 - 9:29 am UTC

Reviewer: Reader

CJ to CCR and PHP to PHPL on 02/06/2009. CCR may change to something else as you said to XYZ in the future and PHPL may change to ABC in the future
I have to calculate the average volume based on CJ, CCR, XYZ

insert into main_table
values
(to_date('02/09/2009','mm/dd/yyyy'), 'CCR',11111);

insert into main_table
values
(to_date('02/010/2009','mm/dd/yyyy'), 'XYZ',22222);


insert into map_main
values
('CCR','XYZ',to_date('02/10/2009','mm/dd/yyyy'));


For example, on 02/06/2009, I need to calculate as below:
CCR=(1000 (loaded 02/06)+ 3000
(loaded 02/05) + 2000 (loaded 02/04))/20

If CCR is changed to XYZ on 02/10/2009, I need to to do the calculation as below on 02/10/2009:
CCR=( 2000 (CJ loaded 02/04) + 3000 (CJ loaded 02/05) + 1000 (CCR loaded 02/06) + 11111 (CCR loaded 02/09) + 22222 (XYZ loaded on 02/10) ) /20

- last 20 days average volume and I do not have data for holidays and weekends.


Tom Kyte

Followup  

February 11, 2009 - 9:17 am UTC

can you psuedo code out the logic a bit better - I'm not sure if you are changing ccr to cj or remapping cj back to ccr

Maybe you can start with this result set and tell us what the value of the character column should be for each row:

(and I'm always curious why people store data in a way that makes it almost impossible if not impossible to answer their questions?)

February 10, 2009 - 9:30 am UTC

Reviewer: Reader

Apologies for the typo in my previous question
Here is what I need to calculate -

XYZ=( 2000 (CJ loaded 02/04) + 3000 (CJ loaded 02/05) + 1000 (CCR loaded 02/06) + 11111 (CCR
loaded 02/09) + 22222 (XYZ loaded on 02/10) ) /20

February 11, 2009 - 9:44 am UTC

Reviewer: Reader

om,
Thank you for your response. I am trying to handle ticker change in the market data. I need to calculate the 20 day average volume for a ticker each day.

CJ is changed to CCR on Feb 06;
Here CJ is old ticker and CCR is the new ticker

CJ and CCR are basically the same tickers for my caclulation.
I need to consider volumes of CJ prior to feb 06 and volume CCR on FEB 06


If CJ is changed to CCR on 02/06/2009, I need to to do the calculation as below on 02/06/2009.
It should be displayed as CCR (new ticker)
CCR=( 2000 (CJ loaded 02/04) + 3000 (CJ loaded 02/05) + 1000 (CCR loaded 02/06)) /20

Ticker date average20d
CCR 02/06/2009 300

CCR will be changed to XYZ on feb 10. In that case CJ, CCR and XYZ are same for my calculation.

If CCR is changed to XYZ on 02/10/2009, I need to to do the calculation as below on 02/10/2009:
CCR=( 2000 (CJ loaded 02/04) + 3000 (CJ loaded 02/05) + 1000 (CCR loaded 02/06) + 11111 (CCR
loaded 02/09) + 22222 (XYZ loaded on 02/10) ) /20


Ticker date average20d
XYZ 02/10/2009 1966.65

main_table data
DT_KEY SYM VOL
--------- -------------------- ----------
06-FEB-09 CCR 1000
05-FEB-09 CJ 3000
04-FEB-09 CJ 2000
06-FEB-09 PHPL 2000
05-FEB-09 PHP 1000
04-FEB-09 PHP 4000
03-FEB-09 PHP 5000
09-FEB-09 CCR 11111
10-FEB-09 XYZ 22222

main_map table data
OLD_SYM NEW_SYM EFFECTIVE
-------------------- -------------------- ---------
CJ CCR 06-FEB-09
PHP PHPL 06-FEB-09
CCR XYZ 10-FEB-09

I am not sure if I am clear. Please let me know. Thanks a lot for your response.
Tom Kyte

Followup  

February 11, 2009 - 12:54 pm UTC

this should get you going


ops$tkyte%ORA11GR1> select rpad('*',2*level,'*') || old_sym osym,
  2         old_sym,
  3         new_sym,
  4             connect_by_root old_sym the_orig_sym,
  5             prior effective_dt ped,
  6             effective_dt
  7    from map_main
  8  connect by prior new_sym = old_sym and prior effective_dt < effective_dt
  9  /

OSYM                 OLD_S NEW_S THE_O PED         EFFECTIVE_D
-------------------- ----- ----- ----- ----------- -----------
**CCR                CCR   XYZ   CCR               10-feb-2009
**CJ                 CJ    CCR   CJ                06-feb-2009
****CCR              CCR   XYZ   CJ    06-feb-2009 10-feb-2009
**PHP                PHP   PHPL  PHP               06-feb-2009

<b>fill out the hierarchy - getting the root sym</b>

ops$tkyte%ORA11GR1> select the_orig_sym, new_sym, effective_dt sdate,
  2         nvl(lead(effective_dt) over (partition by the_orig_sym order by
  3             effective_dt)-1,to_date('1-jan-3000','dd-mon-yyyy')) edate
  4    from (
  5  select connect_by_root old_sym the_orig_sym,
  6         new_sym,
  7             effective_dt
  8    from map_main
  9  connect by prior new_sym = old_sym and prior effective_dt < effective_dt
 10          )
 11  order by 1, 3
 12  /

THE_O NEW_S SDATE       EDATE
----- ----- ----------- -----------
CCR   XYZ   10-feb-2009 01-jan-3000
CJ    CCR   06-feb-2009 09-feb-2009
CJ    XYZ   10-feb-2009 01-jan-3000
PHP   PHPL  06-feb-2009 01-jan-3000

<b>make it now a range... start date to end date</b>

ops$tkyte%ORA11GR1> with data
  2  as
  3  (
  4  select the_orig_sym, new_sym, effective_dt sdate,
  5         nvl(lead(effective_dt) over (partition by the_orig_sym order by
  6             effective_dt)-1,to_date('1-jan-3000','dd-mon-yyyy')) edate
  7    from (
  8  select connect_by_root old_sym the_orig_sym,
  9         new_sym,
 10             effective_dt
 11    from map_main
 12  connect by prior new_sym = old_sym and prior effective_dt < effective_dt
 13          )
 14  )
 15  select *
 16    from main_table mt left outer join data d
 17      on (mt.dt_key between d.sdate and d.edate
 18              and mt.sym = d.new_sym)
 19    order by dt_key;

DT_KEY      SYM                         VOL THE_O NEW_S SDATE       EDATE
----------- -------------------- ---------- ----- ----- ----------- -----------
03-feb-2009 PHP                        5000
04-feb-2009 CJ                         2000
04-feb-2009 PHP                        4000
05-feb-2009 CJ                         3000
05-feb-2009 PHP                        1000
06-feb-2009 CCR                        1000 CJ    CCR   06-feb-2009 09-feb-2009
06-feb-2009 PHPL                       2000 PHP   PHPL  06-feb-2009 01-jan-3000
09-feb-2009 CCR                       11111 CJ    CCR   06-feb-2009 09-feb-2009
10-feb-2009 XYZ                       22222 CCR   XYZ   10-feb-2009 01-jan-3000
10-feb-2009 XYZ                       22222 CJ    XYZ   10-feb-2009 01-jan-3000

10 rows selected.



February 11, 2009 - 5:24 pm UTC

Reviewer: Reader

Tom,
Thanks a lot. Trying to understand your queries. I will have questions for you on this. I have not used this kind of SQL. Again, Thanks for all your answers.

Analytics with connect by

February 11, 2009 - 5:35 pm UTC

Reviewer: Dan from MI

Hi Tom,

I know I saw this issue before in the thread around 2005 (and I think you said you filed a bug on it) but I'm running into it now, and I was wondering if you had any more insight.

select * from v$version
where banner like 'Oracle%';

BANNER                                                           
---------------------------------------------------------------- 
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bi 

1 rows selected


Here is the base query - I am just trying to get the parent_dept_key name, so I was building my way to it.

with t as (
select 1 dept_key, 'Test' name, 3 parent_dept_key from dual UNION ALL
select 2 dept_key, 'Test1' name, 1 parent_dept_key from dual UNION ALL
select 3 dept_key, 'Test2' name, 3 parent_dept_key from dual )
select dept_key, 
       name, 
       parent_dept_key, 
       connect_by_root(name) parent_name,
       level lvl
from t
connect by nocycle
prior dept_key = parent_dept_key
  and level <= 2
order by dept_key, lvl;

DEPT_KEY               NAME  PARENT_DEPT_KEY        PARENT_NAME LVL                    
---------------------- ----- ---------------------- ----------- ---------------------- 
1                      Test  3                      Test        1                      
1                      Test  3                      Test2       2                      
2                      Test1 1                      Test1       1                      
2                      Test1 1                      Test        2                      
3                      Test2 3                      Test2       1                      

5 rows selected


Now in order to get only the "correct" rows, I figured I would just use analytic max() to know the max level per dept_key (and then wrap this all in an inline view and only pick the rows I wanted):

with t as (
select 1 dept_key, 'Test' name, 3 parent_dept_key from dual UNION ALL
select 2 dept_key, 'Test1' name, 1 parent_dept_key from dual UNION ALL
select 3 dept_key, 'Test2' name, 3 parent_dept_key from dual )
select dept_key, 
       name, 
       parent_dept_key, 
       connect_by_root(name) parent_name,
       level lvl,
       max(level) over (partition by dept_key) m
from t
connect by nocycle
prior dept_key = parent_dept_key
  and level <= 2
order by dept_key, lvl;

DEPT_KEY               NAME  PARENT_DEPT_KEY        PARENT_NAME LVL M   
---------------------- ----- ---------------------- ----------- --- --- 
1                      Test  3                      Test2       1   2   
1                      Test  3                      Test2       2   2   
2                      Test1 1                      Test2       1   2   
2                      Test1 1                      Test2       2   2   
3                      Test2 3                      Test2       1   1   

5 rows selected


But here's the problem - why did the Parent_name change?!

Now, I solved this by putting my original query into an inline view and then running the analytics off of that. But it seems to me that adding a column shouldn't change the result of other columns - at the very least, it's very counter-intuitive.

Any ideas on what's going on?
Tom Kyte

Followup  

February 12, 2009 - 11:58 am UTC

confirmed, bug, will file after seeing if it isn't already filed

thanks!

February 12, 2009 - 12:47 pm UTC

Reviewer: Reader

Tom,
Using the query that you gave, can you please let me know how I should calculate average for last 20 days?

ops$tkyte%ORA11GR1> with data
2 as
3 (
4 select the_orig_sym, new_sym, effective_dt sdate,
5 nvl(lead(effective_dt) over (partition by the_orig_sym order by
6 effective_dt)-1,to_date('1-jan-3000','dd-mon-yyyy')) edate
7 from (
8 select connect_by_root old_sym the_orig_sym,
9 new_sym,
10 effective_dt
11 from map_main
12 connect by prior new_sym = old_sym and prior effective_dt < effective_dt
13 )
14 )
15 select *
16 from main_table mt left outer join data d
17 on (mt.dt_key between d.sdate and d.edate
18 and mt.sym = d.new_sym)
19 order by dt_key;



Tom Kyte

Followup  

February 12, 2009 - 4:30 pm UTC

I was hoping you could get there....

first, identify the data you want:


ops$tkyte%ORA11GR1> with data
  2  as
  3  (
  4  select the_orig_sym, new_sym, effective_dt sdate,
  5         nvl(lead(effective_dt) over (partition by the_orig_sym order by
  6             effective_dt)-1,to_date('1-jan-3000','dd-mon-yyyy')) edate
  7    from (
  8  select connect_by_root old_sym the_orig_sym,
  9         new_sym,
 10             effective_dt
 11    from map_main
 12  connect by prior new_sym = old_sym and prior effective_dt < effective_dt
 13          )
 14  )
 15  select nvl(the_orig_sym,sym) the_sym,
 16         dt_key,
 17         vol
 18    from main_table mt left outer join data d
 19      on (mt.dt_key between d.sdate and d.edate
 20              and mt.sym = d.new_sym)
 21    where (sdate is null and edate is null)
 22       or (dt_key between sdate and edate)
 23    order by the_sym, dt_key;

THE_SYM              DT_KEY           VOL
-------------------- --------- ----------
CCR                  10-FEB-09      22222
CJ                   04-FEB-09       2000
CJ                   05-FEB-09       3000
CJ                   06-FEB-09       1000
CJ                   09-FEB-09      11111
CJ                   10-FEB-09      22222
PHP                  03-FEB-09       5000
PHP                  04-FEB-09       4000
PHP                  05-FEB-09       1000
PHP                  06-FEB-09       2000

10 rows selected.


and then aggregate:

ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> with data
  2  as
  3  (
  4  select the_orig_sym, new_sym, effective_dt sdate,
  5         nvl(lead(effective_dt) over (partition by the_orig_sym order by
  6             effective_dt)-1,to_date('1-jan-3000','dd-mon-yyyy')) edate
  7    from (
  8  select connect_by_root old_sym the_orig_sym,
  9         new_sym,
 10             effective_dt
 11    from map_main
 12  connect by prior new_sym = old_sym and prior effective_dt < effective_dt
 13          )
 14  )
 15  select nvl(the_orig_sym,sym) the_sym, avg(vol)
 16    from main_table mt left outer join data d
 17      on (mt.dt_key between d.sdate and d.edate
 18              and mt.sym = d.new_sym)
 19    where ((sdate is null and edate is null)
 20       or (dt_key between sdate and edate))
 21          and dt_key >= trunc(sysdate)-20
 22          and nvl(the_orig_sym,sym) in ('CJ', 'PHP')
 23    group by nvl(the_orig_sym,sym)
 24    order by the_sym;

THE_SYM                AVG(VOL)
-------------------- ----------
CJ                       7866.6
PHP                        3000


I did average, you do whatever calculation you need - you have all of the data, you should be able to get there from here?

February 13, 2009 - 7:39 am UTC

Reviewer: Reader

Tom,
Thank You for your response.

on 02/06, I need to display

Symbol adv_20d
CCR 300

on 02/10, I need to display
symbol adv_20d
XYZ 1966.65

I understand about getting the data for the date range. Here main_table is basically the driving table, which
has symbols each day and I have to calculate 20d average for each symbol in that table and handle symbol changes if
any in the map_table


Can I use this with data query in a pl/sql cursor to get only the symbol changes and their date ranges and use that
to join to the main_table to get the data I want?


Tom Kyte

Followup  

February 16, 2009 - 10:51 am UTC

I gave you access to *every single column*, you cannot derive from the data what you need?

break the query apart - bit by bit, understand it, everything - everything is there - old symbol, new symbol, everything.

gaps

February 17, 2009 - 3:29 pm UTC

Reviewer: Amm Kh

Dear Tom,
can i know the gaps from a range of values ? I have a table like this :

select * from ss;

col
====
1000
1001
1002
1005
1006
1010
1011
1012
1013
1021
1022

i need the output to be like :

from To
==== =====
1000 - 1002
1005 - 1006
1010 - 1013
1021 - 1022

but without having an inner view which gets all the col values as i have about 10 million records in the table which will make it very slow.

thanks in advance.
Tom Kyte

Followup  

February 17, 2009 - 4:48 pm UTC

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:3225867788098#3252061695684

use that concept and modify as you need.

Available

February 18, 2009 - 3:14 am UTC

Reviewer: A reader

Dear Tom,

Thanks fro your quick response. The link provided was very useful. but how can i knwo the "Available Ranges" and not the "Missing Ranges" ? It is a bit challenging for me.

Thanks in advance
Tom Kyte

Followup  

February 18, 2009 - 3:10 pm UTC

you are a sql coder.... come on... did you give it a try (you know how you learn, you take a program and you try to solve it - rather than have someone solve it for you.....)


You see the lag/lead trick, you know you can get the prior and the next row married up with every row in the table after sorting....

So, if you know that the row "in front" (lag) of you has an X value that is not your X values minus 1 - there was a gap.... you are the START of a range....

You know if your X value is not equal to the next (lead) X value minus one - you are the END of a range.....


so, something like:
select x, lag(x) over (order by x) last_x, lead(x) over (order by x),
       case when x <> nvl(lag(x) over (order by x),x)+1  then '<<<<=== start!' end start_range,
       case when x <> nvl(lead(x) over (order by x),x)-1 then '<<<<=== end!' end end_range
  from t


this will result in rows with

a) start/end both null - uninteresting, we'll discard them
b) start/end both NOT NULL - interesting, they reprensent a range unto themselves
c) start is not null, end is null - interesting - we need to find the end
d) start is null, end is not null - and here it is....

ops$tkyte%ORA11GR1> select x, lag(x) over (order by x) last_x, lead(x) over (order by x),
  2         case when x <> nvl(lag(x) over (order by x),x)+1  then '<<<<=== start!' end start_range,
  3         case when x <> nvl(lead(x) over (order by x),x)-1 then '<<<<=== end!' end end_range
  4    from t
  5  /

         X     LAST_X LEAD(X)OVER(ORDERBYX) START_RANGE    END_RANGE
---------- ---------- --------------------- -------------- ------------
         3                                4 <<<<=== start!
         4          3                     5
         5          4                    10                <<<<=== end!
        10          5                    11 <<<<=== start!
        11         10                    13                <<<<=== end!
        13         11                    20 <<<<=== start! <<<<=== end!
        20         13                    25 <<<<=== start! <<<<=== end!
        25         20                    26 <<<<=== start!
        26         25                                      <<<<=== end!

9 rows selected.


that is an example.

so, if we take that query and keep just:

where start_range is not null or end_range is not null


ops$tkyte%ORA11GR1> select *
  2    from (
  3  select x, lag(x) over (order by x) last_x, lead(x) over (order by x),
  4         case when x <> nvl(lag(x) over (order by x),x)+1  then '<<<<=== start!' end start_range,
  5         case when x <> nvl(lead(x) over (order by x),x)-1 then '<<<<=== end!' end end_range
  6    from t
  7         )
  8   where start_range is not null or end_range is not null
  9   order by x;

         X     LAST_X LEAD(X)OVER(ORDERBYX) START_RANGE    END_RANGE
---------- ---------- --------------------- -------------- ------------
         3                                4 <<<<=== start!
         5          4                    10                <<<<=== end!
        10          5                    11 <<<<=== start!
        11         10                    13                <<<<=== end!
        13         11                    20 <<<<=== start! <<<<=== end!
        20         13                    25 <<<<=== start! <<<<=== end!
        25         20                    26 <<<<=== start!
        26         25                                      <<<<=== end!

8 rows selected.



we'll get everything we need. We just need to marry up the start/end rows, collapse them out. lead will do that nicely!



ops$tkyte%ORA11GR1> select x start_x, case when end_range is null then lead(x) over (order by x) else x end end_x, start_range
  2    from (
  3  select x, lag(x) over (order by x) last_x, lead(x) over (order by x),
  4         case when x <> nvl(lag(x) over (order by x),x)+1  then '<<<<=== start!' end start_range,
  5         case when x <> nvl(lead(x) over (order by x),x)-1 then '<<<<=== end!' end end_range
  6    from t
  7         )
  8   where start_range is not null or end_range is not null
  9   order by x;

   START_X      END_X START_RANGE
---------- ---------- --------------
         3          5 <<<<=== start!
         5          5
        10         11 <<<<=== start!
        11         11
        13         13 <<<<=== start!
        20         20 <<<<=== start!
        25         26 <<<<=== start!
        26         26

8 rows selected.



but we only care out rows where start_range is not null so....

ops$tkyte%ORA11GR1> select start_x, end_x
  2    from (
  3  select x start_x, case when end_range is null then lead(x) over (order by x) else x end end_x, start_range
  4    from (
  5  select x, lag(x) over (order by x) last_x, lead(x) over (order by x),
  6         case when x <> nvl(lag(x) over (order by x),x)+1  then '<<<<=== start!' end start_range,
  7         case when x <> nvl(lead(x) over (order by x),x)-1 then '<<<<=== end!' end end_range
  8    from t
  9         )
 10   where start_range is not null or end_range is not null
 11         )
 12   where start_range is not null
 13   order by start_x;

   START_X      END_X
---------- ----------
         3          5
        10         11
        13         13
        20         20
        25         26


and if you go back to my first query that shows all of the X's in order, you can see that I have the ranges now....

available ranges

February 18, 2009 - 4:35 am UTC

Reviewer: Ammar Khwaira from Jordan

Dear Tom,

regarding my last review above, i could write the query to get the available ranges , it is like :


select s "from serial", nvl(lead(prev_serial) over (order by s) , (select max(s) from amm) ) "to_serial"

from (

select s,prev_serial

from ( select s, nvl(lag(s) over ( order by s) , -999) prev_serial
from (select s from amm
)

)
where s- 1 <> prev_serial
)
;

I hope it is a good one and works well.

Thanks for you support.
Ammar Khwaira
Tom Kyte

Followup  

February 18, 2009 - 3:27 pm UTC

excellent :)

see above for another way (i posted before I read this...)




April 28, 2009 - 4:02 pm UTC

Reviewer: Reader

create table tst
(v_id number
,v_val varchar2(20)
);

insert into tst values (1,1000);
insert into tst values (2,2000);
insert into tst values (3,3000);
insert into tst values (4,4000);
insert into tst values (5,5000);
insert into tst values (6,6000);
insert into tst values (7,1000);
insert into tst values (8,2000);
insert into tst values (9,3000);
insert into tst values (10,4000);
insert into tst values (11,5000);
commit;

I want to get the sum of 10 rows startingjavascript:doSubmit('SUBMIT_REVIEW')
Submit Review from the 5 th record for each v_id (odered by v_id is ascending order). Can you advice if this can be done using analytic fucntions?
Tom Kyte

Followup  

April 28, 2009 - 4:17 pm UTC

I want to get the sum of 10 rows starting from the 5 th record for each v_id (odered by v_id is ascending
order).



I didn't follow that - please explain in more detail.

April 28, 2009 - 6:09 pm UTC

Reviewer: Reader

For v_id=1, I need to calculate the sum of v_val starting from the v_id=5 and ending at v_id=10
For v_id=2, I need to calculate the sum of v_val starting from v_id=7 and ending at v_id=10
Basically, I need to skip the first 4 records for each v_id and take the sum of v_val from 5th record to 10 records from the beginning.

select v_id,v_val,
sum(v_val) over (order by v_id rows between 5 following and 10 following) sm
from tst

Not sure if I am clear, Please let me know.
Tom Kyte

Followup  

April 29, 2009 - 8:37 am UTC

v_id=1 skip to v_id=5, skip 2,3,4 (three)    - add up 5,6,7,8,9,10 (six)
v_id=2 skip to v_id=7, skip 3,4,5,6 (four)   - add up 7,8,9,10 (four)


interesting, if you ask me that means when we get to v_id=3, we should skip 4,5,6,7,8 (I see a series here)

what pattern do you see???

come on - please....


(and you seem to already KNOW how to do this?? your sum() is picking off some rows and summing them, what is missing that makes it not possible for you to finish?)

ftd, mtd, ytd

May 19, 2009 - 4:09 am UTC

Reviewer: w from APAC

Hi Tom,

Currently in our DWH, we are using a materialized view to storage the amount at an aggregate level. when we need to do ftd (for the day), mtd (month till date), ytd (year till date) comparisons, we end up writing three different queries for the different periods. We are planning to store all the three figures precomputed in the MV and have achieved that using parition by. one issue we are currently facing is that the MV has rolled up data on various dimensions as below

CREATE TABLE PROD.TEST_ANALY
(
OFF NUMBER(10),
PROD NUMBER(10),
AGENT NUMBER(10),
SUBAGENT VARCHAR2(50 BYTE),
INDI NUMBER(3),
DATE1 DATE,
MONTH1 VARCHAR2(20 BYTE),
YEAR1 VARCHAR2(20 BYTE),
AMOUNT1 NUMBER
)


SET DEFINE OFF;
Insert into TEST_ANALY
(OFF, PROD, AGENT, SUBAGENT, INDI,
DATE1, MONTH1, YEAR1, AMOUNT1)
Values
(1901, 1002, 113644, '1012-122', NULL,
TO_DATE('04/28/2009 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'APR.2009', '2009', 6426);
Insert into TEST_ANALY
(OFF, PROD, AGENT, SUBAGENT, INDI,
DATE1, MONTH1, YEAR1, AMOUNT1)
Values
(1301, 1802, 196829, NULL, NULL,
TO_DATE('04/06/2009 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'APR.2009', '2009', 411);
Insert into TEST_ANALY
(OFF, PROD, AGENT, SUBAGENT, INDI,
DATE1, MONTH1, YEAR1, AMOUNT1)
Values
(1801, 1802, 2081, NULL, NULL,
TO_DATE('05/04/2009 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'MAY.2009', '2009', 617);
Insert into TEST_ANALY
(OFF, PROD, AGENT, SUBAGENT, INDI,
DATE1, MONTH1, YEAR1, AMOUNT1)
Values
(2401, 1803, 293347, NULL, NULL,
TO_DATE('04/08/2009 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'APR.2009', '2009', 0);
Insert into TEST_ANALY
(OFF, PROD, AGENT, SUBAGENT, INDI,
DATE1, MONTH1, YEAR1, AMOUNT1)
Values
(2403, 1818, 1799, NULL, NULL,
TO_DATE('04/29/2009 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'APR.2009', '2009', 7589);
Insert into TEST_ANALY
(OFF, PROD, AGENT, SUBAGENT, INDI,
DATE1, MONTH1, YEAR1, AMOUNT1)
Values
(1401, 1818, 197, '0', NULL,
TO_DATE('04/09/2009 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'APR.2009', '2009', 3006);
Insert into TEST_ANALY
(OFF, PROD, AGENT, SUBAGENT, INDI,
DATE1, MONTH1, YEAR1, AMOUNT1)
Values
(1202, 1801, 49587, '90014594', NULL,
TO_DATE('04/24/2009 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'APR.2009', '2009', 3369);
Insert into TEST_ANALY
(OFF, PROD, AGENT, SUBAGENT, INDI,
DATE1, MONTH1, YEAR1, AMOUNT1)
Values
(1901, 8401, 1849, NULL, NULL,
TO_DATE('04/28/2009 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'APR.2009', '2009', 0);
Insert into TEST_ANALY
(OFF, PROD, AGENT, SUBAGENT, INDI,
DATE1, MONTH1, YEAR1, AMOUNT1)
Values
(1901, 1802, 1849, NULL, NULL,
TO_DATE('04/15/2009 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'APR.2009', '2009', 501);
Insert into TEST_ANALY
(OFF, PROD, AGENT, SUBAGENT, INDI,
DATE1, MONTH1, YEAR1, AMOUNT1)
Values
(1901, 9901, 233575, NULL, NULL,
TO_DATE('04/21/2009 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'APR.2009', '2009', 1000);
Insert into TEST_ANALY
(OFF, PROD, AGENT, SUBAGENT, INDI,
DATE1, MONTH1, YEAR1, AMOUNT1)
Values
(1101, 8401, 304173, NULL, NULL,
TO_DATE('05/13/2009 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'MAY.2009', '2009', 7285);
COMMIT;

we are using the below query to get the mtd & ytd data.

select off1, prod, agent, subagent, indi, date1, month1, year1, amount1,
sum(amount1) over (partition by off1, agent, subagent, prod, indi, month1 order by date1) mtd,
sum(amount1) over (partition by off1, agent, subagent, prod, indi, year1 order by date1) ytd
from test_analy
group by off1, prod, agent, subagent, indi, date1, month1, year1,amount1

The issue is when on this aggregated data if we want to see only office wise, mtd, ytd for a particular date using

select off1, sum(mtd), sum(ytd) from (
select off1, prod, agent, subagent, indi, date1, month1, year1, amount1,
sum(amount1) over (partition by off1, agent, subagent, prod, indi, month1 order by date1) mtd,
sum(amount1) over (partition by off1, agent, subagent, prod, indi, year1 order by date1) ytd
from test_analy
where date1=to_date('21-apr-2009')
group by off1, prod, agent, subagent, indi, date1, month1, year1,amount1)
group by off1

these figures do not match with the actual month and year. is there some flaw in logic?

version 10.2.0.3

mtd

May 19, 2009 - 7:19 am UTC

Reviewer: w from APAC

figured out that not all the 6 dimensions have the value on the queried date so hence it is giving lesser amount.

problem:

storing the amount on 6 dimensions & by date. not all the dimension will have transaction on every date. one option is to fill the dates and for all unique records of the dimension, store the date, where no amount store 0. here there will issue with large dataset. since the combinations are huge. is there a better way to accomplish this?

hope this is not confusing.

generating aggregates - analytic functions

August 09, 2009 - 3:48 am UTC

Reviewer: Dev Madeka from Sydney, Australia

Hi Tom
Not sure if you consider this a new question or a follow-up.
Hi Tom,
We have a datawarehouse which has about 15 yrs worth of data. One of the Materialised views that we maintain in this DWH has 3 kinds of aggreates - Rolling Quarter (QTR) , Year to Date (YTD) and Moving Annual total (MAT). The sales history stores both Dollars (_D) and units (_U). Currently this view is built procedurally and takes roughly 12 hrs to build - this is the major hog in our load process. We get data every month for last 60 months from an external provider (since old data can change we have to drop the last 59 months of data and load 60 months)

I have tried to use analytical functions to automate creation of this view however since the sales fact is dense (no history when sales were zero) - I have to do a cartersian join on pack, month and geography to make sure I have rows for QTR,YTD and MAT sums. However when I roll this out to production the cartesian join is going to blow up the no. of rows in the query to about 4.7 billion + (1600 geography, 180 months, 16500 packs). A lot of packs haven't sold in years and all I need to do is ensure that a pack has 12 rows following a sale in any geography for the following months so that MAT calculations are correct. I can probably add a filter which will remove any rows where all the calculations result in zero - but am afraid to try this on our prod database. Is there a smarter way to do this ?


CREATE TABLE "TEST_DT"
( "DT_CODE" NUMBER(5,0) NOT NULL ENABLE,
"DT" DATE NOT NULL ENABLE,
"DT_CHAR" VARCHAR2(20 BYTE) NOT NULL ENABLE,
CONSTRAINT "TEST_DT_PK" PRIMARY KEY ("DT_CODE")
);



CREATE TABLE "TEST_GEO"
( "GEO_CODE" NUMBER(5,0) NOT NULL ENABLE,
"GEO_NAME" VARCHAR2(20 BYTE) NOT NULL ENABLE,
CONSTRAINT "TEST_GEO_PK" PRIMARY KEY ("GEO_CODE")
);




CREATE TABLE "TEST_PACK"
( "PACK_CODE" NUMBER(5,0) NOT NULL ENABLE,
"PACK_NAME" VARCHAR2(20 BYTE) NOT NULL ENABLE,
CONSTRAINT "PACK_GEO_PK" PRIMARY KEY ("PACK_CODE")
);



CREATE TABLE"TEST_SALES"
( "DT_CODE" NUMBER(5,0) NOT NULL ENABLE,
"GEO_CODE" NUMBER(5,0) NOT NULL ENABLE,
"PACK_CODE" NUMBER(5,0) NOT NULL ENABLE,
"MTH_D" NUMBER,
"MTH_U" NUMBER,
CONSTRAINT "TEST_SALES_PK" PRIMARY KEY ("DT_CODE", "GEO_CODE", "PACK_CODE")
);



REM INSERTING into TEST_DT
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (2,to_timestamp('01/FEB/08','DD/MON/RR HH12:MI:SSXFF AM'),'01-Feb-2008');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (3,to_timestamp('01/MAR/08','DD/MON/RR HH12:MI:SSXFF AM'),'01-Mar-2008');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (4,to_timestamp('01/APR/08','DD/MON/RR HH12:MI:SSXFF AM'),'01-Apr-2008');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (5,to_timestamp('01/MAY/08','DD/MON/RR HH12:MI:SSXFF AM'),'01-May-2008');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (6,to_timestamp('01/JUN/08','DD/MON/RR HH12:MI:SSXFF AM'),'01-Jun-2008');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (7,to_timestamp('01/JUL/08','DD/MON/RR HH12:MI:SSXFF AM'),'01-Jul-2008');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (8,to_timestamp('01/AUG/08','DD/MON/RR HH12:MI:SSXFF AM'),'01-Aug-2008');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (9,to_timestamp('01/SEP/08','DD/MON/RR HH12:MI:SSXFF AM'),'01-Sep-2008');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (10,to_timestamp('01/OCT/08','DD/MON/RR HH12:MI:SSXFF AM'),'01-Oct-2008');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (11,to_timestamp('01/NOV/08','DD/MON/RR HH12:MI:SSXFF AM'),'01-Nov-2008');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (12,to_timestamp('01/DEC/08','DD/MON/RR HH12:MI:SSXFF AM'),'01-Dec-2008');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (13,to_timestamp('01/JAN/09','DD/MON/RR HH12:MI:SSXFF AM'),'01-Jan-2009');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (14,to_timestamp('01/FEB/09','DD/MON/RR HH12:MI:SSXFF AM'),'01-Feb-2009');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (15,to_timestamp('01/MAR/09','DD/MON/RR HH12:MI:SSXFF AM'),'01-Mar-2009');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (16,to_timestamp('01/APR/09','DD/MON/RR HH12:MI:SSXFF AM'),'01-Apr-2009');
Insert into TEST_DT (DT_CODE,DT,DT_CHAR) values (1,to_timestamp('01/JAN/08','DD/MON/RR HH12:MI:SSXFF AM'),'01-Jan-2008');


REM INSERTING into TEST_GEO
Insert into TEST_GEO (GEO_CODE,GEO_NAME) values (20000,'SYDNEY');
Insert into TEST_GEO (GEO_CODE,GEO_NAME) values (20001,'MELBOURNE');
Insert into TEST_GEO (GEO_CODE,GEO_NAME) values (20002,'BRISBANE');

REM INSERTING into TEST_PACK
Insert into TEST_PACK (PACK_CODE,PACK_NAME) values (1000,'SONITES');
Insert into TEST_PACK (PACK_CODE,PACK_NAME) values (2000,'VODITES');


REM INSERTING into TEST_SALES
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (1,20000,1000,2250,75);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (1,20001,1000,2310,77);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (1,20002,1000,1140,38);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (2,20000,1000,1650,55);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (2,20001,1000,2400,80);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (3,20000,1000,2460,82);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (3,20001,1000,210,7);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (3,20002,1000,1830,61);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (4,20000,1000,1590,53);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (4,20002,1000,2400,80);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (5,20001,1000,1620,54);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (5,20002,1000,480,16);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (6,20000,1000,180,6);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (6,20001,1000,2220,74);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (6,20002,1000,2400,80);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (7,20000,1000,2070,69);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (7,20002,1000,1230,41);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (8,20000,1000,2850,95);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (8,20001,1000,2430,81);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (8,20002,1000,540,18);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (9,20000,1000,2760,92);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (9,20001,1000,900,30);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (9,20002,1000,270,9);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (10,20000,1000,1080,36);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (10,20002,1000,420,14);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (11,20000,1000,600,20);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (11,20001,1000,2490,83);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (11,20002,1000,2490,83);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (12,20000,1000,330,11);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (12,20002,1000,1980,66);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (13,20000,1000,180,6);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (13,20001,1000,990,33);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (14,20000,1000,1290,43);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (14,20001,1000,360,12);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (14,20002,1000,1950,65);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (15,20000,1000,1140,38);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (15,20002,1000,930,31);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (16,20000,1000,750,25);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (16,20002,1000,1230,41);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (1,20000,2000,250,5);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (1,20001,2000,2010,67);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (2,20000,2000,360,12);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (2,20001,2000,330,11);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (3,20000,2000,2070,69);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (3,20001,2000,2190,73);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (3,20002,2000,870,29);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (4,20000,2000,930,31);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (4,20001,2000,750,25);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (5,20000,2000,120,4);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (5,20001,2000,1860,62);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (5,20002,2000,1260,42);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (6,20000,2000,2940,98);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (6,20001,2000,2790,93);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (6,20002,2000,150,5);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (7,20001,2000,1050,35);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (7,20002,2000,510,17);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (8,20000,2000,1080,36);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (8,20001,2000,2820,94);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (8,20002,2000,960,32);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (9,20000,2000,60,2);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (9,20001,2000,1320,44);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (9,20002,2000,1230,41);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (10,20000,2000,1020,34);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (10,20001,2000,1050,35);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (10,20002,2000,2880,96);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (11,20001,2000,2040,68);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (11,20002,2000,750,25);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (12,20001,2000,2250,75);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (12,20002,2000,2100,70);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (13,20001,2000,390,13);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (13,20002,2000,2700,90);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (14,20000,2000,2910,97);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (14,20001,2000,1770,59);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (14,20002,2000,1410,47);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (15,20001,2000,990,33);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (15,20002,2000,60,2);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (16,20000,2000,1050,35);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (16,20001,2000,2250,75);
Insert into TEST_SALES (DT_CODE,GEO_CODE,PACK_CODE,MTH_D,MTH_U) values (16,20002,2000,1230,41);


--query to generate the MV
select all_combi.dt, all_combi.geo_code, all_combi.pack_code,
nvl(mth_d,0) mth_d,
nvl(mth_u,0) mth_u,
sum(nvl(mth_d,0)) over
(PARTITION BY all_combi.geo_code, all_combi.pack_code order by all_combi.dt rows 2 preceding) qtr_d,
sum(nvl(mth_u,0)) over
(PARTITION BY all_combi.geo_code, all_combi.pack_code order by all_combi.dt rows 2 preceding) qtr_u,
sum(nvl(mth_d,0)) over
(PARTITION BY all_combi.geo_code, all_combi.pack_code order by all_combi.dt rows 11 preceding) mat_d,
sum(nvl(mth_u,0)) over
(PARTITION BY all_combi.geo_code, all_combi.pack_code order by all_combi.dt rows 11 preceding) mat_u,
sum(nvl(mth_d,0)) over
(PARTITION BY all_combi.geo_code, all_combi.pack_code order by all_combi.dt rows
to_char(dt,'MM')-1 preceding) ytd_d,
sum(nvl(mth_u,0)) over
(PARTITION BY all_combi.geo_code, all_combi.pack_code order by all_combi.dt rows
to_char(dt,'MM')-1 preceding) ytd_u
from
(select dt_code, dt, geo_code, pack_code
from test_dt, test_geo, test_pack ) all_combi,
test_sales
where
all_combi.dt_code = test_sales.dt_code(+)
and all_combi.geo_code = test_sales.geo_code(+)
and all_combi.pack_code = test_sales.pack_code(+)
order by all_combi.pack_code, all_combi.dt, all_combi.geo_code




Tom Kyte

Followup  

August 11, 2009 - 1:33 pm UTC

how can you use analytics (which do not aggregate rows, they preserve the rows) to generate your aggregates?

and since the analytics would not be incrementally refreshable - would this be a good idea at all?

seems you would just build quarterly rollups which could be used to generate the other rollups?

using Analytics function to calculate sale for last three months.

September 16, 2009 - 6:11 am UTC

Reviewer: Prats

Tom,

I need to write a query to provide sum of sale amount for last three month. I tried going through your website and tried few of the options availble on this page, but couldn't come up with a suiatble solution. Please HELP...

create table sales (saledate date, amount number);

insert into sales values('01/01/2009', 10);
insert into sales values('02/01/2009', 1);
insert into sales values('03/01/2009', NULL);
insert into sales values('04/01/2009', 5);
insert into sales values('05/01/2009', 2);
insert into sales values('06/01/2009', 4);
insert into sales values('07/01/2009', 7);
insert into sales values('08/01/2009', 8);
insert into sales values('09/01/2009', 2);
insert into sales values('10/01/2009', 4);
insert into sales values('11/01/2009', 3);
insert into sales values('12/01/2009', 0);

Desired solution :
12/01/2009 7
11/01/2009 9
10/01/2009 14
09/01/2009 17
08/01/2009 19
......
......
......

Tom Kyte

Followup  

September 16, 2009 - 4:22 pm UTC

ops$tkyte%ORA11GR1> select saledate, amount, add_months(saledate,-2),
  2         sum(amount) over (order by saledate asc range saledate-add_months(saledate,-2) preceding) sum_amt
  3    from sales
  4   order by saledate DESC
  5  /

SALEDATE       AMOUNT ADD_MONTHS    SUM_AMT
---------- ---------- ---------- ----------
12/01/2009          0 10/01/2009          7
11/01/2009          3 09/01/2009          9
10/01/2009          4 08/01/2009         14
09/01/2009          2 07/01/2009         17
08/01/2009          8 06/01/2009         19
07/01/2009          7 05/01/2009         13
06/01/2009          4 04/01/2009         11
05/01/2009          2 03/01/2009          7
04/01/2009          5 02/01/2009          6
03/01/2009            01/01/2009         11
02/01/2009          1 12/01/2008         11
01/01/2009         10 11/01/2008         10

12 rows selected.



September 24, 2009 - 8:41 am UTC

Reviewer: Prats

Thanks for the query Tom.
One small problem I have been struck with last few days is handling of missing months.
Say if I don't have an entry for April09. Query output wouldn't show the record corresponding to April09.
Can we do something about it.Please guide...


SALEDATE AMOUNT ADD_MONTHS SUM_AMT
---------- ---------- ---------- ----------
07/01/2009 7 05/01/2009 13
06/01/2009 4 04/01/2009 11
05/01/2009 2 03/01/2009 7
>>>04/01/2009 5 02/01/2009 6<<<
03/01/2009 01/01/2009 11
02/01/2009 1 12/01/2008 11
01/01/2009 10 11/01/2008 10



Tom Kyte

Followup  

September 29, 2009 - 7:55 am UTC

ops$tkyte%ORA11GR1> select * from sales;

SALEDATE       AMOUNT
---------- ----------
01/01/2009         10
02/01/2009          1
03/01/2009
05/01/2009          2
06/01/2009          4
07/01/2009          7
08/01/2009          8
09/01/2009          2
10/01/2009          4
11/01/2009          3
12/01/2009          0

11 rows selected.

ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> with min_date as (select min(saledate) min_sd from sales),
  2       max_date as (select max(saledate) max_sd from sales),
  3           dates as (select add_months(min_sd,level-1) dt
  4                       from min_date, max_date
  5                            connect by level <= months_between(max_sd,min_sd)+1)
  6  select saledate, amount, add_months(saledate,-2),
  7         sum(amount) over (order by saledate asc range saledate-add_months(saledate,-2) preceding) sum_amt
  8    from (select dates.dt as saledate,
  9                 nvl(sales.amount,0) amount
 10            from dates, sales
 11                   where dates.dt = sales.saledate(+))
 12   order by saledate DESC
 13  /

SALEDATE       AMOUNT ADD_MONTHS    SUM_AMT
---------- ---------- ---------- ----------
12/01/2009          0 10/01/2009          7
11/01/2009          3 09/01/2009          9
10/01/2009          4 08/01/2009         14
09/01/2009          2 07/01/2009         17
08/01/2009          8 06/01/2009         19
07/01/2009          7 05/01/2009         13
06/01/2009          4 04/01/2009          6
05/01/2009          2 03/01/2009          2
04/01/2009          0 02/01/2009          1
03/01/2009          0 01/01/2009         11
02/01/2009          1 12/01/2008         11
01/01/2009         10 11/01/2008         10

12 rows selected.


Lag and Lead (analytic function)

September 29, 2009 - 6:03 am UTC

Reviewer: Daphnia from INDIA

Tom,

I need your help.

Kindly see the below data.
-------------------------------------
REC_ID YEAR_MONTH YR VAL
-------------------------------------
10 200101 2001 null --> ignored
10 200102 2001 null --> ignored
10 200103 2001 2356
10 200104 2001 7898 ----> upper value(uv)
10 200105 2001 null->1+(uv+lv/count)= 7239
10 200106 2001 null->2+(uv+lv/count)= 7240
10 200107 2001 6578 -----> lower value(lv)
10 200108 2001 6878 ----> upper value(uv)
10 200109 2001 null
10 200110 2001 null
10 200111 2001 null
10 200112 2001 8078 -----> lower value(lv)
---------------------------------------
i need to calculate the null values by using the upper value(the value just above the null) and
lower value(the value just after the null) and total null count. the logic goes
like,n+(uv+lv/count) and fill the result in the null sets.. the count is the inbetween count of the
uv and lv.Moreover nulls before the upper value must be ignored for the red_id 10.how can we
achieve this? kindly help me. the query which i tried is,

SELECT YEAR_MONTH,REC_ID,val,
last_VALUE(val IGNORE NULLS) over (PARTITION BY REC_ID ORDER BY YEAR_MONTH) last_val,
CASE WHEN (val IS NOT NULL AND previous_value IS NOT NULL AND
next_value IS NULL) THEN 'BEFORE_NULL'
WHEN (val IS NOT NULL AND previous_value IS NULL AND
next_value IS NOT NULL) THEN 'AFTER_NULL'
END AS record_position ,
previous_value,
next_value
FROM (SELECT ROWNUM row_num,
REC_ID,
YEAR_MONTH,
YR,
VAL,
lag(val) over(PARTITION BY REC_ID ORDER BY YEAR_MONTH) previous_value,
lead(val) over(PARTITION BY REC_ID ORDER BY YEAR_MONTH) next_value
FROM TABLE_X
WHERE YR = 2001
AND REC_ID = 10
ORDER BY YEAR_MONTH);

but, how to calculate and update it over there in the null. kindly help me.
---------------------------------------------------------
scripts
---------------------------------------------------------
CREATE TABLE TABLE_X
(REC_ID NUMBER,
YEAR_MONTH NUMBER,
YR NUMBER,
VAL NUMBER);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200101, 2001, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200102, 2001, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200103, 2001, 7898);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200104, 2001, 2356);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200105, 2001, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200106, 2001, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200107, 2001, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200108, 2001, 6578);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200109, 2001, 6878);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200110, 2001, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200111, 2001, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200112, 2001, 8078);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200202, 2002, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200203, 2002, 14225);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200204, 2002, 20554);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200205, 2002, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200206, 2002, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200207, 2002, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200208, 2002, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200209, 2002, 52199);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200210, 2002, 58528);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200211, 2002, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200212, 2002, 71186);

--------------------------------------------------------

Awaiting your reply


Tom Kyte

Followup  

October 02, 2009 - 9:07 am UTC

I hate it when the same question pops up all over the place. Look in the other place you asked the same exact thing.


http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:16389714357623

Analytic functions (sum over partition) with rollup?

November 25, 2009 - 7:59 am UTC

Reviewer: johny from Philippines

Hi Tom,

Consider:

Oracle9i Enterprise Edition Release 9.2.0.6.0

CREATE TABLE reportstg
(
maid VARCHAR2(4),
maprice NUMBER(5,2),
mapart VARCHAR2(5),
irid VARCHAR2(4),
irdateyymm VARCHAR2(4),
irstatus VARCHAR2(4),
iramount NUMBER(5,2)
)

insert into reportstg
values ('ma1', 100.00, 'part1', 'ir1', to_char(sysdate, 'YYMM'), 'DOWN', 100.00);

insert into reportstg
values ('ma1', 100.00, 'part1', 'ir2', to_char(sysdate, 'YYMM'), 'UP', 100.00);

insert into reportstg
values ('ma1', 100.00, 'part1', 'ir3', to_char(add_months(sysdate, -2), 'YYMM'), 'DOWN', 200.00);

insert into reportstg
values ('ma1', 150.00, 'part2', 'ir4', to_char(sysdate,'YYMM'), 'DOWN', 50.00);

insert into reportstg
values ('ma1', 150.00, 'part2', 'ir5', to_char(sysdate,'YYMM'), 'UP', 60.00);

insert into reportstg
values ('ma1', 10.00, 'part3', 'ir6', to_char(add_months(sysdate, -1), 'YYMM'), 'UP', 20.00);

insert into reportstg
values ('ma1', 10.00, 'part4', 'ir7', to_char(sysdate, 'YYMM'), 'DOWN', 20.00);

insert into reportstg
values ('ma1', 10.00, 'part4', 'ir8', to_char(add_months(sysdate, -1), 'YYMM'), 'DOWN', 20.00);

insert into reportstg
values ('ma1', 15.00, 'part5', 'ir9', to_char(sysdate, 'YYMM'), 'DOWN', 10.00);

insert into reportstg
values ('ma2', 100.00, 'part6', 'ir7', to_char(add_months(sysdate, -2), 'YYMM'), 'DOWN', 500.00);

insert into reportstg
values ('ma2', 100.00, 'part6', 'ir8', to_char(sysdate, 'YYMM'), 'UP', 20.00);

insert into reportstg
values ('ma2', 150.00, 'part7', 'ir9', to_char(add_months(sysdate, -1), 'YYMM'), 'DOWN', 110.00);

insert into reportstg
values ('ma2', 10.00, 'part8', 'ir10', to_char(sysdate, 'YYMM'), 'DOWN', 120.00);

insert into reportstg
values ('ma3', 10.00, 'part9', 'ir11', to_char(sysdate, 'YYMM'), 'DOWN', 220.00);

select * from reportstg;

MAID MAPRICE MAPAR IRID IRDA IRST IRAMOUNT
---- ---------- ----- ---- ---- ---- ----------
ma1 100 part1 ir1 0911 DOWN 100
ma1 100 part1 ir2 0911 UP 100
ma1 100 part1 ir3 0909 DOWN 200
ma1 150 part2 ir4 0911 DOWN 50
ma1 150 part2 ir5 0911 UP 60
ma1 10 part3 ir6 0910 UP 20
ma1 10 part4 ir7 0911 DOWN 20
ma1 10 part4 ir8 0910 DOWN 20
ma1 15 part5 ir9 0911 DOWN 10
ma2 100 part6 ir7 0909 DOWN 500
ma2 100 part6 ir8 0911 UP 20

MAID MAPRICE MAPAR IRID IRDA IRST IRAMOUNT
---- ---------- ----- ---- ---- ---- ----------
ma2 150 part7 ir9 0910 DOWN 110
ma3 10 part9 ir11 0911 DOWN 220
ma2 10 part8 ir10 0911 DOWN 120

14 rows selected.

select distinct rpt.maid,
rpt.maprice,
rpt.mapart,
rem.downamt,
rem.upamt,
sum(rpt.iramount) over (partition by rpt.maid, rpt.mapart) yrtotal
from reportstg rpt,
(select maid,
maprice,
mapart,
max(decode(irstatus, 'DOWN', monthtotal, null)) downamt,
max(decode(irstatus, 'UP', monthtotal, null)) upamt
from (
select distinct maid,
maprice,
mapart,
irstatus,
sum(iramount) over (partition by maid, mapart, irdateyymm, irstatus) monthtotal
from reportstg
where irdateyymm = '0911'
)
group by maid, maprice, mapart) rem
where rpt.maid = rem.maid(+)
and rpt.mapart = rem.mapart(+)
and rpt.irstatus != 'UP'

MAID MAPRICE MAPAR DOWNAMT UPAMT YRTOTAL
---- ---------- ----- ---------- ---------- ----------
ma1 100 part1 100 100 300
ma1 150 part2 50 60 50
ma1 10 part4 20 40
ma1 15 part5 10 10
ma2 100 part6 20 500
ma2 150 part7 110
ma2 10 part8 120 120
ma3 10 part9 220 220

8 rows selected.

Basically, I wanted a query to return the total amount of status 'DOWN' and 'UP'
on the most recent irdateyymm ('0911'), pivot via status, then get the total for the entire year -
for the entire year total, exclude the 'UP' status.

Some quick questions:
a. Is my query good? Not sure if there's any easy way on this.
b. Is it possible to have the output something like this:

MAID MAPRICE MAPAR DOWNAMT UPAMT YRTOTAL
---- ---------- ----- ---------- ---------- ----------
ma1 100 part1 100 100 300
ma1 150 part2 50 60 50
ma1 10 part4 20 40
ma1 15 part5 10 10
ma1 total 180 400
ma2 100 part6 20 500
ma2 150 part7 110
ma2 10 part8 120 120
ma2 total 120 730
ma3 10 part9 220 220
ma3 total 220 220

I wanted to rollup the totals based on maid. Thanks in advance.
Tom Kyte

Followup  

November 27, 2009 - 3:43 pm UTC

ops$tkyte%ORA9IR2> select maid, decode( grouping(maprice), 1, 'Total:' ), maprice, mapart, downamt, upamt, sum(yrtotal)
  2    from (
  3  select distinct rpt.maid,
  4         rpt.maprice,
  5         rpt.mapart,
  6         rem.downamt,
  7         rem.upamt,
  8         sum(rpt.iramount) over (partition by rpt.maid, rpt.mapart) yrtotal
  9  from reportstg rpt,
 10  (select maid,
 11         maprice,
 12         mapart,
 13         max(decode(irstatus, 'DOWN', monthtotal, null)) downamt,
 14         max(decode(irstatus, 'UP', monthtotal, null)) upamt
 15    from (
 16         select distinct maid,
 17                maprice,
 18                mapart,
 19                irstatus,
 20                sum(iramount) over (partition by maid, mapart, irdateyymm,
 21  irstatus) monthtotal
 22         from reportstg
 23         where irdateyymm = '0911'
 24         )
 25  group by maid, maprice, mapart) rem
 26  where rpt.maid = rem.maid(+)
 27  and rpt.mapart = rem.mapart(+)
 28  and rpt.irstatus != 'UP'
 29        )
 30   group by grouping sets( (maid,maprice,mapart,downamt,upamt), (maid))
 31  /

MAID DECODE    MAPRICE MAPAR    DOWNAMT      UPAMT SUM(YRTOTAL)
---- ------ ---------- ----- ---------- ---------- ------------
ma1                 10 part4         20                      40
ma1                 15 part5         10                      10
ma1                100 part1        100        100          300
ma1                150 part2         50         60           50
ma1  Total:                                                 400
ma2                 10 part8        120                     120
ma2                100 part6                    20          500
ma2                150 part7                                110
ma2  Total:                                                 730
ma3                 10 part9        220                     220
ma3  Total:                                                 220

11 rows selected.

Analytic Functions

November 28, 2009 - 8:36 am UTC

Reviewer: johny from Philippines

Awesome. Thanks a lot! :)

Anaylytic Functions

December 18, 2009 - 11:54 am UTC

Reviewer: anon from UK

Hi Tom,

If I have table like this:

create table x (a number, b number);

insert into x values (1,1);
insert into x values (1,1);
insert into x values (1,2);
insert into x values (1,3);

Is it possible to select out a running count of distinct occurrences of b:

e.g. a b result
---------------
1 1 1
1 1 1
1 2 2
1 3 3

Tom Kyte

Followup  

December 18, 2009 - 1:06 pm UTC

http://asktom.oracle.com/Misc/comparative-window-functions.html#c6547676037055918655

demonstrates a technique for doing that.

Ranking date overlaps

December 22, 2009 - 7:55 pm UTC

Reviewer: A reader

Hi Tom,

How can I get ranking from datetime overlaps

CODE START_DAT END_DATE RN
---------- --------- --------- ----------
111 01-JAN-09 08:00 01-JAN-09 10:00 1
111 01-JAN-09 08:45 01-JAN-09 10:05 2
111 01-JAN-09 09:20 01-JAN-09 10:10 3
111 01-JAN-09 10:30 01-JAN-09 11:10 1
111 01-JAN-09 10:25 01-JAN-09 11:12 2
111 01-JAN-09 13:12 01-JAN-09 15:24 1
111 01-JAN-09 18:56 01-JAN-09 22:21 1

222 01-JAN-09 01:56 01-JAN-09 02:28 1
222 01-JAN-09 02:10 01-JAN-09 02:30 2
222 01-JAN-09 02:12 01-JAN-09 02:32 3
222 01-JAN-09 03:22 01-JAN-09 03:23 1
222 01-JAN-09 06:22 01-JAN-09 08:23 1





Tom Kyte

Followup  

December 31, 2009 - 11:12 am UTC

you'd have to start by defining what:

... ranking from datetime overlaps
...


means, it means nothing to me.

Generating max+1

January 19, 2010 - 9:53 pm UTC

Reviewer: AG from NY, USA

Tom,

Needed some help to write a query that would assign/generate max+1 when there is null.

create table test(loc_name varchar2(10), id varchar2(10), lk number);

insert into test('A','U1',0);
insert into test('A','U2',0);
insert into test('A','U3',0);
insert into test('A','U4',2);
insert into test('A','U5',3);
insert into test('A','U6',5);
insert into test('A','U7',null);
insert into test('A','U8',null);

select * from test;

loc id lk
---- -- --
A U1 0
A U2 0
A U3 0
A U4 2
A U5 3
A U6 5
A U7
A U8




Result:


loc id lk
---- -- --
A U1 0
A U2 0
A U3 0
A U4 2
A U5 3
A U6 5
A U7 6 <== (5+1)
A U8 7 <== (6+1)


Thanks in Advance for any help.

AG
Tom Kyte

Followup  

January 20, 2010 - 11:28 am UTC

I see no order by, the nulls will come out where ever they feel like it, they won't come out last in many cases.


and what if the data is:

insert into test('A','U1',0);
insert into test('A','U2',0);
insert into test('A','U3',0);
insert into test('A','U4',2);
insert into test('A','U5',3);
insert into test('A','U6',5);
insert into test('A','U7',null);
insert into test('A','U8',null);
insert into test('A','U9',6);

what then? more details, give all details, use and order by, think about strange situations and get back to us.

January 20, 2010 - 8:05 am UTC

Reviewer: AG from NY, USA

Tom,

Sorry about not posting the example clearly. Actually the requirement is to write a query that would assign/generate within a qroup/partition max+1 when there is null.

create table test(loc_name varchar2(10), id varchar2(10), lk number);

insert into test values('A','U1',0);
insert into test values('A','U2',0);
insert into test values('A','U3',0);
insert into test values('A','U4',2);
insert into test values('A','U5',3);
insert into test values('A','U6',5);
insert into test values('A','U7',null);
insert into test values('A','U8',null);
insert into test values('B','S1',0);
insert into test values('B','S2',0);
insert into test values('B','S3',null);
insert into test values('B','S4',null);


select * from test;

LOC_NAME ID LK
---------- ---------- ----------
A U1 0
A U2 0
A U3 0
A U4 2
A U5 3
A U6 5
A U7
A U8
B S1 0
B S2 0
B S3
B S4

12 rows selected.




Result:


loc id lk
---- -- --
A U1 0
A U2 0
A U3 0
A U4 2
A U5 3
A U6 5
A U7 6 <== (5+1)
A U8 7 <== (6+1)
B S1 0
B S2 0
B S3 1 <== (0+1)
B S4 2 <== (1+1)


Thanks in Advance for any help.

AG

Tom Kyte

Followup  

January 20, 2010 - 11:38 am UTC

See above, add my U9 to the example and explain how that works.

Analytical function

January 29, 2010 - 1:48 am UTC

Reviewer: Sudhakar from India

Hi Tom,

create table daily_data(id number, status_id number, date_time date)

insert into daily_data values(1,0,to_date('12/4/2009 9:40:06','mm/dd/yyyy hh:mi:ss'));
insert into daily_data values(1,0,to_date('12/4/2009 9:47:06','mm/dd/yyyy hh:mi:ss'));
insert into daily_data values(1,1,to_date('12/4/2009 9:57:06','mm/dd/yyyy hh:mi:ss'));
insert into daily_data values(1,0,to_date('12/4/2009 10:57:06','mm/dd/yyyy hh:mi:ss'));
insert into daily_data values(1,0,to_date('12/4/2009 11:10:06','mm/dd/yyyy hh:mi:ss'));
insert into daily_data values(1,1,to_date('12/4/2009 11:17:06','mm/dd/yyyy hh:mi:ss'));
insert into daily_data values(1,1,to_date('12/4/2009 11:47:06','mm/dd/yyyy hh:mi:ss'));

As per above data I need to get the out put as time gap between staus_id "1" and next immediate status id "0"

For example
-----------
1) 12/4/2009 9:57:06 - 12/4/2009 10:57:06 =60 minutes
2) 12/4/2009 11:17:06 - 12/4/2009 11:47:06 = 30 minutes.

Could you please help me provide a sequel to give output as above specified.

Best regards & Thank you,
Sudhakar
Tom Kyte

Followup  

January 29, 2010 - 4:18 pm UTC

your output doesn't compute.


the time 11:47:06 corresponds to a status_id = 1 record, it is not the next 0 record.

Analytic Functions

February 03, 2010 - 11:20 am UTC

Reviewer: Stephan from SN

Hi Tom,
i have below table data

select * from test.
fr_dt      to_dt       amount
01/01/2010 31/12/2010   1000
01/01/2011 31/12/2011   1000
01/01/2012 31/12/2012   1400
01/01/2013 31/12/2013   1400
01/01/2014 31/12/2014   1400
01/01/2015 31/12/2015   1400
01/01/2016 31/12/2016   1000
01/01/2017 31/12/2017   1000
01/01/2018 31/12/2018   1000

how to write a query to output as below.
fr_dt      to_dt       amount
01/01/2010 31/12/2011   1000
01/01/2012 31/12/2015   1400
01/01/2016 31/12/2018   1000

Please help.
Tom Kyte

Followup  

February 03, 2010 - 12:28 pm UTC

... how to write a query to output as below. ...

use sql. I would.


You'll be able to do it because

a) you have the create table ( I don't)
b) you have the inserts ready to go (i don't)

It looks pretty easy. If I had a create table and inserts, I could even demonstrate it for you :)

remember my nls_date_format isn't yours, make it easy to run your test script.

Analytic Functions

February 03, 2010 - 5:19 pm UTC

Reviewer: Stephan from SN

Thanks for your swift reply.

here is insert statements

create table test (fr_dt date,to_dt date,amount number);
insert into test values(to_date('01/01/2010','dd/mm/yyyy'),to_date('31/12/2010','dd/mm/yyyy'),1000);
insert into test values(to_date('01/01/2011','dd/mm/yyyy'),to_date('31/12/2011','dd/mm/yyyy'),1000);
insert into test values(to_date('01/01/2012','dd/mm/yyyy'),to_date('31/12/2012','dd/mm/yyyy'),1400);
insert into test values(to_date('01/01/2013','dd/mm/yyyy'),to_date('31/12/2013','dd/mm/yyyy'),1400);
insert into test values(to_date('01/01/2014','dd/mm/yyyy'),to_date('31/12/2014','dd/mm/yyyy'),1400);
insert into test values(to_date('01/01/2015','dd/mm/yyyy'),to_date('31/12/2015','dd/mm/yyyy'),1400);
insert into test values(to_date('01/01/2016','dd/mm/yyyy'),to_date('31/12/2016','dd/mm/yyyy'),1000);
insert into test values(to_date('01/01/2017','dd/mm/yyyy'),to_date('31/12/2017','dd/mm/yyyy'),1000);
insert into test values(to_date('01/01/2018','dd/mm/yyyy'),to_date('31/12/2018','dd/mm/yyyy'),1000);

Tom Kyte

Followup  

February 04, 2010 - 12:20 pm UTC

ops$tkyte%ORA10GR2> select min(fr_dt), max(to_dt), max(amount)
  2    from (
  3  select fr_dt, to_dt, amount, last_value(grp ignore nulls) over (order by fr_dt) carry_down
  4    from (
  5  select fr_dt, to_dt, amount,
  6         case when nvl(lag(amount) over (order by fr_dt),-amount) <> amount
  7              then row_number() over (order by fr_dt)
  8          end grp
  9    from test
 10         )
 11         )
 12   group by carry_down
 13   order by 1;

MIN(FR_DT MAX(TO_DT MAX(AMOUNT)
--------- --------- -----------
01-JAN-10 31-DEC-11        1000
01-JAN-12 31-DEC-15        1400
01-JAN-16 31-DEC-18        1000

Ugly but works... I think?

February 04, 2010 - 8:47 am UTC

Reviewer: A reader from UK

select fr_dt, to_dt, amount
from
(
select case when start_pos is not null then fr_dt end as fr_dt,
lead(case when end_pos is not null then to_dt end ) over (order by fr_dt) as to_dt,
amount
from
(
select fr_dt, to_dt, amount, start_pos, case when ( start_pos is null and lead(start_pos,1,1) over (order by to_dt) is not null ) then 'End' else null end as end_pos
from
(
select fr_dt, to_dt, amount,
case when amount = lag(amount) over (order by to_dt) then null else 'Start' end as start_pos
from test
)
)
where start_pos is not null or end_pos is not null
)
where fr_dt is not null and to_dt is not null

Analytic Functions

February 04, 2010 - 2:40 pm UTC

Reviewer: Stephan from SN

Thanks Tom, you're the best :)

Only 2 SELECTs

February 05, 2010 - 12:30 pm UTC

Reviewer: CezarN from ROMANIA

select distinct
case when b=e or b>0 then fr_dt else lag(fr_dt) over (order by fr_dt) end as fr_dt,
case when e>0 then to_dt else lead(to_dt) over (order by fr_dt) end as to_dt,
amount
from
(
select case when lag(t.amount,1,-t.amount) over (order by fr_dt) =t.amount then 0 else row_number() over(order by fr_dt) end b,
case when lead(t.amount,1,-t.amount) over (order by fr_dt)=t.amount then 0 else row_number() over(order by fr_dt) end e,
t.*
from test t
)
where b+e<>0

eliminate records through analytics

June 01, 2010 - 12:52 pm UTC

Reviewer: Sanji from Shelton, CT

Tom,

Scenario

create table test1 (party_id varchar2(5), sam_flag varchar2(1));

insert into test1 values('123','N');
insert into test1 values('456','N');
insert into test1 values('789','N');
insert into test1 values('ABC','Y');
insert into test1 values('ABC','Y');
insert into test1 values('ABC','N');
insert into test1 values('DEF','N');
insert into test1 values('DEF','Y');
insert into test1 values('DEF','N');
insert into test1 values('GHI','Y');
insert into test1 values('GHI','Y');
insert into test1 values('GHI','N');

I'm attempting to rewrite the following query through analytics

SELECT * FROM TEST1
WHERE (PARTY_ID, SAM_FLAG) IN
(SELECT PARTY_ID,SAM_FLAG FROM TEST1 WHERE SAM_FLAG='Y')
OR PARTY_ID NOT IN
(SELECT DISTINCT PARTY_ID FROM TEST1 WHERE SAM_FLAG = 'Y' )
ORDER BY PARTY_ID

The result has to look like

PARTY SAM_FLAG
----- -
123 N
456 Y
789 N
ABC Y
ABC Y
DEF Y
GHI Y
GHI Y

I tried the following query but need a logic to elimiate certain records

SELECT PARTY_ID, SAM_FLAG,
ROW_NUMBER() OVER (PARTITION BY PARTY_ID ORDER BY SAM_FLAG) AS RANK1
FROM TEST1
/

PARTY S SALES_RANK
----- - ----------
123 N 1

456 Y 1

789 N 1

ABC N 1
Y 2
Y 2

DEF N 1
N 1
Y 3

GHI N 1
Y 2
Y 2

Following records should not be a part of the result set
'ABC','N'
'DEF','N'
'DEF','N'
'GHI','N'

Would appreciate any insight.

Thanks
Sanji
Tom Kyte

Followup  

June 08, 2010 - 8:51 am UTC

it would have been really nice to phrase the question as a question - rather than just giving a sql statement and thus forcing us to live in the confines of that sql statement.

we'll make the safe (hopefully) assumption that party_id is NOT NULL in real life.

You want (apparently) every row that is a 'Y' if there exists at least one 'Y' row for a party_id

OR

You want all rows if there are NO 'Y' records.


So, just see if there is a 'Y' record by party_id (easy) and then filter on its existence or lack thereof.




ops$tkyte%ORA11GR2> select party_id, sam_flag, sam_flag2
  2    from (
  3  select party_id, sam_flag,
  4         max(case when sam_flag = 'Y' then sam_flag end) over (partition by party_id) sam_flag2
  5    from test1
  6         )
  7   where (sam_flag = 'Y' and sam_flag2 = 'Y')
  8      or (sam_flag2 is null)
  9   order by party_id
 10  /

PARTY S S
----- - -
123   N
456   N
789   N
ABC   Y Y
ABC   Y Y
DEF   Y Y
GHI   Y Y
GHI   Y Y

8 rows selected.




followup to previous post

June 02, 2010 - 3:15 pm UTC

Reviewer: Sanji from Shelton, CT

Tom,
I rewrote the above query (previous post from me) but am not sure if this is the efficient way.

OPEN:SANJI:FDCTL@ORADEV1>select party_id,sam_flag, rank2
2 from(
3 select party_id,
4 lead(sam_flag) over (partition by party_id order by sam_flag) as rank2,
5 sam_flag
6 from test1)
7 where rank2 is null
8 or (rank2 = 'Y' and sam_flag=rank2)
9 order by 1
10 /

PARTY_ID SAM_F RANK2
---------- ----- -----
123 N

456 Y

789 N

ABC Y Y
Y