Thanks for the question, Suresh.

Asked: December 07, 2001 - 8:31 pm UTC

Last updated: March 07, 2011 - 9:42 am UTC

Version: 8.1.7

Viewed 10K+ times! This question is

You Asked

What is the best way to get the 'other columns' also of rows retruned when using aggregate functions.

For eg.

select max(inv_amt), cust_code
from invoices
group by cust_code;

Now, say if you want to get the invoice_number, and the Invoice_date also for the respective record, what is the best way to do it.

Thank you,
Suresh.

and Tom said...

There are a couple of ways to do this.

Lets say you are working with the EMP table and want to get the DEPTNO and MAX(sal) by deptno and would like to get the ENAME and HIREDATE that goes with that....

FIRST -- we have to assume that CUST_CODE,INV_AMT is unique or we will get >1 record back for a given cust_code (I mean, if cust_code = 100 has two entries with inv_amt = 1000000 -- then there are 2 invoice_numbers and 2 invoice_dates with the MAX!! so be prepared for that!!)

Ok, we can:

scott@ORA817DEV.US.ORACLE.COM> select deptno, sal, ename, hiredate
2 from emp
3 where sal = ( select max(sal) from emp e2 where e2.deptno = emp.deptno )
4 order by deptno
5 /

DEPTNO SAL ENAME HIREDATE
---------- ---------- ---------- ---------
10 5000 KING 17-NOV-81
20 3000 SCOTT 09-DEC-82
20 3000 FORD 03-DEC-81
30 2850 BLAKE 01-MAY-81

scott@ORA817DEV.US.ORACLE.COM>
scott@ORA817DEV.US.ORACLE.COM>
scott@ORA817DEV.US.ORACLE.COM> select deptno,
2 to_number(substr( max_sal, 1, 10 )) sal,
3 substr( max_sal, 20 ) ename,
4 to_date( substr( max_sal, 11, 9 ) ) hiredate
5 from (
6 select deptno, max( to_char( sal,'fm0000000009') || hiredate || ename ) max_sal
7 from emp
8 group by deptno
9 )
10 /

DEPTNO SAL ENAME HIREDATE
---------- ---------- ----------- ---------
10 5000 KING 17-NOV-81
20 3000 SCOTT 09-DEC-82
30 2850 BLAKE 01-MAY-81

scott@ORA817DEV.US.ORACLE.COM>
scott@ORA817DEV.US.ORACLE.COM>
scott@ORA817DEV.US.ORACLE.COM> select distinct deptno,
2 first_value(sal) over ( partition by deptno order by SAL desc nulls last ) sal,
3 first_value(ename) over ( partition by deptno order by SAL desc nulls last ) ename,
4 first_value(hiredate) over ( partition by deptno order by SAL desc nulls last )
5 hiredate
6 from emp
7 order by deptno
8 /

DEPTNO SAL ENAME HIREDATE
---------- ---------- ---------- ---------
10 5000 KING 17-NOV-81
20 3000 FORD 03-DEC-81
30 2850 BLAKE 01-MAY-81

scott@ORA817DEV.US.ORACLE.COM>

the totally interesting thing here is that all three queries are correct (given your problem statement) but they all return different answers given the same data!!!

(this is one of my favorite problems in SQL, dealing with top-n's, max's and such -- I use questions like this in interviews all of the time to see if people can catch this subtle problem, most do not...)

So, which on is "best", thats in the eye of the beholder. I would probably use the analytic functions (first_value) as they are by far the cleanest approach and offer immense functionality once you learn them.

the one that uses max( to_char(sal,'fm.......) is sort of a trick. You encode the value you want to max in a fixed length string, glue the rest of the data onto the end of it and then substr it back out. Very effective, can be very performant (but the analytic functions are typically faster still).

The last one is the traditional set based, sql approach to the problem. Boring but tried and true. It works, can perform well in the right circumstances (proper indexes, right amount of data and such)....

There are other approaches as well -- this just gives you a flavor of what you can do.

Rating

(75 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

using aggregate functions

Suresh Nair, December 08, 2001 - 5:19 pm UTC

Definitely the response was very hepful. It was the answer to my last two days of search for the same. And also it told me that there are something like 'Analytic functions'. For the time being I will be using the 2nd method (The trick). Before using the 'Analytic functions' method let me learn what it is first.

Tom, Thank you very much.

analytical functions

A reader, December 08, 2001 - 5:34 pm UTC

The second row of the third query:

20 3000 FORD 03-DEC-8

doesn't look right to me. One can deduce that the hiredate for FORD was '03-DEC-8' which is not a valid fact in the database.

December 08, 2001 - 6:13 pm UTC

The second row of the 3rd query is (and always was)

DEPTNO SAL ENAME HIREDATE
---------- ---------- ---------- ---------
10 5000 KING 17-NOV-81
20 3000 FORD 03-DEC-81
30 2850 BLAKE 01-MAY-81

scott@ORA817DEV.US.ORACLE.COM>

your browser must have done something funny to it -- there was an 81 there, not just an 8.

immense functionality ?

Mikito Harakiri, December 08, 2001 - 7:46 pm UTC

Tom, can you be a litle bit more specific why and if analytical functions are so powerful? Relational theory tells us that joins are a the best thing since sliced bread and that referencing "previous" row is bad. How would you compare "sliding window" analytical query with this:

select empno,
(select sum(sal) from
(select rownum rn, sal from
(select sal from emp order by -sal)
)
where center-1<=rn and rn<=center+1 )
from (select rownum center, empno from
(select empno from emp order by -sal)
)

Don't you think that adding new features into the language distract users from being able to leverage existing ones?

December 09, 2001 - 10:31 am UTC

Analytic functions make possible what was either

o hard to code (relational theory is awesome but sometimes people have jobs to do, reports to crank out)

o impratical to perform for performance reasons

In "theory" many things are "bad", in practice -- they are necessary. (i'm not a big fan of "theory", I'm a big fan of getting things done, done well, done fast)

Try to compute a 5 day moving average sometime using relational theory. Doable but not practical.

You yourself violate relational theory above, you are using our psuedo column rownum which assigns order to rows in a tuple. Thats a big bad no-no. You are relying on the ordering of rows in a set -- another Oracle non-pure extension (order by in a subquery -- thats a huge no-no).... You've broken many rules.

Ok, after looking at your query, I deduce that it simply shows the sum of the salary of

a) the row "in front of" or "preceding" the current row
b) the current row
c) the row "after" or "following" the current row

So, I loaded up 1,000 employees (i tried with 30,000 but I got really bored waiting for the answer...) and ran the queries:

ops$tkyte@ORA817DEV.US.ORACLE.COM> create table emp ( empno int NOT NULL, sal int NOT NULL );

Table created.

ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> insert /*+ append */
2 into emp
3 select rownum, rownum from all_objects
4 where rownum <= 1000;

1000 rows created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> commit;

Commit complete.

ops$tkyte@ORA817DEV.US.ORACLE.COM> select count(*) from emp;

COUNT(*)
----------
1000

ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> alter table emp add constraint emp_pk primary key(empno);

Table altered.

ops$tkyte@ORA817DEV.US.ORACLE.COM> create index emp_idx_one on emp(empno,sal);

Index created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> create index emp_idx_two on emp(sal,empno);

Index created.

ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> analyze table emp compute statistics
2 for table
3 for all indexes
4 for all indexed columns;

Table analyzed.

ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> alter session set sql_trace=true;

I overindexed just to give every possible index to your query it might want. Upon running TKPROF to review the result of your query vs the SAME query using analytic functions we find:

select empno,
(select sum(sal) from
(select rownum rn, sal from
(select sal from emp order by -sal)
)
where center-1<=rn and rn<=center+1 ) x
from (select rownum center, empno, sal from
(select empno, sal from emp order by -sal)
)

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 68 14.22 14.19 0 4004 4004 1000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 70 14.23 14.20 0 4004 4004 1000

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 63

Rows Row Source Operation
------- ---------------------------------------------------
1000 VIEW
1000 COUNT
1000 VIEW
1000 SORT ORDER BY
1000 INDEX FAST FULL SCAN (object id 23779)

select empno,
sum(sal) over (order by sal desc
rows between 1 preceding and 1 following ) x
from emp

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 68 0.03 0.03 0 4 0 1000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 70 0.03 0.03 0 4 0 1000

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 63

Rows Row Source Operation
------- ---------------------------------------------------
1000 WINDOW SORT
1000 INDEX FULL SCAN (object id 23780)

So, which query would YOU want to use in the real world??

I find the second query using analytic functions not only faster but more intuitive. No fancy "order and assign a row number, find the row numbers around my rownumber and add them up". Just a simple "sum the salary of the preceding and following row" period.

Much much faster, easier to read -- easier to comprehend and code. Thats why I think analytic functions RULE.

Larry Ellison is doing this "Americas Cup" boat thing. When he bought the boat it ran on "SQLServer" (the boat is fully instrumented to capture hundreds of thousands of pieces of data from windspeed and direction to water related things, the fullness of the sails, everything). The app they had was so complex and used so many products there was one guy in New Zealand that knew how to set it up and run it.

Our group moved it into Oracle, we use nothing but SQL and a simple HTML interface and they get every bit of functionality they had plus tons more. No need for a complex, external OLAP server -- good old SQL with analytic functions does it all (and more). I myself find less moving parts = greater chance of success. The more functionality we have in the database, the better off we are.

As for:

"Don't you think that adding new features into the language distract users from
being able to leverage existing ones?"

I'd say -- NO WAY (very loudly). Look at languages like Java with J2EE and J2SE and the base java language and jdk 1.1, 1.2, 1.3, etc etc etc. Should that language "stand still" or should it keep moving (i say move). Same with the database - its when you stand still that you become obsolete. Analytic functions allow us to do things that were not feasible before, required dumping the data OUT into some other "server", learning some other environment, programming some other environment.

Here at least all the developers have to learn is the database. They don't have to learn the database, its apis, some OLAP tool, its apis and nuances, some OTHER tool that does stuff differently and so on.

Just my 2cents worth. I appreciate the functionality the analytic functions give us (so much so I did an entire chapter on them in my book, I felt they were that important -- more benchmarks in there comparing performance)...

informative

reader, December 10, 2001 - 3:03 pm UTC

where can I find more (docs, articles and different uses ) on analytical functions? ( is it specific for Oracle Database products ?)

December 10, 2001 - 3:52 pm UTC

Well, my book has a good chapter on it.

The data warehousing guide does as well:
</code> http://docs.oracle.com/docs/cd/A81042_01/DOC/server.816/a76994/analysis.htm#11237 <code>

afaik, they are Oracle'isms.

Are there limits on analytic functions?

jaromir nemec, December 10, 2001 - 5:58 pm UTC

The answer is very useful and complete.
A small extension - the analytic functions are also a pure SQL alternative to cursor loop processing.
I have a simple question to that topic.

A table stores transaction entries with transaction date (unique timestamp) and some other attributes, not interesting for this case.
Transactions belong to sessions following this rules:

· the first transaction in the session is marked
· all subsequent transaction (ordered by transaction date) till to the start of next session belongs to the current session

Question: can I assign to all transactions of a session the transaction date of the first transaction of this session?
Something like FIRST_VALUE, but the problem is, you have no session identification to partition on it, you know only the start and end of the session.

With a cursor loop it is a simple if statement, is it possible to solve that with analytic function?

Thanks

Jaromir Nemec
www.db-nemec.com

December 10, 2001 - 7:41 pm UTC

You wouldn't use analytic functions for that -- no. You cannot update an "ordered" result set and analytic functions involve ordering.

scott@ORA817DEV.US.ORACLE.COM> update ( select ename,
2 first_value(sal) over ( partition by deptno order by sal ) new_sal,
3 sal
4 from emp )
5 set sal = new_sal
6 /
update ( select ename,
*
ERROR at line 1:
ORA-01732: data manipulation operation not legal on this view

for this tho, i would use a correlated subquery:

update t
set trans_date = ( select max(trans_date)
from t t2
where flag = 1
and t2.trans_date < t.trans_date )
where flag is null
/

once more analytic functions

Jaromir Nemec, December 11, 2001 - 5:11 am UTC

The update is not the problem in the transaction / session assignment.

The question is if it is possible to perform this task "analytical" i.e. without a selfjoin or without a cursor loop. (A correlated subquery is a kind of join).

The necessary feature would be

instead of partitioning by a column e.g.

.... OVER (PARTITION BY session_identifikation ORDER BY trans_date)
-- I don't know the session identification

partitioning by start and stop boundaries, something like that ....

.... OVER (PARTITION BY all records between start flag and the stop flag
ORDER BY trans_date)

Isn't it a nice feature for the overnext realase?

Thanks

Jaromir Nemec
www.db-nemec.com

December 11, 2001 - 8:20 am UTC

You cannot update ordered relations -- that is the limitation here, the analytic functions naturally impose ORDER, we do not update "ordered" things.

Sorry for not saying it clear!

Jaromir Nemec, December 15, 2001 - 8:04 pm UTC

I have a table ...

SQL> SELECT to_char(trans_date,'hh24:mi:ss') trans_date, start_flag 
FROM trans
ORDER BY trans_date;

TRANS_DA S
-------- -
12:00:00 X
12:01:00
12:01:30
12:02:00
12:05:00 X
12:06:00
12:07:30
12:09:00 X
12:09:30

9 Zeilen ausgewählt.

The Question is: can I do the query bottom using analytic functions only (omitting joins or cursor loops)? 

SQL> SELECT to_char(a.trans_date,'hh24:mi:ss') trans_date, a.start_flag,
  2  to_char(b.trans_date,'hh24:mi:ss') AS session_start
  3  FROM trans a, trans b
  4  WHERE b.trans_date = 
  5   (SELECT MAX(trans_date)
  6    FROM trans c
  7    WHERE c.start_flag = 'X' AND c.trans_date <= a.trans_date)
  8  ORDER BY a.trans_date
  9  ;

TRANS_DA S SESSION_
-------- - --------
12:00:00 X 12:00:00
12:01:00   12:00:00
12:01:30   12:00:00
12:02:00   12:00:00
12:05:00 X 12:05:00
12:06:00   12:05:00
12:07:30   12:05:00
12:09:00 X 12:09:00
12:09:30   12:09:00

9 Zeilen ausgewählt.

SQL> 


Thanks


Jaromir

December 16, 2001 - 9:40 am UTC

No, because there is no predicable way to define the window. We cannot say "look backwards until you find the first non-null start flag". Our ranges are based on numeric ranges (all rows before me with a date within 5 days of my date for example) or exact row counts (5 rows in front of me).

Can you expand on the subtle problem you are talking about

A reader, December 17, 2001 - 10:21 am UTC

"(this is one of my favorite problems in SQL, dealing with top-n's, max's and such -- I use questions like this in interviews all of the time to see if people
can catch this subtle problem, most do not...) "

Tom, when you say subtle problem , do you mean how to find otu top-n's

December 17, 2001 - 11:47 am UTC

I mean that the question is ambigous -- look at my answer above -- three queries, all are technically accurate, they return the top-n, but they all return DIFFERENT answers....

Is there some difference in SQL Engine of 8.1.5 and 8.1.7 ?

Bhawna Rajesh, December 19, 2001 - 12:01 am UTC

Hi Tom,

Sorry for posting this query here, but needed your help urgently. I have got an export dump of Designer Repository which I have taken from 8.1.5 (Win NT) database using exp80 utility. Right now, I am trying to import it back to another server 8.1.7 (AIX), the import gets hanged. When tried to tack down the problem point, got a package 'cioue_element_type', which used IN predicate in one of the procedures. The package gets indefinitely hanged, while compiling. Tried writing a similar query with IN predicate, and got ORA-00920:invalid relational operator.
Here is the query I wrote:
  1  select * from pam_app
  2* where (cod_pam, DES_VAL) in ('QTY_MAX_TLX_ADD', 32)
SQL> /
where (cod_pam, DES_VAL) in ('QTY_MAX_TLX_ADD', 32)
                            *
ERROR at line 2:
ORA-00920: invalid relational operator

And, here is the statement of the package giving problem:
  -- Remove Reserved Names entry and comment
  delete from sdd_reserved_names
   where (rn_res_name, rn_type) in ((upper(view_name), 'VIEW')
                                   ,(upper(pack_name), 'PACKAGE')
                                   ,(upper(pack_name), 'PACKAGE BODY')
                                   ,(upper(app_view_name), 'VIEW BODY'));

This is within procedure DROP_VIEWS_PACKS.

I replaced the IN predicate with suitable OR conditions, and it got compiled. Probably the Repository is not in a stable state, I mean most of the utilities of Designer 2000 are not working. I need to do a clean import of Designer schema once more. Can you tell me if there is some setting in parameter file, which will allow IN predicates ??
Both Oracle instances have Compatible parameter as 8.1.0
Designer version is 2.1.2.

Thanks in advance.

Question on sort

Kapil, September 08, 2002 - 8:59 am UTC

Hi Tom,
It has been noticed that the time taken to sort a result set is directly proportional to the number of columns selected keeping the same columns on which the sort is performed. It was understandable if the time increased if the number of columns, on which the sort is performed, increased.
Could you please explain reason for the above behaviour?

September 08, 2002 - 9:51 am UTC

the more columns, the wider the data. Think about it -- you have a set of data like:

A (4bytes) B (10 bytes) C (10 bytes)

each swap (when you sort, you MOVE lots of stuff to get them in order) moves 24 bytes.

Now, you sort

A (4bytes) B (10 bytes) C (10 bytes) D(100 bytes)

you are moving 124bytes. Wider data, longer time to swap places.

And I doubt the "directly proportional" truly applies as at some point, you'll hit the sort_area_size threshold and start writing to disk which will make it less then a linear function of time!

Analytic functions are very useful but.....

A Reader, September 08, 2002 - 12:38 pm UTC

Tom,
Eversince i read your book i have been advocating the use of analytic functions. However whenever my queries use analytic functions query time shoots up.
How do i debug it? How do i identify which oracle parameter is causing poor performance?
Is there anyway that we can estimate the memory required for PARTITIONING and aggregating by analytic functions.

regards

September 08, 2002 - 4:33 pm UTC

Tell me, compare the performance of analytic functions to some process that produces the same exact result but doesn't use them.

Then tell me they are slow. In the book I demonstrated a pure "SQL" approach vs analytic functions. Not only are the analytics of course easier to code, they ran slighty (understatement) faster.

Do queries that include analytics run slower then a query without them? Do bears do their stuff in the woods? Of course.

There is no "Oracle parameter" (well there is that undocumented _fast = true, but I'm not allowed to talk about that ;) that is causing poor performance. You would really need to give a test case with all of the parameters (I use this table, with this much data, with this distribution over the partion by field, I have these indexes, I use this query and I get this performance - the TKPROF report).

The memory required? We use sort_area_size, you let us use as much as you want and we use all that we can -- then we swap to disk.

Really stuck up here (Analytic Functions)

Ik, September 08, 2002 - 11:18 pm UTC

Tom,

First of all, Thanks a zillion for this wonderful site and all the information we get through this.

Sorry, I should have given more clarity. On matrix reports (6i) when the number of records returned by the query is large (more than 4000 or so), the reports time (time taken at the client end) goes on to 10 minutes. This is due to the presence of summary columns with product orders. These reports need to be under a minute or else they throw the whole batch process off track.
Thus i went on to use analytic functions instead of the summary columns but now the query itself is going for nearly 75 secs. At the reports end, time has been significantly reduced but i want to cut down on the Query time.
I believe that query performance could be improved if there could be more memory available for grouping and sorting.
Can you please guide me on how analyic functions are processed internally and which oracle parameters do matter.

Thanks in advance.

September 09, 2002 - 8:12 am UTC

Lets see -- you take a process that took 10 minutes procedurally -- get it to 75 seconds in SQL and still aren't happy.....

So -- give it more memory and try. But then again, i HATE guessing so you should really see in sqlplus if you are doing sort to disk or not. If you aren't, allowing us to use more memory won't do squat.

Anyway -- suggest you enable sql_trace, get a trace file, run tkprof and analyze what you got. If you have my book, read chapter 10 - that is what you need, not to search for the holy grail of init.ora parameters.

Thank You

Ik, September 09, 2002 - 9:05 pm UTC

Tom,
Thanks a lot. My problem is that since iam on client site i do not access to Tkprof and other tuning tools. The only method is to keep changing the query and clock the runtimes.

One last Question.

I have a requirement which can be handled by using analytic function.
I have two choices of implementation whereby i handle them differently at the report end.
1. Get a COUNT over smaller partitions (sets of data). This is by having more selective columns in the PARTITION BY clause.
2. Do an COUNT OVER a larger recordset by having a single column in the PARTITON BY clause.

Which do you think would consume more Memory and Processing?

regards
ik

September 10, 2002 - 7:12 am UTC

You should tell your client that they could save TONS of money by giving you access to the requisite tools necessary to do your job.

You also should have access to AUTOTRACE in sqlplus:
</code> http://asktom.oracle.com/~tkyte/article1/autotrace.html <code>
requires no server access.

only answer to the question is "benchmark it and see"

Question on sort

Kapil, September 14, 2002 - 12:41 am UTC

Hi Tom,
Well, that is an acceptable explaination.
With regards to the sort on disk, I have confirmed it through the trace file as well as session statistics that the sort is taking place completely in memory.
Thanks anyway.

anlytical functions -- oracle 9i

john, September 26, 2002 - 11:33 am UTC

tom,
where can find the info on anlytical functions for oracle 9i

thanks

September 26, 2002 - 11:58 am UTC

</code> http://docs.oracle.com/docs/cd/B10501_01/server.920/a96520/analysis.htm#10981 <code>

Why isn't 4 rows more "correct" here ?

Robert Chin, September 26, 2002 - 2:41 pm UTC

Tom, can the analytical function , like the one you demonstrated, return the same result as the first, "old-fashioned" method ? :

DEPTNO SAL ENAME HIREDATE
---------- ---------- ---------- ---------
10 5000 KING 17-NOV-81
20 3000 SCOTT 09-DEC-82
20 3000 FORD 03-DEC-81
30 2850 BLAKE 01-MAY-81

Also, can you explain why this result is not *more* correct than the ones returning 3 rows, excluding either SCOTT or FORD, or this is just a subjective thing depening on requirement ?
Thanks

September 26, 2002 - 2:52 pm UTC

It is all subjective.

Yes, analytics can return the 4 row answer.

scott@ORA920.US.ORACLE.COM> l
1 select *
2 from (
3 select deptno,
4 sal,
5 ename,
6 hiredate,
7 rank() over (partition by deptno order by sal desc ) rnk
8 from emp
9 )
10* where rnk = 1
scott@ORA920.US.ORACLE.COM> /

DEPTNO SAL ENAME HIREDATE RNK
---------- ---------- ---------- --------- ----------
10 5000 KING 17-NOV-81 1
20 3000 SCOTT 19-APR-87 1
20 3000 FORD 03-DEC-81 1
30 2850 BLAKE 01-MAY-81 1

scott@ORA920.US.ORACLE.COM>

Can table size affect analytical functions

A reader, October 15, 2002 - 11:47 am UTC

Hi Tom
I have a query which is using first_value and when I run the query on the original table, which has a size of 1.2 Gigabytes, the query is aborted with the error 3113, but If I create a table with the 28 records involved in the logic and I run the query using that new little table and I get the results.
Is it possible that the size of the table can affect the execution of the query?
Thanks in advance

October 15, 2002 - 12:15 pm UTC

3113 = contact support, open a tar, probably got a bug.

A reader, February 25, 2003 - 1:52 am UTC

Hi Tom,
In our DW we have a table which stores the currency rates, it stores around million records, and uses a effective date to show the effectivness of the record,
Table structure
currencyid char(3)
exchrate number(21,4)
Effectivedate date

What i need is to write a query which fetches the effective record as per a particular date, i have made a function but the function makes the query become very slow as the query returns about 100,000 rows.
Is there a way to speeden things here, I can make a Materlialised view or view, but not any procedure to create a table.
To sum up what i need is effected record as per that day, since the day maybe anything date we need to retrieve from the DW table the effective rate.
Can we make a View which gives the records from date Say 01-01-1999 till today 1 row per day and ist corresponding exchange rate.

February 25, 2003 - 9:40 am UTC

how slow is slow? how fast is fast?

anyway -- something like:

select *
from (
select a.*,
row_number() over (partition by currencyid order by effectivedate desc) rn
from t a
where effectivedate <= to_date( :your_date, 'mm-dd-yyyy' )
)
where rn = 1;

that'll get the record for the max effective date by currency id.

What is the best (most efficient) way to do that?

j., July 13, 2003 - 3:47 am UTC

hi tom,

I'm looking for a way to get an ordered result set divided into "buckets" each containing a certain (predefined) number of elements and get consecutive numbers assigned to those groups generated.

for example: each group should contain 2 entries

GeneratedGroupNumber SomeOrderedValue
1 -10
1 24.3
2 55
2 103
3 234.8

I haven't found any solution to achieve this so far (unfortunately NTILE doesn't fit). can you please give me a hint?

July 13, 2003 - 9:43 am UTC

scott@ORA920.US.ORACLE.COM> select trunc((rownum-0.01)/2), sal
2 from ( select sal from emp order by sal )
3 /

TRUNC((ROWNUM-0.01)/2) SAL
---------------------- ----------
0 800
0 950
1 1100
1 1250
2 1250
2 1300
3 1500
3 1600
4 2450
4 2850
5 2975
5 3000
6 3000
6 5000

14 rows selected.

just use rownum and divide.

j., July 13, 2003 - 10:52 am UTC

Incredible: How could I miss this simple solution? No need for analytical functions at all. Thank you very much!

Query for you

Abhijit, July 18, 2003 - 9:35 am UTC

Hi Tom,
I have a query which is really puzzling me.
I have got three tables.
1) UNIT_INFO_TABLE
UNIT_NUMBER (PK) (VARCHAR2),
STATUS,
MFG_ID (FOREIGNKEY)

'100'
'AVAILABLE'
null
'101'
'AVAILABLE'
null
'102'
'AVAILABLE'
201
2) MFG_INFO TABLE
MFG_ID (PK_KEY), MFG_NAME (VARCHAR2)
200 VOLVO
201 FORD
3) UNIT_PRICES TABLE
ASSET_NUMBER, UNIT_NUMBER (FOREIGN KEY), DATE_IN_SERVICE, VALUE
1001 '100' 02/05/1997 $200
1001-1 '101' 03/04/1997 $350
1002 '102' 04/05/1998 $450

I wish to display the asset summary for each asset and the date_in_service
column is the cause of my trouble.
I have written the following query so far.

SELECT UI.UNIT_NUMBER, UI.STATUS, MI.MFG_NAME, SUM(UP.VALUE)
FROM UNIT_INFO UI,
MFG_INFO MI,
UNIT_PRICES UP
WHERE UP.UNIT_NUMBER = UI.UNIT_NUMBER (+)
AND UI.MFG_ID = MI.MFG_ID(+)
AND UNIT_NUMBER LIKE '10%'
GROUP BY UI.UNIT_NUMBER, UI.STATUS, MI.MFG_NAME

However I have to add the date_in_service column in the select list and it has to
display the date_in_service of the main asset only. For example the values without
an "-" in the column value is a main asset. How should I go about it?

Thanks & Regards,
Abhijit

July 28, 2003 - 8:51 am UTC

doesn't make sense. what gets displayed for rows where asset number contains a '-' then.

assigning consecutive numbers to groups

j., July 24, 2003 - 7:57 am UTC

hi tom,

I'm want to get CONSECUTIVE group numbers assigned (instead of existing GENERATED number) to elements of an ordered result set.

sample: each group contains at most two elements.

Value Generated GroupNo Wanted GroupNo
A 23 1
B 24 2
B 24 2
C 25 3
C 25 3
D 26 4

... but I have not found the "pattern" to solve this problem so far. can you please help?

July 24, 2003 - 8:51 am UTC


ops$tkyte@ORA920LAP> select distinct x, y, dense_rank() over ( order by y )
  2    from t;

X          Y DENSE_RANK()OVER(ORDERBYY)
- ---------- --------------------------
A         22                          1
B         23                          2
C         25                          3
D         26                          4

4 rows selected.

ops$tkyte@ORA920LAP>
ops$tkyte@ORA920LAP> create global temporary table temp
  2  ( x varchar2(1) primary key, z int )
  3  /

Table created.

ops$tkyte@ORA920LAP>
ops$tkyte@ORA920LAP> insert into temp
  2  select distinct x, dense_rank() over (order by y)
  3    from t;

4 rows created.

ops$tkyte@ORA920LAP>
ops$tkyte@ORA920LAP> update ( select t.z old_z, temp.z new_z
  2             from t, temp
  3                    where t.x = temp.x )
  4    set old_z = new_z;

6 rows updated.

ops$tkyte@ORA920LAP>
ops$tkyte@ORA920LAP> select * from t;

X          Y          Z
- ---------- ----------
A         22          1
B         23          2
B         23          2
C         25          3
C         25          3
D         26          4

6 rows selected.

My query has not yet been answered

Abhijit, July 28, 2003 - 7:39 am UTC

Hi Tom,
Why have you missed my query?

Regards,
Abhijit

July 28, 2003 - 8:51 am UTC

i don't look at every review/followup - especially when I'm on vacation which I was....

rewrite queries

PINGU_SAN, September 30, 2003 - 6:36 pm UTC

Hi

I have a query which uses conventional correlated query (which is damn slow!), I tried to rewrite it into 2 other queries and both of these new queries return different number of rows. The first query takes so long (over 9 hours when I cancelled it!) that I never got the number of rows!

I am not sure but it seems that I am failing in the category of "queries do what we want but returns different number of rows"

May you look and see if the logics are different in any sense...? Please thank you!

ORIGINAL QUERY
-----------------
select count(*)
from contrato_detalle_lineas cont_det,
clientes cli,
contratos cont
where cont_det.estado_atis is null
and cont_det.COD_EMPRESA = cli.COD_EMPRESA
and cont_det.ID_CLIENTE = cli.ID_CLIENTE
and cont_det.cod_empresa = cont.COD_EMPRESA
and cont_det.ID_CLIENTE = cont.ID_CLIENTE
and cont_det.COD_CONTRATO = cont.COD_CONTRATO
and CONT.COD_SERVICIO = 'IMAIL'
and cont_det.cod_empresa = 'TSAI'
and cont_det.num_version_detalle = (select max(num_version_detalle)
from contrato_detalle_lineas e
where e.cod_empresa = cont_det.cod_empresa
and e.id_cliente = cont_det.id_cliente
and e.cod_contrato = cont_det.cod_contrato
and e.cod_contrato_detalle = cont_det.cod_contrato_detalle
and e.dom64_operacion_realizada != 'SIN'
and e.fch_registro is not null)
and cont_det.dom64_operacion_realizada != 'BAJ';

SECOND REWRITTEN QUERY
------------------------
select count(*)
from (select cli.id_fiscal,
cont_det.num_version_detalle,
cont_det.cod_empresa,
cont_det.id_cliente,
cont_det.cod_contrato,
cont_det.cod_contrato_detalle,
cont_det.dom64_operacion_realizada,
cont_det.fch_registro
from contrato_detalle_lineas cont_det,
clientes cli,
contratos cont
where cont_det.estado_atis is null
and cont_det.COD_EMPRESA = cli.COD_EMPRESA
and cont_det.ID_CLIENTE = cli.ID_CLIENTE
and cont_det.cod_empresa = cont.COD_EMPRESA
and cont_det.ID_CLIENTE = cont.ID_CLIENTE
and cont_det.COD_CONTRATO = cont.COD_CONTRATO
and cont.cod_servicio = 'IMAIL'
and cont_det.cod_empresa = 'TSAI'
and cont_det.dom64_operacion_realizada != 'BAJ') x,
(select y.cod_empresa, y.id_cliente, y.cod_contrato, y.cod_contrato_detalle,
max(y.num_version_detalle)
over (partition by y.cod_empresa, y.id_cliente,
y.cod_contrato, y.cod_contrato_detalle) maxversion
from contrato_detalle_lineas Y
where y.fch_registro is not null
and y.dom64_operacion_realizada != 'SIN') z
where x.num_version_detalle = z.maxversion
and x.cod_empresa = z.cod_empresa
and x.id_cliente = z.id_cliente
and x.cod_contrato = z.cod_contrato
and x.cod_contrato_detalle = z.cod_contrato_detalle;

THIRD REWRITTEN QUERY
-----------------------
select count(*)
from (select cli.id_fiscal,
num_version_detalle,
cont_det.cod_empresa,
cont_det.id_cliente,
cont_det.cod_contrato,
cont_det.cod_contrato_detalle,
cont_det.dom64_operacion_realizada,
cont_det.fch_registro,
max(case when fch_registro is not null and dom64_operacion_realizada != 'SIN'
then num_version_detalle
else null
end)
over (partition by x.cod_empresa, x.id_cliente,
x.cod_contrato, x.cod_contrato_detalle) maxversion
from contrato_detalle_lineas cont_det,
clientes cli,
contratos cont
where cont_det.estado_atis is null
and cont_det.COD_EMPRESA = cli.COD_EMPRESA
and cont_det.ID_CLIENTE = cli.ID_CLIENTE
and cont_det.cod_empresa = cont.COD_EMPRESA
and cont_det.ID_CLIENTE = cont.ID_CLIENTE
and cont_det.COD_CONTRATO = cont.COD_CONTRATO
and cont.cod_servicio = 'IMAIL'
and cont_det.cod_empresa = 'TSAI'
and cont_det.dom64_operacion_realizada != 'BAJ') x ) z
where num_version_detalle = maxversion;

FOURTH REWRITTEN QUERY
------------------------
select count(*)
from (select
x.*,
max(case when fch_registro is not null and dom64_operacion_realizada != 'SIN'
then x.num_version_detalle
else null
end)
over (partition by x.cod_empresa, x.id_cliente,
x.cod_contrato, x.cod_contrato_detalle) maxversion
from (select cli.id_fiscal,
num_version_detalle,
cont_det.cod_empresa,
cont_det.id_cliente,
cont_det.cod_contrato,
cont_det.cod_contrato_detalle,
cont_det.dom64_operacion_realizada,
cont_det.fch_registro
from contrato_detalle_lineas cont_det,
clientes cli,
contratos cont
where cont_det.estado_atis is null
and cont_det.COD_EMPRESA = cli.COD_EMPRESA
and cont_det.ID_CLIENTE = cli.ID_CLIENTE
and cont_det.cod_empresa = cont.COD_EMPRESA
and cont_det.ID_CLIENTE = cont.ID_CLIENTE
and cont_det.COD_CONTRATO = cont.COD_CONTRATO
and cont.cod_servicio = 'IMAIL'
and cont_det.cod_empresa = 'TSAI'
and cont_det.dom64_operacion_realizada != 'BAJ') x ) z
where num_version_detalle = maxversion;

The third and fourth I suspect they are the same.

Also in your first reply I think only the query with analytic functions achieve what we are looking for no? The first query only returned one row with 3000 salary which cannot be correct in real world...!

about last part of question

A reader, September 30, 2003 - 6:49 pm UTC

sorry I should have said analytic function does not give correct solution!

How would you substr

A reader, October 16, 2003 - 11:46 pm UTC

Tom in the below query how would you substr if the query retrieves multiple varchar2 columns whose data length varies from row to row.

select deptno,
2 to_number(substr( max_sal, 1, 10 )) sal,
3 substr( max_sal, 20 ) ename,
4 to_date( substr( max_sal, 11, 9 ) ) hiredate
5 from (
6 select deptno, max( to_char( sal,'fm0000000009') || hiredate || ename )
max_sal
7 from emp
8 group by deptno
9 )
10 /

October 17, 2003 - 9:47 am UTC

you would encode them using a fixed width format

max( to_char(sal, 'fm000000009' ) ||
nvl( to_char(hiredate,'yyyymmddhh24miss'), rpad( ' ', 14 ) ) ||
nvl( rpad( ename, 10 ), rpad( ' ', 10 ) ) DATA

(the last column doesn't have to be padded out obviously)...

now:

to_number( substr( data, 1, 9 ) ) SAL,
to_date( substr( data, 10, 14 ), 'yyyymmddhh24miss' ) ) hiredate,
substr( data, 24 ) ename

gets your data back (use rtrim when appropriate as well)

Maximum Records on a Day

Rohit Gulati, Ambala India, November 07, 2003 - 12:41 am UTC

Sir,
We have a table which contains data about workshop jobs opened. We want that day which has highest no of jobs opened through a single sql command.
Is it possible two have two dates having the highest no of job opened using a single sql command.

with kind regards,

November 07, 2003 - 8:57 am UTC

select *
from ( select trunc(date_column) theDay, count(*) theCnt
from table
order by 2 desc )
where rownum <= 2

Nice

Michael, February 27, 2004 - 12:23 am UTC

Dear Tom,
Can the analytic functions *Lag or Lead * be used to replace
a self join?Is it possible to use them for a self join like
emp e1,emp e2
e1.mgr = e2.empno;
Do you have any other simple example for that?
Please do reply.
Bye!

February 27, 2004 - 7:28 am UTC

Yes and No.

Yes, lag and lead can be used to replace a self join.

No, lag and lead cannot be used to replace a join of EMP to itself to get the MGR's record attached to EMP record.

lag and lead let you look back/forward for a record. Here, you want every record with mgr=5 to be joined to a single record empno=5. not applicable.

Anyway to do an analytical function on another one?

She, March 02, 2004 - 4:12 pm UTC

I want to do analytical function on another one. So I want to do different partitioning clause on the sum of the sum. I dont think its possible, but just a question. I cant do the inner query as a inline view becuase a third party tool that I use doesnt allow inline views.

Ex.

select

sum(
sum( UPF1.QUANTITY)
over(partition by

HAP.PERIOD_DATE , UPF1.ISSUER_ID,UPF1.ISSUE_ID,
decode(UPF1.optiontype, 'C','C','P','P','N'),
UPF1.gsp_product_key, UPF1.ENTITY_ID) )
) over(partition by

HAP.PERIOD_DATE , UPF1.ISSUER_ID,UPF1.ISSUE_ID,
UPF1.ENTITY_ID) )

from
UPF UPF1,
HOLDING_AUDIT_POSITION HAP

WHERE
( HAP.MASTER_POSITION_FACT_ID=UPF1.MASTER_POSITION_FACT_ID )
AND ( HAP.REPORTABLE_HOLDING_RULE_ID= 164)
AND HAP.issuer_id = 24178
and HAP.PERIOD_DATE = '5-FEB-2004'

March 02, 2004 - 7:15 pm UTC

confusing -- you are asking (apparently) to sum at a HIGHER level of detail, something you've already summed to a lower level of detail?

I'm not getting it. can you provide a concrete example to work from (with less columns and all - you should only need two partition keys to demonstrate your intent)

Anyway to do an analytical function on another one?

She, March 04, 2004 - 2:22 pm UTC

I have a table with 4 columns with values like below:
Issr Region Product cost
------------------------
213 US PROD1 20
213 US PROD1 30
213 US PROD2 50
213 US PROD2 -200
213 US PROD3 -200

213 CAN PROD1 50
213 CAN PROD1 150
213 CAN PROD2 -350

213 UK PROD2 50
213 UK PROD3 120

I want the results to look like this
-------------------------------------
Issr Region Product cost Qty 1 Qty2
------------------------ -------- -----
213 US PROD1 20 1 0
213 US PROD1 30 1 0
213 US PROD2 50 0 0
213 US PROD2 -200 0 0
213 US PROD3 -200 1 0

213 CAN PROD1 -50 1 -50
213 CAN PROD1 150 1 150
213 CAN PROD2 -350 0 0

213 UK PROD2 50 1 50
213 UK PROD3 120 1 120

How I get Qty 1
----------------
If sum(qty) for a region = -ve and product in (1,2), Qty1 = 0, else Qty1 = 1 (for each row)

select
Issr, region, product,
( case
when
(product = 1 or product = 2) and
sum(cost) over(partition by Issr, Region, PRODUCT) > 0
then
1
when
product = 3
then
1
else
0 end
)Qty
from table1

How I would like to get Qty2
----------------------------
I would like to multiple cost * sum(Qty) for a region and if -ve, make Qty = 0, else Qty = cost

How i would like to achieve the above Qty 2
---------------------------------------------
select
Issr,
region,
product,
case
when
sum(cost * qty1 ) over(partition by Issr,Region ) > 0 THEN
cost
else
0
end )Qty2
from table1

So we are trying to apply aggregation at 2 different levels
The 3rd party tool I use, doesnt allow inline views. So I cant use the following option :

select
(case
when
sum(cost * Qty1) over(partition by Issr,Region ) > 0
then
cost
else
0
end) Qty2
from table1 ,
(
select Issr, region, product,
( case
when
(product = 1 or product = 2) and
sum(cost) over(partition by Issr, Region, PRODUCT) > 0
then 1
when product = 3
then 1
else 0
end
)Qty1
from table1
) view1
where view1.Issr = table1.Issr
and view1.Region = table1.Region
and view1.PRODUCT = table1.PRODUCT

March 04, 2004 - 2:54 pm UTC

i do not see a "qty" to sum anywhere, but you want to look into using CASE

OK

Siva, March 19, 2004 - 1:02 pm UTC

Dear Tom,
I created a function and it is as follows.
SQL>  create function test_func(arg in number,p_sqr out number,p_cube out number,p_sqrt out
  2   number) return number is
  3   begin
  4    p_sqr := arg*arg;
  5    p_cube := p_sqr*arg;
  6    p_sqrt := sqrt(arg);
  7    return arg;
  8   end;
  9  /

Function created.

SQL> declare
  2   x number;
  3   y number;
  4   z number;
  5  begin
  6   dbms_output.put_line('The square,cube and square root values of'||
  7                        test_func(4,x,y,z)||'are'||x||chr(9)||y||chr(9)||z);
  8  end;
  9  /
The square,cube and square root values of4are16    64    2                            

PL/SQL procedure successfully completed.

 But when I use bind variables, It comes as 

SQL> SQL> var x number

SQL> SQL> var y number

SQL> SQL> var z number

SQL> SQL> set autop on

SQL> select test_func(10,:x,:y,:z) from dual
  2         *
  3  ERROR at line 1:
  4  ORA-06572: Function TEST_FUNC has out arguments
  5  ;
 How to correct this?Can bind variables not used for OUT parameters?
Please do reply.
Bye!

March 19, 2004 - 2:07 pm UTC

no, functions with out parameters cannot be called from SQL.


SQL> exec dbms_output.put_line( test_func(10,:x,:y,:z) );

will work dandy.

OK

Catherine, April 04, 2004 - 10:22 am UTC

Dear Tom,
How to obtain the row maximum value in a table? For Example if I have
a table like

SQL> select * from t;
       a  b  c
       -------
       1  2  3
       4  5  6
I want to obtain a computed column Row_max as 

SQL> select * from t;
      a  b  c  Row_max
      ----------------
      1  2  3    3  <-- Max in the row
      4  5  6    6  <-- Max in the row

 I want to have result set like this.Is it possible in SQL?
Could you please with the query?
Please do reply.
Bye!

April 04, 2004 - 10:39 am UTC

select t.*, greatest( a,b,c) row_max from t;

Just learning Analytical functions..

Muraalee, April 05, 2004 - 9:14 am UTC

select distinct first_value(sal) over(partition by deptno order by sal desc nulls last) max_sal,
first_value(empno) over(partition by deptno order by sal desc nulls last) max_empno,
first_value(deptno) over(partition by deptno order by sal desc nulls last) max_deptno,
first_value(ename) over(partition by deptno order by sal desc nulls last) max_name,
first_value(job) over(partition by deptno order by sal desc nulls last) max_job,
first_value(hiredate) over(partition by deptno order by sal desc nulls last) max_dat
from emp
order by max_deptno
/

SQL> @t2_max_analytical

   MAX_SAL  MAX_EMPNO MAX_DEPTNO MAX_NAME   MAX_JOB    MAX_DAT
---------- ---------- ---------- ---------- ---------- ---------
      5000       7839         10 KING       PRESIDENT  17-NOV-81
      3000       7788         20 FORD       ANALYST    12-DEC-80
      2850       7698         30 BLAKE      MANAGER    01-MAY-81

SQL> l
  1  select * from (
  2     select sal,
  3            empno,
  4            deptno,
  5            ename,
  6            job,
  7            hiredate,
  8            rank() over(partition by deptno order by sal desc nulls last) rk
  9     from emp
 10     order by deptno
 11  )
 12* where rk=1
SQL> 
       SAL      EMPNO     DEPTNO ENAME      JOB        HIREDATE          RK
---------- ---------- ---------- ---------- ---------- --------- ----------
      5000       7839         10 KING       PRESIDENT  17-NOV-81          1
      3000       7788         20 SCOTT      ANALYST    19-APR-87          1
      3000       7813         20 MAX GARD   ANALYST    12-DEC-80          1
      3000       7814         20 MAX SQE    ANALYST    12-APR-87          1
      3000       7812         20 MAX DUPE   ANALYST    12-DEC-81          1
      3000       7902         20 FORD       ANALYST    03-DEC-81          1
      2850       7698         30 BLAKE      MANAGER    01-MAY-81          1

7 rows selected.

>> Just observed the hiredate returned by first query has 'Hiredate' which is not that of 'Ford'. Also the empno for FORD is 7902
>> maybe, this is what someone up the page tried to highlight, but it also looked like his/her browser problem missing '1' in '81' of hiredate

April 05, 2004 - 9:58 am UTC

you are correct, would have had to have sorted by sal, rowid or sal, empno.......

otherwise order by sal returns one of the N records with the highest sal randomly.

A reader, June 07, 2004 - 10:15 am UTC

create table dept( id number(2) primary key, dname varchar2(14), salary number(5));
insert into dept values ( 101, 'accounting', 1000 );
insert into dept values ( 102, 'accounting', 2000 );
insert into dept values ( 103, 'sales', 1010 );
insert into dept values ( 10, 'administration', 1500 );

I need in one select:
'accounting', 1000, 2000

June 07, 2004 - 10:29 am UTC

is there some "maximum number of records" for dname = :x ???

A reader, June 07, 2004 - 11:23 am UTC

It's just example. But it could be up 1000.

thank you.

June 07, 2004 - 1:19 pm UTC

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:2196162600402 <code>

Joining after getting max value

dharma, July 16, 2004 - 8:46 pm UTC

Hi Tom,

I have to get the max value of a sequence for a id and then join the sequence to another table to get all the relevant data. Since its something like this
for i in (select id from bigquery) loop
run the query for i.id
end loop;

And I cannnot join this query to the outer loop.

I have to pass the id and other values as parameter to get the max value. can they be put inside the inline view. and then joined.
Is there anyother way to do it?
I preferred first_value to rank() because of simplicity.

SELECT cap.*, crem.received_date , crem.postmark_date
FROM
(SELECT DISTINCT
FIRST_VALUE(cap.aagy_code) OVER
(ORDER BY cap.crem_remit_sequence DESC) aagy_code,
FIRST_VALUE(cap.crem_aent_sys_id_payor) OVER
(ORDER BY cap.crem_remit_sequence DESC) crem_aent_sys_id_payor,
FIRST_VALUE(cap.crem_remit_sequence) OVER
(ORDER BY cap.crem_remit_sequence DESC) max_crem_remit_sequence,
FIRST_VALUE(cap.cfee_code) OVER
(ORDER BY cap.crem_remit_sequence DESC) cfee_code
FROM csh_applied_payments cap
WHERE cap.aagy_code='33'
AND cap.crem_aent_sys_id_payor=111
AND cap.cfee_code='K'
) cap,
csh_remittances crem
WHERE crem.aagy_code = cap.aagy_code
AND crem.aent_sys_id_payor = cap.crem_aent_sys_id_payor
AND crem.remit_sequence = cap.max_crem_remit_sequence

Thank You,
-dharma

July 16, 2004 - 11:03 pm UTC

<quote>
And I cannnot join this query to the outer loop.
</quote>

sorry -- stopped reading there, don't believe you -- why not?

Expected Answer!

dharma, July 19, 2004 - 2:50 pm UTC

Hi Tom,
That was an expected answer(As I try to follow the principles), that was why I had indicated that the query was big and complicated. This is one of those rare systems where there is no foreign key relationship between any of the tables.
From the outer query for each Id there will more than one row and only a few of them can be matched to the query I already mentioned above. And the outer query itself takes 13 min to run (its a batch job).

I have included the query for your review. Could you please suggest how to incorporate the above query (prev question) and properly join with the outer query.

outer Query:
WITH
--Get all candidates from app_appn_reqs whose requirement type is an Exam
--ie. (eletp_code is not null) and whose status is PENDING (status='P')
--and who have their exam license type(eletp_code in app_reqs) in exm_lic_exam_cycles
--And order them by priority.
--Its ordered by priority as only rank= 1 will be retrieved because a candidate cannot
--take an exam of higher priority before clearing a lower priority.
priority_rnk AS
(SELECT app_reqs.aagy_code, app_reqs.aent_sys_id, app_reqs.priority,
app_reqs.eletp_code, app_reqs.alictp_code, app_reqs.qqlfmth_code,
RANK () OVER (PARTITION BY app_reqs.aagy_code, app_reqs.aent_sys_id,
app_reqs.qqlfmth_code, app_reqs.alictp_code
ORDER BY app_reqs.priority) rnk
FROM app_appn_reqs app_reqs,
(SELECT DISTINCT aagy_code, alictp_code, eletp_code
FROM exm_lic_exam_cycles) elec
WHERE app_reqs.status = 'P' --Status is "Pending"
AND app_reqs.eletp_code IS NOT NULL --Requirement type is EXAM
AND elec.aagy_code = app_reqs.aagy_code
AND elec.alictp_code = app_reqs.alictp_code
AND elec.eletp_code = app_reqs.eletp_code),
--Get all candidates from app_appn_reqs (Contains all the requirements for all candidates).
--App_lic_applications table has record for each license type a candidate has applied for.
--This query gets all records from app_appn_reqs where the license status(astatp_code) is
--Approved or Pending with a count of number of Pending records on priority =1.
aar AS
(SELECT aar.ROWID,aar.aagy_code, aar.sys_id, aar.aent_sys_id, aar.qqlfmth_code,
aar.qqlmtrq_seq_num, aar.alictp_code, aar.qrqm_code, aar.sub_type,
aar.priority, aar.status, aar.eletp_code,
aar.valid_for, aar.valid_for_unit_measure,
aar.qualification_date, aar.start_clock_code,
aar.aconstr_seq_constrained_by, aar.constraint_fail_action,
aar.complete_within_period, aar.complete_period_unit_measure,
aar.expiration_date, aar.apkg_code,
COUNT(DECODE(aar.priority,
1,
DECODE(aar.status,'P',1) --Requirement is "Pending"
)
) OVER
(PARTITION BY aar.aagy_code,
aar.aent_sys_id,
aar.qqlfmth_code,
aar.alictp_code
) pending_req_cnt
FROM app_lic_applications ala, app_appn_reqs aar
WHERE --Applicant Status Type "Approved to sit" - A, "Pending" - P
ala.astatp_code IN ('A','P')
AND ala.aagy_code = aar.aagy_code
AND ala.aent_sys_id = aar.aent_sys_id
AND ala.alictp_code = aar.alictp_code
AND ala.qqlfmth_code = aar.qqlfmth_code)
--Start Main Query
SELECT LAG (aar.aent_sys_id) OVER (PARTITION BY aar.aagy_code ORDER BY aar.aagy_code,
aar.aent_sys_id,
aar.qqlfmth_code,
aar.alictp_code,
aar.priority) prev_aent_sys_id,
LAG (aar.alictp_code) OVER (PARTITION BY aar.aagy_code, aar.aent_sys_id, aar.qqlfmth_code
ORDER BY aar.aagy_code,
aar.aent_sys_id,
aar.qqlfmth_code,
aar.alictp_code,
aar.priority) prev_alictp_code,
LAG (aar.priority) OVER (PARTITION BY aar.aagy_code, aar.aent_sys_id, aar.qqlfmth_code, aar.alictp_code
ORDER BY aar.aagy_code,
aar.aent_sys_id,
aar.qqlfmth_code,
aar.alictp_code,
aar.priority) prev_priority,
/* Columns from app_appn_reqs */
aar.ROWID, aar.aagy_code, aar.sys_id, aar.aent_sys_id, aar.qqlfmth_code,
aar.qqlmtrq_seq_num, aar.alictp_code, aar.qrqm_code,
aar.sub_type, aar.priority, aar.status, aar.eletp_code,
aar.valid_for, aar.valid_for_unit_measure,
aar.qualification_date, aar.start_clock_code,
aar.aconstr_seq_constrained_by, aar.constraint_fail_action,
aar.complete_within_period, aar.complete_period_unit_measure,
aar.expiration_date, aar.pending_req_cnt,aar.apkg_code,
--Noted Priority: The least priority for a requirement that is an EXAM which has status PENDING.
priority_rnk.priority noted_priority
FROM
--Get only rank =1 to pick only least priority with pending status.
(SELECT DISTINCT aagy_code, aent_sys_id, qqlfmth_code,
alictp_code, priority
FROM priority_rnk
WHERE priority_rnk.rnk = 1) priority_rnk,
--To get only aent_sys_ids who have fulfilled all priority=1 requirements.
(SELECT aar.*
FROM aar
WHERE pending_req_cnt = 0) aar
WHERE priority_rnk.aagy_code = p_aagy_code
--AND r.aent_sys_id=145143 mary
AND priority_rnk.aagy_code = aar.aagy_code
AND priority_rnk.aent_sys_id = aar.aent_sys_id
AND priority_rnk.qqlfmth_code = aar.qqlfmth_code
AND priority_rnk.alictp_code = aar.alictp_code
--To get all records in app_appn_reqs whose priority is less than the min priority
--with a eletp code and status=P
--OR
--records in app_appn_reqs which have the same min priority and the same eletp_code
AND ( aar.priority < priority_rnk.priority
OR ( aar.status = 'P' --Requirement Pending
AND priority_rnk.priority = aar.priority
--AND priority_rnk.eletp_code = aar.eletp_code
)
);

If there is any othere information that is missing, I'll provide them.

Thanks,
-dharma.

July 19, 2004 - 4:24 pm UTC

select *
from ( with q1 as (select * from emp),
q2 as (select * from dept)
select empno, ename, dname
from q1, q2
where q2.deptno = q1.deptno ) a
left outer join emp on (a.empno = emp.empno)
/

just keep on joining? if you are running the loop you say above, just join on "id" -- another inline view and join.

Thanks

dharma, July 19, 2004 - 5:21 pm UTC

Thanks Tom for the immediate response, I will join the subquery as a inline view. You forgot to answer my original question.

Can I have parameters inside the inline view, as it will reduce the time of the query a lot.

SELECT e.sal,e.empno,d.*
FROM (SELECT DISTINCT
FIRST_VALUE(sal) OVER (PARTITION BY deptno ORDER BY sal) sal,
FIRST_VALUE(empno) OVER (PARTITION BY deptno ORDER BY sal) empno,
FIRST_VALUE(deptno) OVER (PARTITION BY deptno ORDER BY sal) deptno
FROM emp
WHERE deptno=10 --p_deptno -----> have this as a parameter
) e,
dept d
WHERE e.deptno = d.deptno

Thanks a lot for your invaluable time.
-dharma
ps: I am hoping to meet you at the nocoug :)

July 19, 2004 - 7:04 pm UTC

yes, you can bind in an inline view, it is very common.

scott@ORA9IR2> variable b number
scott@ORA9IR2> exec :b := 10

PL/SQL procedure successfully completed.

scott@ORA9IR2> @test
scott@ORA9IR2> SELECT e.sal,e.empno,d.*
2 FROM (SELECT DISTINCT
3 FIRST_VALUE(sal) OVER (PARTITION BY deptno ORDER BY sal) sal,
4 FIRST_VALUE(empno) OVER (PARTITION BY deptno ORDER BY sal) empno,
5 FIRST_VALUE(deptno) OVER (PARTITION BY deptno ORDER BY sal)
6 deptno
7 FROM emp
8 WHERE deptno=:b
9 ) e,
10 dept d
11 WHERE e.deptno = d.deptno
12 /

SAL EMPNO DEPTNO DNAME LOC
---------- ---------- ---------- -------------- -------------
1300 7934 10 ACCOUNTING NEW YORK

saw another 3113 in the thread and thought i would post it

Justin, July 30, 2004 - 12:32 am UTC

SQL> create or replace function loser
  2  return sys_refcursor
  3  as
  4  l_rc1   sys_refcursor;
  5
  6  begin
  7  if 'loser' = 'loser'
  8  then
  9  return l_rc1;
 10  else
 11  open l_rc1 for select '' from dual;
 12  return l_rc1;
 13
 14  end if;
 15  end;
 16  /

Function created.

SQL> select loser from dual;
select loser from dual
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel

Returning the empty refcursor seems to die a painful death.

Although this seems to work.

Just FYI.

SQL> create or replace function loser
  2  return sys_refcursor
  3  as
  4  l_rc1   sys_refcursor;
  5
  6  begin
  7  if 'loser' = 'losr'
  8  then
  9  return l_rc1;
 10  else
 11  open l_rc1 for select '' from dual;
 12  return l_rc1;
 13
 14  end if;
 15  end;
 16  /

Function created.

SQL> select loser from dual;

LOSER
--------------------
CURSOR STATEMENT : 1

CURSOR STATEMENT : 1

'
-

Can I do this?

A reader, September 21, 2004 - 6:42 am UTC

Hi Tom,
Can I do this:
SELECT a.client,
a.product,
sum(amnt) over (partition by a.product, a.product_count||b.product_count)
from tablea a, (select client, product, product_count FROM tablea where
product = 'A') b
where a.client = b.client(+)
and a.product = b.product(+)
---
What I am expecting:
There are clients having lots of products with various counts. Sum to be done based upon product, and product count but if a client has product 'A' of specific count, it will fall in different group including product 'A' and its count
Thanks

September 21, 2004 - 7:59 am UTC

did you set up a test with some data to see if this was even remotely close?

I don't think it is.

I'm having a hard time following the textual description -- and lots of metadata is missing (like what is the primary key of tablea -- it would be very important in reading and changing a query like this - and evaluating if it could be even remotely "correct")

using a.product, a.product_count||b.product_count is pretty dangerous. why ||???

No, that query doesn't look right and the specs are really "too vague" to come to a query.

Suggestion:

set up a table
place data in table, make sure to include every boundary value you can imagine

FIRST_VALUE ANALYTIC WITHIN A WINDOW

Grant in Calgary, October 28, 2004 - 5:30 pm UTC

Hi Tom,

Love your site as always.

I can see how your using the first_value function to return a column value from a row with the greatest or least value in a different column within a partition using the order by clause in one of the above questions.

I'm trying to figure out how to do that when you are using a window in the analytical function where your order by clause is already being used to define the window.

It's probably easier to explain with an example.

create table mytest (
entity_id number(22) not null,
rec_month date not null,
rec_value number(22,10) not null );

insert into mytest values(1, to_date('2003-jan-01','yyyy-mon-dd'), 20);
insert into mytest values(1, to_date('2003-feb-01','yyyy-mon-dd'), 25);
insert into mytest values(1, to_date('2003-mar-01','yyyy-mon-dd'), 15);
insert into mytest values(1, to_date('2003-apr-01','yyyy-mon-dd'), 30);
insert into mytest values(1, to_date('2003-may-01','yyyy-mon-dd'), 45);
insert into mytest values(1, to_date('2003-jun-01','yyyy-mon-dd'), 24);
insert into mytest values(1, to_date('2003-jul-01','yyyy-mon-dd'), 22);
insert into mytest values(1, to_date('2003-aug-01','yyyy-mon-dd'), 26);
insert into mytest values(1, to_date('2003-sep-01','yyyy-mon-dd'), 28);
insert into mytest values(1, to_date('2003-oct-01','yyyy-mon-dd'), 32);
insert into mytest values(1, to_date('2003-nov-01','yyyy-mon-dd'), 36);
insert into mytest values(1, to_date('2003-dec-01','yyyy-mon-dd'), 21);
insert into mytest values(1, to_date('2004-jan-01','yyyy-mon-dd'), 14);
insert into mytest values(1, to_date('2004-feb-01','yyyy-mon-dd'), 23);
insert into mytest values(1, to_date('2004-mar-01','yyyy-mon-dd'), 29);
insert into mytest values(1, to_date('2004-apr-01','yyyy-mon-dd'), 17);
insert into mytest values(1, to_date('2004-may-01','yyyy-mon-dd'), 18);
insert into mytest values(1, to_date('2004-jun-01','yyyy-mon-dd'), 14);
insert into mytest values(1, to_date('2004-jul-01','yyyy-mon-dd'), 31);
insert into mytest values(1, to_date('2004-aug-01','yyyy-mon-dd'), 33);
insert into mytest values(1, to_date('2004-sep-01','yyyy-mon-dd'), 19);
insert into mytest values(1, to_date('2004-oct-01','yyyy-mon-dd'), 11);
insert into mytest values(1, to_date('2004-nov-01','yyyy-mon-dd'), 13);
insert into mytest values(1, to_date('2004-dec-01','yyyy-mon-dd'), 50);

select entity_id,
rec_month,
rec_value,
max(rec_value) over (partition by entity_id ORDER BY rec_month range interval '2' month preceding) max_value_over_last_3,
'?' max_val_over_last_3_date
from mytest;

ENTITY_ID REC_MONTH REC_VALUE MAX_VALUE_OVER_LAST_3 M
---------- --------- ---------- --------------------- -
1 01-JAN-03 20 20 ?
1 01-FEB-03 25 25 ?
1 01-MAR-03 15 25 ?
1 01-APR-03 30 30 ?
1 01-MAY-03 45 45 ?
1 01-JUN-03 24 45 ?
1 01-JUL-03 22 45 ?
1 01-AUG-03 26 26 ?
1 01-SEP-03 28 28 ?
1 01-OCT-03 32 32 ?
1 01-NOV-03 36 36 ?
1 01-DEC-03 21 36 ?
1 01-JAN-04 14 36 ?
1 01-FEB-04 23 23 ?
1 01-MAR-04 29 29 ?
1 01-APR-04 17 29 ?
1 01-MAY-04 18 29 ?
1 01-JUN-04 14 18 ?
1 01-JUL-04 31 31 ?
1 01-AUG-04 33 33 ?
1 01-SEP-04 19 33 ?
1 01-OCT-04 11 33 ?
1 01-NOV-04 13 19 ?
1 01-DEC-04 50 50 ?

The max_value_over_last_3 is the greatest rec_value in the last 2 months and the current month for each row.
What I also need is the rec_month that the greatest rec_value came from in the same window. So the results that I need would look like the following example.

ENTITY_ID REC_MONTH REC_VALUE MAX_VALUE_OVER_LAST_3 MAX_VAL_O
---------- --------- ---------- --------------------- ---------
1 01-JAN-03 20 20 01-JAN-03
1 01-FEB-03 25 25 01-FEB-03
1 01-MAR-03 15 25 01-FEB-03
1 01-APR-03 30 30 01-APR-03
1 01-MAY-03 45 45 01-MAY-03
1 01-JUN-03 24 45 01-MAY-03
1 01-JUL-03 22 45 01-MAY-03
1 01-AUG-03 26 26 01-AUG-03
1 01-SEP-03 28 28 01-SEP-03
1 01-OCT-03 32 32 01-OCT-03
1 01-NOV-03 36 36 01-NOV-03
1 01-DEC-03 21 36 01-NOV-03
1 01-JAN-04 14 36 01-NOV-03
1 01-FEB-04 23 23 01-FEB-04
1 01-MAR-04 29 29 01-MAR-04
1 01-APR-04 17 29 01-MAR-04
1 01-MAY-04 18 29 01-MAR-04
1 01-JUN-04 14 18 01-MAY-04
1 01-JUL-04 31 31 01-JUL-04
1 01-AUG-04 33 33 01-AUG-04
1 01-SEP-04 19 33 01-AUG-04
1 01-OCT-04 11 33 01-AUG-04
1 01-NOV-04 13 19 01-SEP-04
1 01-DEC-04 50 50 01-DEC-04

I had tried using
max(rec_month) keep (dense_rank first order by rec_value) over (partition by entity_id ORDER BY rec_month range rows '2' month preceding)
but it appears that you can't have more than one order by clause in an analytical function.
(ORA-30487: ORDER BY not allowed here)

I have tried making 2 inline views the same as the original query and then joining between them to pull back the dates that match the max(rec_value) but this is
2 costly. There can also be multiple rec_values (rev_value_a, rec_value_b, rec_value c,...) like columns that I need to find the max on and there related rec_months within the 3 month window which means that I would have to create multiple copies of the inline view and have multiple joins, compounding the performance issues.

This is the best solution that I have been able to find so far.

Concatenate the rec_value formatted to the rec_month,
apply the analytical max function
parse the values for the rec_value and rec_month back out.

select entity_id,
rec_month,
rec_value,
TO_NUMBER(SUBSTR(max(lpad(trim(to_char(rec_value, '999999')), 6, '0')||trim(to_char(rec_month, 'YYYYMMDD'))) over (partition by entity_id ORDER BY rec_month range interval '2' month preceding), 1,6)) max_value_over_last_3,
TO_DATE (SUBSTR(max(lpad(trim(to_char(rec_value, '999999')), 6, '0')||trim(to_char(rec_month, 'YYYYMMDD'))) over (partition by entity_id ORDER BY rec_month range interval '2' month preceding), 7), 'YYYYMMDD') max_val_over_last_3_date
from mytest;

Though this works, I figured there must be an easier way to do this that would be easier on the eyes and more maintainable.

Thanks in advance Tom.

October 28, 2004 - 7:57 pm UTC

first thing I thought of is 


ops$tkyte@ORA9IR2> select entity_id, rec_month, rec_value,
  2         to_number(substr(data,1,23)) rec_value,
  3             to_date(substr(data,24),'yyyymmdd') rec_month
  4    from (
  5  select     entity_id,
  6          rec_month,
  7          rec_value,
  8          max(to_char( rec_value,'fm000000000000.0000000000' ) ||
  9                               to_char(rec_month,'yyyymmdd') ) over
 10          (partition by entity_id ORDER BY rec_month range
 11           interval '2' month preceding) data
 12  from       mytest );
 
 ENTITY_ID REC_MONTH  REC_VALUE  REC_VALUE REC_MONTH
---------- --------- ---------- ---------- ---------
         1 01-JAN-03         20         20 01-JAN-03
         1 01-FEB-03         25         25 01-FEB-03
         1 01-MAR-03         15         25 01-FEB-03
         1 01-APR-03         30         30 01-APR-03

Window definition

A reader, October 28, 2004 - 9:05 pm UTC

"range interval '2' month preceding"

I read the docs for analytic functions but the explanation for window definition gave me a headache. Can you please explain the above (and in general all window definitions) more clearly?

Thanks

October 28, 2004 - 10:11 pm UTC

that is 9i using the INTERVAL datatype.

ops$tkyte@ORA9IR2> select sysdate, sysdate + interval '2' month from dual;
 
SYSDATE   SYSDATE+I
--------- ---------
28-OCT-04 28-DEC-04
 


not really "analytics", just INTERVALs

Correlated Subquery vs Analytical Functions

Prasad Jayakumar, February 25, 2005 - 7:28 am UTC

Hi Tom,

I have few questions regarding the

"What is the best way to get the 'other columns' also of rows returned when using aggregate functions"

Table - EMP

EMPNO DEPTNO SAL
1 10 10000
2 10 12000
3 20 7000

create table emp(empno char(4), deptno char(2), sal number(10,2));

insert into emp values(1,10,10000);
insert into emp values(2,10,12000);
insert into emp values(3,20,7000);

Required Result

EMPNO DEPTNO SAL DEPT_TOT_SAL COMP_TOT_SAL
1 10 10000 22000 29000
2 10 12000 22000 29000
3 20 7000 7000 29000

DEPT_TOT_SAL - Department Wise total salary
COMP_TOT_SAL - Total salary for the company as a whole

Possible solutions

1. select empno,deptno,sal,
sum(sal) over(partition by deptno) dept_tot_sal,
sum(sal) over() comp_tot_sal
from emp;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=3 Bytes=27)
1 0 WINDOW (SORT) (Cost=4 Card=3 Bytes=27)
2 1 TABLE ACCESS (FULL) OF 'EMP' (Cost=2 Card=3 Bytes=27)

2. select empno, deptno, sal,
(select sum(sal) from emp where deptno = e.deptno) dept_tot_sal,
(select sum(sal) from emp) comp_tot_sal
from emp e;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=3 Bytes=27)
1 0 TABLE ACCESS (FULL) OF 'EMP' (Cost=2 Card=3 Bytes=27)

My Questions are

1. Does this show correlated subquery is performing equivalent to the analytical function (For this query, ofcourse) ?

2. During the explanation of correlated subquery, we were informed that the inner query will be executed once for every row? But the explain plan doesnot reflect any such thing?

3. I was expecting atleast two full table scan on EMP for the second query (correlated one), But didn't happen. Could you explain why?

February 25, 2005 - 5:20 pm UTC

1) no, it is running the scalar subquery as well -- older plans did not show this.

tkprof it and you'll see the extra work taking place.

2) not in 2005 -- in 1995 when it was all about the RBO that might have been true but the optimizer (CBO ) will rewrite exists as an in or join and cost those out as well.

And the explain plan for the correlated subquery would -- you are using what is known as a SCALAR subquery however. They are different beasts all together....

3) sure it did -- tkprof it.

Scalar Subquery vs Analytical Functions

Prasad Jayakumar, February 26, 2005 - 10:35 pm UTC

Hi Tom,

Thanks for the follow up.

Sorry for the terminology mistake. The following is the excerpt from the oracle official study guide - Introduction to 9i SQL

"The Oracle Server performs a correlated subquery when the subquery references a column from a table in the parent query."

Maybe I misunderstood it. Thanks for the correction.

I am back with the same question

1. Does this show scalar subquery is performing equivalent to the analytical function (For this query, ofcourse) ?

select empno, deptno, sal,
(select sum(sal) from emp where deptno = e.deptno) dept_tot_sal,
(select sum(sal) from emp) comp_tot_sal
from emp e

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 11470 1.63 1.69 0 12897 60 172032
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 11472 1.63 1.69 0 12897 60 172032

Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 149

Rows Row Source Operation
------- ---------------------------------------------------
172032 TABLE ACCESS FULL EMP

********************************************************************************

select empno,deptno,sal,
sum(sal) over(partition by deptno) dept_tot_sal,
sum(sal) over() comp_tot_sal
from emp

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 11470 1.97 6.11 1578 289 35 172032
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 11472 1.97 6.11 1578 289 35 172032

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 149

Rows Row Source Operation
------- ---------------------------------------------------
172032 WINDOW SORT
172032 TABLE ACCESS FULL EMP

********************************************************************************

Am I missing something?

With Regards
Prasad

February 27, 2005 - 8:30 am UTC

there are

a) scalar subqueries
b) inline views
c) (correlated|non-correlated) subqueries (where exists, where not exists, where in, where not in for example)

do you see the MASSIVE IO difference between the two? Looks like you posted them out of order here -- the analytics did all of the heavy lifting (physical IO) for you.

Top 10 items --

Reader, March 12, 2005 - 8:55 am UTC

I am trying to pick the best selling items from my sales table .

1. I do this by
SELECT a.*
FROM (SELECT DISTINCT item_id,SUM (qty * price) OVER (PARTITION BY inv_item_id) value FROM sales
WHERE invoice_dt > TRUNC (SYSDATE) - 3
ORDER BY 2 DESC) a
WHERE ROWNUM < 10

Is there a better way ?

2.Now, I want to to pick up the a corresponding column with each row .How can I use first_value with this with sum() .
I tried your first answer in this post but got confused .

3.Next,I want to find how these items (top 10) performed in the last 5 days.

I can prepare a sample table if required .

Thanks for your help .

March 12, 2005 - 10:14 am UTC

seems a bit strange to me - not knowing your data model.

why sum(qty*price) by inv_item_id -- but distinct on item_id?

one would need a small example to work with and an little bit of explanation on the model.

Contd ..

Reader, March 12, 2005 - 12:10 pm UTC

Sorry about the typo while I was tring to simplying the sql.

For simplicity ,let us take the orders table from the OE sample schema .

create table test as select * from orders;
alter table test add cust_loc varchar2(10);
update test set cust_loc='DETROIT'
where customer_id=147;
update test set cust_loc = 'MADISON'
where customer_id=149;
update test set cust_loc='CHICAGO'
where customer_id=101;
commit;

select a.* from
(select distinct customer_id,sum(order_total) over(partition by customer_id) qty
order by 2 desc) a
where rownum < 6
CUSTOMER_ID QTY
----------- ----------
149 403119.7
147 371278.2
150 282694.3
109 265255.6
108 213399.7
5 rows selected

1. If I now need to the customer_loc then how do I get that ?

2.I did the following but the results are totally different --Why ?

select a.* from
(select distinct customer_id,cust_loc,sum(order_total) over(partition by custome
order by 2 desc) a
where rownum < 6
CUSTOMER_ID CUST_LOC QTY
----------- ---------- ----------
102 69211.4
103 20591.4
104 146605.5
105 61376.5
106 36199.5
5 rows selected

3.If I now wish to see all the orders in the first 30 days of all the customers after he has placed his first order

THANKS AGAIN

March 12, 2005 - 2:17 pm UTC

lets take an example you have - i don't have the OE schema installed right now.

what typo?

if you have a table with multiple records -- why would you need "another column"

eg:

cust_id, cust_loc, order_total

well, CUST_LOC would never be in the orders table.

Lets use EMP as an example, you want to sum(sal) by deptno -- but why would you bring along any other attribute -- like ENAME? ENAME cannot be brought along at the DEPT level after aggregation.

So, use EMP, pretend deptno is your item number, sal is your sales -- what attribute could possibly be brought along and still "make sense"

contd...

reader, March 12, 2005 - 4:12 pm UTC

Is it possible to use first_value with any aggregate function in one sql ?

I am trying to use emp table to frame my three questions from the previous post .
However ,it would have been easier for me if you had the OE schema .

Thanks for your time .

March 12, 2005 - 4:48 pm UTC

You see -- I'm trying to point out what I believe to be a fundemental logic error here...

If you sum(price) by inventory_id -- hence there are MULTIPLE rows for each inventory_id -- AND you want some "other fields" -- that does not compute with me.

eg:

INV_ID LOC PRICE
1 a 100
1 b 200

Ok, what is the output here? (1,a,300) or (1,b,200)?????

You see, if you aggregate -- which "row" is to contribute this magic value?

(and it would be easier for me for you to provide a small sample, for you see -- it is I that gets the questions.....)

contd...

reader, March 16, 2005 - 12:00 am UTC

create table testsales (item_id number,description varchar2(20),sale_dt date ,sale_amount number(9,2));

insert into testsales values (1001,'NUTS','11-DEC-2004',50.25);
insert into testsales values (1001,'NUTS','12-DEC-2004',150);
insert into testsales values (1001,'NUTS','15-DEC-2004',250);
insert into testsales values (1001,'NUTS','21-DEC-2004',350);
insert into testsales values (1001,'NUTS','29-DEC-2004',250.25);
insert into testsales values (1001,'NUTS','11-JAN-2005',154);
insert into testsales values (2003,'BOLTS','12-NOV-2004',150.25);
insert into testsales values (2003,'BOLTS','13-DEC-2004',520.35);
insert into testsales values (2003,'BOLTS','11-DEC-2004',350.65);
insert into testsales values (2003,'BOLTS','17-DEC-2004',650.25);
insert into testsales values (7001,'WASHERS','11-JAN-2004',550);
insert into testsales values (7001,'WASHERS','17-DEC-2004',50);
insert into testsales values (7001,'WASHERS','13-FEB-2004',50);
insert into testsales values (7001,'WASHERS','14-MAR-2004',50);

NB :Ideally description will be in a different table .

select * from (select distinct item_id,description,sum(sale_amount) over(partition by item_id ) tot
from testsales
order by tot desc ) a

ITEM_ID DESCRIPTION TOT
--------- -------------------- ----------
2003 BOLTS 1671.5
1001 NUTS 1204.5
7001 WASHERS 739.5

1.How can I use rank() with any aggregate function ?
2.Can I use first_value in the above sql to get description ?
3.How can I get all the invoices for an item for the first 30 days only from the the first sale_date

ie
1001 NUTS 11-DEC-04 50.25
1001 NUTS 12-DEC-04 150
1001 NUTS 15-DEC-04 250
1001 NUTS 21-DEC-04 350
1001 NUTS 29-DEC-04 250.25
2003 BOLTS 12-NOV-04 150.25
2003 BOLTS 11-DEC-04 350.65
7001 WASHERS 11-JAN-04 550

Thanks in advance

March 16, 2005 - 8:17 am UTC

question is obviously "why is description in that table at all"?

If item_id -> description, it never belonged there, but if it is there, then item_id -> description so just aggregate by description as well.


select item_id, description, sum(sale_amount)
  from testsales
 group by item_id, description;


Now you can do whatever you want to that result set.


1) to apply an analytic to an analytic you use layers of inline views:

select analytic(c) over (...)
  from (select analytic(x) over (....) c
          from 
        )

2) not neceesary, item_id and description are what you will group by.

3) 

ops$tkyte@ORA9IR2> select item_id, description, sale_dt, sale_dt-min(sale_dt) over (partition by item_id) days
  2    from testsales
  3  /
 
   ITEM_ID DESCRIPTION          SALE_DT         DAYS
---------- -------------------- --------- ----------
      1001 NUTS                 11-DEC-04          0
      1001 NUTS                 12-DEC-04          1
      1001 NUTS                 15-DEC-04          4
      1001 NUTS                 21-DEC-04         10
      1001 NUTS                 29-DEC-04         18
      1001 NUTS                 11-JAN-05         31
      2003 BOLTS                12-NOV-04          0
      2003 BOLTS                13-DEC-04         31
      2003 BOLTS                11-DEC-04         29
      2003 BOLTS                17-DEC-04         35
      7001 WASHERS              11-JAN-04          0
      7001 WASHERS              17-DEC-04        341
      7001 WASHERS              13-FEB-04         33
      7001 WASHERS              14-MAR-04         63
 
14 rows selected.


You can use that query in an inline view, where on it (where days <= 30) and then aggregate.

EXCELLENT !

reader, March 17, 2005 - 6:42 am UTC

Deploying a parent attribute to all children in a hierarchy

Naresh, July 11, 2005 - 6:40 am UTC

Hi Tom,

I have a table with a hierarchical structure

Cust_ext (leg_cust_no, leg_parent_no, soc). Leg_cust_no is the customer identifier, leg_parent_no is the identifier for a parent cuctomer - this setting up a hierarchy.

SQL> select LEG_CUST_NO, LEG_PARENT_NO, soc from cust_ext;

LEG_CUS LEG_PAR SOC
------- ------- ---------------
106     102     SOC106
107     100
108     106
100             SOC100
101             SOC101
102     101     SOC102
103     102
104     101
105

9 rows selected.

The intention is to 'deploy' the soc to all child customers - i.e. I need output rows with a leg_cust_no and a soc, such that for each customer, there is a row for each soc that is against every 'ancestor' customer. 

I have a data set returned by a query such as

  1  select LEG_CUST_NO, /* OPEN_DATE, CUST_TYPE, CUST_SUB_TYPE, */ LEG_PARENT_NO, soc,
  2        decode (level, 1, soc, NULL) as first_soc, level dummy_lvl
  3  from cust_ext
  4* start with soc is not null connect by prior leg_cust_no = leg_parent_no
SQL> /

LEG_CUS LEG_PAR SOC             FIRST_SOC        DUMMY_LVL
------- ------- --------------- --------------- ----------
106     102     SOC106          SOC106                   1
108     106                                              2
100             SOC100          SOC100                   1
107     100                                              2
101             SOC101          SOC101                   1
102     101     SOC102                                   2
106     102     SOC106                                   3
108     106                                              4
103     102                                              3
104     101                                              2
102     101     SOC102          SOC102                   1
106     102     SOC106                                   2
108     106                                              3
103     102                                              2

14 rows selected.

(I put the decode because I wanted the child soc values to be NULL - that way the output is better to understand)

The output of this query can be given to another query - if there is an analytic function that can copy the first_soc column to subsequent rows wherever it is NULL. I cannot use first_value above as there is nothing to partition the output set. What I ultimately need is below (soc column denotes data as it is in the table, first_soc is the result required.

LEG_CUS LEG_PAR SOC             FIRST_SOC        DUMMY_LVL
------- ------- --------------- --------------- ----------
106     102     SOC106          SOC106                   1
108     106                     SOC106                   2
100             SOC100          SOC100                   1
107     100                     SOC100                   2
101             SOC101          SOC101                   1
102     101     SOC102          SOC101                   2
106     102     SOC106          SOC101                   3
108     106                     SOC101                   4
103     102                     SOC101                   3
104     101                     SOC101                   2
102     101     SOC102          SOC102                   1
106     102     SOC106          SOC102                   2
108     106                     SOC102                   3
103     102                     SOC102                   2

Is there any way to use an analytic finction that can help in achieving above?

Thanks.

July 11, 2005 - 11:21 am UTC

it is unfortunate you didn't give a create table and insert into, I cannot test anything...

but it would appear that if the top level has it, sys_connect_by_path can be used to carry that attribute down easily.

and in 10g, there is a function to get any attribute of the "root" node as well.

Deploying parent attribute to all children in hierarchy

naresh, July 11, 2005 - 11:54 am UTC

Hi Tom,

Sorry to miss it - here is the SQl for test data:

create table cust_ext (leg_cust_no varchar2(7), leg_parent_no varchar2(7), soc varchar2(10))

INSERT INTO CUST_EXT VALUES ( '100', NULL, 'SOC100');
INSERT INTO CUST_EXT VALUES ('101', NULL, 'SOC101');
INSERT INTO CUST_EXT VALUES ('102', '101', 'SOC102');
INSERT INTO CUST_EXT VALUES ('103', '102', NULL);
INSERT INTO CUST_EXT VALUES ( '104', '101', NULL);
INSERT INTO CUST_EXT VALUES ( '105', NULL, NULL);
INSERT INTO CUST_EXT VALUES ( '106', '102', 'SOC106');
INSERT INTO CUST_EXT VALUES ('107', '100', NULL);
INSERT INTO CUST_EXT VALUES ('108', '106', NULL);

commit;

thanks.

July 11, 2005 - 12:37 pm UTC

did you *even try* sys_connect_by_path??????  hint hint hint.


ops$tkyte@ORA9IR2> select LEG_CUST_NO, LEG_PARENT_NO, soc,
  2  decode (level, 1, soc, NULL) as first_soc, level dummy_lvl,
  3  sys_connect_by_path( soc, ',' ) scbp
  4  from cust_ext
  5  start with soc is not null connect by prior leg_cust_no = leg_parent_no
  6  /
 
LEG_CUS LEG_PAR SOC        FIRST_SOC   DUMMY_LVL SCBP
------- ------- ---------- ---------- ---------- -----------------------
102     101     SOC102     SOC102              1 ,SOC102
103     102                                    2 ,SOC102,
106     102     SOC106                         2 ,SOC102,SOC106
108     106                                    3 ,SOC102,SOC106,
106     102     SOC106     SOC106              1 ,SOC106
108     106                                    2 ,SOC106,
100             SOC100     SOC100              1 ,SOC100
107     100                                    2 ,SOC100,
101             SOC101     SOC101              1 ,SOC101
102     101     SOC102                         2 ,SOC101,SOC102
103     102                                    3 ,SOC101,SOC102,
106     102     SOC106                         3 ,SOC101,SOC102,SOC106
108     106                                    4 ,SOC101,SOC102,SOC106,
104     101                                    2 ,SOC101,
 
14 rows selected.


just substr off the last one (hint: instr can look "backwards" in a string too)

is analytic functions available in SE of the database? (9r2)

Oleg Oleander, August 06, 2005 - 9:10 pm UTC

Is it part of the SE's feature set or licensed separately?

Thank you

August 07, 2005 - 9:14 am UTC

It is a feature of SE, EE and PE in 9i and up.

An alternative solution

yas, August 08, 2005 - 7:12 am UTC

Connor McDonald has an alternative solution for the original question in </code> http://www.SITE-NOLONGER-EXISTS/tips/9i_first_last.htm <code>

DENSE_RANK versus FIRST_VALUE

Eric D Peterson, August 30, 2005 - 5:40 pm UTC

This morning a DBA mentioned using FIRST_VALUE in place of a DENSE_RANK. For this query we are trying to get the last amount due and it's date for each ID from the bill table. Each ID does have many entries. PK is id & id_seq_no. This is an 9i database.

I came up with the following. Then after learning about FIRST_VALUE I came up with the second query. Both have the same cost. Both run at about the same speed. I've not yet done a TKPROF of each, that is my next step.

I would like your opinion on which is "better"? Personally, I like DENSE_RANK. I have a clear idea in my head what it is doing and returning. The FIRST_VALUE seems to also get what I want. Both have to work as a sub-query to ensure I only get one row per ID.

I'm leaning towards the opinion that both are fairly "equivalent" and it probably doesn't matter which I use in this case. Although looking at the stats shows fewer memory and disk sorts. hmmmm...

Thanks for your time.
Eric

SELECT
id,
total_due_amt,
bill_date
FROM
(
SELECT /*+ PARALLEL (bill) */
id,
total_due_amt,
bill_date,
DENSE_RANK () OVER ( PARTITION BY id
ORDER BY id_seq_no DESC ) AS dr
FROM
bill
WHERE
bill_status = 'CONFIRM'
)
WHERE
dr < 2
ORDER BY
id ASC;

---------------------------------------------------------------------------------------------
|Id | Operation | Name |Rows|Bytes|TempSpc|Cost | TQ |IN-OUT|PQ Distrib|
---------------------------------------------------------------------------------------------
| 0| SELECT STATEMENT | | 50M|2307M| |31190| | | |
|* 1| VIEW | | 50M|2307M| |31190|69,01| P->S |QC (ORDER)|
|* 2| WINDOW SORT PUSHED RANK | | 50M|1394M| 4258M|31190|69,01| PCWP | |
| 3| VIEW | | | | | |69,00| P->P |RANGE |
|* 4| WINDOW CHILD PUSHED RANK| | 50M|1394M| |31190|69,00| PCWP | |
|* 5| TABLE ACCESS FULL |BILL_MV1| 50M|1394M| | 7718|69,00| PCWP | |
---------------------------------------------------------------------------------------------
Statistics
---------------------------------------------------
25 recursive calls
287 db block gets
499395 consistent gets
787392 physical reads
920 redo size
56921299 bytes sent via SQL*Net to client
819500 bytes received via SQL*Net from client
117034 SQL*Net roundtrips to/from client
8 sorts (memory)
6 sorts (disk)
2340653 rows processed

SELECT
id,
total_due_amt,
bill_date
FROM
(
SELECT /*+ PARALLEL (bill) */
id,
total_due_amt,
bill_date,
id_seq_no,
FIRST_VALUE (id_seq_no) OVER ( PARTITION BY id
ORDER BY id_seq_no DESC ) AS fv
FROM
bill
WHERE
bill_status = 'CONFIRM'
)
WHERE
id_seq_no = fv
ORDER BY
id ASC;

--------------------------------------------------------------------------------------
| Id| Operation | Name |Rows|Bytes|TempSpc|Cost | TQ |IN-OUT|PQ Distrib|
--------------------------------------------------------------------------------------
| 0| SELECT STATEMENT | | 50M|2452M| |31190| | | |
|* 1| VIEW | | 50M|2452M| |31190|71,01| P->S |QC (ORDER)|
| 2| WINDOW SORT | | 50M|1394M| 4258M|31190|71,01| PCWP | |
|* 3| TABLE ACCESS FULL |BILL_MV1| 50M|1394M| | 7718|71,00| P->P |RANGE |
--------------------------------------------------------------------------------------
Statistics
---------------------------------------------------
25 recursive calls
214 db block gets
499371 consistent gets
949515 physical reads
928 redo size
57857573 bytes sent via SQL*Net to client
819499 bytes received via SQL*Net from client
117034 SQL*Net roundtrips to/from client
5 sorts (memory)
4 sorts (disk)
2340653 rows processed

August 31, 2005 - 5:06 am UTC

PARTITION BY id ORDER BY id_seq_no DESC

as long as id_seq_no is UNIQUE in a given id, they are the same in this case.

I would prefer dense_rank, row_number or rank in this case for "semantic reasons", easy to read (for me)

new query

Tony, September 09, 2005 - 12:54 am UTC

Tom,

I have one query,

I like show the cumulative sum for sal so I used analytical function

like select empno,ename,sal,sum(sal) over(order by empno) from emp;

It is working fine but when I used the same query in SAS it is not... I don't know but it is giving problem in over clause...

So I want to get the result without using any analytical function

Is it possible Tom? Can u please help out...?

Thx

Tony

September 09, 2005 - 7:04 am UTC

What is the error being returned and how does SAS fit into the picture? is SAS executing the query or are we querying SAS?

Using SUM

Ajums T T, September 28, 2005 - 5:37 pm UTC

Hi Tom,
I wonder if it would be possible to get a count from two different tables and present the sum in a single row?

for example
select count(*) from table_a
returns 10
select count(*) from table_b
returns 25

I want to see the total as 35 in my output.

How do I go about doing that? any ideas?

Regards,
Ajums TT

September 28, 2005 - 5:43 pm UTC

select sum(cnt)
from (select count(*) cnt from a
union all
select count(*) cnt from b );

very useful

Ajums TT, September 30, 2005 - 2:31 pm UTC

Thanks a lot Tom.

Choosing the "best" approach

Narendra, November 11, 2005 - 4:15 am UTC

Tom,

This has proved to be one of the most useful thread for me.
Can you please help me to choose (alonwith good evidence & reasoning) best approaches from below?
I have following 3 tables
SQL> desc transfers
 Name                Null?    Type
 ------------------- -------- ----------
 AT_ID               NOT NULL NUMBER(10) => Primary Key
 STUD_ID             NOT NULL NUMBER(10)
 TGH_ID                       NUMBER(10)

No. of records => 10000

SQL> desc choices
 Name                Null?    Type
 ------------------- -------- ----------
 CHOICE_ID           NOT NULL NUMBER(10) => PK
 AT_ID               NOT NULL NUMBER(10) => FK(Transfers.AT_ID)
 BASE_ID                      NUMBER(10)
 STATUS                       CHAR(1)

No. of records => 20000

SQL> desc alloc_hist
 Name                Null?    Type
 ------------------- -------- ----------
 ALLOC_ID            NOT NULL NUMBER(10) => PK
 CHOICE_ID           NOT NULL NUMBER(10) => FK(Choices.CHOICE_ID)
 OLD_STATUS                   CHAR(1)
 NEW_STATUS                   CHAR(1)
 LAST_UPDT                    DATE
 TGH_ID                       NUMBER(10)

No. of records => 40000

I want to find number of records from CHOICES table such that STATUS does not match with NEW_STATUS of ALLOC_HIST table for each CHOICE_ID OR ALLOC_HIST does not have any record for CHOICE_ID.
Below is TKPROF for 4 SQLs that I used out of which 2 are just repeatition of other 2 with a HINT.
declare
ln_cnt NUMBER;
begin
select count(*) into ln_cnt from choices c, transfers t where c.at_id = t.at_id
and not exists ( select null from alloc_hist a where a.choice_id = c.choice_id
and a.new_status = c.status and a.last_updt = ( select max(last_updt)
from alloc_hist ah where ah.choice_id = a.choice_id ) );
dbms_output.put_line('Without Analytics ' || ln_cnt);
--
select count(*) into ln_cnt from choices c, transfers t,
( select distinct choice_id, first_value(new_status) over (partition by choice_id order by last_updt desc NULLS LAST) old_status
  from alloc_hist) a
where c.at_id = t.at_id
and c.choice_id = a.choice_id(+)
and (c.status <> a.old_status OR a.choice_id is null);
dbms_output.put_line('With Analytics ' || ln_cnt);
--
select /*+ USE_HASH(c t) */ count(*) into ln_cnt from choices c, transfers t,
( select distinct choice_id, first_value(new_status) over (partition by choice_id order by last_updt desc NULLS LAST) old_status
  from alloc_hist) a
where c.at_id = t.at_id
and c.choice_id = a.choice_id(+)
and (c.status <> a.old_status OR a.choice_id is null);
dbms_output.put_line('With Analytics(HASH Hint) ' || ln_cnt);
--
select /*+ USE_HASH(c t) */ count(*) into ln_cnt from choices c, transfers t where c.at_id = t.at_id
and not exists ( select null from alloc_hist a where a.choice_id = c.choice_id
and a.new_status = c.status and a.last_updt = ( select max(last_updt)
from alloc_hist ah where ah.choice_id = a.choice_id ) );
dbms_output.put_line('Without Analytics(HASH Hint) ' || ln_cnt);
--
end;

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.00          0          0          0           0
Execute      1      0.01       0.29          0          0          0           1
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.03       0.30          0          0          0           1

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 64  

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       1        0.00          0.00
  SQL*Net message from client                     1        0.00          0.00
********************************************************************************

SELECT count(*) from choices c, transfers t where c.at_id = t.at_id
and not exists ( select null from alloc_hist a where a.choice_id = c.choice_id
and a.new_status = c.status and a.last_updt = ( select max(last_updt)
from alloc_hist ah where ah.choice_id = a.choice_id ) )

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.01          0          0          0           0
Fetch        1      0.23       1.20        360      19748          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.23       1.21        360      19748          0           1

Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 64     (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE 
  19317   NESTED LOOPS  
  19317    HASH JOIN ANTI 
  20000     TABLE ACCESS FULL CHOICES 
  17305     VIEW  
  17305      HASH JOIN  
  17305       VIEW  
  17305        SORT GROUP BY 
  40000         TABLE ACCESS FULL ALLOC_HIST 
  40000       TABLE ACCESS FULL ALLOC_HIST 
  19317    INDEX UNIQUE SCAN SYS_C003758 (object id 32148)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  db file scattered read                         40        0.03          0.41
********************************************************************************

SELECT count(*) from choices c, transfers t,
( select distinct choice_id, first_value(new_status) over (partition by choice_id order by last_updt desc NULLS LAST) old_status
  from alloc_hist) a
where c.at_id = t.at_id
and c.choice_id = a.choice_id(+)
and (c.status <> a.old_status OR a.choice_id is null)

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.01          0          0          0           0
Fetch        1      0.54       1.10        418      19564          5           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.54       1.11        418      19564          5           1

Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 64     (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE 
  19317   NESTED LOOPS  
  19317    FILTER  
  20000     HASH JOIN OUTER 
  20000      TABLE ACCESS FULL CHOICES 
  17305      VIEW  
  17305       SORT UNIQUE 
  40000        WINDOW SORT 
  40000         TABLE ACCESS FULL ALLOC_HIST 
  19317    INDEX UNIQUE SCAN SYS_C003758 (object id 32148)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  db file scattered read                         20        0.03          0.16
  direct path write                               4        0.00          0.00
  direct path read                              116        0.00          0.02
********************************************************************************

SELECT /*+ USE_HASH(c t) */ count(*) from choices c, transfers t,
( select distinct choice_id, first_value(new_status) over (partition by choice_id order by last_updt desc NULLS LAST) old_status
  from alloc_hist) a
where c.at_id = t.at_id
and c.choice_id = a.choice_id(+)
and (c.status <> a.old_status OR a.choice_id is null)

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      0.40       0.65        418        269          5           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.40       0.65        418        269          5           1

Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 64     (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE 
  19317   FILTER  
  20000    HASH JOIN OUTER 
  20000     HASH JOIN  
  10000      INDEX FAST FULL SCAN SYS_C003758 (object id 32148)
  20000      TABLE ACCESS FULL CHOICES 
  17305     VIEW  
  17305      SORT UNIQUE 
  40000       WINDOW SORT 
  40000        TABLE ACCESS FULL ALLOC_HIST 


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  db file scattered read                         20        0.04          0.17
  direct path write                               4        0.00          0.00
  direct path read                              116        0.01          0.04
********************************************************************************

SELECT /*+ USE_HASH(c t) */ count(*) from choices c, transfers t where c.at_id = t.at_id
and not exists ( select null from alloc_hist a where a.choice_id = c.choice_id
and a.new_status = c.status and a.last_updt = ( select max(last_updt)
from alloc_hist ah where ah.choice_id = a.choice_id ) )

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      0.20       0.43        360        453          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.20       0.43        360        453          0           1

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 64     (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE 
  19317   HASH JOIN  
  19317    HASH JOIN ANTI 
  20000     TABLE ACCESS FULL CHOICES 
  17305     VIEW  
  17305      HASH JOIN  
  17305       VIEW  
  17305        SORT GROUP BY 
  40000         TABLE ACCESS FULL ALLOC_HIST 
  40000       TABLE ACCESS FULL ALLOC_HIST 
  10000    INDEX FAST FULL SCAN SYS_C003758 (object id 32148)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  db file scattered read                         40        0.02          0.20
********************************************************************************

Also, why did not the optimizer choose HASH JOIN approach even if that caused much less LIOs?
I have analyzed all 3 tables. Oracle version is 9.2.0.1.0

p.s. I have posted a similar issue for another thread but thought this one is more appropriate.

Thanks

Can you suggest a better approach ?

A reader, April 20, 2006 - 12:06 pm UTC

Tom,
Can you please select a better approach?

select version_control
from STUDENT_REGION
where version_control_dt||single_time > to_char(sysdate - .0105 ,'YYMMDDHH24MM')
and student_ration != 0
and travel_cost = 'MA'
order by version_control;
Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=12562 Card=233438
Bytes=7003140)

1 0 SORT (ORDER BY) (Cost=12562 Card=233438 Bytes=7003140)
2 1 TABLE ACCESS (FULL) OF 'STUDENT_REGION' (TABLE) (Co
st=10643 Card=233438 Bytes=7003140)

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=12562 Card=233438
Bytes=7003140)

1 0 SORT (ORDER BY) (Cost=12562 Card=233438 Bytes=7003140)
2 1 TABLE ACCESS (FULL) OF 'STUDENT_REGION' (TABLE) (Co
st=10643 Card=233438 Bytes=7003140)

April 20, 2006 - 12:35 pm UTC

yes, fix your data model.

apparently you are not only using a character string to store a date and time (very very very bad) but you are doing it in two separate fields (very very very very bad).

Ami, May 29, 2006 - 9:44 am UTC

Hi Tom,

Is there a way to do this in one statement...

I need to get the cust_code that corresponds to the min cust_rank based on the max recency date. A cust_rank can have many cust_codes.

Thanks in advance for any advice.

May 30, 2006 - 8:23 am UTC

who knows, without a table and some explanation - who knows.

I have no clue what the problem is here.

sort of sounds like

select cust_code
from (select * from t order by cust_rank, recency_date DESC)
where rownum = 1;

that would get the "first" cust_code after sorting.

But that is probably wrong, given the problem is somewhat "not really defined too well"

FIRST_VALUE

Girish, September 20, 2006 - 12:11 am UTC

Hi Tom,

I have following records in table BANK

ACCNO LDATE BAL
---------- --------- ----------
10 12-SEP-06 1000
20 12-SEP-06 2000
30 13-SEP-06 3000
10 15-SEP-06 4000
20 15-SEP-06 1000
I want current balance in each account and when I wrote following query

select accno,ldate,first_value(bal) over (partition by accno order by ldate desc) curr_bal from bank;
ACCNO LDATE CURR_BAL
--------- --------- ----------
10 15-SEP-06 4000
10 12-SEP-06 4000
20 15-SEP-06 1000
20 12-SEP-06 1000
30 13-SEP-06 3000

Where as I want to see only

10 15-SEP-06 4000
20 15-SEP-06 1000
30 13-SEP-06 3000

Why I am getting all rows ? What is missing in query ?

Regards
Girish

September 20, 2006 - 2:56 pm UTC

what is missing? some sort of predicate or aggregate.

analytics do NOT aggregate (that is their purpose, to get aggregations over many rows without actually "aggregating" the rows - without removing rows!!!)

sounds like you want the first row after sorting by date desc...

select *
from ( select accno,ldate,
row_number() over (partition by accno order by ldate desc) rn
from bank)
where rn = 1;

New to Analytic Functions

Krishna, June 18, 2008 - 5:56 am UTC

Hi Tom,

I am very new to so called Analytic Functions and I just gone thru some of the posting here but, I am unable to undrstand the stuff. Can u pls, explain me with some very basic examples.

Thx in Advance
Krishna

June 18, 2008 - 1:37 pm UTC

"U" is very popular.
"U" is NEVER available.
I do not think "U" actually does database work, in fact, he is deceased http://en.wikipedia.org/wiki/U_of_Goryeo

what do you think if I answer instead?

pls - German postal codes, this always confuses me. I have yet to make that connection. And thx, it sounds really good in the movies, but I cannot connect it to a database question....

But anyway, do you have access to Expert One on One Oracle - my first book? Big chapter on nothing but analytics.

short of that, the data warehousing guide:

http://docs.oracle.com/docs/cd/B19306_01/server.102/b14223/analysis.htm#i1007779

Reader, June 26, 2008 - 6:12 pm UTC

create table tst
(td date
,qty number);

insert into tst
values
('31-MAY-2008',20);

insert into tst
values
('30-MAY-2008',20);

insert into tst
values
('29-MAY-2008',20);

insert into tst
values
('28-MAY-2008',20);

insert into tst
values
('01-JUN-2008',20);

insert into tst
values
('01-JUN-2008',10);

insert into tst
values
('01-JUN-2008',30);

insert into tst
values
('02-JUN-2008',30);

insert into tst
values
('03-JUN-2008',40);

insert into tst
values
('04-JUN-2008',50);

insert into tst
values
('05-JUN-2008',60);

insert into tst
values
('06-JUN-2008',70);

insert into tst
values
('07-JUN-2008',80);

insert into tst
values
('08-JUN-2008',90);

insert into tst
values
('09-JUN-2008',100);

insert into tst
values
('10-JUN-2008',20);

insert into tst
values
('11-JUN-2008',40);

insert into tst
values
('11-JUN-2008',50);

insert into tst
values
('12-JUN-2008',80);

insert into tst
values
('13-JUN-2008',90);

insert into tst
values
('14-JUN-2008',100);

insert into tst
values
('14-JUN-2008',200);

insert into tst
values
('15-JUN-2008',20);

insert into tst
values
('15-JUN-2008',10);

insert into tst
values
('16-JUN-2008',40);

insert into tst
values
('16-JUN-2008',20);

insert into tst
values
('16-JUN-2008',20);

insert into tst
values
('16-JUN-2008',20);

insert into tst
values
('16-JUN-2008',20);

insert into tst
values
('17-JUN-2008',20);

insert into tst
values
('17-JUN-2008',20);

insert into tst
values
('18-JUN-2008',20);

insert into tst
values
('18-JUN-2008',20);

insert into tst
values
('19-JUN-2008',20);

insert into tst
values
('20-JUN-2008',20);

insert into tst
values
('20-JUN-2008',20);

insert into tst
values
('21-JUN-2008',20);

insert into tst
values
('22-JUN-2008',20);

insert into tst
values
('23-JUN-2008',20);

insert into tst
values
('24-JUN-2008',20);

insert into tst
values
('24-JUN-2008',20);

insert into tst
values
('25-JUN-2008',20);

insert into tst
values
('25-JUN-2008',20);

insert into tst
values
('25-JUN-2008',20);

select td,sum(qty)
from tst
group by td
order by td desc

I need to get the sum of qty for each day. Whenever I query, I should get the 20 records as of that date.

For example
if I run the query today and pass the date as 06/25/2008, I should get records from 06/06/2008 - 06/25/2008
if I query on 06/25/2008 and pass the date as 06/24/2008, I should get records from 06/05/2008 - 06/24/2008

Can you please let me know how to do this?

June 26, 2008 - 6:56 pm UTC

is it april 1st?


ops$tkyte%ORA9IR2> variable x varchar2(30)
ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> exec :x := '6/25/2008'

PL/SQL procedure successfully completed.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> select to_date(:x,'mm/dd/yyyy')-19, to_date(:x,'mm/dd/yyyy')
  2    from dual
  3  /

TO_DATE(: TO_DATE(:
--------- ---------
06-JUN-08 25-JUN-08

ops$tkyte%ORA9IR2> select td,sum(qty)
  2  from tst<b>
  3  WHERE TD BETWEEN TO_DATE(:X,'MM/DD/YYYY')-19 AND TO_DATE(:X,'MM/DD/YYYY')
</b>  4  group by td
  5  order by td desc
  6  /

TD          SUM(QTY)
--------- ----------
25-JUN-08        240
...
06-JUN-08        280

20 rows selected.

Reader, June 27, 2008 - 9:32 am UTC

create table orders
(order_dt date
,quantity number);

insert into orders
values
(to_Date('06/02/2008','mm/dd/yyyy'),2000);

insert into orders
values
(to_Date('05/30/2008','mm/dd/yyyy'),1000);

insert into orders
values
(to_Date('05/30/2008','mm/dd/yyyy'),4000);

insert into orders
values
(to_Date('05/29/2008','mm/dd/yyyy'),100);

insert into orders
values
(to_Date('05/28/2008','mm/dd/yyyy'),2000);

insert into orders
values
(to_Date('05/27/2008','mm/dd/yyyy'),4000);

insert into orders
values
(to_Date('05/23/2008','mm/dd/yyyy'),5000);

insert into orders
values
(to_Date('05/22/2008','mm/dd/yyyy'),6000);

insert into orders
values
(to_Date('05/22/2008','mm/dd/yyyy'),8000);

insert into orders
values
(to_Date('05/21/2008','mm/dd/yyyy'),100);

insert into orders
values
(to_Date('05/20/2008','mm/dd/yyyy'),200);

insert into orders
values
(to_Date('05/19/2008','mm/dd/yyyy'),3000);

insert into orders
values
(to_Date('05/16/2008','mm/dd/yyyy'),5000);

commit;

select * from orders;

I need to get the sum(quantity) for the last 5 working days including the date passed; and get the average for last 5 working days.

If I query on 06/02/2008, I need to get sum and then avg for 06/02,05/30,05/29,05/28,05/27

If I query on 05/30/2008, I need to get sum and then avg for 05/30,05/29,05/28,05/27,05/23

If I query on 05/29/2008, I need to get sum and then avg for 05/29,05/28,05/27,05/23,05/22

This I have to do for the whole year

If I do to_Date('05/30/2008','mm/dd/yyyy')-5, I get 05/25/2008, which is not correct according to the data that I have.

I have to do this for the whole year.

If I do to_Date('05/30/2008','mm/dd/yyyy')-5, I get 05/25/2008, which is not correct according to the data that I have.

I need to create a view which will have three columns order_Dt, total_qty_5d, average_5d.

Order_Dt total_qty_5d average_5d
06/02/2008 13100 2260
05/30/2008 16100 3220

Can you please tell me how to write the query for this?

June 27, 2008 - 9:45 am UTC

you'll need to figure out a where clause that gives you the last five business days - search for business days on this site, we've been over that a million times.

You have to define what you mean by business days
you have to then include your definition of holidays
which means you'll have yet another table of holidays (and you'll maintain that)

which all means you'll need to sit down a really design this - it is non-trivial.

this is utterly different from

...
I need to get the sum of qty for each day. Whenever I query, I should get the 20 records as of that
date.

For example
if I run the query today and pass the date as 06/25/2008, I should get records from 06/06/2008 -
06/25/2008
if I query on 06/25/2008 and pass the date as 06/24/2008, I should get records from 06/05/2008 -
06/24/2008

Can you please let me know how to do this?
.....

Reader, June 27, 2008 - 9:56 am UTC

I have a holiday table, I missed to mention this:

  1  select to_Char(holiday_dt,'mm/dd/yyyy') holiday from holiday
  2* where to_Char(holiday_dt,'yyyy') = '2008'
SQL> /

HOLIDAY
----------
01/01/2008
01/21/2008
02/18/2008
03/21/2008
05/26/2008
07/04/2008
09/01/2008
11/27/2008
12/25/2008

9 rows selected.

June 27, 2008 - 10:02 am UTC

did you do what I requested.

.... search for business days on this site, we've been over that a million times. ....

you'll need to figure out what they date is N business days ago - then you have the date range you need. that is all. I would presume that since you already have a holidays table, you already know how to do this. If not, do what I suggested. Search

Reader, June 27, 2008 - 4:51 pm UTC

I looked for last business day.

I do not store data for holidays and weekends in the orders table.
This is what I did to get the sum for 5 days. Please let me know if it is fine.

select O.ORDER_DT,
sum(O.qty) over( order by O.ORDER_DT desc rows between current row and 4 following ) sum_5d
from (select order_dt,sum(QUANTITY) qty
from orders
group by order_Dt
order by order_dt desc) o
order by 1 desc

June 27, 2008 - 5:20 pm UTC

you would

a) run a query to figure out what day was 5 business days before your current day, your input day

b) you would use the output of that to query with

I don't see how your query would end up with "the last five business days", can you explain your logic?

Unless your data NEVER has any day within the last N days unless it is a business day (unusual - many times, there will be dates on dates that are NON-business days)

Reader, July 01, 2008 - 8:31 am UTC

Tom,
I created a view for the below query and this will be run using business objects Mon-Fri at 7:30 pm.

select O.ORDER_DT,
sum(O.qty) over( order by O.ORDER_DT desc rows between current row and 4 following ) sum_5d
from (select order_dt,sum(QUANTITY) qty
from orders
group by order_Dt
order by order_dt desc) o
order by 1 desc

Last five business days is today and past 4 days of data in the table. I have data in the table only for business days.
There will be no data for weekends and holidays. Also, there will be data for all business days. Hence I used this logic.

If I run this on Monday, I should get data for Monday,last friday, thursday, Wednesday, Tuesday. If Tuesday is a holiday (I do not store this data), I should get Monday's data.

Please let me know if I am missing something.

Reader, August 18, 2008 - 12:19 pm UTC

Tom,
I have a table with 50 million records which is daily partitioned
by dt_key (YYYYMMDD format).
I need to calculate 20 day averages.
I do not store data for holidays and weekends.
I am using the below query to get the 20d average
for each dusiness day and this takes about 4mins 30secs to run.
Each day 200,000 records are loaded.
Not sure if this would cause any issue later on as the number of records increases.

Each day, I will get 60,000 records when I run this query.
Initially, I thought of creating a table. Not sure if this is a good idea, since in the future,
I have to calculate 90d avg, 5d avg. Sometimes these records have to be combied with table with 15 million records to get the volume.

Can you please let me know if there is any better approach than what I am doing?

Also, please let me know if I am missing something in the above explanation

select *
from
(select a.dt_key,
a.sym,
sum(a.volume) over (partition by a.sym order by a.trade_dt_key DESC ROWS BETWEEN CURRENT ROW AND 19 FOLLOWING) as TOTAL_VOLUME,
sum(a.volume) over (partition by a.sym order by a.trade_dt_key DESC ROWS BETWEEN CURRENT ROW AND 19 FOLLOWING)/20 as AVG_DAILY_VOLUME
from tab a
WHERE code = 'US'
)

August 20, 2008 - 10:05 am UTC

as the data volume goes up, that query will take longer and longer to run as it processes every single row in the table every single time. more data, longer response time for that one.

and your number of rows will go up, it won't be 60,000 in the future - you are adding data.

as select count(*) from tab where code = 'US' increases, so will the number of rows your query returns.

SQL Aggregates

Krishna, September 24, 2008 - 1:49 pm UTC

Hi Tom,
i'm a Tester and i have a requirement like..
i have the table which has

Year_Quarter | Total |
2008Q1 | 5 |

i want my SQL to give the result LIKE

Year_Quarter | Total | Net_Total |
2008Q1 | 5 | 5 |
2008Q2 | 7 | 12 |
2008Q3 | 4 | 16 |

How can it be acheived...pls help on this.

Cheers!!
Krishna

September 24, 2008 - 8:09 pm UTC

I hope this was a joke. Think about this please....

How would one go from the first query with one record and know to MAKE UP the next two rows???

if you want a running total, that is trivial - you have a query that returns year_quarter and total - you would

select year_quarter, total, sum(total) over (order by year_quarter) net_total
  from .....

but that presumes rows 2 and 3 exist in your query as it is now..

Aggregate SQLs

Krishna, September 25, 2008 - 6:17 am UTC

Thanks a ton..

This is what i look for..

Cheers!
Krishna

MAX function ....

Srini, December 30, 2008 - 8:08 pm UTC

Hi Tom,

I've an ITEM table that contains a list of items ordered.(per ITEMPARENTID = 0). Here is the SQL for that

CREATE TABLE ITEM
(REGION VARCHAR2(15),
ITEMID NUMBER(4),
ITEMPRDID VARCHAR2(25),
SHIPDATE DATE,
ITEMPARENTID NUMBER(4)
)

INSERT INTO ITEM VALUES ('ID','1','PRD001',TRUNC(SYSDATE),'0')
INSERT INTO ITEM VALUES ('ID','2','PRD002',TRUNC(SYSDATE),'1')
INSERT INTO ITEM VALUES ('ID','3','PRD003',TRUNC(SYSDATE),'1')
INSERT INTO ITEM VALUES ('ID','4','PRD004',TRUNC(SYSDATE),'1')
INSERT INTO ITEM VALUES ('ID','5','PRD005',TRUNC(SYSDATE),'1')
INSERT INTO ITEM VALUES ('ID','6','PRD006',TRUNC(SYSDATE),'1')

I also have an ITEMATTR table that stores the attributes of certain items.

CREATE TABLE ITEMATTR
(
REGION VARCHAR2(15),
ITEMID NUMBER(4),
ATTRID VARCHAR2(15),
ATTRDATE DATE
)

INSERT INTO ITEMATTR VALUES ('ID','2','INSDATE',TRUNC(SYSDATE))
INSERT INTO ITEMATTR VALUES ('ID','4','LAUNCHDATE',TRUNC(SYSDATE-4))
INSERT INTO ITEMATTR VALUES ('ID','6','ENDDATE',TRUNC(SYSDATE-6))

I have three Lookup tables for corresponding to the main item table.

CREATE TABLE LOOKUP1
(
PRDID VARCHAR2(25),
PRD_DESCR VARCHAR2(30),
PRD_CONTRACT VARCHAR2(10)
)

INSERT INTO LOOKUP1 VALUES ('PRD002','Product 2',0)

CREATE TABLE LOOKUP2
(
PRDID VARCHAR2(25),
PRD_DESCR VARCHAR2(30),
PRD_CONTRACT VARCHAR2(10)
)
INSERT INTO LOOKUP2 VALUES ('PRD004','Product 4',0)

CREATE TABLE LOOKUP3
(
PRDID VARCHAR2(25),
PRD_DESCR VARCHAR2(30),
PRD_CONTRACT VARCHAR2(10)
)
INSERT INTO LOOKUP3 VALUES ('PRD006','Product 6',0)

Now, I'm running an SQL like this

SELECT 
MAX(B.PRDID),MAX(B.PRD_DESCR),MAX(B.PRD_CONTRACT),
MAX(C.PRDID),MAX(C.PRD_DESCR),MAX(C.PRD_CONTRACT),
MAX(C.PRDID),MAX(C.PRD_DESCR),MAX(C.PRD_CONTRACT),
MAX(DECODE(B.PRDID,NULL,NULL,E.ATTRDATE)),
MAX(DECODE(D.PRDID,NULL,NULL,E.ATTRDATE))
FROM ITEM A
LEFT JOIN LOOKUP1 B ON A.ITEMPRDID=B.PRDID
LEFT JOIN LOOKUP2 C ON A.ITEMPRDID=C.PRDID
LEFT JOIN LOOKUP3 D ON A.ITEMPRDID=D.PRDID
LEFT JOIN ITEMATTR E ON A.REGION=E.REGION AND A.ITEMID=E.ITEMID AND E.ATTRID IN ('INSDATE','ENDDATE')
WHERE
A.ITEMPARENTID = '1'

My aim is to get the matching values for a product from the lookup tables in a single SQL.(max of item id)

In this case it works, because, I'm hitting only one row against each of the lookup tables.

However, if I do

INSERT INTO LOOKUP2 VALUES ('PRD003','Product 3',2)

and execute the same SQL, I get erroneous results.

I could figure out that the problem is due to MAX.

What is the other aggregate function that I can use in this case to get the desired result?

More generally, is it possible to achieve this in a single SQL if I hit multiple rows against the lookup table? One point here is, a same product cannot be present across LOOKUP1, LOOKUP2 and LOOKUP3 tables.It can be in only one of these tables. That is why, I chose MAX but hitting a different problem when doing this way.

January 05, 2009 - 10:01 am UTC

I hate your model. ugh. This will never perform, it is so so very very hard to query.

you will need to better define your query.

You give me a query - that does not work and simply say "results are erroneous"

But you never gave us the "question", not really. You say:

... My aim is to get the matching values for a product from the lookup tables in a single SQL.(max of item id) ...

but lookup tables do not even HAVE an item id.

this model is an utter mess. Since you have not really started implementation yet, I would strongly suggest:

DROP THIS MODEL

think about a better design

oracle version

Srini, December 30, 2008 - 8:51 pm UTC

Forgot to provide the Oracle version for my post..

Oracle9i Enterprise Edition Release 9.2.0.5.0 - 64bit Production
PL/SQL Release 9.2.0.5.0 - Production
CORE 9.2.0.6.0 Production
TNS for Solaris: Version 9.2.0.5.0 - Production
NLSRTL Version 9.2.0.5.0 - Production

Srini, January 05, 2009 - 3:01 pm UTC

Tom,

Yes, I do agree that this MODEL is a mess. But, that is what the OOB CRM implementation has given us :(..

Somehow, I have managed this using WITH SELECT and at the same time without hitting any performance bottlenecks.

Thanks,
Srini

Column order

Sanji, March 04, 2011 - 3:44 pm UTC

Tom,

Oracle 9i rel2, HP-UX 11i.

create table project
(project_number number,
territory1 varchar2(10),
territory2 varchar2(10),
territory3 varchar2(10),
territory4 varchar2(10),
territory5 varchar2(10)
);

insert into project values(1,'ter1','ter2',null,null,null);
insert into project values(1,null,null,null,'ter4',null);
insert into project values(1,'ter1',null,'ter2',null,null);
insert into project values(1,'ter1','ter2','ter3',null,null);
insert into project values(1,'ter1','ter2','ter3','ter4','ter5');

I want to add another column which would give me the number of not null territories for that row.

For instance, TER_ORDER would be like

PROJECT_NUMBER TERRITORY1 TERRITORY2 TERRITORY3 TERRITORY4 TERRITORY5 TER_ORDER
-------------- ---------- ---------- ---------- ---------- ---------- ----------
1 ter1 ter2 2
1 ter4 1
1 ter1 ter2 2
1 ter1 ter2 ter3 3
1 ter1 ter2 ter3 ter4 ter5 5

I can do that with case, but that'd be lot of case statements.
I was wondering if we can utilize analytics here.
Any suggestion is much appreciated.

Thanks
Sanji

March 07, 2011 - 9:42 am UTC

No, you don't.

You might want to create a view (in 11g a virtual column) that computes this information:

decode(t1,null,0,1)+decode(t2,null,0,1)+...decode(t5,null,0,1)

Actually, your data model looks probably botched already (do not botch it up more by storing this value physically!). Anytime I see columns ending in "1", "2", "3", ... "N" I say that...

column order

A reader, March 04, 2011 - 5:37 pm UTC

I could think of the following for the above question.
Not sure if it's the most efficient way...

SELECT project_number, territory1, territory2, territory3, territory4, territory5,
ter_order1+ter_order2+ter_order3+ter_order4+ter_order5 as ter_order
FROM(
SELECT project_number, territory1, territory2, territory3, territory4, territory5
,CASE WHEN territory1 is null THEN 0 ELSE 1 END ter_order1
,CASE WHEN territory2 is null THEN 0 ELSE 1 END ter_order2
,CASE WHEN territory3 is null THEN 0 ELSE 1 END ter_order3
,CASE WHEN territory4 is null THEN 0 ELSE 1 END ter_order4
,CASE WHEN territory5 is null THEN 0 ELSE 1 END ter_order5
FROM project) iq

PROJECT_NUMBER TERRITORY1 TERRITORY2 TERRITORY3 TERRITORY4 TERRITORY5 TER_ORDER
-------------- ---------- ---------- ---------- ---------- ---------- ----------
1 ter1 ter2 2
1 ter4 1
1 ter1 ter2 2
1 ter1 ter2 ter3 3
1 ter1 ter2 ter3 ter4 ter5 5

Thanks
Sanji

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.

using aggregate functions

Breadcrumb

Question and Answer

You Asked

and Tom said...

Rating

Comments

using aggregate functions

analytical functions

immense functionality ?

informative

Are there limits on analytic functions?

once more analytic functions

Sorry for not saying it clear!

Can you expand on the subtle problem you are talking about

Is there some difference in SQL Engine of 8.1.5 and 8.1.7 ?

Question on sort

Analytic functions are very useful but.....

Really stuck up here (Analytic Functions)

Thank You

Question on sort

anlytical functions -- oracle 9i

Why isn't 4 rows more "correct" here ?

Can table size affect analytical functions

What is the best (most efficient) way to do that?

Query for you

assigning consecutive numbers to groups

My query has not yet been answered

rewrite queries

about last part of question

How would you substr

Maximum Records on a Day

Nice

Anyway to do an analytical function on another one?

Anyway to do an analytical function on another one?

OK

OK

Just learning Analytical functions..

Joining after getting max value

Expected Answer!

Thanks

saw another 3113 in the thread and thought i would post it

Can I do this?

FIRST_VALUE ANALYTIC WITHIN A WINDOW

Window definition

Correlated Subquery vs Analytical Functions

Scalar Subquery vs Analytical Functions

Top 10 items --

Contd ..

contd...

contd...

EXCELLENT !

Deploying a parent attribute to all children in a hierarchy

Deploying parent attribute to all children in hierarchy

is analytic functions available in SE of the database? (9r2)

An alternative solution

DENSE_RANK versus FIRST_VALUE

new query

Using SUM

very useful

Choosing the "best" approach

Can you suggest a better approach ?

FIRST_VALUE

New to Analytic Functions

SQL Aggregates

Aggregate SQLs

MAX function ....

oracle version

Column order

column order

More to Explore

Analytics