Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Gautam.

Asked: August 20, 2001 - 2:20 pm UTC

Last updated: November 12, 2010 - 8:46 am UTC

Version: 8.1.5

Viewed 10K+ times! This question is

You Asked

Hi..

i want to display the top 5 salaries for each department using single SQL !!! i don't want to use the PL/SQL !!!!

how can i achieve....?.


Your responce will be highly appriciated.

Sorry!!!.

What i want is Top 5 salaries in each department. if department has
repetative salararies [ department no 10 has 8 records having same salary than display only once.

I hope now You may understand my questation.

Your responce will be highly appriciated.


Thanks.






and Tom said...

Ok, given the clarification I can show you a way to get this in 815 and before (slow on a big table) and in 816 and up (fast with analytic functions!)

ops$tkyte@ORA817.US.ORACLE.COM> select * from t;

DEPTNO SAL
---------- ----------
10 100
10 100
10 200
10 300
10 400
10 500
20 100
20 200
20 300
20 400
20 500
20 600
20 700
20 700
30 100
30 200
30 300
30 400

18 rows selected.

Thats my sample data. Now to get it in 815 and before:

ops$tkyte@ORA817.US.ORACLE.COM>
ops$tkyte@ORA817.US.ORACLE.COM>
ops$tkyte@ORA817.US.ORACLE.COM> select distinct *
2 from t t1
3 where 5 >= ( select count(distinct t2.sal)
4 from t t2
5 where t2.deptno = t1.deptno
6 and t2.sal >= t1.sal )
7 /

DEPTNO SAL
---------- ----------
10 100
10 200
10 300
10 400
10 500
20 300
20 400
20 500
20 600
20 700
30 100
30 200
30 300
30 400

14 rows selected.


And now using a new feature of 816:


ops$tkyte@ORA817.US.ORACLE.COM> select distinct *
2 from ( select deptno, sal,
3 dense_rank() over ( partition by deptno order by sal desc ) rank
4 from t )
5 where rank <= 5
6 /

DEPTNO SAL RANK
---------- ---------- ----------
10 100 5
10 200 4
10 300 3
10 400 2
10 500 1
20 300 5
20 400 4
20 500 3
20 600 2
20 700 1
30 100 4
30 200 3
30 300 2
30 400 1

14 rows selected.

Rating

  (42 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

More info

Benny Ching, August 24, 2001 - 6:41 pm UTC

Can you explain this new feature or point me to some
documentation on it. Thanks!

salaries > 5

sree, August 26, 2001 - 5:07 am UTC

hi tom,

i am a java professional. i am new to oracle. can u please explain how the 5>subquery work. like sub query gives only count.
can u suggest good site/book for queries.
i have more doubts.\
1. how to work with tkprof. the command is not working from command prompt. should i set any thing in classpath or path.

2.when i try to create a database i get the following errors. i have followed oracle documentation.
when i try to create database i get the following error and the session terminates.i can find datafiles created in the disk.
ORA-00603: oracke server session terminated by fatal error.

and in alert log file the following error appears

ORA-00376: file 1 cannot be read at this time
ORA-01110: data file 1: 'E:\ORANT\LEASE\LSYS01.ORA'

Can u help me?

thanks
sree

Tom Kyte
August 26, 2001 - 8:09 pm UTC

This query:

ops$tkyte@ORA817.US.ORACLE.COM> select distinct *
  2    from t t1
  3   where 5 >= ( select count(distinct t2.sal)
  4                  from t t2
  5                 where t2.deptno = t1.deptno
  6                   and t2.sal >= t1.sal )
  7  /



is conceptually processed like this:

  for x in ( select * from t t1 )
  loop
      count the number of distinct salaries in t such that the deptno is equal
      to X.DEPTNO and the salary is greater then or equal to X.SAL. 

      if ( that count <= 5 ) 
      then
          keep the record 
      end if;
  end loop

That subquery is technically called a "correlated subquery" and is effectively run once per row in the outer query to get a value.
       

As for TKPROF, it is a simple command line tool.  You need to have your ORACLE_HOME set and ORACLE_HOME/bin should be in the path (not classpath -- tkprof is C).  As you didn't put an error message -- i cannot guess what the error might be.

As for creating a database, I suggest you use the database configuration assistant and let it do the work.  Again, without a tad more information -- answering your specific request isn't really possible. 

Fantastic explantion

umesh, August 26, 2001 - 10:46 am UTC

Good Tom U always remember things for me which I forget down the line ..
Tom I know I should not ask a Question Here
But Still Is there any Indian Edition of your book coming uo soon ( You know I am talking about Expert 1 on 1 !!)


Tom Kyte
August 26, 2001 - 8:11 pm UTC

If by Indian Edition you mean a translated version -- not that I am aware of.

The Article was excellent.

Ganesh. R, August 27, 2001 - 2:34 am UTC

Hi Tom,

This regarding the answer you have given to the last comment by umesh from Banglore.

By Indian Edition we mean a Reprit as the book if got as a US edition Costs us 2000 odd Rs which is too high [ I know that nothing is too much for Tom's book but still... ] So we people in India are eagerly awaiting a Reprint of the book so that we can get from our local book store.

So it would be great if you would pull some strings with wrox for the reprint.

Regards,
Ganesh R



To Find the Nth Max or Min

Kamal.P, August 27, 2001 - 6:44 am UTC

It is most useful to find out the "N"th max or min,exact n value from the given query,please let me know the following result is the same as Mr.TOM's answer..
select a.dept_no,a.salary from t a where 5<=(select count(*) from t
where a.salary<salary );

Tom Kyte
August 27, 2001 - 7:15 am UTC

No, that would be different than:

ops$tkyte@ORA817.US.ORACLE.COM> select distinct *
  2    from t t1
  3   where 5 >= ( select count(distinct t2.sal)
  4                  from t t2
  5                 where t2.deptno = t1.deptno
  6                   and t2.sal >= t1.sal )
  7  /


Yours is very different.  It finds the count of all records with salaries  > the row from the outer query.  My subquery finds the number of DISTINCT ( remove dups ) salaries in the CURRENT DEPTNO that are larger. 

book not available

Harish, August 27, 2001 - 10:04 am UTC

yup,your book is not available in India

top salary in each department but very big emp table

A reader, August 30, 2002 - 5:48 am UTC

hi tom

if we have 100000 rows in table emp then the query (example)

select ename, deptno
from emp a
where sal = (select max(sal) from emp b
where a.deptno = b.deptno
and b.deptno in (10, 30))


would be very very slow, is there a way to rewrite this query?


Tom Kyte
August 30, 2002 - 10:18 am UTC

Why pre-suppose things like that?  Consider:


ops$tkyte@ORA920.US.ORACLE.COM> /*
DOC>drop table emp;
DOC>create table emp as select * from scott.emp where 1=0;
DOC>
DOC>exec gen_data( 'EMP', 100000 );
DOC>
DOC>drop index dept_index;
DOC>create index dept_index on emp( deptno, sal );
DOC>analyze table emp compute statistics for table for all indexes for all indexed columns;
DOC>*/
ops$tkyte@ORA920.US.ORACLE.COM>
ops$tkyte@ORA920.US.ORACLE.COM> select deptno, count(*)
  2    from emp
  3   group by deptno;

DEPTNO   COUNT(*)
------ ----------
     1        503
     2       1000
     3        983
     4        984
.....
    97       1002
    98        989
    99        523

99 rows selected.

Elapsed: 00:00:00.13

ops$tkyte@ORA920.US.ORACLE.COM> select ename, deptno
  2  from emp a
  3  where sal = (select max(sal) from emp b
  4               where a.deptno = b.deptno
  5               and b.deptno in (10, 30))
  6  /

ENAME      DEPTNO
---------- ------
xnkHKxWxyr     10
zqZDsVmPjF     30
<b>
Elapsed: 00:00:00.00

That ain't very very slow, is it?</b>

ops$tkyte@ORA920.US.ORACLE.COM> set autotrace on
ops$tkyte@ORA920.US.ORACLE.COM> select ename, deptno
  2  from emp a
  3  where sal = (select max(sal) from emp b
  4               where a.deptno = b.deptno
  5               and b.deptno in (10, 30))
  6  /

ENAME      DEPTNO
---------- ------
xnkHKxWxyr     10
zqZDsVmPjF     30

Elapsed: 00:00:00.13

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=9 Card=1 Bytes=40)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'EMP' (Cost=2 Card=1 Bytes=14)
   2    1     NESTED LOOPS (Cost=9 Card=1 Bytes=40)
   3    2       VIEW OF 'VW_SQ_1' (Cost=5 Card=2 Bytes=52)
   4    3         SORT (GROUP BY) (Cost=5 Card=2 Bytes=14)
   5    4           INLIST ITERATOR
   6    5             INDEX (RANGE SCAN) OF 'DEPT_INDEX' (NON-UNIQUE) (Cost=5 Card=1333 By
          tes=9331)

   7    2       INDEX (RANGE SCAN) OF 'DEPT_INDEX' (NON-UNIQUE) (Cost=1 Card=1)




Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         17  consistent gets
          0  physical reads
          0  redo size
        487  bytes sent via SQL*Net to client
        499  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          2  rows processed

ops$tkyte@ORA920.US.ORACLE.COM> set autotrace off
ops$tkyte@ORA920.US.ORACLE.COM>

<b>It in fact is pretty darn fast</b>
 

analytic functions instead of correlated query

A reader, August 30, 2002 - 1:40 pm UTC

Hi

I have follwing correlated query


Select co_id, ch_seqno, ch_status, userlastmod, entdate
From contract_history ch1
Where (ch1.ch_seqno = ( Select Max(ch_seqno)
From contract_history ch2
Where ch1.co_id = ch2.co_id
And trunc(entdate) <= trunc(sysdate - 1))
or ( ch1.ch_seqno = 1 and trunc(entdate) <= trunc(sysdate - 1)))
order by co_id, ch_seqno;

The plan is contrct_history_pk (compositi index cod_id and cod_seqno) range scan then contract_history full table scan (or the other way round, I dont have the plan handy and cant get it now)


I changed it into

select co_id, ch_seqno, ch_status, userlastmod, entdate
from (select co_id, ch_seqno, ch_status, userlastmod,
entdate,
dense_rank() over (partition by co_id order by
ch_seqno desc) rango
from contract_history
where co_id < 11
and trunc(entdate) <= trunc(sysdate - 1)) iv_ch
where rango = 1
or (ch_seqno = 1 and trunc(entdate) <= trunc(sysdate - 1))
order by co_id, ch_seqno

The plan is simply a full table scan and window sort

I think the second one is better although it runs just 1 minute faster comparing with previous correlated query for 50000 rows. The consistent gets are lower with analytic function as well

Table contract history has around 13 million rows

Do you think I am in the right direction :?

Tom Kyte
August 30, 2002 - 4:51 pm UTC

If you are using the cost based optimizer -- you are.



Hi Umesh

Reddy, August 30, 2002 - 1:42 pm UTC

I am Reddy here,

Yesterday I got book from Gangarams(after Mr. Tom Suggestion). Now only one copy is available from Gangarams. Please get it now it self,with out fail.

Reddy

oops the second query should be this

A reader, August 30, 2002 - 1:45 pm UTC

select co_id, ch_seqno, ch_status, userlastmod, entdate
from (select co_id, ch_seqno, ch_status, userlastmod, entdate,
dense_rank() over (partition by co_id order by ch_seqno desc) rango
from contract_history
where trunc(entdate) <= trunc(sysdate - 1)) iv_ch
where rango = 1
or (ch_seqno = 1 and trunc(entdate) <= trunc(sysdate - 1))
order by co_id, ch_seqno

cod_id < 11 was redundant

cheers

sreekanth gadhiraju, August 30, 2002 - 2:44 pm UTC

following will also give top five salaries

select empno, ename, -minus_sal sal from
(
select minus_sal, empno, ename, rownum top from
(
select -sal minus_sal, empno, ename from emp
union
select to_number(null), to_number(null), null from dual
minus
select to_number(null), to_number(null), null from dual
)
) where top <= 5

how to do this ???

Riaz Shahid, February 07, 2003 - 4:31 am UTC

Hello Tom !

consider:

Table : t
Columns:

seller_Regno char(13)
invoice_no varchar2(20)
...........
..........


I want top 15 seller_regno's( and their counts) who have the greatest no. of records in table t.

For this i did:

select * from (select count(1)cnt,seller_regno form t
group by seller_regno
order by count(1) desc)
where cnt<16;

It gave me correct results but its too much slow. Can analytic function help me for this ???

I am using 8.1.7 EE on Win2000 machine.

Riaz

Tom Kyte
February 07, 2003 - 8:13 am UTC

that query doesn't look right

select *
from (select count(*) cnt,seller_regno
from t
group by seller_regno
order by 1 desc)
where ROWNUM < 16;

but basically -- your question involves scanning and aggregating the entire table. An index on seller_regno and a NOT NULL constraint would help it greatly.



To find the 3rd maximum

Dani, April 28, 2003 - 5:14 am UTC

Hi The great,
I have Written a query to find the emp_id, who's salary is the 3 rd maximum in the company.

Table Name : Emp

Emp_Id number(10)
Salary number(9,2)

create table emp(Emp_id number(10), salary number(9));

insert into emp values(1,3000);
insert into emp values(2,1000);
insert into emp values(3,5000);
insert into emp values(4,2400);


The Query is
----------------
Select emp_id from emp
Where
Salary = (select min(Salary) from (select distinct salary from emp order by

salary desc) where rownum <=3)

-------------------
It is working perfectly.

But my boss is told that you can give better query than the previous one. Is there any other

efficient way to find this 3rd maximum.

Please reply to me


Thanks in advance

Dani

Tom Kyte
April 28, 2003 - 8:05 am UTC

and you could not figure out how to do this given the above examples?

Top N query

Mohan K, June 13, 2003 - 9:15 am UTC

Another query can be

select b.rnum, a.deptno, b.sal from
(select deptno, min(rnum) minrnum from(select rownum rnum, deptno, sal from (select deptno, sal from dept order by deptno, sal)) group by deptno) a,
(select rownum rnum, deptno, sal from (select deptno, sal from dept order by deptno, sal)) b
where a.deptno=b.deptno and b.rnum>=a.minrnum and b.rnum<(a.minrnum+5)
/


Query to find row wise maximum

Praveen, June 25, 2003 - 3:40 am UTC

Hi tom,

    Learned a lot from your site. I am regular reader of your site. I have one small question to ask. suppose I have a students table as described below

SQL> desc students
Name Type         Nullable Default 
---- ------------ -------- ------- 
NAME VARCHAR2(30) Y                
PHY  NUMBER(3)    Y                
CHEM NUMBER(3)    Y                
MATH NUMBER(3)    Y                
BIO  NUMBER(3)    Y

In this table I store all the marks of the students in physics, chemistry, maths and biology. Now I have data as below:
SQL> select * from students;

NAME                            PHY CHEM MATH  BIO
------------------------------ ---- ---- ---- ----
Jack                             45   50   67   35
Jill                             50   76   68   66
Jane                             35   78   78   56
Bill                             56   35   98  100

4 rows selected

I want to display the student wise subjects that he/she is scoring most. From the above example I want to display the result as

NAME        SUBJECT
--------------- ---------------
Jack        MATH
Jill        CHEM
Jane        CHEM, MATH
Bill        BIO

can I achieve this in a single SQL statement? If so then can you explain the method... 

Tom Kyte
June 25, 2003 - 12:02 pm UTC

no version.  bummer.

ok, using the technique at
http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:2196162600402

pretend DEPTNO = NAME
pretend SAL=grade
pretend ENAME = subject...

then


ops$tkyte@ORA920LAP> select deptno, stragg(ename) ename
  2    from (select deptno, ename, dense_rank() over (partition by deptno order by sal DESC) dr
  3            from emp)
  4   where dr = 1
  5   group by deptno;

    DEPTNO ENAME
---------- ------------------------------
        10 KING
        20 SCOTT,FORD
        30 BLAKE

 

Nice

Ferdinand Maer, September 24, 2003 - 1:52 am UTC

Dear Sir,
I have some doubts regarding queries.They are as follows
1)Can a column value be tested in queries? For example.
How to make this query work?
select deptno,sal,hiredate from emp where 'KING' = (..TEST
condition );
How to do this?I tried but it throws errors.Could you please tell how to do this?
2)I want to get rows from EMP table who donot belong to
deptno = 10 using * not exists* .The query needs to be like
"select * from emp where not exists(...subquery which tests
for deptno ->10)" .Could you please help?
Thanks a lot in advance.



Tom Kyte
September 24, 2003 - 9:31 am UTC

1) having no idea what ( ...TEST condition ) means -- i cannot say anything about this one....


2) why would you want to do that???? but you could

1* select ename, deptno from emp where not exists ( select * from dual where deptno = 10 )
scott@ORA920LAP> /

ENAME DEPTNO
---------- ----------
SMITH 20
ALLEN 30
WARD 30
JONES 20
MARTIN 30
BLAKE 30
SCOTT 20
TURNER 30
ADAMS 20
JAMES 30
FORD 20

11 rows selected.


it would be about the wrongest way to do it, but you could

Nice

Ferdinand Maer, September 25, 2003 - 2:17 am UTC

Dear Sir,
Regarding the first question ,I tried to get the ename which must match 'KING'.To be clear My intention was
" select deptno,sal from emp where 'KING' = (select ename
from emp).I want the subquery to return the ename as 'KING'
Is this possible?
I have some other doubts also involving queries.
1)What is this "column_value"?I searched in your website but
donot find anything.
2)You sometime use "select max(decode(col_name...))"
Could you please explain why you are doing so with an example?
Thanks in advance


Tom Kyte
September 25, 2003 - 5:20 am UTC

why?  it would be "wrong"


but, it would be;

select deptno, sal from emp where 'KING' = ( select ename from DUAL );


but again - it is wrong wrong wrong, wrong wrong wrong. not anything you want to do.


1) column_value is the default name assigned to the single column of a nested table type.


ops$tkyte@ORA920LAP> create or replace type numArray as table of number
  2  /

Type created.

ops$tkyte@ORA920LAP>
ops$tkyte@ORA920LAP> create table t
  2  ( x int,
  3    y numArray )
  4  nested table y store as y_tab;

Table created.

ops$tkyte@ORA920LAP>
ops$tkyte@ORA920LAP> insert into t values ( 1, numArray(1,2,3) );

1 row created.

ops$tkyte@ORA920LAP>
ops$tkyte@ORA920LAP> select y.* from t x, TABLE(x.y) y
  2  /

COLUMN_VALUE
------------
           1
           2
           3

ops$tkyte@ORA920LAP>

2) you'll have to show me a FULL example as there are possibly and infinite set of responses to that question. 

Nice

Ferdinanand Maer, September 25, 2003 - 9:16 am UTC

Dear Sir,
Thanks for you response.I need a query that returns the max.
among the count of jobs.I tried the one like
"select job,count(*),max(count(*)) from job group by job"
But got an error stating that not a single - group group
function.How to correct this?Please specify other formats of
the query.
P.S)-> Can the "select max(decode(job,..." example used here?
Please do reply.
Thanks a lot in advance.

Tom Kyte
September 25, 2003 - 11:15 pm UTC


scott@ORA920LAP> select job, count(*), max(count(*)) over () from emp group by job;

JOB COUNT(*) MAX(COUNT(*))OVER()
--------- ---------- -------------------
ANALYST 2 4
CLERK 4 4
MANAGER 3 4
PRESIDENT 1 4
SALESMAN 4 4


Nice

Ferdinand Maer, September 26, 2003 - 4:36 am UTC

Dear Sir,
Thanks for your response.I have a question for you.
1) Can we have a glimpse of the first question which you have received when this website was started?

Tom Kyte
September 26, 2003 - 6:26 am UTC

i forget the exact first question, one of them that was sort of the genisys of it all was:

</code> https://asktom.oracle.com/Misc/oramag/oracle-availability-options.html <code>

starting to show it's age by now but it was the first print article I had and that question was among the first few.



NIce

Ferdinand Maer, September 27, 2003 - 11:26 am UTC

Dear Sir,
Thanks for your response.I have some questions regarding
queries.They are
1)In a query,can a where clause contain a select stmt.like
"select * from dept where(select ...) = .."
2)suppose if I want to add the sal of first five rows regard
-less of deptno. how can I do it?Can analytic functions be
used here?
3)Can a select like this possible ?
'select(select * from a) from b'
Please provide some examples.
Thanks in advance.
Bye!





Tom Kyte
September 27, 2003 - 1:04 pm UTC

1) that is a statement of fact, not a question...

2) select sum(sal) from ( select sal from t where rownum <= 5 )

3) yes.

select ( select * from dual ) from dual;



Predicate out of scope

Marcio, October 02, 2003 - 12:54 pm UTC

Is it possible?

ops$t_mp00@MRP9I1> select x.*, rank() over ( order by sum_sal ) rank
2 from (
3 select job, sum(sal) sum_sal
4 from emp
5 group by job
6 ) x
7 /

JOB SUM_SAL RANK
--------- ---------- ----------
CLERK 4150 1
PRESIDENT 5000 2
SALESMAN 5600 3
ANALYST 6000 4
MANAGER 8275 5
DBA 10000 6

6 rows selected.

Now I would like selection dynamic with hiredate predicate. I know that query below doesn't work, just ilustrate what I want.

ops$t_mp00@MRP9I1>
ops$t_mp00@MRP9I1> select x.*, rank() over ( order by sum_sal ) rank
2 from (
3 select job, sum(sal) sum_sal
4 from emp
5 group by job
6 ) x
7 where hiredate between to_date('01/01/80', 'dd/mm/yy') and
8 to_date('01/01/99', 'dd/mm/yy')
9 /
where hiredate between to_date('01/01/80', 'dd/mm/yy') and
*
ERROR at line 7:
ORA-00904: "HIREDATE": invalid identifier

Any idea?

Tom Kyte
October 02, 2003 - 4:46 pm UTC

using an illustration that doesn't make any sense to me won't help. I don't know what you want really.

is it as easy as putting the predicate in the inline view where it will work.

View with rank dynamic

Marcio, October 03, 2003 - 8:21 am UTC

I'm helping with BO (Business Inteligence Tool) and need a view that brings to me a rank of sum(sal). The view predicate is -- where hiredate between a range. I know if I put predicate into inline view this will work fine, but how can I create a view to accept a range of hiredate and rank with sum()?



Tom Kyte
October 03, 2003 - 8:44 am UTC


tell me what you can do in BO -- can you make a stored procedure call before running a query? (we can parameterize a view with an application context). Can you call a stored procedure that returns a ref cursor? what can we do -- i don't know this tool.

View

Marcio, October 03, 2003 - 9:57 am UTC

1) can you make a stored procedure call before running a query?
No. This was the first choice offer.

2) Can you call a stored procedure that returns a ref cursor?
No.

We can do view in BO.

Tom Kyte
October 03, 2003 - 10:17 am UTC

sorry, no go then.

you cannot push the predicate down into the view with analytics. it changes the semantic meaning of the query. if you create a view like:


create view v
as
select empno, sal, hiredate, rank() over (order by sal desc) rnk
from emp;


when you query:


select * from v where hiredate between A and B;

that will be like:


select * from (select empno, sal, hiredate, rank() over (order by sal desc) rnk
from emp ) where hiredate between A and B;


the inline view would be materialized fully AND THEN queried -- for the simple reason that

select empno, sal, hiredate, rank() over (order by sal desc) rnk
from emp
where hiredate between A and B

would return a totally different answer.



You'll have to put the query itself into BO.

Complt. last follup

Marcio, October 03, 2003 - 10:18 am UTC

In BO we can select * from view... just it, just queries.

BO

A reader, October 03, 2003 - 12:58 pm UTC

In BO (thick client) you can use stored procedures. You can also use analytical functions in BO Universe Design. For using analytical function go to thier web site and search for "tips and tricks". You will find the document under BO Universe Design.

Tom Kyte
October 03, 2003 - 1:33 pm UTC

thanks so much for the input -- appreciate it!

thanks "reader"

Marcio, October 03, 2003 - 2:51 pm UTC

Thanks, but according my colleague, who is responsible to implement that problem with BO, is -- web intelligence doesn't allow any stored procedure or queries with inline views YET.
Anyway, thank you so much!



Awesome

Jeffrey, October 30, 2003 - 5:43 pm UTC

Hello Tom,

I have learned more on Oracle from this site than any other resource. Thank you.

I have a followup question for this case; and I also wanted your readers to know how well this works. I timed these and the second example was approximately 25% the time of the first.

I started with the following query: (where T1 has (id number, value number, value2 number, and id and value are relevant, value2 is not)

select * from t1 where (id, value) in (
select i1, v1 from (
select id i1, value v1 from t1 T
where id in (select distinct id from t1)
and value = (select max(value) from t1 where id = T.id)
) group by i1, v1 having count(*) = 1
)

the goal here was to find the id,value pairs with the greatest value for a given ID, but only when the case with that value has only one instance. if there are >1 cases with the highest 'value', it should not be included in the result.

I am far from a SQL expert, so the first question is how could I improve that query without analytical functions?

I then turned that into the following:

select * from t1 where (id, value) in (
select i1, v1 from
(select id i1, value v1,
dense_rank() over ( partition by id order by value desc ) rank
from t1 T
where id in (select distinct id from t1)
) where rank = 1
group by i1, v1
having count(*) = 1)

The inner part here returns the ranks, when is selected only the #1 ranks (equivalent to the MAX above), then selecting cases where count=1 to reject the ones where there is more than one case.

2) 2nd question is how could I improve this query?





Top Rows from subquery -- Is it possible?

Dilip Patel, May 26, 2004 - 11:51 am UTC

Tom,
I have query of following type.

Select taba_col1, tabb_col1,
(select sum(top_five_rows_from_some_other_table)
from some_other_table where some_other_table.cola = tabb.cola
order by some_other_table.colb) -- This line is giving error.
from taba, tabb
where taba.col1 = tabb.col1 ;

Basically It is not letting me use order by clause in computed column of select list.

Just want to know if this is possible or syntax'lly wrong.
I am breaking my head since last two days and nothing seems to work.

Please advise. Thanks as always.

-- Dilip

Tom Kyte
May 26, 2004 - 3:23 pm UTC

in a select list like that

select a,b, (query) from t

the query is known as a SCALAR subquery.

It returns zero or one rows (if it returns two, it is an error)

hence -- order by in that context = does not compute

you have at MOST one row there.

Thanks, much clear now.

Dilip Patel, May 26, 2004 - 4:06 pm UTC


Might not be suitable for this thread but...

A reader, July 15, 2004 - 6:53 am UTC

Tom,

I guess this is not a right question to ask in this thread. But couldn't resist myself doing so.
What is the difference (if there is one) between "distinct" & "unique" ??
I remeber reading this somewhere but can't recollect though.

PLease throw some light as usual.

Thanks

Tom Kyte
July 15, 2004 - 12:35 pm UTC

they are synonymous.

top-n subquery in an update

Wil Simpson, August 08, 2004 - 3:20 pm UTC

yet another variation on top-n queries:

what is the best way to calculate: (in the context of an update) the number of transactions that were entered in the same way as the current transaction out of the 5 last transactions by the cardholder (regardless of entry method)

or more generally stated: calculate some aggregate function over a subset of rows from an interval of rows defined around this row, i.e. the one being updated

setup:

CREATE TABLE cg (
cardnumber varchar2(16) not null,
entrytype char(1) not null,
chargedate date not null,
NBANAT5#DD0 numeric(7) null) ;

(i apologize for leaving out inserts of random data, but i think that the problem is pretty clear even from the rbo plans of an empty table)

CREATE INDEX cgcncdet on cg (cardnumber,chargedate,entrytype) ;

CREATE INDEX cgcnetcd on cg (cardnumber,entrytype,chargedate) ;

the set-theory happy version:

(if we assume that no transactions are simultanious)

UPDATE cg
SET NBANAT5#DD0
= NVL(( SELECT COUNT(*)
FROM cg t
WHERE t.cardnumber = cg.cardnumber
AND t.entrytype = cg.entrytype
AND ( SELECT COUNT(*)
FROM cg t1
WHERE t1.cardnumber = cg.cardnumber
AND t1.chargedate >= t.chargedate
AND t1.chargedate < cg.chargedate) <= 5
AND t.chargedate <= cg.chargedate ),0) ;

UPDATE STATEMENT
UPDATE CG
TABLE ACCESS FULL CG
SORT AGGREGATE
FILTER
INDEX RANGE SCAN CGCNETCD "SYS_ALIAS_3"."CARDNUMBER"=:B1 AND "SYS_
ALIAS_3"."ENTRYTYPE"=:B2 AND "SYS_ALIAS_
3"."CHARGEDATE"<=:B3
SORT AGGREGATE
INDEX RANGE SCAN CGCNCDET "T1"."CARDNUMBER"=:B1 AND "T1"."CHARGEDA
TE">=:B2 AND "T1"."CHARGEDATE"<:B3

so, for every single row in the transaction history for the cardholder and entrytype, it calculates how many rows exist between it and the current transctions to determine if it is in the last 5 or not. it works and when each cardholder has a sufficiently small transaction history then the performance is not TOO bad, but when the transaction history becomes long it is grotesque. (well, it's grotesque in the first place, but the response time will change from maybe a minute over a million rows with small transaction histories to being several hours over a million rows with long transaction histories)

if it were legal, then the following would perform better, but it isn't. (no correlated subquery in an in-line view, right?)

UPDATE cg
SET NBANAT5#DD0
= NVL(( SELECT count(*)
FROM ( SELECT * from cg t1
where t1.cardnumber = cg.cardnumber
and t1.chargedate < cg.chargedate
order by t1.chargedate desc) t
WHERE ROWNUM <= 5
and t.entrytype = cg.entrytype),0) ;

conceivably something like

SUM(decode(entrytype, >this<.entrytype, 1, 0) OVER (partition by cardnumber order by chargedate desc
rows between 5 preceding and 1 preceding)

would work as well, but i don't see any means of referring to >this<.entrytype in the analytic function. (and aggregate or analytic functions can't be used in update statements, right?)

any suggestions?

thank you,
Wil


Tom Kyte
August 08, 2004 - 6:04 pm UTC

is the goal here to update "a single row in a large table" or "every single row in a large table"?

answers for each are very very different.

top-n subquery in an update - continued

Wil Simpson, August 08, 2004 - 6:54 pm UTC

every row.

Tom Kyte
August 09, 2004 - 7:20 am UTC

ops$tkyte@ORA9IR2> CREATE TABLE cg ( cardnumber varchar2(16) not null,
  2                     entrytype char(1) not null,
  3                    chargedate date not null,
  4                   NBANAT5#DD0 numeric(7) null) ;
 
Table created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> insert into cg (cardnumber, entrytype, chargedate )
  2  select 1, 'x', sysdate-rownum
  3    from all_users
  4   where rownum <= 10;
 
10 rows created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> insert into cg (cardnumber, entrytype, chargedate )
  2  select 2, substr(username,1,1), sysdate-rownum
  3    from all_users
  4   where rownum <= 10;
 
10 rows created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> insert into cg (cardnumber, entrytype, chargedate )
  2  select 3, substr(username,1,1), sysdate-rownum
  3    from all_users
  4   where rownum <= 4;
 
4 rows created.
 
ops$tkyte@ORA9IR2> insert into cg (cardnumber, entrytype, chargedate )
  2  select 3, 'x', sysdate-(rownum+4)
  3    from all_users
  4   where rownum <= 6;
 
6 rows created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> merge into cg
  2  using ( select rowid rid,
  3                 decode( lag(entrytype,1) over (partition by cardnumber order by chargedate), entrytype,1,0)+
  4                 decode( lag(entrytype,2) over (partition by cardnumber order by chargedate), entrytype,1,0)+
  5                 decode( lag(entrytype,3) over (partition by cardnumber order by chargedate), entrytype,1,0)+
  6                 decode( lag(entrytype,4) over (partition by cardnumber order by chargedate), entrytype,1,0)+
  7                 decode( lag(entrytype,5) over (partition by cardnumber order by chargedate), entrytype,1,0)+1 cnt
  8            from cg ) cg2
  9  on ( cg.rowid = cg2.rid )
 10  when matched then update set nbanat5#dd0 = cnt
 11  when not matched then insert (cardnumber) values ( null );
 
30 rows merged.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> break on cardnumber skip 1
ops$tkyte@ORA9IR2> select * from cg
  2  order by cardnumber, chargedate;
 
CARDNUMBER       E CHARGEDAT NBANAT5#DD0
---------------- - --------- -----------
1                x 30-JUL-04           0
                 x 31-JUL-04           1
                 x 01-AUG-04           2
                 x 02-AUG-04           3
                 x 03-AUG-04           4
                 x 04-AUG-04           5
                 x 05-AUG-04           5
                 x 06-AUG-04           5
                 x 07-AUG-04           5
                 x 08-AUG-04           5
 
2                X 30-JUL-04           0
                 C 31-JUL-04           0
                 M 01-AUG-04           0
                 O 02-AUG-04           0
                 O 03-AUG-04           1
                 W 04-AUG-04           0
                 D 05-AUG-04           0
                 O 06-AUG-04           2
                 S 07-AUG-04           0
                 S 08-AUG-04           1
 
3                x 30-JUL-04           0
                 x 31-JUL-04           1
                 x 01-AUG-04           2
                 x 02-AUG-04           3
                 x 03-AUG-04           4
                 x 04-AUG-04           5
                 D 05-AUG-04           0
                 O 06-AUG-04           0
                 S 07-AUG-04           0
                 S 08-AUG-04           1
 
 
30 rows selected.


<b>Note, i would expect the CBO to blow off ALL indexes, indexes would be the kiss of death here.</b> 

Query to find row wise maximum

ashish, August 09, 2004 - 4:45 am UTC

hi praveen its for u


select name,decode(greatest(eng,hindi,math,sci),eng,'eng',hindi,'hindi',math,'math',sci,'sci') subject from ll
group by name ,decode(greatest(eng,hindi,math,sci),eng,'eng',hindi,'hindi',math,'math',sci,'sci')
/


need some help.

reader, August 30, 2005 - 6:28 am UTC

Hi Tom,
good day to you, need some help from you, in a table Temp having a column col1 number datatype, if values are inserted in sequence and few of the records are deleted how can we find the min missing number i.e. seq number, sounds bizarre but would like to know how can this be done via a query.

Kind Regards,
your fan

Tom Kyte
August 30, 2005 - 7:21 am UTC

ops$tkyte@ORA10GR1> create table t as select rownum x from all_users;

Table created.

ops$tkyte@ORA10GR1> delete from t where x = 10;

1 row deleted.

ops$tkyte@ORA10GR1>
ops$tkyte@ORA10GR1> select *
  2    from ( select x, lead(x) over (order by x) next_x
  3             from t
  4                    order by x
  5             )
  6   where x+1 <> next_x
  7     and rownum = 1;

         X     NEXT_X
---------- ----------
         9         11

Not sure I'd want to do that on a HUGE table ;) 

thanks a lot for your help

reader, August 30, 2005 - 9:37 am UTC

Hi Tom,
once again thanks a lot for your help, one more thing though could you please tell me where to read about these analytical functions with examples or if you are writing a book on these nothing better than that.

Thanks a lot again.

kind regards,
your fan.

Tom Kyte
August 30, 2005 - 12:18 pm UTC

I have a chapter on them in the old Expert one on one Oracle.

The DATA WAREHOUSING guide has much about them as well.

Getting back to groups, counts etc

Zoe, April 25, 2006 - 8:28 am UTC

Hi there,
Harking back to something similar to the start of this question about distinct and groups etc
I was hoping of some clarity on a piece of code I have to write that is stumping me at the mo,

please provide a count of the number of 'children' currently connected to an 'Adult' . Please provide it in the style of :
1 Child | number of Adults that have 1 child
2 Children | number of Adults that have 2 children
etc...

The table currently has the 2 columns I am interested in like
Adult Child
--------------------
Adult1 Child_1a
Adult1 Child_1b
Adult2 Child_2a
Adult3 Child_3a
Adult3 Child_3b
Adult3 Child_3c
etc... (not in that order necessarily)
I was trying to use distinct, count, group, and it never returns exactly what I want (I have so many different iterations of code I cannot really write them all here and they are all wrong anyway)
Please can you help


Tom Kyte
April 25, 2006 - 10:17 am UTC

first you would query


select count(*) cnt from t group by adult;

that gives you cnts by adults, then you can query:

select cnt, count(*)
from (select count(*) cnt from t group by adult)
group by cnt
order by cnt;


to count how many have 1, 2, 3, ... N children.


Top 5 Salaries by Department

Donal Dee, August 09, 2006 - 11:24 am UTC

Tom

I found the two query examples extremely Helpful
Many thanks

Donal


Total salary Increment rank

your fan, June 18, 2008 - 4:13 am UTC

Total salary Increment rank for the last 2 years or one year etc.

oracle ver is 8.5

Create table sal_raise (eno number (10) , raise_date date, raise_value number(10));

Insert into sal_raise values (101, ¿01-01-2007¿, 1000);
Insert into sal_raise values (102, ¿01-01-2007¿, 500);
Insert into sal_raise values (103, ¿01-02-2007¿, 700);
Insert into sal_raise values (102, ¿01-05-2007¿, 100);
Insert into sal_raise values (106, ¿01-05-2007¿, 100);
Insert into sal_raise values (107, ¿01-05-2007¿, 50);
Insert into sal_raise values (108, ¿01-05-2007¿, 200);
Insert into sal_raise values (109, ¿01-05-2007¿, 200);
Insert into sal_raise values (101, ¿01-01-2008¿, 1000);
Insert into sal_raise values (105, ¿01-01-2008¿, 1500);
Insert into sal_raise values (106, ¿01-05-2008¿, 200);
Insert into sal_raise values (107, ¿01-07-2008¿, 100);

How to see total top 5 increment during 01-01-2007 to 31-12-2008 .. or user may select date¿ and for some employees increment happen twice in same year ¿ but we need to see total increment during specified dates and its rank


Thank you

Tom Kyte
June 18, 2008 - 1:15 pm UTC

... oracle ver is 8.5
....

no, it is not... there was never a version 8.5


ops$tkyte%ORA9IR2> variable x varchar2(30)
ops$tkyte%ORA9IR2> variable y varchar2(30)
ops$tkyte%ORA9IR2> exec :x := '01-jan-2007'; :y := '31-dec-2007'

PL/SQL procedure successfully completed.

ops$tkyte%ORA9IR2> select *
  2    from (
  3  select eno, raise, dense_rank() over (order by raise) rnk
  4    from (
  5  select eno, sum(raise_value) raise
  6    from sal_raise
  7   where raise_date >= to_date(:x,'dd-mon-yyyy')
  8     and raise_date < to_date(:y,'dd-mon-yyyy')+1
  9   group by eno
 10         )
 11         )
 12   where rnk <= 5
 13   order by rnk
 14  /

       ENO      RAISE        RNK
---------- ---------- ----------
       107         50          1
       106        100          2
       108        200          3
       109        200          3
       102        600          4
       103        700          5

6 rows selected.

your fan, June 19, 2008 - 6:21 am UTC

i'm sorry tom i wrote wrong version it is i work on
8.0.4.0.0 and your query not work in my version can you please simplify query

Tom Kyte
June 19, 2008 - 10:19 am UTC

hah, going backwards things get HARDER not EASIER. I won't/can't simplify. We have to make it harder.

and I see I missed a DESC on the order by in the dense_rank - correct that and here are the two queries, one for 8.1.5 and above and one for before that.

ops$tkyte%ORA9IR2> select *
  2    from (
  3  select eno, raise, dense_rank() over (order by raise DESC) rnk
  4    from (
  5  select eno, sum(raise_value) raise
  6    from sal_raise
  7   where raise_date >= to_date(:x,'dd-mon-yyyy')
  8     and raise_date < to_date(:y,'dd-mon-yyyy')+1
  9   group by eno
 10         )
 11         )
 12   where rnk <= 5
 13   order by rnk
 14  /

       ENO      RAISE        RNK
---------- ---------- ----------
       101       1000          1
       103        700          2
       102        600          3
       108        200          4
       109        200          4
       106        100          5

6 rows selected.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> select eno, sum(raise_value)
  2    from sal_raise
  3   where raise_date >= to_date(:x,'dd-mon-yyyy')
  4     and raise_date < to_date(:y,'dd-mon-yyyy')+1
  5   group by eno
  6  having sum(raise_value) in ( select *
  7                                 from (
  8                               select distinct sum(raise_value)
  9                                 from sal_raise
 10                                where raise_date >= to_date(:x,'dd-mon-yyyy')
 11                                  and raise_date < to_date(:y,'dd-mon-yyyy')+1
 12                                group by eno
 13                                order by 1 desc
 14                                      )
 15                                where rownum <= 5
 16                                                          )
 17   order by 2 desc
 18  /

       ENO SUM(RAISE_VALUE)
---------- ----------------
       101             1000
       103              700
       102              600
       108              200
       109              200
       106              100

6 rows selected.

A reader, June 21, 2008 - 2:51 am UTC

thank you tom i agree with u to use new version but my comany not update even we have it anyhow.. the second query also give error .. and rank column also missing


SQL> select eno, sum(raise_value)  
  2     from sal_raise
  3    where raise_date >= to_date(:x,'dd-mon-yyyy')
  4        and raise_date < to_date(:y,'dd-mon-yyyy')+1
  5      group by eno
  6     having sum(raise_value) in ( select *
  7                                     from (
  8                                   select distinct sum(raise_value)
  9                                   from sal_raise
 10                                  where raise_date >= to_date(:x,'dd-mon-yyyy')
 11                                    and raise_date < to_date(:y,'dd-mon-yyyy')+1
 12                               group by eno
 13                                  order by 1 desc
 14                                        )
 15                                  where rownum <= 5
 16                                                            )
 17     order by 2 desc
 18   ;

ERROR at line 13:
ORA-00907: missing right parenthesis

Tom Kyte
June 21, 2008 - 8:28 pm UTC

who is "U" and why are you agreeing them with? They are not part of this conversation at all????


oh, that is right... way way way WAY back in 8.0 - order by in a subquery was not supported (I don't have such old stuff sitting around normally)


In that really really really old stuff, this will be horribly inefficient. We have to generate a set of the top 5 raise amounts. It is easy enough to generate the raise amounts

ops$tkyte@ORA806> select distinct sum(raise_value) raise
  2    from sal_raise
  3   where raise_date >= to_date(:x,'dd-mon-yyyy')
  4     and raise_date < to_date(:y,'dd-mon-yyyy')+1
  5   group by eno
  6  /

     RAISE
----------
        50
       100
       200
       600
       700
      1000

6 rows selected.


but getting the top five is an exercise in "do lots of IO"

ops$tkyte@ORA806> select *
  2    from (
  3  select distinct sum(raise_value) outer_raise
  4    from sal_raise
  5   where raise_date >= to_date(:x,'dd-mon-yyyy')
  6     and raise_date < to_date(:y,'dd-mon-yyyy')+1
  7   group by eno
  8             ) x
  9   where 5 >= (select count(distinct inner_raise)
 10                             from ( select sum(raise_value) inner_raise
 11                          from sal_raise
 12                         where raise_date >= to_date(:x,'dd-mon-yyyy')
 13                           and raise_date < to_date(:y,'dd-mon-yyyy')+1
 14                                                   and rownum > 0
 15                         group by eno
 16                                           )
 17                            where inner_raise >= x.outer_raise )
 18  /

OUTER_RAISE
-----------
        100
        200
        600
        700
       1000


and then we can use that to find what we are interested in...

ops$tkyte@ORA806> select eno, outer_raise
  2    from (
  3  select eno, sum(raise_value) outer_raise
  4    from sal_raise
  5   where raise_date >= to_date(:x,'dd-mon-yyyy')
  6     and raise_date < to_date(:y,'dd-mon-yyyy')+1
  7   group by eno
  8             ) x
  9   where 5 >= (select count(distinct inner_raise)
 10                             from ( select sum(raise_value) inner_raise
 11                          from sal_raise
 12                         where raise_date >= to_date(:x,'dd-mon-yyyy')
 13                           and raise_date < to_date(:y,'dd-mon-yyyy')+1
 14                                                   and rownum > 0
 15                         group by eno
 16                                           )
 17                            where inner_raise >= x.outer_raise )
 18   order by 2 desc
 19  /

       ENO OUTER_RAISE
---------- -----------
       101        1000
       103         700
       102         600
       108         200
       109         200
       106         100

6 rows selected.



your fan, June 22, 2008 - 3:05 am UTC

Thank you so much Tom you are really greate.....it is a good example to explan advantage of new version with old and how simple it is in new verson..

but tom is it possible to get the rank column also in the result... like

ENO RAISE RNK
---------- ---------- ----------
101 1000 1
103 700 2
102 600 3
108 200 4
109 200 4
106 100 5

about oracle

teju, November 12, 2010 - 12:32 am UTC

how to get top 5 salaries from emp table..?
Tom Kyte
November 12, 2010 - 8:46 am UTC

are you telling me that after reading this page - you honestly cannot answer that question yourself?

read the original answer to the first question and look at dense_rank, understand it - and it'll be obvious that all you need to do is remove the partition by deptno bit so the entire table is ranked instead of ranking by department.

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.