## Question and Answer

## You Asked

Hi, Tom.

Could you please tell me more about how to determine when to use analyatic function dense_rank, and when to use rank?

Thanks.

Bo

Could you please tell me more about how to determine when to use analyatic function dense_rank, and when to use rank?

Thanks.

Bo

## and Tom said...

You need to tell ME when you want to use either/or. This is a short extract from my book where I cover analytic functions in great detail.

dense_rank

This function computes the relative rank of each row returned from a query with respect to the other rows, based on the values of the expressions in the ORDER BY clause. The data within a group is sorted by the ORDER BY clause and then a numeric ranking is assigned to each row in turn starting with 1 and continuing on up. The rank is incremented every time the values of the ORDER BY expressions change. Rows with equal values receive the same rank (NULLS are considered equal in this comparison). A dense rank returns a ranking number without any gaps. This is as compared to RANK below.

...

rank

This function computes the relative rank of each row returned from a query with respect to the other rows, based on the values of the expressions in the ORDER BY clause. The data within a group is sorted by the ORDER BY clause and then a numeric ranking is assigned to each row in turn starting with 1 and continuing on up. Rows with the same exact values of the order by expressions receive the same rank, however if two rows do receive the same rank ? the rank numbers will subsequently "skip". If two rows are #1, there will be no #2 ? rank will assign the value of 3 to the next row in the group. This is in comparison to DENSE_RANK which does not skip values.

.....

We will use the SCOTT.EMP table for these examples. The first question we will answer is "give me the set of sales people making the top 3 salaries in each department"?

scott@TKYTE816> break on deptno skip 1

scott@TKYTE816> select *

2 from ( select deptno, ename, sal,

3 dense_rank() over ( partition by deptno

4 order by sal desc ) dr

5 from emp )

6 where dr <= 3

7 order by deptno, sal desc

8 /

DEPTNO ENAME SAL DR

---------- ---------- ---------- ----------

10 KING 5000 1

CLARK 2450 2

MILLER 1300 3

20 SCOTT 3000 1

FORD 3000 1

JONES 2975 2

ADAMS 1100 3

30 BLAKE 2850 1

ALLEN 1600 2

TURNER 1500 3

10 rows selected.

What we did here was to use the DENSE_RANK() function to get the top three salaries. We assigned the dense rank to the SAL column sorted descending. If you recall from above, a dense rank does not skip numbers and will assign the same number to those rows with the same value. Hence, after the result set is built in the inline view ? we can simply select all of the rows with a dense rank of three or less ? that gives us everyone who makes the top three salaries by department number. Just to show what would happen if we tried to use RANK and encountered these duplicate values:

scott@TKYTE816> select deptno, ename, sal,

2 dense_rank() over ( partition by deptno

3 order by sal desc ) dr,

4 rank() over ( partition by deptno

5 order by sal desc ) r

6 from emp

7 order by deptno, sal desc

8 /

DEPTNO ENAME SAL DR R

---------- ---------- ---------- ---------- ----------

10 KING 5000 1 1

CLARK 2450 2 2

MILLER 1300 3 3

20 SCOTT 3000 1 1

FORD 3000 1 1

JONES 2975 2 3 <<----

ADAMS 1100 3 4 <<----

SMITH 800 4 5

30 BLAKE 2850 1 1

ALLEN 1600 2 2

TURNER 1500 3 3

WARD 1250 4 4

MARTIN 1250 4 4

JAMES 950 5 6

14 rows selected.

If we had used RANK, it would have left ADAMS out of the result set ? but ADAMS is one of the people in department 20 making the top 3 salaries so he belongs in the result. In this case, using RANK over DENSE_RANK would not have answered our specific query.

dense_rank

This function computes the relative rank of each row returned from a query with respect to the other rows, based on the values of the expressions in the ORDER BY clause. The data within a group is sorted by the ORDER BY clause and then a numeric ranking is assigned to each row in turn starting with 1 and continuing on up. The rank is incremented every time the values of the ORDER BY expressions change. Rows with equal values receive the same rank (NULLS are considered equal in this comparison). A dense rank returns a ranking number without any gaps. This is as compared to RANK below.

...

rank

This function computes the relative rank of each row returned from a query with respect to the other rows, based on the values of the expressions in the ORDER BY clause. The data within a group is sorted by the ORDER BY clause and then a numeric ranking is assigned to each row in turn starting with 1 and continuing on up. Rows with the same exact values of the order by expressions receive the same rank, however if two rows do receive the same rank ? the rank numbers will subsequently "skip". If two rows are #1, there will be no #2 ? rank will assign the value of 3 to the next row in the group. This is in comparison to DENSE_RANK which does not skip values.

.....

We will use the SCOTT.EMP table for these examples. The first question we will answer is "give me the set of sales people making the top 3 salaries in each department"?

scott@TKYTE816> break on deptno skip 1

scott@TKYTE816> select *

2 from ( select deptno, ename, sal,

3 dense_rank() over ( partition by deptno

4 order by sal desc ) dr

5 from emp )

6 where dr <= 3

7 order by deptno, sal desc

8 /

DEPTNO ENAME SAL DR

---------- ---------- ---------- ----------

10 KING 5000 1

CLARK 2450 2

MILLER 1300 3

20 SCOTT 3000 1

FORD 3000 1

JONES 2975 2

ADAMS 1100 3

30 BLAKE 2850 1

ALLEN 1600 2

TURNER 1500 3

10 rows selected.

What we did here was to use the DENSE_RANK() function to get the top three salaries. We assigned the dense rank to the SAL column sorted descending. If you recall from above, a dense rank does not skip numbers and will assign the same number to those rows with the same value. Hence, after the result set is built in the inline view ? we can simply select all of the rows with a dense rank of three or less ? that gives us everyone who makes the top three salaries by department number. Just to show what would happen if we tried to use RANK and encountered these duplicate values:

scott@TKYTE816> select deptno, ename, sal,

2 dense_rank() over ( partition by deptno

3 order by sal desc ) dr,

4 rank() over ( partition by deptno

5 order by sal desc ) r

6 from emp

7 order by deptno, sal desc

8 /

DEPTNO ENAME SAL DR R

---------- ---------- ---------- ---------- ----------

10 KING 5000 1 1

CLARK 2450 2 2

MILLER 1300 3 3

20 SCOTT 3000 1 1

FORD 3000 1 1

JONES 2975 2 3 <<----

ADAMS 1100 3 4 <<----

SMITH 800 4 5

30 BLAKE 2850 1 1

ALLEN 1600 2 2

TURNER 1500 3 3

WARD 1250 4 4

MARTIN 1250 4 4

JAMES 950 5 6

14 rows selected.

If we had used RANK, it would have left ADAMS out of the result set ? but ADAMS is one of the people in department 20 making the top 3 salaries so he belongs in the result. In this case, using RANK over DENSE_RANK would not have answered our specific query.

## Rating

Is this answer out of date? If it is, please let us know via a Comment

# Comments

In your last example,

select deptno, ename, sal,

dense_rank() over ( partition by deptno

order by sal desc ) dr,

rank() over ( partition by deptno

order by sal desc ) r

from emp

order by deptno, sal desc

How do I get exactly first N records for all dept

(e.g first 4 for dept 30)?

Thanks you!

select deptno, ename, sal,

dense_rank() over ( partition by deptno

order by sal desc ) dr,

rank() over ( partition by deptno

order by sal desc ) r

from emp

order by deptno, sal desc

How do I get exactly first N records for all dept

(e.g first 4 for dept 30)?

Thanks you!

select *

from ( select a.*,

row_number() over ( partition by DEPTNO

order by SAL DESC ) rn

from emp )

where rn <= 4

order by deptno, sal desc

from ( select a.*,

row_number() over ( partition by DEPTNO

order by SAL DESC ) rn

from emp )

where rn <= 4

order by deptno, sal desc

Hi Tom, I am using the following query to rank the cpu time usage, dont care about the sql which was took too much of cpu etc it gives me the following , is my query wrong or is there something wrong with analsyis function. I am using enterpise edition release 9.0.1.1.1 SQL> SELECT CPU_TIME CPUTIME, 2 RANK() OVER (ORDER BY SUM(CPU_TIME) DESC) AS CPU 3 FROM V$SQL 4 GROUP BY CPU_TIME 5 / CPUTIME CPU ---------- ---------- 20031250 1 14968750 2 13906250 3 13359375 4 6125000 5 468750 6 500000 7 531250 8 3703125 9 546875 10 703125 11 CPUTIME CPU ---------- ---------- 453125 12 515625 13 2546875 14 578125 15 437500 16 406250 17 46875 18 484375 19 1906250 20 31250 21 593750 21 CPUTIME CPU ---------- ---------- 78125 23 562500 24 62500 25 1578125 26 15625 27 625000 28 1250000 28 140625 30 375000 30 125000 32 968750 33 CPUTIME CPU ---------- ---------- 296875 34 93750 35 421875 35 156250 37 390625 37 750000 39 734375 40 234375 41 640625 42 609375 43 109375 44 CPUTIME CPU ---------- ---------- 218750 45 359375 46 171875 47 328125 48 312500 49 281250 50 265625 51 250000 52 203125 53 187500 54 0 55 55 rows selected. SQL> SPOOL OFF

it is your use of the analytic function that is improper here. Looking at your query: SQL> SELECT CPU_TIME CPUTIME, 2 RANK() OVER (ORDER BY SUM(CPU_TIME) DESC) AS CPU 3 FROM V$SQL 4 GROUP BY CPU_TIME 5 / in english, one might say this: show me the CPU time -- removing all duplicates (select cpu_time, group by cpu_time) but, before you remove the duplicates -- sum up the cpu time by cpu time -- order the data by that and rank it. So, if you had 1,000,000 sessions that had 1 cpu second each -- you would have found the report to be: CPUTIME CPU ---------- ---------- 1 1 20031250 2 14968750 3 ... if you throw "sum(cpu_time)" into the select list, that is the number you are ranking! then it'll make more sense why "small" cpu times are at the top of the list, before larger ones. that just means you had a couple of sessions with the SAME cpu time. Now, since I'm not sure what question you are trying to answer, I cannot tell you how to write the query. It could be as simple as "lose the sum and group by"

Hi tom

I am enjoying with this and solving most of the problems.

I want to get TOP3 dept Salaries where SMITH works.

I am trying so many ways. I din't get. Can you tell how.

thanks in advance

Reddy

I am enjoying with this and solving most of the problems.

I want to get TOP3 dept Salaries where SMITH works.

I am trying so many ways. I din't get. Can you tell how.

thanks in advance

Reddy

select *

from ( select distinct sal

from emp

where deptno = ( select deptno from emp where ename = 'SMITH' )

order by 1 desc )

where rownum <= 3;

from ( select distinct sal

from emp

where deptno = ( select deptno from emp where ename = 'SMITH' )

order by 1 desc )

where rownum <= 3;

Hi Tom,

Are these Analytical functions are ANSI SQL standards or Oracle implementing by its own?

Are these Analytical functions are ANSI SQL standards or Oracle implementing by its own?

i *believe* there are draft ansi specs for some of these, yes.

not really following that much anymore. sql92 became so large, with so many levels and options that "ansi sql 92" compliance was utterly meaningless. sql99 -- 10 times more so.

not really following that much anymore. sql92 became so large, with so many levels and options that "ansi sql 92" compliance was utterly meaningless. sql99 -- 10 times more so.

Thanks Tom for all your help. Analyatic functions do Rock!

I got a request from a buisness user wanting to change the ranking to say "(Tie)" for items that are tied. i.e.

10 SCOTT 3000 1(Tie)

10 FORD 3000 1(Tie)

10 JONES 2975 3

10 ADAMS 1100 4

20 ALLEN 1600 1

20 TURNER 1500 2

20 WARD 1250 3(Tie)

20 MARTIN 1250 3(Tie)

20 JAMES 950 5

Know of any clever ways of doing this? I'm at a loss.

Thanks

I got a request from a buisness user wanting to change the ranking to say "(Tie)" for items that are tied. i.e.

10 SCOTT 3000 1(Tie)

10 FORD 3000 1(Tie)

10 JONES 2975 3

10 ADAMS 1100 4

20 ALLEN 1600 1

20 TURNER 1500 2

20 WARD 1250 3(Tie)

20 MARTIN 1250 3(Tie)

20 JAMES 950 5

Know of any clever ways of doing this? I'm at a loss.

Thanks

Tom,

I have a table T which has 1 column and 5 rows in it.

Select Col1 from T

Col1

----

5

6

7

8

5

Case 1

======

select * from

(

select rank() over(order by col1) r,

dense_rank() over(order by col1) dr,

col1

from T

)

Output

------

R DR Col1

=========

1 1 5

1 1 5

3 2 6

4 3 7

5 4 8

In the above case, the Rank() function did a skip when there was a tie whereas Dense_Rank() didn't which is expected.

Case 2

======

select * from

(

select rank() over(order by col1 desc) r,

dense_rank() over(order by col1 desc) dr,

col1

from T

)

Output

------

R DR Col1

=========

1 1 8

2 2 7

3 3 6

4 4 5

4 4 5

Why is Rank() and Dense_Rank() behaving the same in this case. The only thing I changed is to reverse the ordering of COL1 in the RANK() and DENSE_RANK() function.

I expected the R column in the last row to have a 5 instead of 4 because of a tie.

Is my understanding wrong or is it the way RANK() and DENSE_RANK() supposed to work. Please clarify.

I have a table T which has 1 column and 5 rows in it.

Select Col1 from T

Col1

----

5

6

7

8

5

Case 1

======

select * from

(

select rank() over(order by col1) r,

dense_rank() over(order by col1) dr,

col1

from T

)

Output

------

R DR Col1

=========

1 1 5

1 1 5

3 2 6

4 3 7

5 4 8

In the above case, the Rank() function did a skip when there was a tie whereas Dense_Rank() didn't which is expected.

Case 2

======

select * from

(

select rank() over(order by col1 desc) r,

dense_rank() over(order by col1 desc) dr,

col1

from T

)

Output

------

R DR Col1

=========

1 1 8

2 2 7

3 3 6

4 4 5

4 4 5

Why is Rank() and Dense_Rank() behaving the same in this case. The only thing I changed is to reverse the ordering of COL1 in the RANK() and DENSE_RANK() function.

I expected the R column in the last row to have a 5 instead of 4 because of a tie.

Is my understanding wrong or is it the way RANK() and DENSE_RANK() supposed to work. Please clarify.

think about it.

in the second case, you have a first, second, third and fourth place winner.

the difference would come into play when you went for FIFTH place. rank -- no 5th place, dense rank -- 5th place.

in the second case, you have a first, second, third and fourth place winner.

the difference would come into play when you went for FIFTH place. rank -- no 5th place, dense rank -- 5th place.

As Tom explained in his answer. insert into t values (4); SQL> select * from t; COL1 ---------- 5 5 6 7 8 4 6 rows selected. select * from ( select rank() over(order by col1 desc) r, dense_rank() over(order by col1 desc) dr, col1 from t ) / R DR COL1 ---------- ---------- ---------- 1 1 8 2 2 7 3 3 6 4 4 5 4 4 5 6 5 4 6 rows selected.

Hi Mark,

just check this query, I hope this will answer your question :

select ename, deptno, sal, rk,

decode(nvl(ldrk,0)-rk, 0, 'Tie', decode(rk-nvl(lgrk,0), 0 , 'Tie', rk) ) status

from

(

select ename, deptno, sal, rk,

lead(rk) over (partition by deptno order by rk) ldrk,

lag(rk) over (partition by deptno order by rk) lgrk

from (select ename, deptno, sal,

rank() over (partition by deptno order by sal desc) rk

from emp

)

);

ENAME DEPTNO SAL RK STATUS

---------- ---------- ---------- ---------- ---------------

KING 10 5000 1 1

CLARK 10 2450 2 2

MILLER 10 1300 3 3

FORD 20 3000 1 Tie

SCOTT 20 3000 1 Tie

JONES 20 2975 3 3

ADAMS 20 1100 4 4

SMITH 20 800 5 5

BLAKE 30 2850 1 1

ALLEN 30 1600 2 2

TURNER 30 1500 3 3

MARTIN 30 1250 4 Tie

WARD 30 1250 4 Tie

JAMES 30 950 6 6

just check this query, I hope this will answer your question :

select ename, deptno, sal, rk,

decode(nvl(ldrk,0)-rk, 0, 'Tie', decode(rk-nvl(lgrk,0), 0 , 'Tie', rk) ) status

from

(

select ename, deptno, sal, rk,

lead(rk) over (partition by deptno order by rk) ldrk,

lag(rk) over (partition by deptno order by rk) lgrk

from (select ename, deptno, sal,

rank() over (partition by deptno order by sal desc) rk

from emp

)

);

ENAME DEPTNO SAL RK STATUS

---------- ---------- ---------- ---------- ---------------

KING 10 5000 1 1

CLARK 10 2450 2 2

MILLER 10 1300 3 3

FORD 20 3000 1 Tie

SCOTT 20 3000 1 Tie

JONES 20 2975 3 3

ADAMS 20 1100 4 4

SMITH 20 800 5 5

BLAKE 30 2850 1 1

ALLEN 30 1600 2 2

TURNER 30 1500 3 3

MARTIN 30 1250 4 Tie

WARD 30 1250 4 Tie

JAMES 30 950 6 6

Tom,

I have following situation. Need your opinion.

I have a table as follows

TABLE A

Amount Rate type

1000 1.80 A

1000 1.81 A

1100 1.81 A

1000 1.90 B

1100 1.95 B

Now I want result like

Amount Rate type Dense rank Amount allocated

1000 1.81 A 1 1000

1100 1.81 A 1 1100

1000 1.80 A 2 900

1100 1.95 B 1 1100

1000 1.90 B 2 900

p.s. available quantity for type A is 3000 & type B = 2000

I can use dense rank to get rank.

But further depending on available quantity I want to Update "Amount Allocated" depending on certain business logic.

My question is What is a best way to approach??

1. Read A --> use dense rank function to populate "oracle temporary" table B & update it with business logic ?

2. Read A --> use dense rank function to populate "oracle temporary" table B & then load it into associative arrays / PL/SQL table & update that with business logic & then put results in Table C.

3. Or something else

The biggest concern is the time it's going to take .

I have less than 1 sec to process.

Record count is between 100 to 3000.

Using 9iR2.

Thanks in advance

Sam

I have following situation. Need your opinion.

I have a table as follows

TABLE A

Amount Rate type

1000 1.80 A

1000 1.81 A

1100 1.81 A

1000 1.90 B

1100 1.95 B

Now I want result like

Amount Rate type Dense rank Amount allocated

1000 1.81 A 1 1000

1100 1.81 A 1 1100

1000 1.80 A 2 900

1100 1.95 B 1 1100

1000 1.90 B 2 900

p.s. available quantity for type A is 3000 & type B = 2000

I can use dense rank to get rank.

But further depending on available quantity I want to Update "Amount Allocated" depending on certain business logic.

My question is What is a best way to approach??

1. Read A --> use dense rank function to populate "oracle temporary" table B & update it with business logic ?

2. Read A --> use dense rank function to populate "oracle temporary" table B & then load it into associative arrays / PL/SQL table & update that with business logic & then put results in Table C.

3. Or something else

The biggest concern is the time it's going to take .

I have less than 1 sec to process.

Record count is between 100 to 3000.

Using 9iR2.

Thanks in advance

Sam

having this much level of detail - well, not quite possible to comment.

I've no clue as to what amount allocated is, does, is populated or anything.

I've no clue as to what amount allocated is, does, is populated or anything.

Tom,

Thanks for taking time.

The amount allocated is calculated on business logic.

The logic uses few lookup tables apart from the one mentioned..

You can read first column as amount requested.

The amount allocated has to be derived based on business logic which reads rate column from current table & some lookup tables.

Thanks

Sam

Thanks for taking time.

The amount allocated is calculated on business logic.

The logic uses few lookup tables apart from the one mentioned..

You can read first column as amount requested.

The amount allocated has to be derived based on business logic which reads rate column from current table & some lookup tables.

Thanks

Sam

if the logic can be done in sql using case/decode, do it there.

if the logic can be made a plsql function, you may well be able to invoke that from sql so insert into whatever select dense-rank, yourfunction from whereever.

else select bulk collect the dense-rank, call function, forall i insert.

if the logic can be made a plsql function, you may well be able to invoke that from sql so insert into whatever select dense-rank, yourfunction from whereever.

else select bulk collect the dense-rank, call function, forall i insert.

What is best way to code ??

In the case where we need to look at 2 or more rows at a given time. So I guess function may not be available for me.

What I mean is in my above case I need to look at more than one row in table A to arrive at a decision to update column Amount allocated.

Need more than one row from table A to take decision.

The decision logic is business logic which make use of additional lookup tables. Should I also get those in memory ?

What is best way to approach

1. fetch rows one by one & put it in variables & do calculation ??

2. Get all required rows in array & then traverse through array to get info & make decision??

As I said earlier volume is not much.

Atmost 1000 rows to process.

Time is very crucial here. So what is best practice??

In the case where we need to look at 2 or more rows at a given time. So I guess function may not be available for me.

What I mean is in my above case I need to look at more than one row in table A to arrive at a decision to update column Amount allocated.

Need more than one row from table A to take decision.

The decision logic is business logic which make use of additional lookup tables. Should I also get those in memory ?

What is best way to approach

1. fetch rows one by one & put it in variables & do calculation ??

2. Get all required rows in array & then traverse through array to get info & make decision??

As I said earlier volume is not much.

Atmost 1000 rows to process.

Time is very crucial here. So what is best practice??

lag/lead -- scalar subqueries, sql is pretty powerful.

I gave you the three approaches in order of preference above.

I gave you the three approaches in order of preference above.

Hi Tom,

I need to convert following SQL into simple SQL as the database I am using is not having dense_rank and cube function.

Here is my SQL:

select mkt, bnd, brow, part_name, max(u) as UU, sum(p) as PV from

(select decode(grouping(mkt_code),3,'All mkts',mkt_code) as mkt,

decode(grouping(bnd_name),3,'All bnds', bnd_name) as bnd,

decode(grouping(brow_type),3,'All brows',brow_type) as brow,

decode(grouping(part_name),3,'All parts', part_name) as part_name,

DENSE_RANK() OVER (PARTITION BY mkt_code, bnd_name, brow_type, part_name ORDER BY user_key) as u,

count(*) as p from

paperview pf, dateview dd, marketview md, bnd_dimension bd, brow_dimension sd, part_dimension secd

where pf.date_key=dd.date_key

and dd.fulldate='01-FEB-2000'

and pf.mkt_key=md.mkt_key

and pf.bnd_key=bd.bnd_key

and pf.brow_key=sd.brow_key

and pf.part_key=secd.part_key

and pf.spider_key=9

and pf.user_key > 19

group by cube (mkt_code,bnd_name,brow_type,part_name), user_key)

group by mkt, bnd, brow, part_name

Pls help me converting it if possible ?

Thanks

I need to convert following SQL into simple SQL as the database I am using is not having dense_rank and cube function.

Here is my SQL:

select mkt, bnd, brow, part_name, max(u) as UU, sum(p) as PV from

(select decode(grouping(mkt_code),3,'All mkts',mkt_code) as mkt,

decode(grouping(bnd_name),3,'All bnds', bnd_name) as bnd,

decode(grouping(brow_type),3,'All brows',brow_type) as brow,

decode(grouping(part_name),3,'All parts', part_name) as part_name,

DENSE_RANK() OVER (PARTITION BY mkt_code, bnd_name, brow_type, part_name ORDER BY user_key) as u,

count(*) as p from

paperview pf, dateview dd, marketview md, bnd_dimension bd, brow_dimension sd, part_dimension secd

where pf.date_key=dd.date_key

and dd.fulldate='01-FEB-2000'

and pf.mkt_key=md.mkt_key

and pf.bnd_key=bd.bnd_key

and pf.brow_key=sd.brow_key

and pf.part_key=secd.part_key

and pf.spider_key=9

and pf.user_key > 19

group by cube (mkt_code,bnd_name,brow_type,part_name), user_key)

group by mkt, bnd, brow, part_name

Pls help me converting it if possible ?

Thanks

thats too bad. I mean, not everything is possible in "simple sql"

this is one of them.

well, two of them.

you'll have to write a report generator (pretty boring stuff...). you won't do this in "simple sql" whatever that is.

this is one of them.

well, two of them.

you'll have to write a report generator (pretty boring stuff...). you won't do this in "simple sql" whatever that is.

DEPTNO ENAME SAL DR

---------- ---------- ---------- ----------

10 KING 5000 1

CLARK 2450 2

MILLER 1300 3

20 SCOTT 3000 1 <-

FORD 3000 1 <-

JONES 2975 2

ADAMS 1100 3

30 BLAKE 2850 1

ALLEN 1600 2

TURNER 1500 3

Based on your above example of EMP table, how can I make sure that Scott and Ford in DEPTNO 20 get sequential rank even though they are tie. Therefore, Scott would be first if he is first occurrence in table and Ford is second and so on.

We are trying to address a bug by updating 13 million rows assigning sequential rank to items within each group based on its position within each group and there are no other discriminating criteria.

We use Oracle 9.2

Thank you for your time.

---------- ---------- ---------- ----------

10 KING 5000 1

CLARK 2450 2

MILLER 1300 3

20 SCOTT 3000 1 <-

FORD 3000 1 <-

JONES 2975 2

ADAMS 1100 3

30 BLAKE 2850 1

ALLEN 1600 2

TURNER 1500 3

Based on your above example of EMP table, how can I make sure that Scott and Ford in DEPTNO 20 get sequential rank even though they are tie. Therefore, Scott would be first if he is first occurrence in table and Ford is second and so on.

We are trying to address a bug by updating 13 million rows assigning sequential rank to items within each group based on its position within each group and there are no other discriminating criteria.

We use Oracle 9.2

Thank you for your time.

are you looking for row_number()?? it does 1, 2, 3, 4, ....

SELECT deptno, ename, sal, next_sal, prev_sal,

DECODE(sal, next_sal, '(Tie)', prev_sal, '(Tie)') tie

FROM (

SELECT deptno, ename, sal,

LAG(sal) OVER (PARTITION BY deptno

ORDER BY sal DESC) next_sal,

LEAD(sal) OVER (PARTITION BY deptno

ORDER BY sal DESC) prev_sal

FROM scott.emp

);

DEPTNO ENAME SAL NEXT_SAL PREV_SAL TIE

--------- ---------- --------- --------- --------- -----

10 KING 5000 2450

CLARK 2450 5000 1300

MILLER 1300 2450

20 SCOTT 3000 3000 (Tie)

FORD 3000 3000 2975 (Tie)

JONES 2975 3000 1100

ADAMS 1100 2975 800

SMITH 800 1100

30 BLAKE 2850 1600

ALLEN 1600 2850 1500

TURNER 1500 1600 1250

WARD 1250 1500 1250 (Tie)

MARTIN 1250 1250 950 (Tie)

JAMES 950 1250

DECODE(sal, next_sal, '(Tie)', prev_sal, '(Tie)') tie

FROM (

SELECT deptno, ename, sal,

LAG(sal) OVER (PARTITION BY deptno

ORDER BY sal DESC) next_sal,

LEAD(sal) OVER (PARTITION BY deptno

ORDER BY sal DESC) prev_sal

FROM scott.emp

);

DEPTNO ENAME SAL NEXT_SAL PREV_SAL TIE

--------- ---------- --------- --------- --------- -----

10 KING 5000 2450

CLARK 2450 5000 1300

MILLER 1300 2450

20 SCOTT 3000 3000 (Tie)

FORD 3000 3000 2975 (Tie)

JONES 2975 3000 1100

ADAMS 1100 2975 800

SMITH 800 1100

30 BLAKE 2850 1600

ALLEN 1600 2850 1500

TURNER 1500 1600 1250

WARD 1250 1500 1250 (Tie)

MARTIN 1250 1250 950 (Tie)

JAMES 950 1250

hallo tom

how to get top 2 salary from a emp table per each deptno?

how to get top 2 salary from a emp table per each deptno?

sort of an ambigous query - but I'll answer in the spirit of the question:

scott@ORA10G> l

1 select distinct deptno, sal

2 from (

3 select deptno, sal, dense_rank() over (partition by deptno order by sal DESC nulls last) dr

4 from emp

5 )

6* where dr <= 2

scott@ORA10G> /

DEPTNO SAL

---------- ----------

10 5000

10 2450

20 3000

20 2975

30 2850

30 1600

6 rows selected.

Those are the top two salaries by deptno...

scott@ORA10G> l

1 select distinct deptno, sal

2 from (

3 select deptno, sal, dense_rank() over (partition by deptno order by sal DESC nulls last) dr

4 from emp

5 )

6* where dr <= 2

scott@ORA10G> /

DEPTNO SAL

---------- ----------

10 5000

10 2450

20 3000

20 2975

30 2850

30 1600

6 rows selected.

Those are the top two salaries by deptno...

Hi Tom,

I need to create a report from a table(10million recs) that has a "house_points" column. Beginning with the highest house_points value (order by house_points desc), I need to create groups of 100 records each and sum(house_points) and get the max(house_points), min(house_points) for each group. Any ideas?

Any advice is greatly appreciated.

Thanks,

Ami

I need to create a report from a table(10million recs) that has a "house_points" column. Beginning with the highest house_points value (order by house_points desc), I need to create groups of 100 records each and sum(house_points) and get the max(house_points), min(house_points) for each group. Any ideas?

Any advice is greatly appreciated.

Thanks,

Ami

select min(..) , max(...)

from (

select a.*, trunc((rownum-1)/100) grp

from (select * from t order by house_points desc)

)

group by grp;

from (

select a.*, trunc((rownum-1)/100) grp

from (select * from t order by house_points desc)

)

group by grp;

I found a strange behavior of dense_rank function. I believe it's a bug and is only reproducible on Oracle 11g.

SQL> SQL> SELECT * FROM v$version; BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production PL/SQL Release 11.2.0.1.0 - Production CORE 11.2.0.1.0 Production TNS for 64-bit Windows: Version 11.2.0.1.0 - Production NLSRTL Version 11.2.0.1.0 - Production SQL> SQL> DROP TABLE b; Table dropped. SQL> DROP TABLE a; Table dropped. SQL> SQL> CREATE TABLE a ( 2 a_id INTEGER, 3 Name VARCHAR2(5) 4 ); Table created. SQL> SQL> CREATE TABLE b ( 2 b_id INTEGER, 3 a_id INTEGER, 4 Status INTEGER 5 ); Table created. SQL> SQL> INSERT INTO a 2 SELECT level, 'A' || level 3 FROM dual 4 CONNECT BY level <= 10; 10 rows created. SQL> SQL> -- --------------------------------------------------------------------------- SQL> -- Procedure for populating table b. SQL> -- SQL> -- The input parameter determines the ratio of the number of records in b to SQL> -- that of table a. SQL> -- --------------------------------------------------------------------------- SQL> CREATE OR REPLACE PROCEDURE Populate_b ( 2 p_num_of_b_per_a INTEGER 3 ) 4 AS 5 l_cnt INTEGER := 0; 6 7 CURSOR c IS 8 SELECT a_id 9 FROM a; 10 11 BEGIN 12 DELETE FROM b; 13 14 FOR x IN c LOOP 15 FOR i IN 1..p_num_of_b_per_a LOOP 16 l_cnt := l_cnt + 1; 17 18 INSERT INTO b VALUES (l_cnt, x.a_id, i); 19 END LOOP; 20 END LOOP; 21 22 END; 23 / Procedure created. SQL> SQL> -- --------------------------------------------------------------------------- SQL> -- correct result SQL> -- --------------------------------------------------------------------------- SQL> exec Populate_b(1); PL/SQL procedure successfully completed. SQL> SQL> SELECT name, dr 2 FROM ( 3 SELECT Name, 4 row_number() OVER (PARTITION BY a.Name ORDER BY b.Status) rn, 5 dense_rank() OVER (ORDER BY a.Name) dr 6 FROM a, b 7 WHERE a.a_id = b.a_id 8 ) 9 WHERE dr <= 20 AND 10 rn = 1; NAME DR ----- ---------- A1 1 A10 2 A2 3 A3 4 A4 5 A5 6 A6 7 A7 8 A8 9 A9 10 10 rows selected. SQL> SQL> -- --------------------------------------------------------------------------- SQL> -- Expect to get 10 records, but get 7 instead. Gets worse as we increase SQL> -- the ratio of records between table a and b. SQL> -- --------------------------------------------------------------------------- SQL> exec Populate_b(3); PL/SQL procedure successfully completed. SQL> SQL> SELECT name, dr 2 FROM ( 3 SELECT Name, 4 row_number() OVER (PARTITION BY a.Name ORDER BY b.Status) rn, 5 dense_rank() OVER (ORDER BY a.Name) dr 6 FROM a, b 7 WHERE a.a_id = b.a_id 8 ) 9 WHERE dr <= 20 AND 10 rn = 1; NAME DR ----- ---------- A1 1 A10 2 A2 3 A3 4 A4 5 A5 6 A6 7 7 rows selected. SQL> SQL> exec Populate_b(5); PL/SQL procedure successfully completed. SQL> SQL> SELECT name, dr 2 FROM ( 3 SELECT Name, 4 row_number() OVER (PARTITION BY a.Name ORDER BY b.Status) rn, 5 dense_rank() OVER (ORDER BY a.Name) dr 6 FROM a, b 7 WHERE a.a_id = b.a_id 8 ) 9 WHERE dr <= 20 AND 10 rn = 1; NAME DR ----- ---------- A1 1 A10 2 A2 3 A3 4 A4 5 SQL> SQL> exec Populate_b(10); PL/SQL procedure successfully completed. SQL> SQL> SELECT name, dr 2 FROM ( 3 SELECT Name, 4 row_number() OVER (PARTITION BY a.Name ORDER BY b.Status) rn, 5 dense_rank() OVER (ORDER BY a.Name) dr 6 FROM a, b 7 WHERE a.a_id = b.a_id 8 ) 9 WHERE dr <= 20 AND 10 rn = 1; NAME DR ----- ---------- A1 1 A10 2 A2 3 SQL> SQL> exec Populate_b(20); PL/SQL procedure successfully completed. SQL> SQL> SELECT name, dr 2 FROM ( 3 SELECT Name, 4 row_number() OVER (PARTITION BY a.Name ORDER BY b.Status) rn, 5 dense_rank() OVER (ORDER BY a.Name) dr 6 FROM a, b 7 WHERE a.a_id = b.a_id 8 ) 9 WHERE dr <= 20 AND 10 rn = 1; NAME DR ----- ---------- A1 1 A10 2 SQL> SQL> SQL> -- --------------------------------------------------------------------------- SQL> -- Convert the dense_rank to number and the result is correct. This is very SQL> -- bizarre! SQL> -- --------------------------------------------------------------------------- SQL> SELECT name, dr 2 FROM ( 3 SELECT Name, 4 row_number() OVER (PARTITION BY a.Name ORDER BY b.Status) rn, 5 dense_rank() OVER (ORDER BY a.Name) dr 6 FROM a, b 7 WHERE a.a_id = b.a_id 8 ) 9 WHERE TO_NUMBER(dr) <= 20 AND 10 rn = 1; NAME DR ----- ---------- A1 1 A10 2 A2 3 A3 4 A4 5 A5 6 A6 7 A7 8 A8 9 A9 10 10 rows selected. SQL> SQL> SQL> spool off

definitely a wrong answer.

do you have access to support? If not, let me know and I'll file the bug.

do you have access to support? If not, let me know and I'll file the bug.

Cannot reproduce in 11.1.0.7 (32-bit Windows).

I reproduced in 11.2.0.2 on Linux

SQL> select * from v$version; BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production PL/SQL Release 11.2.0.2.0 - Production CORE 11.2.0.2.0 Production TNS for IBM/AIX RISC System/6000: Version 11.2.0.2.0 - Production NLSRTL Version 11.2.0.2.0 - Production SQL> create table test (c number, n number); Table created. SQL> insert into test values (1,1); 1 row created. SQL> insert into test values (1,2); 1 row created. SQL> insert into test values (2,1); 1 row created. SQL> insert into test values (2,2); 1 row created. SQL> commit; Commit complete. SQL> -- Correct SQL> select * 2 from (select t.c, 3 t.n, 4 dense_rank() over(order by t.n) dr 5 from test t) 6 where dr <= 2; C N DR ---------- ---------- ---------- 1 1 1 2 1 1 1 2 2 2 2 2 SQL> -- Wrong SQL> select * 2 from (select t.c, 3 t.n, 4 dense_rank() over(partition by t.n order by t.c) rn, 5 dense_rank() over(order by t.n) dr 6 from test t) 7 where dr <= 2; C N RN DR ---------- ---------- ---------- ---------- 1 1 1 1 2 1 2 1 1 2 1 2 SQL> select * 2 from (select t.c, 3 t.n, 4 row_number() over(partition by t.n order by t.c) rn, 5 dense_rank() over(order by t.n) dr 6 from test t) 7 where dr <= 2; C N RN DR ---------- ---------- ---------- ---------- 1 1 1 1 2 1 2 1 1 2 1 2 SQL> select * 2 from (select t.c, 3 t.n, 4 lag(t.n) over(partition by t.n order by t.c) rn, 5 dense_rank() over(order by t.n) dr 6 from test t) 7 where dr <= 2; C N RN DR ---------- ---------- ---------- ---------- 1 1 1 2 1 1 1 1 2 2 SQL> select * 2 from (select t.c, 3 t.n, 4 rank() over(partition by t.n order by t.c) rn, 5 dense_rank() over(order by t.n) dr 6 from test t) 7 where dr <= 2; C N RN DR ---------- ---------- ---------- ---------- 1 1 1 1 2 1 2 1 1 2 1 2

I went ahead and filed bug 12670181 for this

Hello Tom,

create table t1_objs as select object_type, object_name from all_objects

create index idx_t1_objs on t1_objs (object_type )

Query 1 :

select decode(row_number() over (partition by object_type order by object_name),1,object_type,'') object_type,object_name

from t1_objs

where object_type in ('TABLE','TRIGGER','INDEX')

order by object_name, object_type

Query 2 :

select object_type, object_name,

dense_rank() over ( partition by object_type

order by object_name desc ) dr,

rank() over ( partition by object_type

order by object_name desc ) r

from t1_objs

where object_type in ('TABLE','TRIGGER','INDEX')

order by object_type, object_name

Using Query 1, I can avoid displaying of object_type if it's gets reapeatd. How can I acheieve the same through Query 2?

I have seen you have already given the example but that can only be run in SQLPlus. I need to execute in the PL/SQL which I'll be sending as ref cursor.

create table t1_objs as select object_type, object_name from all_objects

create index idx_t1_objs on t1_objs (object_type )

Query 1 :

select decode(row_number() over (partition by object_type order by object_name),1,object_type,'') object_type,object_name

from t1_objs

where object_type in ('TABLE','TRIGGER','INDEX')

order by object_name, object_type

Query 2 :

select object_type, object_name,

dense_rank() over ( partition by object_type

order by object_name desc ) dr,

rank() over ( partition by object_type

order by object_name desc ) r

from t1_objs

where object_type in ('TABLE','TRIGGER','INDEX')

order by object_type, object_name

Using Query 1, I can avoid displaying of object_type if it's gets reapeatd. How can I acheieve the same through Query 2?

I have seen you have already given the example but that can only be run in SQLPlus. I need to execute in the PL/SQL which I'll be sending as ref cursor.

I am trying to understand analytics, to confirm my understandings are correct I want to know that:-

If OVER (.... ORDER BY.) have no ties and have only unique rows then there is no differrence between the results of ROWNUMBER, RANK and DENS_RANK functions, correct?

Thanks

If OVER (.... ORDER BY.) have no ties and have only unique rows then there is no differrence between the results of ROWNUMBER, RANK and DENS_RANK functions, correct?

Thanks

Correct - if sorting on unique values these all give the same results

Thanks for your reply.

Now if the above is correct , wouldnt it be better to have one function with 1 parameter which can have 3 values only, instead of 3 seperate functions. ie

ROW_NUMBER(ROW_NUMBER|RANK|DENSE_RANK) ,and ROW_NUBER being the default)

Row_number = row_number() = row_number(row_number)

Rank = row_number(RANK)

DENSE_RANK = row_number(DENSE_RSNK)

Now if the above is correct , wouldnt it be better to have one function with 1 parameter which can have 3 values only, instead of 3 seperate functions. ie

ROW_NUMBER(ROW_NUMBER|RANK|DENSE_RANK) ,and ROW_NUBER being the default)

Row_number = row_number() = row_number(row_number)

Rank = row_number(RANK)

DENSE_RANK = row_number(DENSE_RSNK)

How is this better than the current approach with different functions?

Hmmmm, it doesnt make much diffference and I think none of them is better than the other, but for some people, they like seperate functions, and some other people like one function with 1 parameter of 3 values.

Again, this is standard SQL functionality. You'll have to come up with a

**very**compelling reason to add a parameter that affects the behaviour of these for it to happen.