Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Shalu.

Asked: November 08, 2002 - 12:53 pm UTC

Last updated: April 13, 2004 - 8:13 am UTC

Version: 9.2.0.1.0

Viewed 1000+ times

You Asked

Hi,

I need to calculate Weighted Percentile as one of the columns in a SQL Query. I am using this query to generate a report
using Reports Builder.


INSTR_ID UPB CUSIP OAD1 OAD2 DIFF
10 500 cusip1 .03 .01 .02
20 100 cusip2 .09 .04 .05
30 200 cusip3 .03 .00 .03
40 200 cusip4 .05 .02 .03
50 200 cusip5 .09 .07 .02
60 400 cusip6 .05 .03 .02

where DIFF is the difference in OAD1 and OAD2.

I am using Oracle 9i Rel 2, I was going thru the Analytic Functions provided by this release, I found two functions very
close but still I feel that they won't serve my purpose. They are PERCENTILE_CONT( ) and PERCENTILE_DISC( ), because I need
to calculate the Weighted Percentile.

For the above data the Weighted Percentile should be:

INSTR_ID UPB DIFF Weighted_Percentile
======================================
10 500 .02 0 ==>(0/1600), numerator is 0 because there is no instr having diff <.02
20 400 .02 0 ==>(0/1600), numerator is 0 because there is no instr having diff <.02
30 100 .02 0 ==>(0/1600), numerator is 0 because there is no instr having diff <.02
40 200 .03 .625 ==>(500+400+100/1600), numerator is sum of UPBS of instr havng diff < .03
50 300 .03 .625 ==>(500+400+100/1600), numerator is sum of UPBS of instr havng diff < .03
60 100 .05 .9375 ==>(500+400+100+200+300/1600), numer is sum of all UPBs having diff < .05

Where denominator is always sum of all the UPBs, which is 1600 in this case. And Numerator is the sum of all the UPBs of
all those instruments which have DIFF less than the DIFF of the current Instrument. So, to do this we should sort the records
on DIFF and use some function to calculate Weighted Percentile.

Thanks!!
Shalu

and Tom said...

your exmaple is wrong -- it should have 500+400+200 when diff < .03.

Here is one way to get it (i assume that diff isn't really stored but derived so I did it that way)

ps$tkyte@ORA920.US.ORACLE.COM> select instr_id, upb, (oad1-oad2) diff,
2 (sum(upb) over ( order by oad1-oad2 )
3 -sum(upb) over ( order by oad1-oad2 range between 0 preceding and 0 following )) /
4 sum(upb) over () weighted_pct
5 from t;

INSTR_ID UPB DIFF WEIGHTED_PCT
---------- ---------- ---------- ------------
10 500 .02 0
50 200 .02 0
60 400 .02 0
30 200 .03 .6875
40 200 .03 .6875
20 100 .05 .9375

6 rows selected.



Rating

  (6 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Shalu Aggarwal, November 08, 2002 - 2:03 pm UTC

Hi, When I tried running my query after making the changes I am getting the following error:

ORA-30483 Window functions are not allowed here.

I am using this query in Reports Builder.
In case you want to take a look at my query it is here, you can just check the WEIGHTED_PERCENTILE column.

Select
c.name,
nvl(c.upb,0) upb,
nvl(abs(c.upb),0) abs_upb,
Power( nvl(c.upb,0) , 2) sqr_upb,
nvl(c.upb,0) * ( NVL(&p_col1,0) - NVL(&p_col2,0)) mean_diff,
ABS( nvl(c.upb,0) * ( NVL(&p_col1,0) - NVL(&p_col2,0) ) ) mean_abs_diff,
POWER ( nvl(c.upb,0) * ( NVL(&p_col1,0) - NVL(&p_col2,0) ), 2) root_mean_sqr_diff,
nvl(c.upb,0) * CASE WHEN nvl(&p_col1,0) = 0 Then 0 ELSE round(((&p_col1 - nvl(&p_col2,0) )/&p_col1),6) END mean_diff_per,
ABS( nvl(c.upb,0) * CASE WHEN nvl(&p_col1,0) = 0 Then 0 ELSE round(((&p_col1 - nvl(&p_col2,0) )/&p_col1),6) END ) mean_abs_diff_per,
POWER ( nvl(c.upb,0) * ( CASE WHEN nvl(&p_col1,0) = 0 Then 0 ELSE round(((&p_col1 - nvl(&p_col2,0) )/&p_col1),6) END ), 2) root_mean_sqr_diff_per,
c.product_type product_type,
c.ORIGINAL_TERM original_term,
c.wam wam,
c.wac wac ,
c.coupon coupon,
c.cusip,
( Sum(nvl(c.upb,0)) OVER ( ORDER BY ABS( CASE WHEN nvl(&p_col1,0) = 0 Then 0 ELSE ((&p_col1 - nvl(&p_col2,0) )/&p_col1) END) ) -
Sum(nvl(c.upb,0) OVER ( ORDER BY ABS( CASE WHEN nvl(&p_col1,0) = 0 Then 0 ELSE ((&p_col1 - nvl(&p_col2,0) )/&p_col1) END) RANGE BETWEEN 0 PRECEDING AND 0 FOLLOWING) ) /SUM(nvl(c.upb,0)) WEIGHTED_PERCENTILE,
NVL(&p_col1,0) - NVL(&p_col2,0) diff,
ABS( NVL(&p_col1,0) - NVL(&p_col2,0) ) abs_diff,
ABS( CASE WHEN nvl(&p_col1,0) = 0 Then 0 ELSE ((&p_col1 - nvl(&p_col2,0) )/&p_col1) END ) abs_diff_per,
CASE WHEN nvl(&p_col1,0) = 0 Then 0 ELSE round(((&p_col1 - nvl(&p_col2,0) )/&p_col1),6) END diff_per,
decode(&p_col1, 0, 'NaN', null, 'Missing', &p_col1) ref_col,
decode(&p_col2 ,0, 'NaN', null, 'Missing', &p_col2) test_col
From VN_INSTRUMENT c, VN_RESULT a, VN_RESULT b
Where
c.instr_id = a.instr_id and
a.instr_id = b.instr_id and
a.group_id = 645 and
b.group_id = 655
Order By ABS( CASE WHEN nvl(&p_col1,0) = 0 Then 0 ELSE ((&p_col1 - nvl(&p_col2,0) )/&p_col1) END )

Tom Kyte
November 08, 2002 - 2:20 pm UTC

did you try it in sqlplus -- you would need to give me a test case (create table statement and as SMALL of a query as humanly possible).

be glad to try and help you -- however, you have to make it easy for me to do so.

Shalu Aggarwal, November 08, 2002 - 3:28 pm UTC

Hi,

I figured it out. It was not working at my end because I had NVLs in my query. When i removed all the NVLs it started working, otherwise on SQL Plus also I was getting the same ORA error.

Thanks so much for your help, it is really great. But I would like to know few things:
1. Does it take care of NULLs ?
2. What will it do if sum(upb) in the denominator for some reason is null or zero ?
3. Can I use CASE statement inside it ?

Thanks,
Shalu

Tom Kyte
November 08, 2002 - 3:38 pm UTC

you can "order by whatever NULLS LAST" or "NULLS FIRST" to control where they might pop up. In general -- the functions ignore nulls -- sum() will for example.

If the sum(upb) is NULL or ZERO -- you'll get an error -- so just decode or CASE around that problem.

I don't know -- inside "what" exactly?

Shalu Aggarwal, November 08, 2002 - 3:41 pm UTC

Hi,

I am getting the same ORA error if I use CASE statement or NVL inside it. I need to use CASE or NVL to make sure that 1.) denominator is never zero
2.) a.oad is never zero since I am dividing (a.oad-b.oad) by a.oad.

Look at the following query sample: WEIGHT column works fine here but not WEIGHT1, actually I want to use it as I have used for WEIGHT1 column to handle nulls and zeros.


Select
c.name,
nvl(c.upb,0) upb,
( Sum(c.upb) OVER ( ORDER BY ABS( (a.oad - b.oad)/a.oad) ) -
Sum(c.upb) OVER ( ORDER BY ABS( (a.oad - b.oad)/a.oad)
RANGE BETWEEN 0 PRECEDING AND 0 FOLLOWING) ) /SUM(c.upb) OVER ( ) WEIGHT,
( Sum(c.upb) OVER ( ORDER BY ABS( CASE WHEN a.oad = 0 Then 0 ELSE ((a.oad - b.oad )/a.oad) END )
Sum(c.upb) OVER ( ORDER BY ABS( CASE WHEN a.oad = 0 Then 0 ELSE ((a.oad - b.oad )/a.oad) END )
RANGE BETWEEN 0 PRECEDING AND 0 FOLLOWING) ) /SUM(c.upb) OVER ( ) WEIGHT1,
ABS( (a.oad - b.oad)/a.oad) abs_diff_per
From VN_INSTRUMENT c, VN_RESULT a, VN_RESULT b
Where
c.instr_id = a.instr_id and
a.instr_id = b.instr_id and
a.group_id = 654 and
b.group_id = 655
Order By ABS( CASE WHEN nvl(a.oad,0) = 0 Then 0 ELSE ((a.oad - nvl(b.oad,0) )/a.oad) END )

Shall await your reply.

Thanks!
Shalu

Tom Kyte
November 08, 2002 - 4:57 pm UTC

you () are totally messed up -- but I have no problems using decode with analytics:



ops$tkyte@ORA920.US.ORACLE.COM> select * from t;

  INSTR_ID        UPB CUSIP            OAD1       OAD2
---------- ---------- ---------- ---------- ----------
        10        500 cusip1            .03        .01
        20        100 cusip2            .09        .04
        30        200 cusip3            .03          0
        40        200 cusip4            .05        .02
        50        200 cusip5            .09        .07
        60        400 cusip6            .05        .03
        60        400 cusip6              0          0

7 rows selected.

ops$tkyte@ORA920.US.ORACLE.COM> Select
  2   instr_id,
  3   nvl(upb,0) upb,
  4   (Sum(upb) OVER ( ORDER BY  ABS( decode( oad1, 0, 0, (oad1 - oad2)/oad1)) )  -
  5             Sum(upb) OVER ( ORDER BY  ABS( decode(oad1, 0, 0, (oad1 - oad2)/oad1))
  6                             RANGE BETWEEN 0 PRECEDING AND 0 FOLLOWING) ) /
  7             decode(SUM(upb) OVER (), 0, 0, sum(upb) over()) WEIGHT1,
  8           ABS( decode(oad1, 0, 0, (oad1 - oad2)/oad1)) abs_diff_per
  9   From  t
 10   Order By ABS( CASE WHEN nvl(oad1,0) = 0 Then 0 ELSE ((oad1 - nvl(oad2,0) )/oad1) END )
 11  /

  INSTR_ID        UPB    WEIGHT1 ABS_DIFF_PER
---------- ---------- ---------- ------------
        60        400          0            0
        50        200         .2   .222222222
        60        400         .3           .4
        20        100         .5   .555555556
        40        200        .55           .6
        10        500        .65   .666666667
        30        200         .9            1

7 rows selected.
 

A reader, November 08, 2002 - 3:59 pm UTC

I tried using decode, NVL and CASE statement, it gives same ORA error.

See if use decode as follows, I get the same error:

Select
c.name,
nvl(c.upb,0) upb,
( Sum(c.upb) OVER ( ORDER BY ABS( decode(a.oad,0,0, (a.oad - b.oad )/a.oad ) -
Sum(c.upb) OVER ( ORDER BY ABS( decode(a.oad,0,0, (a.oad - b.oad )/a.oad )
RANGE BETWEEN 0 PRECEDING AND 0 FOLLOWING) ) /SUM(c.upb) OVER ( ) WEIGHT
From VN_INSTRUMENT c, VN_RESULT a, VN_RESULT b
Where
c.instr_id = a.instr_id and
a.instr_id = b.instr_id and
a.group_id = 654 and
b.group_id = 655
Order By ABS( CASE WHEN nvl(a.oad,0) = 0 Then 0 ELSE ((a.oad - nvl(b.oad,0) )/a.oad) END )
/

It is not allowing any CASE/NVL/decode, is there any other alternative ??



A reader, November 08, 2002 - 5:36 pm UTC

Really. It's working fine now.

Thanks, so much!!
Shalu

Similar usage of percentile_disc in oracle 8i

Vivek Sharma, April 13, 2004 - 7:02 am UTC

Dear Tom,

I was going through your book "Effective Oracle by Design" and in Chapter 3 you had demonstrated a usage of percentile_disc package available in Oracle 9i. Is this funcationality available in Oacle 8i. We have some big batch processes which can be parallelized using the logic as demonstrated by you but we have an Oracle 8i Database. Is there any package similar to percentile_disc available in Oracle 8i ? if yes, can you please demonstrate the usage. It will definetely help me improve the response time of some big processes.

Thanks and Regards
Vivek Sharma

Tom Kyte
April 13, 2004 - 8:13 am UTC

read down a paragraph where it states "in chapter 8, we'll take another look at DIY parallelism. You'll see we can accomplish the same goal using ROWID ranges which are even more efficient than the index key approach shown here"...


look at chapter 8

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.