Skip to Main Content

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Jack.

Asked: April 12, 2004 - 1:14 pm UTC

Last updated: September 29, 2022 - 7:40 am UTC

Version: 9.2.0

Viewed 50K+ times! This question is

You Asked

Tom,

I am trying to calculate mean, median, and mode of salary within each department, and display it on all rows (corresponding to the department). It's fine to ignore rows with a null salary. I believe I've got mean and median, but I need help with mode. By "mode", I mean the most frequently occurring value for salary within each department. I know I can do this with a self join, but I'm looking for an analytic solution (for performance evaluation). Pretty basic, but I haven't been able to find the answer - the word "mode" is too common. Please assist. Thanks.

- Jack



SQL> -----------------------
SQL> -- Create sample table.
SQL> -----------------------
SQL>
SQL> create table temp_01_tb
2 as
3 select 1 dept, 1 salary from dual
4 union all
5 select 1 dept, 1 salary from dual
6 union all
7 select 1 dept, 1 salary from dual
8 union all
9 select 1 dept, 2 salary from dual
10 union all
11 select 1 dept, 2 salary from dual
12 union all
13 select 1 dept, 8 salary from dual
14 union all
15 select 2 dept, 1 salary from dual
16 union all
17 select 2 dept, 1 salary from dual
18 union all
19 select 2 dept, 2 salary from dual
20 union all
21 select 2 dept, 2 salary from dual
22 union all
23 select 2 dept, null salary from dual
24 union all
25 select 2 dept, 3 salary from dual
26 ;

Table created.

SQL>
SQL> ---------------------
SQL> -- View sample table.
SQL> ---------------------
SQL>
SQL> select * from temp_01_tb;

DEPT SALARY
---------- ----------
1 1
1 1
1 1
1 2
1 2
1 8
2 1
2 1
2 2
2 2
2
2 3

12 rows selected.

SQL>
SQL> -----------------------------------
SQL> -- Show all rows, plus mean, median
SQL> -- and mode of salary within dept.
SQL> -----------------------------------
SQL>
SQL> select dept,
2 salary,
3 avg(salary) over (partition by dept) mean_sal,
4 percentile_disc (0.5) within group (order by salary)
5 over (partition by dept) median_sal,
6 ' ?' mode_sal
7 from temp_01_tb
8 ;

DEPT SALARY MEAN_SAL MEDIAN_SAL MODE_SAL
---------- ---------- ---------- ---------- --------
1 1 2.5 1 ?
1 1 2.5 1 ?
1 1 2.5 1 ?
1 2 2.5 1 ?
1 2 2.5 1 ?
1 8 2.5 1 ?
2 1 1.8 2 ?
2 1 1.8 2 ?
2 2 1.8 2 ?
2 2 1.8 2 ?
2 3 1.8 2 ?
2 1.8 2 ?

12 rows selected.



and Tom said...

well, what if there are two (or more) "most frequently occurring salaries"? You'll have to figure out what you want to do, but....


ops$tkyte@ORA9IR2> select dept, salary, mean_sal, median_sal,
2 first_value(case when cnt_sal > 1 then salary end)
3 over (partition by dept order by cnt_sal desc) mode_sal
4 from (
5 select dept,
6 salary,
7 avg(salary) over (partition by dept) mean_sal,
8 percentile_disc (0.5) within group
9 (order by salary) over (partition by dept) median_sal,
10 count(salary) over ( partition by dept, salary) cnt_sal
11 from temp_01_tb
12 )
13 ;

DEPT SALARY MEAN_SAL MEDIAN_SAL MODE_SAL
---------- ---------- ---------- ---------- ----------
1 1 2.5 1 1
1 1 2.5 1 1
1 1 2.5 1 1
1 2 2.5 1 1
1 2 2.5 1 1
1 8 2.5 1 1
2 1 1.8 2 1
2 1 1.8 2 1
2 2 1.8 2 1
2 2 1.8 2 1
2 3 1.8 2 1
2 1.8 2 1

12 rows selected.

This shows why i have the case -- technically -- if the count is 1 for all, the set has no "mode". you can change that to suite your needs. Here deptno 10 has no "mode" value


ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select dept, salary, mean_sal, median_sal,
2 first_value(case when cnt_sal > 1 then salary end)
3 over (partition by dept order by cnt_sal desc) mode_sal
4 from (
5 select dept,
6 salary,
7 avg(salary) over (partition by dept) mean_sal,
8 percentile_disc (0.5) within group
9 (order by salary) over (partition by dept) median_sal,
10 count(salary) over ( partition by dept, salary) cnt_sal
11 from (select deptno dept, sal salary from emp) temp_01_tb
12 )
13 ;

DEPT SALARY MEAN_SAL MEDIAN_SAL MODE_SAL
---------- ---------- ---------- ---------- ----------
10 1300 2916.66667 2450
10 2450 2916.66667 2450
10 5000 2916.66667 2450
20 3000 2175 2975 3000
20 3000 2175 2975 3000
20 800 2175 2975 3000
20 1100 2175 2975 3000
20 2975 2175 2975 3000
30 1250 1566.66667 1250 1250
30 1250 1566.66667 1250 1250
30 950 1566.66667 1250 1250
30 1500 1566.66667 1250 1250
30 1600 1566.66667 1250 1250
30 2850 1566.66667 1250 1250

14 rows selected.


Rating

  (38 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Perfect, thanks!

Jack, April 13, 2004 - 11:45 am UTC

Tom,

That's perfect, thanks.  What I need to do here is fill in a few missing salaries (with values that would not be unusual).  Now that I know how to get mode, here's what I came up with:  Fill in missing salaries with a value not unusual for the dept.  If salary is null for a whole dept, fill with a value not unusual for the whole report.  I chose to break ties for mode by further ordering by rowid.

Thanks for your help.

- Jack


SQL> select dept,
  2          salary,
  3          decode(salary,
  4             null,nvl(mode_sal_dept,mode_sal_report)
  5            ) fill
  6  from (
  7        select dept,
  8           salary,
  9           first_value(salary) over
 10              (partition by dept
 11              order by cnt_dept desc, rowid) mode_sal_dept,
 12           first_value(salary) over
 13              (order by cnt_report desc, rowid) mode_sal_report
 14        from (
 15             select dept,
 16                salary,
 17                count(salary) over (partition by dept, salary) cnt_dept,
 18                count(salary) over (partition by salary) cnt_report
 19             from temp_01_tb
 20             )
 21        )
 22  order by 1,2
 23  ;

      DEPT     SALARY       FILL                  
---------- ---------- ----------                  
        10          2                             
        10          2                             
        10          8                             
        10          9                             
        10          9                             
        10          9                             
        20          2                             
        20          2                             
        20          3                             
        20          4                             
        20          4                             
        20                     4                  
        30                     2                  
        30                     2                  

14 rows selected.
 

median

mini, April 29, 2004 - 6:41 pm UTC

This works with Oracle 9 and up. I want to do calculate median in a single select statement in 8i. Can you help me do that.

Thx

Tom Kyte
April 30, 2004 - 6:58 am UTC

select deptno,
avg(sal),
cnt
from (
select deptno,
sal,
count(*) over (partition by deptno) cnt,
row_number() over(partition by deptno order by sal) rn
from emp
)
where (mod(cnt,2) = 1 and rn = ceil(cnt/2) )
or (mod(cnt,2) = 0 and rn in ( cnt/2, cnt/2+1 ) )
group by deptno, cnt
/


Calculate Median, Avg, Min and Max

Mini, May 13, 2004 - 6:03 pm UTC

This works for calculating the median. Thanks. Now I need to write a query which calculates average, minimum, max and median and count in a single select statement.This is in Oracle 8i.
Thanks

Tom Kyte
May 14, 2004 - 9:59 am UTC

scott@ORA9IR2> select deptno,
2 avg(sal),
3 avg_sal,
4 min_sal,
5 max_sal
6 cnt
7 from (
8 select deptno,
9 sal,
10 count(*) over (partition by deptno) cnt,
11 row_number() over(partition by deptno order by sal) rn,
12 avg(sal) over (partition by deptno) avg_sal,
13 min(sal) over (partition by deptno) min_sal,
14 max(sal) over (partition by deptno) max_sal
15 from emp
16 )
17 where (mod(cnt,2) = 1 and rn = ceil(cnt/2) )
18 or (mod(cnt,2) = 0 and rn in ( cnt/2, cnt/2+1 ) )
19 group by deptno, cnt, avg_sal, min_sal, max_sal
20 /

DEPTNO AVG(SAL) AVG_SAL MIN_SAL CNT
---------- ---------- ---------- ---------- ----------
10 2450 2916.66667 1300 5000
20 2975 2175 800 3000
30 1375 1566.66667 950 2850



is one method

mean, median, mode

Mini, May 14, 2004 - 12:54 pm UTC

Thanks for the quick response. Now I have another problem. I have to compute avg, min, max, median for two differnt columns from the same table in one single select.

Tom Kyte
May 15, 2004 - 11:57 am UTC

scott@ORA9IR2> select deptno,
2 count(*),
3 percentile_cont(0.5) within group (order by sal) med_sal,
4 avg(sal),
5 min(sal),
6 max(sal),
7 percentile_cont(0.5) within group (order by nvl(comm,0)) med_comm,
8 avg(nvl(comm,0)),
9 min(nvl(comm,0)),
10 max(nvl(comm,0))
11 from emp
12 group by deptno;

DEPTNO COUNT(*) MED_SAL AVG(SAL) MIN(SAL) MAX(SAL) MED_COMM AVG(NVL(COMM,0)) MIN(NVL(COMM,0)) MAX(NVL(COMM,0))
---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------- ---------------- ----------------
10 3 2450 2916.66667 1300 5000 0 0 0 0
20 5 2975 2175 800 3000 0 0 0 0
30 6 1375 1566.66667 950 2850 150 366.666667 0 1400

scott@ORA9IR2>
scott@ORA9IR2> select deptno,
2 avg(
3 case when (mod(cnt,2) = 1 and rn = ceil(cnt/2) )
4 or (mod(cnt,2) = 0 and rn in ( cnt/2, cnt/2+1 ) )
5 then sal
6 end
7 ) med_sal,
8 avg_sal,
9 min_sal,
10 max_sal,
11 avg(
12 case when (mod(cnt,2) = 1 and rn2 = ceil(cnt/2) )
13 or (mod(cnt,2) = 0 and rn2 in ( cnt/2, cnt/2+1 ) )
14 then comm
15 end
16 ) med_comm,
17 avg_comm,
18 min_comm,
19 max_comm
20 cnt
21 from (
22 select deptno,
23 sal, nvl(comm,0) comm,
24 count(*) over (partition by deptno) cnt,
25 avg(sal) over (partition by deptno) avg_sal,
26 min(sal) over (partition by deptno) min_sal,
27 max(sal) over (partition by deptno) max_sal,
28 row_number() over(partition by deptno order by sal) rn,
29 row_number() over(partition by deptno order by nvl(comm,0)) rn2,
30 avg(nvl(comm,0)) over (partition by deptno) avg_comm,
31 min(nvl(comm,0)) over (partition by deptno) min_comm,
32 max(nvl(comm,0)) over (partition by deptno) max_comm
33 from emp
34 )
35 group by deptno, cnt, avg_sal, min_sal, max_sal, avg_comm, min_comm, max_comm
36 /

DEPTNO MED_SAL AVG_SAL MIN_SAL MAX_SAL MED_COMM AVG_COMM MIN_COMM CNT
---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
10 2450 2916.66667 1300 5000 0 0 0 0
20 2975 2175 800 3000 0 0 0 0
30 1375 1566.66667 950 2850 150 366.666667 0 1400


Brilliant!

VA, January 19, 2005 - 11:51 am UTC


not useful

Dylan, March 18, 2005 - 3:26 pm UTC

this was no use to me and my work!!

Tom Kyte
March 18, 2005 - 7:57 pm UTC

brilliant, guess you don't need a mean, median or mode.

so, why'd you read the page in the first place one might ask....

How to calculate overall median.

sekhar, April 19, 2005 - 7:48 pm UTC

Tom,

You have shown that median, avg are calculated for each department.

How do I calculate the median for all the departments together instead of calculating median for each department.

thanks
sekhar


Tom Kyte
April 19, 2005 - 8:06 pm UTC

don't partition by anything.

multiple analytics at once

Dave, May 24, 2006 - 10:58 am UTC

I see that you are able to use multiple analytic functions in your examples but I have one that is not working for me and I don't understand why not. I dumbed it down to just pull from dual and it is still not giving me numbers in all the columns. I used this same code last year when we were running on 9i instead of 10g and I think it worked then. Any ideas?

SELECT institution_id, school_level, subject_area_code, mean_score,
(TRUNC(cume_dist() OVER (PARTITION BY school_level, subject_area_code ORDER BY mean_score) * 100,2)) AS percentile,
(percentile_disc (0.6) WITHIN GROUP
(ORDER BY mean_score) OVER (PARTITION BY school_level, subject_area_code)) AS per60d,
1
FROM (
SELECT institution_id, school_level, subject_area_code, AVG(mean_score) mean_score
FROM
(
SELECT 1 institution_id, 1 school_level, 1 subject_area_code, 1 mean_score
FROM dual
UNION
SELECT 1 institution_id, 1 school_level, 1 subject_area_code, 2 mean_score
FROM dual)
GROUP BY institution_id, school_level, subject_area_code
)

Tom Kyte
May 25, 2006 - 7:05 am UTC

you have a perfect example for support, please file a bug on this one - all they need to do is run it in 9i and 10g and they'll see something is wrong.

Wow!

Kevin Shidler, May 25, 2006 - 3:26 pm UTC

You know, I don't use mean, modes, and medians at work either, but I'm still amazed!! Excellent!

multiple analytics at once

Dave, May 26, 2006 - 9:38 am UTC

Thanks Tom. I thought maybe I was goofing it up.

Mean,mode,median

Dave Fowler, September 12, 2006 - 6:55 pm UTC

Greate examples, As always....Analytics rule.

Perfect for needs as well.

Display count (*)

Cari, September 18, 2006 - 10:19 pm UTC

Hi

I want to display tha with sql:

ID CODE MESSAGE COUNT(CODE)
1 ORA-02289 Sequence does not exist 3
ORA-00054 Resource busy 14
ORA-00942 Table or view does't exist 5
2 ORA-01001 Invalid cursor 1
.. .. ..

I do it sql query

SELECT decode(rn1,1,id)ID,
decode(rn2,1,code)CODE,message,cnt
FROM (
SELECT id,code,message,count(code) over (partition by code)cnt,
row_number() over (partition by id order by code,message,count(code)) rn1,
row_number() over (partition by id,code order by code,message,count(code)) rn2
from error
group by id,code,message);


But did not show really count.

Would you please showme where is the error ?

Thanks in advanced

Tom Kyte
September 19, 2006 - 2:25 pm UTC

well, given that I don't have your tables installed for some reason, I cannot really do much here...

Reply to Dylan

REJI, September 19, 2006 - 2:36 pm UTC

Tom - Excellent technical tips. I spend 30 minutes everyday to visit this site to learn something new and never logged out without learning something new.

Dylan - I don't think Tom's job is to do your job. This is a place where people like us ask some advise or help and Tom with his busy schedule find the time to address them. So, if it doesn't help you then keep quite or learn something from the posting.

Display count (*)

Cari, September 20, 2006 - 9:25 pm UTC

Hi

Sorry Tom.

That is the table and many inserts.

create table err
(ID number,
CODE varchar2(10),
MESSAGE varchar2(200),
fdate date);

insert into err values(1,'ORA-02289','Sequence does not exist',sysdate);
insert into err values(1,'ORA-017','Invalid username/password',sysdate);
insert into err values(1,'ORA-00942','Table or view exist',sysdate);
insert into err values(2,'ORA-017','Invalid username/password',sysdate);
insert into err values(2,'ORA-017','Invalid username/password',sysdate);
insert into err values(3,'ORA-0056','User invalid',sysdate);

Would you please help how deploy when count (*) ?


Tom Kyte
September 21, 2006 - 2:09 am UTC

well, I have no idea how this output:

ID CODE MESSAGE COUNT(CODE)
1 ORA-02289 Sequence does not exist 3
ORA-00054 Resource busy 14
ORA-00942 Table or view does't exist 5
2 ORA-01001 Invalid cursor 1

it supposed to be generated from that input. count(code) is what exactly????

Here's a guess

Jack Mulhollan, September 21, 2006 - 6:29 pm UTC

SQL> col id for 99
SQL> col message for a30
SQL> break on report
SQL> compute sum of count(*) on report
SQL>
SQL> select id, code, message, count(*)
  2  from err
  3  group by id, code, message
  4  order by 1
  5  ;

 ID CODE       MESSAGE                        COUNT(*)
--- ---------- ------------------------------ --------
  1 ORA-00942  Table or view exist                   1
  1 ORA-017    Invalid username/password             1
  1 ORA-02289  Sequence does not exist               1
  2 ORA-017    Invalid username/password             2
  3 ORA-0056   User invalid                          1
                                              --------
sum                                                  6
 

Running Median

Mark Campbell, September 26, 2007 - 11:56 am UTC

I am trying to figure out how to determine a running median. I have Oracle Enterprise 10.2.0.1.0.

Each day I record how long it took to perform a particular procedure. I am able to calculate the overall average and a running average over the past 5 days and 10 days. I can also calculate the overall median, but I am not able to figure out how to do a running median over the past 5 and 10 days.

I have tried the median analytic function in 10gR2, but it seems it will only let me perform the median over a partition, but what I really want is the median over a window like I can do for the average.

Below is sample data and the SQL and results for the calculations that I am able to perform.

Any suggestions?
Thanks

create table t1 (
run_date date,
elapsed_time number);
insert into t1 values (to_date('09/26/2007','MM/DD/YYYY'),8.23);
insert into t1 values (to_date('09/25/2007','MM/DD/YYYY'),7.41);
insert into t1 values (to_date('09/24/2007','MM/DD/YYYY'),8.66);
insert into t1 values (to_date('09/23/2007','MM/DD/YYYY'),6.58);
insert into t1 values (to_date('09/22/2007','MM/DD/YYYY'),6.18);
insert into t1 values (to_date('09/21/2007','MM/DD/YYYY'),7.33);
insert into t1 values (to_date('09/20/2007','MM/DD/YYYY'),6.23);
insert into t1 values (to_date('09/19/2007','MM/DD/YYYY'),7.02);
insert into t1 values (to_date('09/18/2007','MM/DD/YYYY'),6.28);
insert into t1 values (to_date('09/17/2007','MM/DD/YYYY'),7.23);
insert into t1 values (to_date('09/16/2007','MM/DD/YYYY'),6.36);
insert into t1 values (to_date('09/15/2007','MM/DD/YYYY'),6.36);
insert into t1 values (to_date('09/14/2007','MM/DD/YYYY'),7.18);
insert into t1 values (to_date('09/13/2007','MM/DD/YYYY'),6.28);
insert into t1 values (to_date('09/12/2007','MM/DD/YYYY'),7.21);
insert into t1 values (to_date('09/11/2007','MM/DD/YYYY'),6.33);
insert into t1 values (to_date('09/10/2007','MM/DD/YYYY'),6.50);
insert into t1 values (to_date('09/9/2007','MM/DD/YYYY'), 6.28);
insert into t1 values (to_date('09/8/2007','MM/DD/YYYY'), 6.26);
insert into t1 values (to_date('09/7/2007','MM/DD/YYYY'), 6.33);
insert into t1 values (to_date('09/6/2007','MM/DD/YYYY'), 5.26);
insert into t1 values (to_date('09/5/2007','MM/DD/YYYY'), 6.15);
insert into t1 values (to_date('09/4/2007','MM/DD/YYYY'), 5.11);
insert into t1 values (to_date('09/3/2007','MM/DD/YYYY'), 6.13);
insert into t1 values (to_date('09/2/2007','MM/DD/YYYY'), 5.23);
commit;

select run_date, 
       elapsed_time,        
       PERCENTILE_DISC (0.5) WITHIN GROUP (ORDER BY elapsed_time) OVER () median_perc_disc_method,
       median(elapsed_time) over () median_method, 
       AVG (elapsed_time) OVER () avg_time,
       AVG (elapsed_time)   OVER (ORDER BY run_date DESC RANGE BETWEEN CURRENT ROW AND 4 FOLLOWING)     avg_time5,
       AVG (elapsed_time)   OVER (ORDER BY run_date DESC RANGE BETWEEN CURRENT ROW AND 9 FOLLOWING)     avg_time10
from t1
order by 1 desc

run_date  elapsed_time median_perc_disc_method  median_method  avg_time   avg_time5  avg_time10
9/26/2007 8.23   6.33   6.33   6.5648   7.412  7.115    
9/25/2007 7.41   6.33   6.33   6.5648   7.232  6.928    
9/24/2007 8.66   6.33   6.33   6.5648   6.996  6.823    
9/23/2007 6.58   6.33   6.33   6.5648   6.668  6.675    
9/22/2007 6.18   6.33   6.33   6.5648   6.608  6.645    
9/21/2007 7.33   6.33   6.33   6.5648   6.818  6.748    
9/20/2007 6.23   6.33   6.33   6.5648   6.624  6.648    
9/19/2007 7.02   6.33   6.33   6.5648   6.65   6.675    
9/18/2007 6.28   6.33   6.33   6.5648   6.682  6.601    
9/17/2007 7.23   6.33   6.33   6.5648   6.682  6.599    
9/16/2007 6.36   6.33   6.33   6.5648   6.678  6.509    
9/15/2007 6.36   6.33   6.33   6.5648   6.672  6.399    
9/14/2007 7.18   6.33   6.33   6.5648   6.7    6.378    
9/13/2007 6.28   6.33   6.33   6.5648   6.52   6.171    
9/12/2007 7.21   6.33   6.33   6.5648   6.516  6.156    
9/11/2007 6.33   6.33   6.33   6.5648   6.34   5.958    
9/10/2007  6.5   6.33   6.33   6.5648   6.126  5.9167   
9/9/2007  6.28   6.33   6.33   6.5648   6.056  5.845    
9/8/2007  6.26   6.33   6.33   6.5648   5.822  5.7814   
9/7/2007  6.33   6.33   6.33   6.5648   5.796  5.7016   
9/6/2007  5.26   6.33   6.33   6.5648   5.576  5.576    
9/5/2007  6.15   6.33   6.33   6.5648   5.655  5.655    
9/4/2007  5.11   6.33   6.33   6.5648   5.49   5.49     
9/3/2007  6.13   6.33   6.33   6.5648   5.68   5.68     
9/2/2007  5.23   6.33   6.33   6.5648   5.23   5.23  

Running Median

Mark Campbell, September 27, 2007 - 10:20 am UTC

I figured out how to "derive" a partition (by using a truncated row_number) for each group of 5 records (except for the first group where I only have 4) for which I can calculate the median.

So now I can answer a similar question, but not quite exactly what I want.
What I can answer - "For the 5 most recent 5-day periods , what is the median elapsed time for that 5-day period?"
What I want to answer - "For each day, what is the median elapsed time for that day and the 4 previous days?"

Not sure why I am getting different results for the last two columns - it seems the median function is giving me the true median for the set of 4 records where the truncated row_number is 0.

with a as (
select run_date, 
       elapsed_time,  
       row_number() over (order by run_date desc) rn,       
       trunc((row_number() over (order by run_date desc))/5) rn2,
       PERCENTILE_DISC (0.5) WITHIN GROUP (ORDER BY elapsed_time) OVER () perc_disc_method,
       median(elapsed_time) over () median_method
from t1
order by 1 desc
)
select a.*, 
       PERCENTILE_DISC (0.5) WITHIN GROUP (ORDER BY elapsed_time) OVER (partition by rn2 ) run_med,
       median(elapsed_time) over (partition by rn2) run_med1
from a
order by run_date desc

run_date  elapsed_time rn  rn2 perc_disc_method  median_method run_med  run_med1  
9/26/2007  8.23   1  0  6.33  6.33  7.41  7.82 
9/25/2007  7.41   2  0  6.33  6.33  7.41  7.82 
9/24/2007  8.66   3  0  6.33  6.33  7.41  7.82 
9/23/2007  6.58   4  0  6.33  6.33  7.41  7.82 
9/22/2007  6.18   5  1  6.33  6.33  6.28  6.28 
9/21/2007  7.33   6  1  6.33  6.33  6.28  6.28 
9/20/2007  6.23   7  1  6.33  6.33  6.28  6.28 
9/19/2007  7.02   8  1  6.33  6.33  6.28  6.28 
9/18/2007  6.28   9  1  6.33  6.33  6.28  6.28 
9/17/2007  7.23  10  2  6.33  6.33  6.36  6.36 
9/16/2007  6.36  11  2  6.33  6.33  6.36  6.36 
9/15/2007  6.36  12  2  6.33  6.33  6.36  6.36 
9/14/2007  7.18  13  2  6.33  6.33  6.36  6.36 
9/13/2007  6.28  14  2  6.33  6.33  6.36  6.36 
9/12/2007  7.21  15  3  6.33  6.33  6.33  6.33 
9/11/2007  6.33  16  3  6.33  6.33  6.33  6.33 
9/10/2007  6.5   17  3  6.33  6.33  6.33  6.33 
9/9/2007   6.28  18  3  6.33  6.33  6.33  6.33 
9/8/2007   6.26  19  3  6.33  6.33  6.33  6.33 
9/7/2007   6.33  20  4  6.33  6.33  6.13  6.13 
9/6/2007   5.26  21  4  6.33  6.33  6.13  6.13 
9/5/2007   6.15  22  4  6.33  6.33  6.13  6.13 
9/4/2007   5.11  23  4  6.33  6.33  6.13  6.13 
9/3/2007   6.13  24  4  6.33  6.33  6.13  6.13 
9/2/2007   5.23  25  5  6.33  6.33  5.23  5.23 


Tom Kyte
September 28, 2007 - 3:46 pm UTC

you are getting a different median from percentile_disc because disc is short for discrete - it returns an actual observed value, percentile_cont would return the same as median.

Here is one possible approach:

ops$tkyte%ORA10GR2> create or replace type myType as table of number
  2  /

Type created.

ops$tkyte%ORA10GR2> column data format a50
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> with d
  2  as
  3  (
  4  select run_date,
  5         myType( elapsed_time,
  6                     lag(elapsed_time,1) over (order by run_date),
  7                     lag(elapsed_time,2) over (order by run_date),
  8                     lag(elapsed_time,3) over (order by run_date),
  9                     lag(elapsed_time,4) over (order by run_date) )  data
 10    from t1
 11  )
 12  select run_date, data,
 13         (select percentile_cont(0.5) within group (order by column_value)
 14                from TABLE( data ) ) x
 15   from d
 16  /

RUN_DATE  DATA                                                        X
--------- -------------------------------------------------- ----------
02-SEP-07 MYTYPE(5.23, NULL, NULL, NULL, NULL)                     5.23
03-SEP-07 MYTYPE(6.13, 5.23, NULL, NULL, NULL)                     5.68
04-SEP-07 MYTYPE(5.11, 6.13, 5.23, NULL, NULL)                     5.23
05-SEP-07 MYTYPE(6.15, 5.11, 6.13, 5.23, NULL)                     5.68
06-SEP-07 MYTYPE(5.26, 6.15, 5.11, 6.13, 5.23)                     5.26
07-SEP-07 MYTYPE(6.33, 5.26, 6.15, 5.11, 6.13)                     6.13
08-SEP-07 MYTYPE(6.26, 6.33, 5.26, 6.15, 5.11)                     6.15
09-SEP-07 MYTYPE(6.28, 6.26, 6.33, 5.26, 6.15)                     6.26
10-SEP-07 MYTYPE(6.5, 6.28, 6.26, 6.33, 5.26)                      6.28
11-SEP-07 MYTYPE(6.33, 6.5, 6.28, 6.26, 6.33)                      6.33
12-SEP-07 MYTYPE(7.21, 6.33, 6.5, 6.28, 6.26)                      6.33
13-SEP-07 MYTYPE(6.28, 7.21, 6.33, 6.5, 6.28)                      6.33
14-SEP-07 MYTYPE(7.18, 6.28, 7.21, 6.33, 6.5)                       6.5
15-SEP-07 MYTYPE(6.36, 7.18, 6.28, 7.21, 6.33)                     6.36
16-SEP-07 MYTYPE(6.36, 6.36, 7.18, 6.28, 7.21)                     6.36
17-SEP-07 MYTYPE(7.23, 6.36, 6.36, 7.18, 6.28)                     6.36
18-SEP-07 MYTYPE(6.28, 7.23, 6.36, 6.36, 7.18)                     6.36
19-SEP-07 MYTYPE(7.02, 6.28, 7.23, 6.36, 6.36)                     6.36
20-SEP-07 MYTYPE(6.23, 7.02, 6.28, 7.23, 6.36)                     6.36
21-SEP-07 MYTYPE(7.33, 6.23, 7.02, 6.28, 7.23)                     7.02
22-SEP-07 MYTYPE(6.18, 7.33, 6.23, 7.02, 6.28)                     6.28
23-SEP-07 MYTYPE(6.58, 6.18, 7.33, 6.23, 7.02)                     6.58
24-SEP-07 MYTYPE(8.66, 6.58, 6.18, 7.33, 6.23)                     6.58
25-SEP-07 MYTYPE(7.41, 8.66, 6.58, 6.18, 7.33)                     7.33
26-SEP-07 MYTYPE(8.23, 7.41, 8.66, 6.58, 6.18)                     7.41

25 rows selected.

Running Median

Mark, October 01, 2007 - 11:49 am UTC

Thank you very much.

question: moving median

MannyJ, September 25, 2008 - 4:27 pm UTC

Hi Tom,
As always, Thank you! for the wonderful stuff you do on asktom.

Do you know of a way to calculate "moving median" simliar to hyperion/essbases' MovingMed function, using Oracle 10g analytic functions?

Here is the definition and data set example (in hyperion's doc):
http://download.oracle.com/docs/cd/E12032_01/doc/epm.921/html_ir_web_client/frameset.htm?/docs/cd/E12032_01/doc/epm.921/html_ir_web_client/irwebclient-15-77.html

Thanks,
Manny

Tom Kyte
September 25, 2008 - 4:54 pm UTC

how big of a window do you want or truly need?

Please disregard the "moving medium" question.

MannyJ, September 25, 2008 - 4:33 pm UTC

I think the "running medium" question gave me what I need.

6 months median

Balachandra, July 23, 2010 - 9:45 am UTC

Tom,
I am trying to solve a question where the requirement is to derive a running 180 day median .The data is colected multiple time in a day in short multiple records for same day. So how to modify the query?
Tom Kyte
July 23, 2010 - 9:51 am UTC

I don't see a create table or inserts to test with, one would sort of need that.

Fair Enough

Balachandra, July 23, 2010 - 11:05 am UTC

Tom,
I am only posting a section of data since the actual data is 6K.
CREATE TABLE T2(CCOMPONENT VARCHAR2(50),SAMPLED_DATE DATE ,OBSERVED_VALUE NUMBER(30,5));
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/03/2009 08:00','MM/DD/YYYY HH24:MI'), .08);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/03/2009 14:02','MM/DD/YYYY HH24:MI'), .06);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/03/2009 20:00','MM/DD/YYYY HH24:MI'), .07);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/04/2009 08:30','MM/DD/YYYY HH24:MI'), .2);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/04/2009 14:20','MM/DD/YYYY HH24:MI'), .15);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/04/2009 20:00','MM/DD/YYYY HH24:MI'), .05);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/05/2009 07:57','MM/DD/YYYY HH24:MI'), .05);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/05/2009 13:30','MM/DD/YYYY HH24:MI'), .17);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/05/2009 20:00','MM/DD/YYYY HH24:MI'), .07);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/06/2009 07:54','MM/DD/YYYY HH24:MI'), .2);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/06/2009 13:30','MM/DD/YYYY HH24:MI'), .31);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/06/2009 20:00','MM/DD/YYYY HH24:MI'), .15);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/07/2009 08:15','MM/DD/YYYY HH24:MI'), .29);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/07/2009 13:55','MM/DD/YYYY HH24:MI'), .27);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/07/2009 20:00','MM/DD/YYYY HH24:MI'), .2);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/08/2009 08:12','MM/DD/YYYY HH24:MI'), 0);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/08/2009 13:42','MM/DD/YYYY HH24:MI'), .37);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/08/2009 20:00','MM/DD/YYYY HH24:MI'), .15);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/09/2009 08:10','MM/DD/YYYY HH24:MI'), .4);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/09/2009 13:26','MM/DD/YYYY HH24:MI'), .29);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/09/2009 20:00','MM/DD/YYYY HH24:MI'), .2);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/10/2009 08:05','MM/DD/YYYY HH24:MI'), .29);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/10/2009 13:45','MM/DD/YYYY HH24:MI'), .11);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/10/2009 20:00','MM/DD/YYYY HH24:MI'), .11);
........................................

INSERT INTO T2 VALUES ('Hg', TO_DATE('03/28/2010 07:53','MM/DD/YYYY HH24:MI'), 1.3);
INSERT INTO T2 VALUES ('Hg', TO_DATE('03/28/2010 13:34','MM/DD/YYYY HH24:MI'), .5);
INSERT INTO T2 VALUES ('Hg', TO_DATE('03/28/2010 20:00','MM/DD/YYYY HH24:MI'), .55);
INSERT INTO T2 VALUES ('Hg', TO_DATE('03/29/2010 08:22','MM/DD/YYYY HH24:MI'), 1.38);
INSERT INTO T2 VALUES ('Hg', TO_DATE('03/29/2010 14:12','MM/DD/YYYY HH24:MI'), .77);
INSERT INTO T2 VALUES ('Hg', TO_DATE('03/29/2010 20:00','MM/DD/YYYY HH24:MI'), .21);
INSERT INTO T2 VALUES ('Hg', TO_DATE('03/30/2010 08:29','MM/DD/YYYY HH24:MI'), .82);
INSERT INTO T2 VALUES ('Hg', TO_DATE('03/30/2010 14:03','MM/DD/YYYY HH24:MI'), .33);
INSERT INTO T2 VALUES ('Hg', TO_DATE('03/30/2010 20:00','MM/DD/YYYY HH24:MI'), .18);
INSERT INTO T2 VALUES ('Hg', TO_DATE('03/31/2010 08:15','MM/DD/YYYY HH24:MI'), 2.04);
INSERT INTO T2 VALUES ('Hg', TO_DATE('03/31/2010 13:56','MM/DD/YYYY HH24:MI'), .57);
INSERT INTO T2 VALUES ('Hg', TO_DATE('03/31/2010 20:00','MM/DD/YYYY HH24:MI'), .3);
COMMIT;


Tom Kyte
July 23, 2010 - 12:34 pm UTC

now we need details on what you expect to see from this data and a reason why you expect to see that - like a specification.

not saying we'll be able to do it - but one needs a test case and a detailed explanation of what needs to be done. Your data doesn't resemble the original question at all.

Specification

Balachandra, July 23, 2010 - 3:09 pm UTC

The specification is a compliance determination
Please report the maximum of 6-month median for the reporting month (which is March 2010 in our case).
The 6-month median shall apply as a moving median of daily values for any 180 day period in which daily values represent concentrations within a 24 hour period.

I was trying to adapt your solution to Mark Campbell where you presented 5 day median

Tom Kyte
July 23, 2010 - 4:18 pm UTC

yeah, but his example had an observation per day - so we could group things easily - you do not. And the median/percentile functions do not support range windows so I don't think we can do this in straight analytic SQL.


Your problem appears very different from theirs, but it is hard to say because your specification isn't really a specification. I think you want to take each row in the input set - find all of the records within +/- 180 days of that row, compute the median value for that set of rows. We could do that with correlated subqueries, is this the result you were looking for?

ops$tkyte%ORA10GR2> select ccomponent, sampled_date, observed_value,
  2         (select median(observed_value)
  3                from t2 x
  4                   where x.sampled_date between A.trunc_sd-180 and A.trunc_sd+180
  5                     and x.ccomponent = A.ccomponent ) med
  6    from (select t2.*, trunc(sampled_date) trunc_sd from t2) a
  7  /

CC SAMPLED_D OBSERVED_VALUE        MED
-- --------- -------------- ----------
Hg 03-SEP-09            .08        .16
Hg 03-SEP-09            .06        .16
Hg 03-SEP-09            .07        .16
Hg 04-SEP-09             .2        .16
Hg 04-SEP-09            .15        .16
Hg 04-SEP-09            .05        .16
Hg 05-SEP-09            .05        .16
Hg 05-SEP-09            .17        .16
Hg 05-SEP-09            .07        .16
Hg 06-SEP-09             .2        .16
Hg 06-SEP-09            .31        .16
Hg 06-SEP-09            .15        .16
Hg 07-SEP-09            .29        .16
Hg 07-SEP-09            .27        .16
Hg 07-SEP-09             .2        .16
Hg 08-SEP-09              0        .16
Hg 08-SEP-09            .37        .16
Hg 08-SEP-09            .15        .16
Hg 09-SEP-09             .4        .16
Hg 09-SEP-09            .29        .16
Hg 09-SEP-09             .2        .16
Hg 10-SEP-09            .29        .16
Hg 10-SEP-09            .11        .16
Hg 10-SEP-09            .11        .16
Hg 28-MAR-10            1.3        .56
Hg 28-MAR-10             .5        .56
Hg 28-MAR-10            .55        .56
Hg 29-MAR-10           1.38        .56
Hg 29-MAR-10            .77        .56
Hg 29-MAR-10            .21        .56
Hg 30-MAR-10            .82        .56
Hg 30-MAR-10            .33        .56
Hg 30-MAR-10            .18        .56
Hg 31-MAR-10           2.04        .56
Hg 31-MAR-10            .57        .56
Hg 31-MAR-10             .3        .56

36 rows selected.

Excellent

Balachandra, July 23, 2010 - 5:39 pm UTC

I tried this query

SELECT DISTINCT w.ccomponent,
b.ss,
MEDIAN(w.formatted_entry) OVER (PARTITION BY b.ss) mdn
FROM t2 w,
(SELECT DISTINCT TRUNC(w1.sampled_date) ss
FROM t2 w1
WHERE w1.ccomponent = 'Hg'
AND TRUNC(w1.sampled_date) BETWEEN TO_DATE('03/1/2010', 'mm/dd/yyyy') AND TO_DATE('03/31/2010', 'mm/dd/yyyy')) b
WHERE w.ccomponent = 'Hg'
AND TRUNC(w.sampled_date) BETWEEN b.ss - 179 AND b.ss
ORDER BY 2

Hg 3/1/2010 0.15
Hg 3/2/2010 0.15
Hg 3/3/2010 0.15
Hg 3/4/2010 0.15
Hg 3/5/2010 0.14
Hg 3/6/2010 0.14
Hg 3/7/2010 0.14
Hg 3/8/2010 0.14
Hg 3/9/2010 0.14
Hg 3/10/2010 0.14
Hg 3/11/2010 0.14
Hg 3/12/2010 0.14
Hg 3/13/2010 0.14
Hg 3/14/2010 0.14
Hg 3/15/2010 0.14
Hg 3/16/2010 0.14
Hg 3/17/2010 0.145
Hg 3/18/2010 0.15
Hg 3/19/2010 0.15
Hg 3/20/2010 0.15
Hg 3/21/2010 0.15
Hg 3/22/2010 0.15
Hg 3/23/2010 0.16
Hg 3/24/2010 0.16
Hg 3/25/2010 0.16
Hg 3/26/2010 0.16
Hg 3/27/2010 0.17
Hg 3/28/2010 0.17
Hg 3/29/2010 0.18
Hg 3/30/2010 0.18
Hg 3/31/2010 0.18

A reader, April 26, 2012 - 1:08 pm UTC

I am find median of three columns ( in my case qty1 , qty2 , qty3 ) . Is there any easier way to accommodation this ?. Looking at the documentation , we do have median function ; but that returns value across the records ( interrecord not intra record) .

I am on 10g release 2 .


SQL> drop table test_med purge ;

Table dropped.

SQL>
SQL> create table test_med (
  2  col1 integer ,
  3  qty1 number ,
  4  qty2 number ,
  5  qty3 number )
  6  ;

Table created.

SQL>
SQL> insert into test_med values ( 1 , 1 , 2 , 3) ;

1 row created.

SQL> insert into test_med values ( 2 , 9 , 4 , 3) ;

1 row created.

SQL> insert into test_med values ( 3 , 1 , 3 , 2) ;

1 row created.

SQL> insert into test_med values ( 5 , 5 , 5 , 5) ;

1 row created.

SQL> insert into test_med values ( 6 , 5 , 6 , 6) ;

1 row created.

SQL>
SQL> Select col1 , qty1 , qty2 , qty3
  2  ,
  3  Case
  4  When least( qty1 , qty2 , qty3)  = greatest( qty1 , qty2 , qty3) then qty1
  5  When   least( qty1 , qty2 , qty3)  < qty1 and   qty1 < greatest( qty1 , qty2 , qty3) then qty1
  6  When least( qty1 , qty2 , qty3)  < qty2  and   qty2 < greatest( qty1 , qty2 , qty3) then qty2
  7  When least( qty1 , qty2 , qty3)  < qty3  and   qty3 < greatest( qty1 , qty2 , qty3) then qty3
  8  else null end as median
  9  from test_med;

      COL1       QTY1       QTY2       QTY3     MEDIAN
---------- ---------- ---------- ---------- ----------
         1          1          2          3          2
         2          9          4          3          4
         3          1          3          2          2
         5          5          5          5          5
         6          5          6          6

SQL>


Note : The last record returns null . So the above case is not 100% working.
Tom Kyte
April 26, 2012 - 1:36 pm UTC

ops$tkyte%ORA10GR2> select col1, median(column_value)
  2    from test_med, table(
  3                   cast(
  4                   multiset( select decode(level,1,qty1,2,qty2,3,qty3)
  5                               from dual
  6                            connect by level <= 3)
  7                     as sys.odciNumberList ) )
  8   group by col1
  9  /

      COL1 MEDIAN(COLUMN_VALUE)
---------- --------------------
         1                    2
         2                    4
         3                    2
         5                    5
         6                    6

ops$tkyte%ORA10GR2>


we use table unnesting to get the data in the format it should have been in in the first place :)
is one approach...

A reader, August 20, 2012 - 9:56 am UTC

I am extending above example to fulfil my use case. 

Here  the logic is 

a) col1 is the Primary key 
b) qty1 , qty2 , qty3 are actually values
c) qty4 , qty5 , qty6 are derived values 

qty4 = median ( qty1 , qty2 ,qty3) 
qty5 = median ( qty2 ,qty3 , qty4) 
qty6 = median ( qty4 , qty5 , qty6) 


Code :
------

drop table test_med purge ;
create table test_med (
    col1 integer ,
    qty1 number ,
    qty2 number ,
    qty3 number ,
    qty4 number null,
    qty5 number null
    qty6 number null )
    ;


insert into test_med values ( 1 , 1 , 2 , 3, null , null , null  ) ;
insert into test_med values ( 2 , 9 , 4 , 3 , null , null  , null ) ;
insert into test_med values ( 3 , 1 , 3 , 2 , null , null  , null ) ;
insert into test_med values ( 5 , 5 , 5 , 5 , null , null  , null ) ;
insert into test_med values ( 6 , 5 , 6 , 6 , null , null  , null ) ;
insert into test_med values ( 7 , 5 , 7 , 6 , null , null  , null ) ;
insert into test_med values ( 8 , 6 , 6 , 5 , null , null , null ) ;
insert into test_med values ( 9 , 6 , 5 , 6 , null , null , null ) ;

WITH v_med1 AS
  (SELECT col1, qty1 , qty2 , qty3,
    median(column_value) AS qty4
  FROM test_med,
    TABLE( CAST( multiset
    ( SELECT DECODE(level,1,qty1,2,qty2,3,qty3) FROM dual CONNECT BY level <= 3
    ) AS sys.odciNumberList ) )
  GROUP BY col1 ,qty1 , qty2 , qty3
  )
, v_med2 as ( 
SELECT col1 , qty1 , qty2 , qty3,qty4,
  median(column_value) AS qty5
FROM v_med1,
  TABLE( CAST( multiset
  ( SELECT DECODE(level,1,qty2,2,qty3,3,qty4) FROM dual CONNECT BY level <= 4
  ) AS sys.odciNumberList ) )
GROUP BY col1 ,qty1 , qty2 , qty3,qty4
)
,
v_med3 as ( 
SELECT col1 , qty1 , qty2 , qty3,qty4, qty5,
  median(column_value) AS qty6
FROM v_med2,
  TABLE( CAST( multiset
  ( SELECT DECODE(level,1,qty3,2,qty4,3,qty5) FROM dual CONNECT BY level <= 5
  ) AS sys.odciNumberList ) )
GROUP BY col1 ,qty1 , qty2 , qty3,qty4 , qty5 
) 
Select * from v_med3
;

COL1                                    QTY1                                    QTY2                                    QTY3                                    QTY4                                    QTY5                                    QTY6                                    
--------------------------------------- --------------------------------------- --------------------------------------- --------------------------------------- --------------------------------------- --------------------------------------- --------------------------------------- 
1                                       1                                       2                                       3                                       2                                       2                                       2                                       
2                                       9                                       4                                       3                                       4                                       4                                       4                                       
3                                       1                                       3                                       2                                       2                                       2                                       2                                       
5                                       5                                       5                                       5                                       5                                       5                                       5                                       
6                                       5                                       6                                       6                                       6                                       6                                       6                                       
7                                       5                                       7                                       6                                       6                                       6                                       6                                       
8                                       6                                       6                                       5                                       6                                       6                                       6                                       
9                                       6                                       5                                       6                                       6                                       6                                       6                                       
 8 rows selected 



Is there any other better way to accomplish this ? . The base table has around 10 million records. 





I tried with MODEL clause .. was getting invalid cell reference.


Any better solution

Asim, August 31, 2022 - 9:50 am UTC

The first original qiestion was answered in year 2004. Version 9i, 8i

Is there any better way now using new features, if yes please mention the version frim which that features started
Connor McDonald
September 01, 2022 - 3:58 am UTC

There is a native MEDIAN function, eg

SQL> select deptno, median(sal)
  2  from   scott.emp
  3  group by deptno;

    DEPTNO MEDIAN(SAL)
---------- -----------
        10        2450
        20        2975
        30        1375

Mediaan analytic

Asim, September 01, 2022 - 12:12 pm UTC


Ok, and for mode stats_mode is available now, right?

Median, its shown in the aggregate functions list here in the 21c documentation.

https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/Aggregate-Functions.html#GUID-62BE676B-AF18-4E63-BD14-25206FEA0848


Median also has an analytic syntax but its not listed in the list of analytic functions in the 21c documentation here

https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/Analytic-Functions.html#GUID-527832F7-63C0-4445-8C16-307FA5084056

Is this documentation error?


Connor McDonald
September 05, 2022 - 5:45 am UTC

stats_mode is available now

for a long time (back to 10g I think)


Median

Asim, September 05, 2022 - 11:53 am UTC

What about that documentation error?

Seems another documentation error

Asim, September 07, 2022 - 5:32 pm UTC

The syntax of MEDIAN here

https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/MEDIAN.html#GUID-DE15705A-AC18-4416-8487-B9E1D70CE01A

says OVER({querypartionclause})
whereas its working as
OVER([qierypartitionclause])

Seems another documentation error?



Seems another documentation error

Asim, September 07, 2022 - 5:32 pm UTC

The syntax of MEDIAN here

https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/MEDIAN.html#GUID-DE15705A-AC18-4416-8487-B9E1D70CE01A

says OVER({querypartionclause})
whereas its working as
OVER([qierypartitionclause])

Seems another documentation error?



Connor McDonald
September 12, 2022 - 3:11 am UTC

says OVER({querypartionclause})
whereas its working as
OVER([qierypartitionclause])


what does this mean?

Seems another documentation error

Asim, September 07, 2022 - 5:32 pm UTC

The syntax of MEDIAN here

https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/MEDIAN.html#GUID-DE15705A-AC18-4416-8487-B9E1D70CE01A

says OVER({querypartionclause})
whereas its working as
OVER([qierypartitionclause])

Seems another documentation error?



Sorry

Asim, September 08, 2022 - 4:00 am UTC

Sorry dont know how above post got submitted 3 times.

Means

Asim, September 12, 2022 - 4:59 am UTC

Means doc says querypartitionclause is mandatory in median analytic function, whereas queries are running fine without mentioning querypartitionclause.


Chris Saxon
September 27, 2022 - 3:28 pm UTC

Thanks for flagging this, I've raised it with the doc team.

2 errors

Asim, September 27, 2022 - 7:23 pm UTC

Have you raised both doc errors or only 1.

Just to summarize the above discussion.
There seems to be 2 doc errors in 21c

1. querypartitionclause is mentioned as mandatory in median analytic syntax diagram whereas it should be optional.

2.
MEDIAN is also an analytic function but is not shown in the list of analytic functions, its only listed in aggregate function.

Connor McDonald
September 29, 2022 - 1:52 am UTC

1) no - because the diagram says its optional

2) yes

Asim, September 29, 2022 - 3:58 am UTC

1.
I just rechecked and also refreshed that doc page, but diagram is still showing it mandatory

Because there is an arrow after OVER(-> that is pointing to the left end of the elipse mentioning query partition clause and from the right end of the elipse an arrow is originating and pointing to the closing bracket of OVER
ie. OVER(->qurypartclause->) , there is no second line/path in which arrow originating from first bracket of OVER and going directly to the closing bracket of OVER bypassing the query partition elipse. Even the

Description of the illustration median.eps
MEDIAN(expr) [ OVER (query_partition_clause) ]

is wrong.
It should have been

MEDIAN(expr) [ OVER ([query_partition_clause]) ]

notice the opening square bracket before q and closing square bracket after e

The same wrong diagram is of listagg analytic function.

To understand what I am saying and to see the correct diagram/example, just see the analytic syntax diagram of FIRST/LAST (not FIRST_VALUE) analytic function here

https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/FIRST.html#GUID-85AB9246-0E0A-44A1-A7E6-4E57502E9238

. There are two paths after OVER( the upper path, which is going through qurypartclause elipse, and there is lower path beneath the elipse which is directly going to closing bracket of OVER.

Description of the illustration first.eps
aggregate_function KEEP
( DENSE_RANK FIRST ORDER BY
expr [ DESC | ASC ]
[ NULLS { FIRST | LAST } ]
[, expr [ DESC | ASC ]
[ NULLS { FIRST | LAST } ]
]...
)
[ OVER ( [query_partition_clause] ) ]

Note the opening square bracket before q and closing square bracket after e.

Hope, I am able to convince you that its doc. Error.








In other words

Asim, September 29, 2022 - 4:14 am UTC

In other words the doc. Is saying that you can write
select median(expr) OVER(PARTITION BY col1) from....

But you cannot write

Select median(expr) OVER() from....

Whereas I am able to execute both above queries without any error. Which means either the software is not working according to the docs or the doc are not according the softtware.

Either its a software error or its a doc. error.

Median

Asim, September 29, 2022 - 4:54 am UTC


I think I got where you are mistaking. Note that I am NOT saying that the diagram is not showing the whole analytical syntax optional, I am saying that the diagram is not showing the qurypartion clause optional. ie the diagram says

you CAN

write this

SELECT MEDIAN(expr)
/* OVER(PARTITION BY col1) */ FROM.... ---- ------aggregate syntax

and you CAN write this

SELECT MEDIAN(expr)
OVER(PARTITION BY col1) FROM.... --- analytic ---with querypartition clause syntax

BUT CANNOT write this

SELECT MEDIAN(expr)
OVER( /* PARTITION BY col1 */ ) FROM....---
---- analytic without qurypart clause

Whereas, I am executing all the above 3 queries without any error in the oracle software.


Chris Saxon
September 29, 2022 - 7:40 am UTC

You're correct, I've raised this with doc

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.