Perfect, thanks!
Jack, April 13, 2004 - 11:45 am UTC
Tom,
That's perfect, thanks. What I need to do here is fill in a few missing salaries (with values that would not be unusual). Now that I know how to get mode, here's what I came up with: Fill in missing salaries with a value not unusual for the dept. If salary is null for a whole dept, fill with a value not unusual for the whole report. I chose to break ties for mode by further ordering by rowid.
Thanks for your help.
- Jack
SQL> select dept,
2 salary,
3 decode(salary,
4 null,nvl(mode_sal_dept,mode_sal_report)
5 ) fill
6 from (
7 select dept,
8 salary,
9 first_value(salary) over
10 (partition by dept
11 order by cnt_dept desc, rowid) mode_sal_dept,
12 first_value(salary) over
13 (order by cnt_report desc, rowid) mode_sal_report
14 from (
15 select dept,
16 salary,
17 count(salary) over (partition by dept, salary) cnt_dept,
18 count(salary) over (partition by salary) cnt_report
19 from temp_01_tb
20 )
21 )
22 order by 1,2
23 ;
DEPT SALARY FILL
---------- ---------- ----------
10 2
10 2
10 8
10 9
10 9
10 9
20 2
20 2
20 3
20 4
20 4
20 4
30 2
30 2
14 rows selected.
median
mini, April 29, 2004 - 6:41 pm UTC
This works with Oracle 9 and up. I want to do calculate median in a single select statement in 8i. Can you help me do that.
Thx
April 30, 2004 - 6:58 am UTC
select deptno,
avg(sal),
cnt
from (
select deptno,
sal,
count(*) over (partition by deptno) cnt,
row_number() over(partition by deptno order by sal) rn
from emp
)
where (mod(cnt,2) = 1 and rn = ceil(cnt/2) )
or (mod(cnt,2) = 0 and rn in ( cnt/2, cnt/2+1 ) )
group by deptno, cnt
/
Calculate Median, Avg, Min and Max
Mini, May 13, 2004 - 6:03 pm UTC
This works for calculating the median. Thanks. Now I need to write a query which calculates average, minimum, max and median and count in a single select statement.This is in Oracle 8i.
Thanks
May 14, 2004 - 9:59 am UTC
scott@ORA9IR2> select deptno,
2 avg(sal),
3 avg_sal,
4 min_sal,
5 max_sal
6 cnt
7 from (
8 select deptno,
9 sal,
10 count(*) over (partition by deptno) cnt,
11 row_number() over(partition by deptno order by sal) rn,
12 avg(sal) over (partition by deptno) avg_sal,
13 min(sal) over (partition by deptno) min_sal,
14 max(sal) over (partition by deptno) max_sal
15 from emp
16 )
17 where (mod(cnt,2) = 1 and rn = ceil(cnt/2) )
18 or (mod(cnt,2) = 0 and rn in ( cnt/2, cnt/2+1 ) )
19 group by deptno, cnt, avg_sal, min_sal, max_sal
20 /
DEPTNO AVG(SAL) AVG_SAL MIN_SAL CNT
---------- ---------- ---------- ---------- ----------
10 2450 2916.66667 1300 5000
20 2975 2175 800 3000
30 1375 1566.66667 950 2850
is one method
mean, median, mode
Mini, May 14, 2004 - 12:54 pm UTC
Thanks for the quick response. Now I have another problem. I have to compute avg, min, max, median for two differnt columns from the same table in one single select.
May 15, 2004 - 11:57 am UTC
scott@ORA9IR2> select deptno,
2 count(*),
3 percentile_cont(0.5) within group (order by sal) med_sal,
4 avg(sal),
5 min(sal),
6 max(sal),
7 percentile_cont(0.5) within group (order by nvl(comm,0)) med_comm,
8 avg(nvl(comm,0)),
9 min(nvl(comm,0)),
10 max(nvl(comm,0))
11 from emp
12 group by deptno;
DEPTNO COUNT(*) MED_SAL AVG(SAL) MIN(SAL) MAX(SAL) MED_COMM AVG(NVL(COMM,0)) MIN(NVL(COMM,0)) MAX(NVL(COMM,0))
---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------- ---------------- ----------------
10 3 2450 2916.66667 1300 5000 0 0 0 0
20 5 2975 2175 800 3000 0 0 0 0
30 6 1375 1566.66667 950 2850 150 366.666667 0 1400
scott@ORA9IR2>
scott@ORA9IR2> select deptno,
2 avg(
3 case when (mod(cnt,2) = 1 and rn = ceil(cnt/2) )
4 or (mod(cnt,2) = 0 and rn in ( cnt/2, cnt/2+1 ) )
5 then sal
6 end
7 ) med_sal,
8 avg_sal,
9 min_sal,
10 max_sal,
11 avg(
12 case when (mod(cnt,2) = 1 and rn2 = ceil(cnt/2) )
13 or (mod(cnt,2) = 0 and rn2 in ( cnt/2, cnt/2+1 ) )
14 then comm
15 end
16 ) med_comm,
17 avg_comm,
18 min_comm,
19 max_comm
20 cnt
21 from (
22 select deptno,
23 sal, nvl(comm,0) comm,
24 count(*) over (partition by deptno) cnt,
25 avg(sal) over (partition by deptno) avg_sal,
26 min(sal) over (partition by deptno) min_sal,
27 max(sal) over (partition by deptno) max_sal,
28 row_number() over(partition by deptno order by sal) rn,
29 row_number() over(partition by deptno order by nvl(comm,0)) rn2,
30 avg(nvl(comm,0)) over (partition by deptno) avg_comm,
31 min(nvl(comm,0)) over (partition by deptno) min_comm,
32 max(nvl(comm,0)) over (partition by deptno) max_comm
33 from emp
34 )
35 group by deptno, cnt, avg_sal, min_sal, max_sal, avg_comm, min_comm, max_comm
36 /
DEPTNO MED_SAL AVG_SAL MIN_SAL MAX_SAL MED_COMM AVG_COMM MIN_COMM CNT
---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
10 2450 2916.66667 1300 5000 0 0 0 0
20 2975 2175 800 3000 0 0 0 0
30 1375 1566.66667 950 2850 150 366.666667 0 1400
Brilliant!
VA, January 19, 2005 - 11:51 am UTC
not useful
Dylan, March 18, 2005 - 3:26 pm UTC
this was no use to me and my work!!
March 18, 2005 - 7:57 pm UTC
brilliant, guess you don't need a mean, median or mode.
so, why'd you read the page in the first place one might ask....
How to calculate overall median.
sekhar, April 19, 2005 - 7:48 pm UTC
Tom,
You have shown that median, avg are calculated for each department.
How do I calculate the median for all the departments together instead of calculating median for each department.
thanks
sekhar
April 19, 2005 - 8:06 pm UTC
don't partition by anything.
multiple analytics at once
Dave, May 24, 2006 - 10:58 am UTC
I see that you are able to use multiple analytic functions in your examples but I have one that is not working for me and I don't understand why not. I dumbed it down to just pull from dual and it is still not giving me numbers in all the columns. I used this same code last year when we were running on 9i instead of 10g and I think it worked then. Any ideas?
SELECT institution_id, school_level, subject_area_code, mean_score,
(TRUNC(cume_dist() OVER (PARTITION BY school_level, subject_area_code ORDER BY mean_score) * 100,2)) AS percentile,
(percentile_disc (0.6) WITHIN GROUP
(ORDER BY mean_score) OVER (PARTITION BY school_level, subject_area_code)) AS per60d,
1
FROM (
SELECT institution_id, school_level, subject_area_code, AVG(mean_score) mean_score
FROM
(
SELECT 1 institution_id, 1 school_level, 1 subject_area_code, 1 mean_score
FROM dual
UNION
SELECT 1 institution_id, 1 school_level, 1 subject_area_code, 2 mean_score
FROM dual)
GROUP BY institution_id, school_level, subject_area_code
)
May 25, 2006 - 7:05 am UTC
you have a perfect example for support, please file a bug on this one - all they need to do is run it in 9i and 10g and they'll see something is wrong.
Wow!
Kevin Shidler, May 25, 2006 - 3:26 pm UTC
You know, I don't use mean, modes, and medians at work either, but I'm still amazed!! Excellent!
multiple analytics at once
Dave, May 26, 2006 - 9:38 am UTC
Thanks Tom. I thought maybe I was goofing it up.
Mean,mode,median
Dave Fowler, September 12, 2006 - 6:55 pm UTC
Greate examples, As always....Analytics rule.
Perfect for needs as well.
Display count (*)
Cari, September 18, 2006 - 10:19 pm UTC
Hi
I want to display tha with sql:
ID CODE MESSAGE COUNT(CODE)
1 ORA-02289 Sequence does not exist 3
ORA-00054 Resource busy 14
ORA-00942 Table or view does't exist 5
2 ORA-01001 Invalid cursor 1
.. .. ..
I do it sql query
SELECT decode(rn1,1,id)ID,
decode(rn2,1,code)CODE,message,cnt
FROM (
SELECT id,code,message,count(code) over (partition by code)cnt,
row_number() over (partition by id order by code,message,count(code)) rn1,
row_number() over (partition by id,code order by code,message,count(code)) rn2
from error
group by id,code,message);
But did not show really count.
Would you please showme where is the error ?
Thanks in advanced
September 19, 2006 - 2:25 pm UTC
well, given that I don't have your tables installed for some reason, I cannot really do much here...
Reply to Dylan
REJI, September 19, 2006 - 2:36 pm UTC
Tom - Excellent technical tips. I spend 30 minutes everyday to visit this site to learn something new and never logged out without learning something new.
Dylan - I don't think Tom's job is to do your job. This is a place where people like us ask some advise or help and Tom with his busy schedule find the time to address them. So, if it doesn't help you then keep quite or learn something from the posting.
Display count (*)
Cari, September 20, 2006 - 9:25 pm UTC
Hi
Sorry Tom.
That is the table and many inserts.
create table err
(ID number,
CODE varchar2(10),
MESSAGE varchar2(200),
fdate date);
insert into err values(1,'ORA-02289','Sequence does not exist',sysdate);
insert into err values(1,'ORA-017','Invalid username/password',sysdate);
insert into err values(1,'ORA-00942','Table or view exist',sysdate);
insert into err values(2,'ORA-017','Invalid username/password',sysdate);
insert into err values(2,'ORA-017','Invalid username/password',sysdate);
insert into err values(3,'ORA-0056','User invalid',sysdate);
Would you please help how deploy when count (*) ?
September 21, 2006 - 2:09 am UTC
well, I have no idea how this output:
ID CODE MESSAGE COUNT(CODE)
1 ORA-02289 Sequence does not exist 3
ORA-00054 Resource busy 14
ORA-00942 Table or view does't exist 5
2 ORA-01001 Invalid cursor 1
it supposed to be generated from that input. count(code) is what exactly????
Here's a guess
Jack Mulhollan, September 21, 2006 - 6:29 pm UTC
SQL> col id for 99
SQL> col message for a30
SQL> break on report
SQL> compute sum of count(*) on report
SQL>
SQL> select id, code, message, count(*)
2 from err
3 group by id, code, message
4 order by 1
5 ;
ID CODE MESSAGE COUNT(*)
--- ---------- ------------------------------ --------
1 ORA-00942 Table or view exist 1
1 ORA-017 Invalid username/password 1
1 ORA-02289 Sequence does not exist 1
2 ORA-017 Invalid username/password 2
3 ORA-0056 User invalid 1
--------
sum 6
Running Median
Mark Campbell, September 26, 2007 - 11:56 am UTC
I am trying to figure out how to determine a running median. I have Oracle Enterprise 10.2.0.1.0.
Each day I record how long it took to perform a particular procedure. I am able to calculate the overall average and a running average over the past 5 days and 10 days. I can also calculate the overall median, but I am not able to figure out how to do a
running median over the past 5 and 10 days.
I have tried the median analytic function in 10gR2, but it seems it will only let me perform the median over a partition, but what I really want is the median over a window like I can do for the average.
Below is sample data and the SQL and results for the calculations that I am able to perform.
Any suggestions?
Thanks
create table t1 (
run_date date,
elapsed_time number);
insert into t1 values (to_date('09/26/2007','MM/DD/YYYY'),8.23);
insert into t1 values (to_date('09/25/2007','MM/DD/YYYY'),7.41);
insert into t1 values (to_date('09/24/2007','MM/DD/YYYY'),8.66);
insert into t1 values (to_date('09/23/2007','MM/DD/YYYY'),6.58);
insert into t1 values (to_date('09/22/2007','MM/DD/YYYY'),6.18);
insert into t1 values (to_date('09/21/2007','MM/DD/YYYY'),7.33);
insert into t1 values (to_date('09/20/2007','MM/DD/YYYY'),6.23);
insert into t1 values (to_date('09/19/2007','MM/DD/YYYY'),7.02);
insert into t1 values (to_date('09/18/2007','MM/DD/YYYY'),6.28);
insert into t1 values (to_date('09/17/2007','MM/DD/YYYY'),7.23);
insert into t1 values (to_date('09/16/2007','MM/DD/YYYY'),6.36);
insert into t1 values (to_date('09/15/2007','MM/DD/YYYY'),6.36);
insert into t1 values (to_date('09/14/2007','MM/DD/YYYY'),7.18);
insert into t1 values (to_date('09/13/2007','MM/DD/YYYY'),6.28);
insert into t1 values (to_date('09/12/2007','MM/DD/YYYY'),7.21);
insert into t1 values (to_date('09/11/2007','MM/DD/YYYY'),6.33);
insert into t1 values (to_date('09/10/2007','MM/DD/YYYY'),6.50);
insert into t1 values (to_date('09/9/2007','MM/DD/YYYY'), 6.28);
insert into t1 values (to_date('09/8/2007','MM/DD/YYYY'), 6.26);
insert into t1 values (to_date('09/7/2007','MM/DD/YYYY'), 6.33);
insert into t1 values (to_date('09/6/2007','MM/DD/YYYY'), 5.26);
insert into t1 values (to_date('09/5/2007','MM/DD/YYYY'), 6.15);
insert into t1 values (to_date('09/4/2007','MM/DD/YYYY'), 5.11);
insert into t1 values (to_date('09/3/2007','MM/DD/YYYY'), 6.13);
insert into t1 values (to_date('09/2/2007','MM/DD/YYYY'), 5.23);
commit;
select run_date,
elapsed_time,
PERCENTILE_DISC (0.5) WITHIN GROUP (ORDER BY elapsed_time) OVER () median_perc_disc_method,
median(elapsed_time) over () median_method,
AVG (elapsed_time) OVER () avg_time,
AVG (elapsed_time) OVER (ORDER BY run_date DESC RANGE BETWEEN CURRENT ROW AND 4 FOLLOWING) avg_time5,
AVG (elapsed_time) OVER (ORDER BY run_date DESC RANGE BETWEEN CURRENT ROW AND 9 FOLLOWING) avg_time10
from t1
order by 1 desc
run_date elapsed_time median_perc_disc_method median_method avg_time avg_time5 avg_time10
9/26/2007 8.23 6.33 6.33 6.5648 7.412 7.115
9/25/2007 7.41 6.33 6.33 6.5648 7.232 6.928
9/24/2007 8.66 6.33 6.33 6.5648 6.996 6.823
9/23/2007 6.58 6.33 6.33 6.5648 6.668 6.675
9/22/2007 6.18 6.33 6.33 6.5648 6.608 6.645
9/21/2007 7.33 6.33 6.33 6.5648 6.818 6.748
9/20/2007 6.23 6.33 6.33 6.5648 6.624 6.648
9/19/2007 7.02 6.33 6.33 6.5648 6.65 6.675
9/18/2007 6.28 6.33 6.33 6.5648 6.682 6.601
9/17/2007 7.23 6.33 6.33 6.5648 6.682 6.599
9/16/2007 6.36 6.33 6.33 6.5648 6.678 6.509
9/15/2007 6.36 6.33 6.33 6.5648 6.672 6.399
9/14/2007 7.18 6.33 6.33 6.5648 6.7 6.378
9/13/2007 6.28 6.33 6.33 6.5648 6.52 6.171
9/12/2007 7.21 6.33 6.33 6.5648 6.516 6.156
9/11/2007 6.33 6.33 6.33 6.5648 6.34 5.958
9/10/2007 6.5 6.33 6.33 6.5648 6.126 5.9167
9/9/2007 6.28 6.33 6.33 6.5648 6.056 5.845
9/8/2007 6.26 6.33 6.33 6.5648 5.822 5.7814
9/7/2007 6.33 6.33 6.33 6.5648 5.796 5.7016
9/6/2007 5.26 6.33 6.33 6.5648 5.576 5.576
9/5/2007 6.15 6.33 6.33 6.5648 5.655 5.655
9/4/2007 5.11 6.33 6.33 6.5648 5.49 5.49
9/3/2007 6.13 6.33 6.33 6.5648 5.68 5.68
9/2/2007 5.23 6.33 6.33 6.5648 5.23 5.23
Running Median
Mark Campbell, September 27, 2007 - 10:20 am UTC
I figured out how to "derive" a partition (by using a truncated row_number) for each group of 5 records (except for the first group where I only have 4) for which I can calculate the median.
So now I can answer a similar question, but not quite exactly what I want.
What I can answer - "For the 5 most recent 5-day periods , what is the median elapsed time for that 5-day period?"
What I want to answer - "For each day, what is the median elapsed time for that day and the 4 previous days?"
Not sure why I am getting different results for the last two columns - it seems the median function is giving me the true median for the set of 4 records where the truncated row_number is 0.
with a as (
select run_date,
elapsed_time,
row_number() over (order by run_date desc) rn,
trunc((row_number() over (order by run_date desc))/5) rn2,
PERCENTILE_DISC (0.5) WITHIN GROUP (ORDER BY elapsed_time) OVER () perc_disc_method,
median(elapsed_time) over () median_method
from t1
order by 1 desc
)
select a.*,
PERCENTILE_DISC (0.5) WITHIN GROUP (ORDER BY elapsed_time) OVER (partition by rn2 ) run_med,
median(elapsed_time) over (partition by rn2) run_med1
from a
order by run_date desc
run_date elapsed_time rn rn2 perc_disc_method median_method run_med run_med1
9/26/2007 8.23 1 0 6.33 6.33 7.41 7.82
9/25/2007 7.41 2 0 6.33 6.33 7.41 7.82
9/24/2007 8.66 3 0 6.33 6.33 7.41 7.82
9/23/2007 6.58 4 0 6.33 6.33 7.41 7.82
9/22/2007 6.18 5 1 6.33 6.33 6.28 6.28
9/21/2007 7.33 6 1 6.33 6.33 6.28 6.28
9/20/2007 6.23 7 1 6.33 6.33 6.28 6.28
9/19/2007 7.02 8 1 6.33 6.33 6.28 6.28
9/18/2007 6.28 9 1 6.33 6.33 6.28 6.28
9/17/2007 7.23 10 2 6.33 6.33 6.36 6.36
9/16/2007 6.36 11 2 6.33 6.33 6.36 6.36
9/15/2007 6.36 12 2 6.33 6.33 6.36 6.36
9/14/2007 7.18 13 2 6.33 6.33 6.36 6.36
9/13/2007 6.28 14 2 6.33 6.33 6.36 6.36
9/12/2007 7.21 15 3 6.33 6.33 6.33 6.33
9/11/2007 6.33 16 3 6.33 6.33 6.33 6.33
9/10/2007 6.5 17 3 6.33 6.33 6.33 6.33
9/9/2007 6.28 18 3 6.33 6.33 6.33 6.33
9/8/2007 6.26 19 3 6.33 6.33 6.33 6.33
9/7/2007 6.33 20 4 6.33 6.33 6.13 6.13
9/6/2007 5.26 21 4 6.33 6.33 6.13 6.13
9/5/2007 6.15 22 4 6.33 6.33 6.13 6.13
9/4/2007 5.11 23 4 6.33 6.33 6.13 6.13
9/3/2007 6.13 24 4 6.33 6.33 6.13 6.13
9/2/2007 5.23 25 5 6.33 6.33 5.23 5.23
September 28, 2007 - 3:46 pm UTC
you are getting a different median from percentile_disc because disc is short for discrete - it returns an actual observed value, percentile_cont would return the same as median.
Here is one possible approach:
ops$tkyte%ORA10GR2> create or replace type myType as table of number
2 /
Type created.
ops$tkyte%ORA10GR2> column data format a50
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> with d
2 as
3 (
4 select run_date,
5 myType( elapsed_time,
6 lag(elapsed_time,1) over (order by run_date),
7 lag(elapsed_time,2) over (order by run_date),
8 lag(elapsed_time,3) over (order by run_date),
9 lag(elapsed_time,4) over (order by run_date) ) data
10 from t1
11 )
12 select run_date, data,
13 (select percentile_cont(0.5) within group (order by column_value)
14 from TABLE( data ) ) x
15 from d
16 /
RUN_DATE DATA X
--------- -------------------------------------------------- ----------
02-SEP-07 MYTYPE(5.23, NULL, NULL, NULL, NULL) 5.23
03-SEP-07 MYTYPE(6.13, 5.23, NULL, NULL, NULL) 5.68
04-SEP-07 MYTYPE(5.11, 6.13, 5.23, NULL, NULL) 5.23
05-SEP-07 MYTYPE(6.15, 5.11, 6.13, 5.23, NULL) 5.68
06-SEP-07 MYTYPE(5.26, 6.15, 5.11, 6.13, 5.23) 5.26
07-SEP-07 MYTYPE(6.33, 5.26, 6.15, 5.11, 6.13) 6.13
08-SEP-07 MYTYPE(6.26, 6.33, 5.26, 6.15, 5.11) 6.15
09-SEP-07 MYTYPE(6.28, 6.26, 6.33, 5.26, 6.15) 6.26
10-SEP-07 MYTYPE(6.5, 6.28, 6.26, 6.33, 5.26) 6.28
11-SEP-07 MYTYPE(6.33, 6.5, 6.28, 6.26, 6.33) 6.33
12-SEP-07 MYTYPE(7.21, 6.33, 6.5, 6.28, 6.26) 6.33
13-SEP-07 MYTYPE(6.28, 7.21, 6.33, 6.5, 6.28) 6.33
14-SEP-07 MYTYPE(7.18, 6.28, 7.21, 6.33, 6.5) 6.5
15-SEP-07 MYTYPE(6.36, 7.18, 6.28, 7.21, 6.33) 6.36
16-SEP-07 MYTYPE(6.36, 6.36, 7.18, 6.28, 7.21) 6.36
17-SEP-07 MYTYPE(7.23, 6.36, 6.36, 7.18, 6.28) 6.36
18-SEP-07 MYTYPE(6.28, 7.23, 6.36, 6.36, 7.18) 6.36
19-SEP-07 MYTYPE(7.02, 6.28, 7.23, 6.36, 6.36) 6.36
20-SEP-07 MYTYPE(6.23, 7.02, 6.28, 7.23, 6.36) 6.36
21-SEP-07 MYTYPE(7.33, 6.23, 7.02, 6.28, 7.23) 7.02
22-SEP-07 MYTYPE(6.18, 7.33, 6.23, 7.02, 6.28) 6.28
23-SEP-07 MYTYPE(6.58, 6.18, 7.33, 6.23, 7.02) 6.58
24-SEP-07 MYTYPE(8.66, 6.58, 6.18, 7.33, 6.23) 6.58
25-SEP-07 MYTYPE(7.41, 8.66, 6.58, 6.18, 7.33) 7.33
26-SEP-07 MYTYPE(8.23, 7.41, 8.66, 6.58, 6.18) 7.41
25 rows selected.
Running Median
Mark, October 01, 2007 - 11:49 am UTC
Thank you very much.
question: moving median
MannyJ, September 25, 2008 - 4:27 pm UTC
September 25, 2008 - 4:54 pm UTC
how big of a window do you want or truly need?
Please disregard the "moving medium" question.
MannyJ, September 25, 2008 - 4:33 pm UTC
I think the "running medium" question gave me what I need.
6 months median
Balachandra, July 23, 2010 - 9:45 am UTC
Tom,
I am trying to solve a question where the requirement is to derive a running 180 day median .The data is colected multiple time in a day in short multiple records for same day. So how to modify the query?
July 23, 2010 - 9:51 am UTC
I don't see a create table or inserts to test with, one would sort of need that.
Fair Enough
Balachandra, July 23, 2010 - 11:05 am UTC
Tom,
I am only posting a section of data since the actual data is 6K.
CREATE TABLE T2(CCOMPONENT VARCHAR2(50),SAMPLED_DATE DATE ,OBSERVED_VALUE NUMBER(30,5));
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/03/2009 08:00','MM/DD/YYYY HH24:MI'), .08);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/03/2009 14:02','MM/DD/YYYY HH24:MI'), .06);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/03/2009 20:00','MM/DD/YYYY HH24:MI'), .07);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/04/2009 08:30','MM/DD/YYYY HH24:MI'), .2);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/04/2009 14:20','MM/DD/YYYY HH24:MI'), .15);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/04/2009 20:00','MM/DD/YYYY HH24:MI'), .05);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/05/2009 07:57','MM/DD/YYYY HH24:MI'), .05);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/05/2009 13:30','MM/DD/YYYY HH24:MI'), .17);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/05/2009 20:00','MM/DD/YYYY HH24:MI'), .07);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/06/2009 07:54','MM/DD/YYYY HH24:MI'), .2);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/06/2009 13:30','MM/DD/YYYY HH24:MI'), .31);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/06/2009 20:00','MM/DD/YYYY HH24:MI'), .15);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/07/2009 08:15','MM/DD/YYYY HH24:MI'), .29);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/07/2009 13:55','MM/DD/YYYY HH24:MI'), .27);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/07/2009 20:00','MM/DD/YYYY HH24:MI'), .2);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/08/2009 08:12','MM/DD/YYYY HH24:MI'), 0);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/08/2009 13:42','MM/DD/YYYY HH24:MI'), .37);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/08/2009 20:00','MM/DD/YYYY HH24:MI'), .15);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/09/2009 08:10','MM/DD/YYYY HH24:MI'), .4);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/09/2009 13:26','MM/DD/YYYY HH24:MI'), .29);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/09/2009 20:00','MM/DD/YYYY HH24:MI'), .2);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/10/2009 08:05','MM/DD/YYYY HH24:MI'), .29);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/10/2009 13:45','MM/DD/YYYY HH24:MI'), .11);
INSERT INTO T2 VALUES ('Hg', TO_DATE('09/10/2009 20:00','MM/DD/YYYY HH24:MI'), .11);
........................................
INSERT INTO T2 VALUES ('Hg', TO_DATE('03/28/2010 07:53','MM/DD/YYYY HH24:MI'), 1.3);
INSERT INTO T2 VALUES ('Hg', TO_DATE('03/28/2010 13:34','MM/DD/YYYY HH24:MI'), .5);
INSERT INTO T2 VALUES ('Hg', TO_DATE('03/28/2010 20:00','MM/DD/YYYY HH24:MI'), .55);
INSERT INTO T2 VALUES ('Hg', TO_DATE('03/29/2010 08:22','MM/DD/YYYY HH24:MI'), 1.38);
INSERT INTO T2 VALUES ('Hg', TO_DATE('03/29/2010 14:12','MM/DD/YYYY HH24:MI'), .77);
INSERT INTO T2 VALUES ('Hg', TO_DATE('03/29/2010 20:00','MM/DD/YYYY HH24:MI'), .21);
INSERT INTO T2 VALUES ('Hg', TO_DATE('03/30/2010 08:29','MM/DD/YYYY HH24:MI'), .82);
INSERT INTO T2 VALUES ('Hg', TO_DATE('03/30/2010 14:03','MM/DD/YYYY HH24:MI'), .33);
INSERT INTO T2 VALUES ('Hg', TO_DATE('03/30/2010 20:00','MM/DD/YYYY HH24:MI'), .18);
INSERT INTO T2 VALUES ('Hg', TO_DATE('03/31/2010 08:15','MM/DD/YYYY HH24:MI'), 2.04);
INSERT INTO T2 VALUES ('Hg', TO_DATE('03/31/2010 13:56','MM/DD/YYYY HH24:MI'), .57);
INSERT INTO T2 VALUES ('Hg', TO_DATE('03/31/2010 20:00','MM/DD/YYYY HH24:MI'), .3);
COMMIT;
July 23, 2010 - 12:34 pm UTC
now we need details on what you expect to see from this data and a reason why you expect to see that - like a specification.
not saying we'll be able to do it - but one needs a test case and a detailed explanation of what needs to be done. Your data doesn't resemble the original question at all.
Specification
Balachandra, July 23, 2010 - 3:09 pm UTC
The specification is a compliance determination
Please report the maximum of 6-month median for the reporting month (which is March 2010 in our case).
The 6-month median shall apply as a moving median of daily values for any 180 day period in which daily values represent concentrations within a 24 hour period.
I was trying to adapt your solution to Mark Campbell where you presented 5 day median
July 23, 2010 - 4:18 pm UTC
yeah, but his example had an observation per day - so we could group things easily - you do not. And the median/percentile functions do not support range windows so I don't think we can do this in straight analytic SQL.
Your problem appears very different from theirs, but it is hard to say because your specification isn't really a specification. I think you want to take each row in the input set - find all of the records within +/- 180 days of that row, compute the median value for that set of rows. We could do that with correlated subqueries, is this the result you were looking for?
ops$tkyte%ORA10GR2> select ccomponent, sampled_date, observed_value,
2 (select median(observed_value)
3 from t2 x
4 where x.sampled_date between A.trunc_sd-180 and A.trunc_sd+180
5 and x.ccomponent = A.ccomponent ) med
6 from (select t2.*, trunc(sampled_date) trunc_sd from t2) a
7 /
CC SAMPLED_D OBSERVED_VALUE MED
-- --------- -------------- ----------
Hg 03-SEP-09 .08 .16
Hg 03-SEP-09 .06 .16
Hg 03-SEP-09 .07 .16
Hg 04-SEP-09 .2 .16
Hg 04-SEP-09 .15 .16
Hg 04-SEP-09 .05 .16
Hg 05-SEP-09 .05 .16
Hg 05-SEP-09 .17 .16
Hg 05-SEP-09 .07 .16
Hg 06-SEP-09 .2 .16
Hg 06-SEP-09 .31 .16
Hg 06-SEP-09 .15 .16
Hg 07-SEP-09 .29 .16
Hg 07-SEP-09 .27 .16
Hg 07-SEP-09 .2 .16
Hg 08-SEP-09 0 .16
Hg 08-SEP-09 .37 .16
Hg 08-SEP-09 .15 .16
Hg 09-SEP-09 .4 .16
Hg 09-SEP-09 .29 .16
Hg 09-SEP-09 .2 .16
Hg 10-SEP-09 .29 .16
Hg 10-SEP-09 .11 .16
Hg 10-SEP-09 .11 .16
Hg 28-MAR-10 1.3 .56
Hg 28-MAR-10 .5 .56
Hg 28-MAR-10 .55 .56
Hg 29-MAR-10 1.38 .56
Hg 29-MAR-10 .77 .56
Hg 29-MAR-10 .21 .56
Hg 30-MAR-10 .82 .56
Hg 30-MAR-10 .33 .56
Hg 30-MAR-10 .18 .56
Hg 31-MAR-10 2.04 .56
Hg 31-MAR-10 .57 .56
Hg 31-MAR-10 .3 .56
36 rows selected.
Excellent
Balachandra, July 23, 2010 - 5:39 pm UTC
I tried this query
SELECT DISTINCT w.ccomponent,
b.ss,
MEDIAN(w.formatted_entry) OVER (PARTITION BY b.ss) mdn
FROM t2 w,
(SELECT DISTINCT TRUNC(w1.sampled_date) ss
FROM t2 w1
WHERE w1.ccomponent = 'Hg'
AND TRUNC(w1.sampled_date) BETWEEN TO_DATE('03/1/2010', 'mm/dd/yyyy') AND TO_DATE('03/31/2010', 'mm/dd/yyyy')) b
WHERE w.ccomponent = 'Hg'
AND TRUNC(w.sampled_date) BETWEEN b.ss - 179 AND b.ss
ORDER BY 2
Hg 3/1/2010 0.15
Hg 3/2/2010 0.15
Hg 3/3/2010 0.15
Hg 3/4/2010 0.15
Hg 3/5/2010 0.14
Hg 3/6/2010 0.14
Hg 3/7/2010 0.14
Hg 3/8/2010 0.14
Hg 3/9/2010 0.14
Hg 3/10/2010 0.14
Hg 3/11/2010 0.14
Hg 3/12/2010 0.14
Hg 3/13/2010 0.14
Hg 3/14/2010 0.14
Hg 3/15/2010 0.14
Hg 3/16/2010 0.14
Hg 3/17/2010 0.145
Hg 3/18/2010 0.15
Hg 3/19/2010 0.15
Hg 3/20/2010 0.15
Hg 3/21/2010 0.15
Hg 3/22/2010 0.15
Hg 3/23/2010 0.16
Hg 3/24/2010 0.16
Hg 3/25/2010 0.16
Hg 3/26/2010 0.16
Hg 3/27/2010 0.17
Hg 3/28/2010 0.17
Hg 3/29/2010 0.18
Hg 3/30/2010 0.18
Hg 3/31/2010 0.18
A reader, April 26, 2012 - 1:08 pm UTC
I am find median of three columns ( in my case qty1 , qty2 , qty3 ) . Is there any easier way to accommodation this ?. Looking at the documentation , we do have median function ; but that returns value across the records ( interrecord not intra record) .
I am on 10g release 2 .
SQL> drop table test_med purge ;
Table dropped.
SQL>
SQL> create table test_med (
2 col1 integer ,
3 qty1 number ,
4 qty2 number ,
5 qty3 number )
6 ;
Table created.
SQL>
SQL> insert into test_med values ( 1 , 1 , 2 , 3) ;
1 row created.
SQL> insert into test_med values ( 2 , 9 , 4 , 3) ;
1 row created.
SQL> insert into test_med values ( 3 , 1 , 3 , 2) ;
1 row created.
SQL> insert into test_med values ( 5 , 5 , 5 , 5) ;
1 row created.
SQL> insert into test_med values ( 6 , 5 , 6 , 6) ;
1 row created.
SQL>
SQL> Select col1 , qty1 , qty2 , qty3
2 ,
3 Case
4 When least( qty1 , qty2 , qty3) = greatest( qty1 , qty2 , qty3) then qty1
5 When least( qty1 , qty2 , qty3) < qty1 and qty1 < greatest( qty1 , qty2 , qty3) then qty1
6 When least( qty1 , qty2 , qty3) < qty2 and qty2 < greatest( qty1 , qty2 , qty3) then qty2
7 When least( qty1 , qty2 , qty3) < qty3 and qty3 < greatest( qty1 , qty2 , qty3) then qty3
8 else null end as median
9 from test_med;
COL1 QTY1 QTY2 QTY3 MEDIAN
---------- ---------- ---------- ---------- ----------
1 1 2 3 2
2 9 4 3 4
3 1 3 2 2
5 5 5 5 5
6 5 6 6
SQL>
Note : The last record returns null . So the above case is not 100% working.
April 26, 2012 - 1:36 pm UTC
ops$tkyte%ORA10GR2> select col1, median(column_value)
2 from test_med, table(
3 cast(
4 multiset( select decode(level,1,qty1,2,qty2,3,qty3)
5 from dual
6 connect by level <= 3)
7 as sys.odciNumberList ) )
8 group by col1
9 /
COL1 MEDIAN(COLUMN_VALUE)
---------- --------------------
1 2
2 4
3 2
5 5
6 6
ops$tkyte%ORA10GR2>
we use table unnesting to get the data in the format it should have been in in the first place :)
is one approach...
A reader, August 20, 2012 - 9:56 am UTC
I am extending above example to fulfil my use case.
Here the logic is
a) col1 is the Primary key
b) qty1 , qty2 , qty3 are actually values
c) qty4 , qty5 , qty6 are derived values
qty4 = median ( qty1 , qty2 ,qty3)
qty5 = median ( qty2 ,qty3 , qty4)
qty6 = median ( qty4 , qty5 , qty6)
Code :
------
drop table test_med purge ;
create table test_med (
col1 integer ,
qty1 number ,
qty2 number ,
qty3 number ,
qty4 number null,
qty5 number null
qty6 number null )
;
insert into test_med values ( 1 , 1 , 2 , 3, null , null , null ) ;
insert into test_med values ( 2 , 9 , 4 , 3 , null , null , null ) ;
insert into test_med values ( 3 , 1 , 3 , 2 , null , null , null ) ;
insert into test_med values ( 5 , 5 , 5 , 5 , null , null , null ) ;
insert into test_med values ( 6 , 5 , 6 , 6 , null , null , null ) ;
insert into test_med values ( 7 , 5 , 7 , 6 , null , null , null ) ;
insert into test_med values ( 8 , 6 , 6 , 5 , null , null , null ) ;
insert into test_med values ( 9 , 6 , 5 , 6 , null , null , null ) ;
WITH v_med1 AS
(SELECT col1, qty1 , qty2 , qty3,
median(column_value) AS qty4
FROM test_med,
TABLE( CAST( multiset
( SELECT DECODE(level,1,qty1,2,qty2,3,qty3) FROM dual CONNECT BY level <= 3
) AS sys.odciNumberList ) )
GROUP BY col1 ,qty1 , qty2 , qty3
)
, v_med2 as (
SELECT col1 , qty1 , qty2 , qty3,qty4,
median(column_value) AS qty5
FROM v_med1,
TABLE( CAST( multiset
( SELECT DECODE(level,1,qty2,2,qty3,3,qty4) FROM dual CONNECT BY level <= 4
) AS sys.odciNumberList ) )
GROUP BY col1 ,qty1 , qty2 , qty3,qty4
)
,
v_med3 as (
SELECT col1 , qty1 , qty2 , qty3,qty4, qty5,
median(column_value) AS qty6
FROM v_med2,
TABLE( CAST( multiset
( SELECT DECODE(level,1,qty3,2,qty4,3,qty5) FROM dual CONNECT BY level <= 5
) AS sys.odciNumberList ) )
GROUP BY col1 ,qty1 , qty2 , qty3,qty4 , qty5
)
Select * from v_med3
;
COL1 QTY1 QTY2 QTY3 QTY4 QTY5 QTY6
--------------------------------------- --------------------------------------- --------------------------------------- --------------------------------------- --------------------------------------- --------------------------------------- ---------------------------------------
1 1 2 3 2 2 2
2 9 4 3 4 4 4
3 1 3 2 2 2 2
5 5 5 5 5 5 5
6 5 6 6 6 6 6
7 5 7 6 6 6 6
8 6 6 5 6 6 6
9 6 5 6 6 6 6
8 rows selected
Is there any other better way to accomplish this ? . The base table has around 10 million records.
I tried with MODEL clause .. was getting invalid cell reference.
Any better solution
Asim, August 31, 2022 - 9:50 am UTC
The first original qiestion was answered in year 2004. Version 9i, 8i
Is there any better way now using new features, if yes please mention the version frim which that features started
September 01, 2022 - 3:58 am UTC
There is a native MEDIAN function, eg
SQL> select deptno, median(sal)
2 from scott.emp
3 group by deptno;
DEPTNO MEDIAN(SAL)
---------- -----------
10 2450
20 2975
30 1375
Mediaan analytic
Asim, September 01, 2022 - 12:12 pm UTC
September 05, 2022 - 5:45 am UTC
stats_mode is available now
for a long time (back to 10g I think)
Median
Asim, September 05, 2022 - 11:53 am UTC
What about that documentation error?
Seems another documentation error
Asim, September 07, 2022 - 5:32 pm UTC
Seems another documentation error
Asim, September 07, 2022 - 5:32 pm UTC
September 12, 2022 - 3:11 am UTC
says OVER({querypartionclause})
whereas its working as
OVER([qierypartitionclause])
what does this mean?
Seems another documentation error
Asim, September 07, 2022 - 5:32 pm UTC
Sorry
Asim, September 08, 2022 - 4:00 am UTC
Sorry dont know how above post got submitted 3 times.
Means
Asim, September 12, 2022 - 4:59 am UTC
Means doc says querypartitionclause is mandatory in median analytic function, whereas queries are running fine without mentioning querypartitionclause.
September 27, 2022 - 3:28 pm UTC
Thanks for flagging this, I've raised it with the doc team.
2 errors
Asim, September 27, 2022 - 7:23 pm UTC
Have you raised both doc errors or only 1.
Just to summarize the above discussion.
There seems to be 2 doc errors in 21c
1. querypartitionclause is mentioned as mandatory in median analytic syntax diagram whereas it should be optional.
2.
MEDIAN is also an analytic function but is not shown in the list of analytic functions, its only listed in aggregate function.
September 29, 2022 - 1:52 am UTC
1) no - because the diagram says its optional
2) yes
Asim, September 29, 2022 - 3:58 am UTC
1.
I just rechecked and also refreshed that doc page, but diagram is still showing it mandatory
Because there is an arrow after OVER(-> that is pointing to the left end of the elipse mentioning query partition clause and from the right end of the elipse an arrow is originating and pointing to the closing bracket of OVER
ie. OVER(->qurypartclause->) , there is no second line/path in which arrow originating from first bracket of OVER and going directly to the closing bracket of OVER bypassing the query partition elipse. Even the
Description of the illustration median.eps
MEDIAN(expr) [ OVER (query_partition_clause) ]
is wrong.
It should have been
MEDIAN(expr) [ OVER ([query_partition_clause]) ]
notice the opening square bracket before q and closing square bracket after e
The same wrong diagram is of listagg analytic function.
To understand what I am saying and to see the correct diagram/example, just see the analytic syntax diagram of FIRST/LAST (not FIRST_VALUE) analytic function here
https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/FIRST.html#GUID-85AB9246-0E0A-44A1-A7E6-4E57502E9238 . There are two paths after OVER( the upper path, which is going through qurypartclause elipse, and there is lower path beneath the elipse which is directly going to closing bracket of OVER.
Description of the illustration first.eps
aggregate_function KEEP
( DENSE_RANK FIRST ORDER BY
expr [ DESC | ASC ]
[ NULLS { FIRST | LAST } ]
[, expr [ DESC | ASC ]
[ NULLS { FIRST | LAST } ]
]...
)
[ OVER ( [query_partition_clause] ) ]
Note the opening square bracket before q and closing square bracket after e.
Hope, I am able to convince you that its doc. Error.
In other words
Asim, September 29, 2022 - 4:14 am UTC
In other words the doc. Is saying that you can write
select median(expr) OVER(PARTITION BY col1) from....
But you cannot write
Select median(expr) OVER() from....
Whereas I am able to execute both above queries without any error. Which means either the software is not working according to the docs or the doc are not according the softtware.
Either its a software error or its a doc. error.
Median
Asim, September 29, 2022 - 4:54 am UTC
I think I got where you are mistaking. Note that I am NOT saying that the diagram is not showing the whole analytical syntax optional, I am saying that the diagram is not showing the qurypartion clause optional. ie the diagram says
you CAN
write this
SELECT MEDIAN(expr)
/* OVER(PARTITION BY col1) */ FROM.... ---- ------aggregate syntax
and you CAN write this
SELECT MEDIAN(expr)
OVER(PARTITION BY col1) FROM.... --- analytic ---with querypartition clause syntax
BUT CANNOT write this
SELECT MEDIAN(expr)
OVER( /* PARTITION BY col1 */ ) FROM....---
---- analytic without qurypart clause
Whereas, I am executing all the above 3 queries without any error in the oracle software.
September 29, 2022 - 7:40 am UTC
You're correct, I've raised this with doc