I need a bit different result
Prince, October 03, 2002 - 3:07 pm UTC
May be, I was not clear when I explain the case.
Assume, the SAL column is what I have to pay, and I have 10K on my hand. I need to pay them off in the order of empno and show the the remaining each employee has to get.
Infact, the emp table is not a good example. This is the table I had it in handy.
ENAME SAL SUM_SAL BAL ToRecieve
---------- ---------- ---------- ---------- --------------
SMITH 800 800 9200 0
ALLEN 1600 2400 7600 0
WARD 1250 3650 6350 0
JONES 2975 6625 3375 0
MARTIN 1250 7875 2125 0
BLAKE 2850 10725 2850-2125 = 725
CLARK 2450 13175 2450
SCOTT 3000 16175 3000
KING 5000 21175 5000
TURNER 1500 22675 ..
ADAMS 1100 23775 ..
JAMES 950 24725 ..
FORD 3000 27725 ..
MILLER 1300 29025 ..
What I had was something like,
select * from (select empno, sal, ssum, lag(ssum) over (order by empno) lag_s ,
nvl(10000 - (lag(ssum) over (order by empno)) , 10000) diff, sal - nvl(10000 - (lag(ssum) over (order by empno)) , 10000
) rem
from (SELECT empno, sal, SUM(sal) OVER (ORDER BY empno ) ssum from emp) t1 )
I can now apply the case on "diff" and
if it is >= 0
display "0" if sign("sal - diff" ) = -1
esle display "sal - diff"
else display "SAL" as "ToRecieve"
But the one I have has so many level of subquery. I want it to be simple and efficient. Also, Can I stop the sum() when it reaches the value greater than 10K. I guess I need another nesting for this to work as "window functions are not allowed" in the where cluse.
Thanks for the case statement. I wasn't thinking about the case at all. What I had before was justy the SQL I pasted above.
October 03, 2002 - 6:46 pm UTC
Using inline views makes it simple.
Using inline views does not mean "inefficient"
Use inline views, use case....
I am not corcerned about the inline views, but
Prince., October 04, 2002 - 12:17 am UTC
Thanks Tom,
I am not concerned much about the inline views. But I am wondering whether I am using too many inline views and whether there is a better way to do this.
It looks to me that, I am following a convoluted way!
Do you see any simpler way of doing this? else, I will continue with what I have right now.
October 04, 2002 - 8:17 am UTC
use analytics, use case and you are going down the right path IMO
How can I do thiis ?
Parag Jayant Patankar, January 03, 2005 - 5:04 am UTC
Hi Tom,
I am trying to do average using analytical function.
1. Table of transaction is parag and having column a date, and column b number
15:12:47 SQL> select * from parag;
A B
-------- ----------
20041130 5000
20041210 6000
20041229 10000
20041230 7000
I wan to take average for financial matter by doing following,
select a, b, x, lag(b) over ( order by (a)) y, x *lag(b) over ( order by (a)) z
from
(
select a, b, to_date(a, 'RRRRMMDD') - to_date(lag(a) over (order by(a)), 'RRRRMMDD') x
from parag
)
/
A B X Y Z
-------- ---------- ---------- ---------- ----------
20041130 5000
20041210 6000 10 5000 50000
20041229 10000 19 6000 114000
20041230 7000 2 10000 20000
which is nothing but calculating amount by no of days between transactions and multiplying the amount.
I have following questions
1. How to add last row in this same calculation by arriving no of days is months last day - last day of the transaction ? for e.g. how to add 7000 in this output.
So my output should be
A B X Y Z
-------- ---------- ---------- ---------- ----------
20041130 5000
20041210 6000 10 5000 50000
20041229 10000 19 6000 114000
20041230 7000 2 10000 20000
1 7000 7000
Suppose my table is having following data
A B
-------- ----------
20041130 5000
20041210 6000
20041229 10000
A B X Y Z
-------- ---------- ---------- ---------- ----------
20041130 5000
20041210 6000 10 5000 50000
20041229 10000 19 6000 114000
2 10000 20000
arriving at 2 days ( column x ) by subtracting 29/12 from 31/12
2. My doing this I am arriving at the average balance for the month. Is there any other simple method for arriving this kind of calculation ?
Kindly guide me for analytical functions.
regards & thanks
pjp
January 03, 2005 - 8:54 am UTC
too hard for me to turn "select * from's" into insert intos
How can I do this ? ( weighted average )
Parag Jayant Patankar, January 03, 2005 - 9:11 am UTC
Hi Tom,
I think I am not able to explain question properly. I will try to put my question again.
I have a transaction table "parag" having following details
A B
-------- ----------
20041130 5000
20041210 6000
20041228 10000
20041229 7000
I want the output of select statement as follows
A B X Y Z
-------- ---------- ---------- ---------- ----------
20041130 5000
20041210 6000 10 5000 50000
20041228 10000 18 6000 108000
20041229 7000 1 10000 10000
2 7000 14000 <= in this last row no of days i.e. column x should be arrived by deducting last transaction date from last day of the month. here value 2 arrived by '20041231' - '20041229'.
How can I write SQL statment to arrive this possibly by using analytical function ?
regards & thanks
pjp
January 03, 2005 - 9:14 am UTC
in the future, what I mean is that it is too hard for me to convert your stuff above into insert into's with create tables...
You do that. give me something to cut and paste and run in sqlplus to get going.
but here -- just thing "outer join", you need a table of dates that should be in your output. outer join parag to that table of dates, you have therefore filled in the blanks.
Pl find table creation and insert statements
Parag Jayant Patankar, January 03, 2005 - 9:22 am UTC
Hi Tom,
Pl find details below SQL statements you required
create table parag ( a char, b number );
insert into parag values ('20041130', 5000);
insert into parag values ('20041210', 6000);
insert into parag values ('20041228', 10000);
insert into parag values ('20041229', 7000);
My sql which is not giving desired output is
select a, b, x, lag(b) over ( order by (a)) y, x *lag(b) over ( order by (a)) z
from
(
select a, b, to_date(a, 'RRRRMMDD') - to_date(lag(a) over (order by(a)),
'RRRRMMDD') x
from parag
);
thanks & regards
pjp
January 03, 2005 - 9:54 am UTC
did you even try the suggestion above? outer join to a table with all of your needed dates -- eg:
select to_date( :x, 'dd-mon-yyyy') + rownum-1 dt
from all_objects
where rownum <= to_char( last_day( to_date(:x,'dd-mon-yyyy') ), 'dd' )
will generate all of the days in a month (put 01-dec-2004 into :x)
outer join parag to that.
then analyze.
How to proceed ?
Parag Jayant Patankar, January 04, 2005 - 4:13 am UTC
Hi Tom,
Thanks for your tip. Now after having outer join by following command
select dt, l, m
from
(
select dt, nvl(b, 0) l,
dt - lag(dt) over (order by dt) m
from parag,
(
select to_date( '30-NOV-2004', 'dd-mon-yyyy') + rownum-1 dt
from all_objects
where rownum <= to_char(last_day(to_date('30-NOV-2004','dd-mon-yyyy') ),'dd') +2
order by 1
) x
where to_date(parag.a(+), 'RRRRMMDD') = dt
order by 1, 2, 3
);
I am having following output
DT L M
--------- ---------- ----------
30-NOV-04 5000
01-DEC-04 0 1
02-DEC-04 0 1
03-DEC-04 0 1
04-DEC-04 0 1
05-DEC-04 0 1
06-DEC-04 0 1
07-DEC-04 0 1
08-DEC-04 0 1
09-DEC-04 0 1
10-DEC-04 6000 1
11-DEC-04 0 1
12-DEC-04 0 1
13-DEC-04 0 1
14-DEC-04 0 1
15-DEC-04 0 1
16-DEC-04 0 1
17-DEC-04 0 1
18-DEC-04 0 1
19-DEC-04 0 1
20-DEC-04 0 1
21-DEC-04 0 1
22-DEC-04 0 1
23-DEC-04 0 1
24-DEC-04 0 1
25-DEC-04 0 1
26-DEC-04 0 1
27-DEC-04 0 1
28-DEC-04 10000 1
29-DEC-04 7000 1
30-DEC-04 0 1
31-DEC-04 0 1
32 rows selected.
Kindly guide me how to proceed now. Sorry to trouble you for such a small problem as I am not very clear to proceed. So I am learning from you to tackle such situations in SQL.
regards & thanks
pjp
January 04, 2005 - 8:33 am UTC
how to proceed with what?
Required help
Parag Jayant Patankar, January 04, 2005 - 8:45 am UTC
Hi Tom,
from following output
DT L M
--------- ---------- ----------
30-NOV-04 5000
01-DEC-04 0 1
02-DEC-04 0 1
03-DEC-04 0 1
04-DEC-04 0 1
05-DEC-04 0 1
06-DEC-04 0 1
07-DEC-04 0 1
08-DEC-04 0 1
09-DEC-04 0 1
10-DEC-04 6000 1
11-DEC-04 0 1
12-DEC-04 0 1
13-DEC-04 0 1
14-DEC-04 0 1
15-DEC-04 0 1
16-DEC-04 0 1
17-DEC-04 0 1
18-DEC-04 0 1
19-DEC-04 0 1
20-DEC-04 0 1
21-DEC-04 0 1
22-DEC-04 0 1
23-DEC-04 0 1
24-DEC-04 0 1
25-DEC-04 0 1
26-DEC-04 0 1
27-DEC-04 0 1
28-DEC-04 10000 1
29-DEC-04 7000 1
30-DEC-04 0 1
31-DEC-04 0 1
I want a output like this
DT L M N
--------- ---------- ----------------------
30-NOV-04 5000 5000
01-DEC-04 0 1 5000 <= ( 5000 * 1 i.e. value of column m X last non zero value of column l )
02-DEC-04 0 1 5000 <= ( 5000 * 1 )
03-DEC-04 0 1 5000 <= ( 5000 * 1 )
04-DEC-04 0 1 5000 <= ( 5000 * 1 )
05-DEC-04 0 1 5000 <= ( 5000 * 1 )
06-DEC-04 0 1 5000 <= ( 5000 * 1 )
07-DEC-04 0 1 5000 <= ( 5000 * 1 )
08-DEC-04 0 1 5000 <= ( 5000 * 1 )
09-DEC-04 0 1 5000 <= ( 5000 * 1 )
10-DEC-04 6000 1 6000 <= ( 6000 * 1 )
11-DEC-04 0 1 6000 <= ( 6000 * 1 )
12-DEC-04 0 1 6000 <= ( 6000 * 1 )
13-DEC-04 0 1 6000 <= ( 6000 * 1 )
14-DEC-04 0 1 6000 <= ( 6000 * 1 )
15-DEC-04 0 1 6000 <= ( 6000 * 1 )
16-DEC-04 0 1 6000 <= ( 6000 * 1 )
17-DEC-04 0 1 6000 <= ( 6000 * 1 )
18-DEC-04 0 1 6000 <= ( 6000 * 1 )
19-DEC-04 0 1 6000 <= ( 6000 * 1 )
20-DEC-04 0 1 6000 <= ( 6000 * 1 )
21-DEC-04 0 1 6000 <= ( 6000 * 1 )
22-DEC-04 0 1 6000 <= ( 6000 * 1 )
23-DEC-04 0 1 6000 <= ( 6000 * 1 )
24-DEC-04 0 1 6000 <= ( 6000 * 1 )
25-DEC-04 0 1 6000 <= ( 6000 * 1 )
26-DEC-04 0 1 6000 <= ( 6000 * 1 )
27-DEC-04 0 1 6000 <= ( 6000 * 1 )
28-DEC-04 10000 1 10000 <= (10000 * 1 )
29-DEC-04 7000 1 7000 <= ( 7000 * 1 )
30-DEC-04 0 1 7000 <= ( 7000 * 1 )
31-DEC-04 0 1 7000 <= ( 7000 * 1 )
How can I write SQL to achive this preferably using analytical ?
thanks for helping me.
regards & thanks
pjp
January 04, 2005 - 8:58 am UTC
variation on a theme
https://www.oracle.com/technetwork/issue-archive/2014/14-mar/o24asktom-2147206.html Analytics to the rescue...
same concept, a "carry forward"
ops$tkyte@ORA9IR2> select dt, l,
2 to_number(substr(max(last_l) over (order by dt),11)) last_l, m
3 from (
4 select dt, l,
5 case when l is not null then to_char(row_number() over (order by dt),'fm0000000000')||l end last_l,
6 m
7 from
8 (
9 select dt, b l,
10 dt - lag(dt) over (order by dt) m
11 from parag,
12 (
13 select to_date( '30-NOV-2004', 'dd-mon-yyyy') + rownum-1 dt
14 from all_objects
15 where rownum <= to_char(last_day(to_date('30-NOV-2004','dd-mon-yyyy') ),'dd') +2
16 order by 1
17 ) x
18 where to_date(parag.a(+), 'RRRRMMDD') = dt
19 )
20 )
21 order by 1, 3, 4
22 /
DT L LAST_L M
--------- ---------- ---------- ----------
30-NOV-04 5000 5000
01-DEC-04 5000 1
02-DEC-04 5000 1
03-DEC-04 5000 1
04-DEC-04 5000 1
05-DEC-04 5000 1
06-DEC-04 5000 1
07-DEC-04 5000 1
08-DEC-04 5000 1
09-DEC-04 5000 1
10-DEC-04 6000 6000 1
11-DEC-04 6000 1
12-DEC-04 6000 1
13-DEC-04 6000 1
14-DEC-04 6000 1
15-DEC-04 6000 1
16-DEC-04 6000 1
17-DEC-04 6000 1
18-DEC-04 6000 1
19-DEC-04 6000 1
20-DEC-04 6000 1
21-DEC-04 6000 1
22-DEC-04 6000 1
23-DEC-04 6000 1
24-DEC-04 6000 1
25-DEC-04 6000 1
26-DEC-04 6000 1
27-DEC-04 6000 1
28-DEC-04 10000 10000 1
29-DEC-04 7000 7000 1
30-DEC-04 7000 1
31-DEC-04 7000 1
32 rows selected.
Statistics
A reader, January 04, 2005 - 10:16 am UTC
create table txn
(
txn_id int primary key,
load_seq_no int
);
I want to find out
1. Min, max and average no. of txns per load seq_no
2. Number of unique load_seq_nos
3. Total number of records in the table
All in one query. One FTS of the table, this is a large table (50 million rows)
Help? Thanks
January 05, 2005 - 8:19 am UTC
1) is ambigous -- you put "min", "max" and "avg" num of txns/load seq_no together. I know what "min", "max" txn_id per load_seq_no is:
select min(txn_id), max(txn_id) from t group by load_seq_no;
but an "average" number of transactions per load_seq_no is
select avg( count(txn_id) ) from t group by load_seq_no (an aggregate of an aggregate)
so, I'll assume you want:
by load_seq_no, the min/max txn id, the cnt of txn_ids
and then, the avg number of cnt of txn_ids
the count of all records
the count of distinct load_seq_no's
ops$tkyte@ORA9IR2> select * from t;
TXN_ID LOAD_SEQ_NO
---------- -----------
1 1
2 1
3 1
4 2
5 2
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> set autotrace on explain
ops$tkyte@ORA9IR2> select load_seq_no,
2 min_tid,
3 max_tid,
4 avg(cnt_tid) over () avg_txn,
5 sum(cnt_tid) over () tot_records,
6 count(*) over () num_unq_load_seq
7 from (
8 select load_seq_no, min(txn_id) min_tid, max(txn_id) max_tid, count(txn_id) cnt_tid
9 from t
10 group by load_seq_no
11 )
12 /
LOAD_SEQ_NO MIN_TID MAX_TID AVG_TXN TOT_RECORDS NUM_UNQ_LOAD_SEQ
----------- ---------- ---------- ---------- ----------- ----------------
1 1 3 2.5 5 2
2 4 5 2.5 5 2
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 WINDOW (BUFFER)
2 1 SORT (GROUP BY)
3 2 TABLE ACCESS (FULL) OF 'T'
ops$tkyte@ORA9IR2> set autotrace off
what is wrong with analytics?
mikito, January 04, 2005 - 8:54 pm UTC
Zillions of extra unnecessary concepts.
* lag with offset parameter -- i was naively assuming that lag(lag) must work
* confusing window definnition with preceding, unbounded, following
How analytical SQL should have been designed? Well, let see what concepts are really required.
First, we have to be able to introduce a counter on ordered result set. Any query block that has "order by" clause must have implicit row_num() pseudocolumn. This is IMO one of the majour weaknesses of current SQL: they have underutilized "order by" clause. Now, when we have row numbers, we can define windows any way we like with a scalar subquery within the select clause. When leveraging subqueries we specify window with predicates, while analytic ANSI SQL uses COBOL style descriptions like "RANGE BETWEEN INTERVAL '1' DAY PRECEDING AND INTERVAL '1' DAY FOLLOWING"
Admitterdly,
with ordemp as (
select e.*, rownum rn from (
select * from emp e order by deptno, sal
) e
)
select oe.*,
(select oe.rn - min(rn) from ordemp oee
where oee.deptno = oe.deptno)
from ordemp oe;
looks a little bit less succinct than
select e.*,
row_number(sal)
over(order by sal partition by deptno)
from emp e;
mostly because row_number() functionality that I have described is not there. On the brighter side, when writing the first query I don't have to consult reference manual every 5 minutes.
January 05, 2005 - 9:24 am UTC
yawn, boring...
we know you don't understand "why they are cool"
we know you don't like them
we know you don't like aggregates and count(*) either
and frankly..... :) well -- you can fill in the blanks.
I don't have to consult the manuals every 5 minutes for them. And on the bright side, my stuff runs tons faster -- that is a bonus. I won't even comment on the readability or understandability of your convuluted solution with the "with" clause.
I truly and utterly fail to see how:
row_number() over (PARTITION BY DEPT
ORDER BY SAL)
is "hard" -- either to remember, or to type. If you do top-n processing lots, guess what, that statement just falls out of your hands and into the keyboard.
What I do have to reference the manual for is "how to create a partitioned table" -- why? because I don't do that 50 times a day.
So, if you are reaching for the manual every 5 minutes to remember the syntax of something you do every 5 minutes -- well, that might speak to your short term memory more than anything else. If you use analytics, you'll know the syntax (it takes very little time actually to "get them" -- you have partition (break the result set up), you have order them (should already know that stuff) and then you have the less frequently used RANGE/ROWS clause -- which is the "hard part" conceptually -- but also the "hardly used" part)
There ain't nothing wrong with analytics.
There is something wrong with not learning them.
Stats
A reader, January 05, 2005 - 9:45 am UTC
I had come up with the following before I read your solution
SELECT MIN(cnt),MAX(cnt),MIN(dist),MAX(dist),MIN(total),AVG(cnt),
percentile_cont(0.5) within GROUP (ORDER BY cnt) median1,
percentile_disc(0.5) within GROUP (ORDER BY cnt) median2
FROM (
SELECT
COUNT(*) over (PARTITION BY load_sequence_number) cnt,
COUNT(DISTINCT load_sequence_number) over () dist,
COUNT(*) over () total
FROM t
)
Worked fine, but took 20 minutes for a 50 million row table. Isnt that excessive? If I take out the percentile_* functions, it goes down to 10 minutes. Those functions double the run-time!? Why?
Thanks
January 05, 2005 - 10:51 am UTC
use sql_trace and tkprof to see what things are happening.
think about the work that needs to be performed. you are applying analytics to analytics -- it is virtually having to run the query twice -- inner query to temp probably, outer query from temp to temp and then you get the answer.
larger pga_aggregate_targets or sort_area_sizes might be called for but use a 10046 level 12 trace, tkprof it and see what you see.
Analytics and ANSI SQL
Su Baba, October 03, 2006 - 4:54 pm UTC
In chapter 12, p 545 of your book "Expert Oracle One-On-One", you mentioned that "these extensions [analytic functions] are currently under review by the ANSI SQL committee for inclusion in the SQL specification."
Do you know if this has already happened or will happen soon?
October 03, 2006 - 4:59 pm UTC
I believe it has, but so have a billion other things. Anyone that says "we are fully ansi compliant" is lying to you :) there are so many things in the spec that I seriously doubt anyone will ever be compliant again.