"How to get max and sum from two different tables in the same select",
Prasanna, November 06, 2002 - 5:19 pm UTC
Is there a different way to approach this problem possibly looking at the data model aspect?
I do not know the full details of this database, but seems to me that between invoice payments and invoice accounting the relationship is one to one.
In which case going back to Tom's example table t2 and t3 need to have additional columns "id2".
Then the query becomes simpler
ops$tkyte@ORA920.US.ORACLE.COM>
ops$tkyte@ORA920.US.ORACLE.COM> select t1.id,
2 max(payment_date) max_pay_date,
3 to_number(substr( max(to_char(t2.payment_date,'yyyymmddhh24miss') ||
t2.amt), 15 )) max_t2_amt,
4 sum(t3.amt) sum_t3_amt --/count(distinct t2.rowid) --No need for Count
5 from t1, t2, t3
6 where t1.id = t2.id and --t2.id = t3.id
t2.id2 = t3.id2
7 group by t1.id
8 /
November 06, 2002 - 6:32 pm UTC
that would be true if and only if it was one to one -- but they said it wasn't -- it was 1:n and 1:m -- hence you need to divide to get the sum back right.
Not sure...
Kashif, November 06, 2002 - 5:49 pm UTC
Hi Tom,
I'm a little unclear on your second query, in which you use the count (distinct t2.rowid). How does it work? I know the count (distinct t2.rowid) will give the number of distinct rowids per id, but that's where I get lost. If we don't divide by it, we'll get the sum of amts in t3 times the number of rows in t2. But why? And why distinct? I guess my understanding of joins and groups is a little lacking here... Any feedback would be useful. Thanks.
Kashif
November 06, 2002 - 6:38 pm UTC
Ok, lets say there are
1 rows in t1 with id = 1
2 rows in t2 with id = 1
3 rows in t3 with id = 1
when we join:
from t1, t2, t3
where t1.id = t2.id and t1.id = t3.id
we'll get 6 rows -- (1*2 from joining t1 to t2) * 3 from joining that to t3
We apparently are counting each T3.amt 2 times -- once for each row in T2. That is the nature of the join. Look:
ops$tkyte@ORA920.US.ORACLE.COM> create table t1 ( id int );
Table created.
ops$tkyte@ORA920.US.ORACLE.COM> create table t2 ( id int, msg varchar2(10) );
Table created.
ops$tkyte@ORA920.US.ORACLE.COM> create table t3 ( id int, amt int );
Table created.
ops$tkyte@ORA920.US.ORACLE.COM>
ops$tkyte@ORA920.US.ORACLE.COM> insert into t1 values ( 1 );
1 row created.
ops$tkyte@ORA920.US.ORACLE.COM> insert into t2 values ( 1, 'row1' );
1 row created.
ops$tkyte@ORA920.US.ORACLE.COM> insert into t2 values ( 1, 'row2' );
1 row created.
ops$tkyte@ORA920.US.ORACLE.COM> insert into t3 values ( 1, 2 );
1 row created.
ops$tkyte@ORA920.US.ORACLE.COM> insert into t3 values ( 1, 4 );
1 row created.
ops$tkyte@ORA920.US.ORACLE.COM> insert into t3 values ( 1, 6 );
1 row created.
ops$tkyte@ORA920.US.ORACLE.COM>
ops$tkyte@ORA920.US.ORACLE.COM>
ops$tkyte@ORA920.US.ORACLE.COM> select t1.id, t2.msg, t3.amt
2 from t1, t2, t3
3 where t1.id = t2.id
4 and t1.id = t3.id
5 /
ID MSG AMT
---------- ---------- ----------
1 row1 2
1 row2 2
1 row1 4
1 row2 4
1 row1 6
1 row2 6
6 rows selected.
So -- each row in T3 is "amplified" or "multiplied" by each row in T2. two rows in T2 -- then T3's rows will appear in the join twice. Three rows in T2 -- then T3's rows will appear in the join three times. and so on.
So, when we sum them up:
ops$tkyte@ORA920.US.ORACLE.COM> select t1.id, sum(t3.amt), count(distinct t2.rowid)
2 from t1, t2, t3
3 where t1.id = t2.id
4 and t1.id = t3.id
5 group by t1.id
6 /
ID SUM(T3.AMT) COUNT(DISTINCTT2.ROWID)
---------- ----------- -----------------------
1 24 2
ops$tkyte@ORA920.US.ORACLE.COM>
we doubled counted (in this example) everything. Dividing the SUM by the number of rows in T2 that contributed to this group will fix that -- 24/2 = 12 = 6+4+2...
Does it not have to be count(distinct t2.rowid) always?
Ramakrishna, November 07, 2002 - 3:34 am UTC
Dear Tom,
Thanks for your detailed explanation. You have said in
your response that "count(distinct rowid) could be count(distinct t2.id) as well". But they return different results! The count(distinct t2.id) for a given t1.id always returns 1 unlike count(distinct t2.rowid) which will return the factory to "de-amplify" by. Is that right?
Also, the Discoverer and Business Objects documentation
seem to call this 'Fan Trap' or 'Chasm Trap'.
It would be great if you could list the generic conditions
under which such connection traps can occur in SQL so that we will know what to watch out for.
thanks,
Ramakrishna
November 07, 2002 - 7:42 am UTC
DOH -- sorry about that -- you are correct of course, needs to be count(distinct t2.rowid) -- we need to know the number of rows.......
I've never heard of "Fan Trap" or "Chasm Trap" (I can see the "fan" fitting in -- it "fans out" the result set -- chasm-- that one doesn't click with me)
Anyway -- here are the rules:
If
a) you are using aggregation
b) and you have more then 2 tables in the query (you have N tables resulting in N-1 joins)
c) and there are NOT at least "N-2" one to one relationships involved
you have this issue
Eg: suppose you have:
select ..
from t1, t2, t3 where t1.x=t2.x and t1.x=t3.x N=3, there are 2 joins
t1 1:m t2
t1 1:m t3 there are 0 1:1 conditions, 0 < 3-2 so this "fans" out
Suppose you have:
select ...
from t1, t2, t3 where t1.x=t2.x and t1.x=t3.x N=3, there are 2 joins
t1 1:1 t2
t1 1:m t3 there are 1 1:1 conditions, 1 >= 3-2 so this does NOT fan out
How about a new book: "Cool Little SQL Tricks"
Chuck Jolley, November 07, 2002 - 10:18 am UTC
to_number(substr( max(to_char( t2.payment_date, 'yyyymmddhh24miss') ||
t2.amt), 15 )) max_t2_amt
Clever!
Thanks,
Chuck
November 07, 2002 - 10:45 am UTC
Well, actually, that is what my next book is mutating into -- more of a best practices, rather then a formal "performance tuning text book".
Similar to:
</code>
http://www.amazon.com/exec/obidos/ASIN/0201310058 <code>
curious -- what do people think of that format for an Oracle best practices book? there are online samples of the content so you can see what it would look like. (i have about 100 items in my list)
Thanks.
Kashif, November 07, 2002 - 11:57 am UTC
Thanks Tom, the explanation was fantastic. When do you think you might have a revision for "Expert One on One" out for 9i, by the way? The best practices, bullet-point style is perfect for recommending tips and tricks, especially for intermediate to advance level programmers, which is probably the audience interested in such a text. Prevents us readers from having to sift through tons of text and having to dog-ear each page we're interested in, or even worse highlighting lines in the text with fluoresecent markers. To get what we're looking for, we can just zoom in to the topic we're interested in using the TOC. I've read a best practices book by Feurstein, and while I don't agree with some of his points, it was really hands-down the most effective tips book I have read yet, primarily as a result of the concise, bullet-point style of the book.
Kashif
P.S. Is there a way to get in line to order a copy of the book now?
November 07, 2002 - 12:36 pm UTC
I'm not planning on doing an Expert one on one for 9i - for a couple of reasons.
One is that 90% of the current one applies 100% to 9i (there is just more STUFF in 9i, lots more). So, the guts of the book is relevant.
Second is -- well, time is marching on and the next release would come out before I could do it -- so, I'm thinking about targeting that one ;)
I don't even have an ISBN for the other book as yet.
Interesting...
jan van mourik, November 07, 2002 - 12:02 pm UTC
That "Best Practices" books looks interesting, well, if I were a Java developer :-) I'd definitely be interested in your Oracle version...
Sufficient condition for fan trap?
Ramakrishna, November 08, 2002 - 6:18 am UTC
Dear Tom,
Thanks for your answer. Chasm trap and Fan trap seem to
be used to mean much the same thing. I went through your
conditions when this occurs and wanted to know if it is
correct to restate them as follows:
The NECESSARY conditions for a 'fan trap' to occur in a
query are:
1. Aggregation is being done
2. Involves more than one 1:m relationship
However, I am unable to formulate the SUFFICIENT
conditions. For example, there is one case where a query
meets both the above criteria but has no fan trap which is:
Queries on the classic 'star schema' used in data
warehouses involve many 1:m relationships and aggregations
are done on the central fact table and there is no problem.
For example, you could have a Day table, a Product table and a Region table all linked via 1:m relationships to a
central Sales table but a query joining all these 4 tables
to find total sales by, say, Month, by Category and by
Geography does not face this problem.
I want to know if it will be possible to state the SUFFICIENT condition for this in plain English so that I can
use it to check any multi-table queries easily.
thanks,
ramakrishna
November 08, 2002 - 8:06 am UTC
Queries on star schema's DO NOT involve more then one 1:m -- they are all 1:1 by definition!!!
You have a series of dimension tables, you have a single FACT table.
The relationship from FACT to DIMENSION is always 1:1 (the relationship from DIMENSION to FACT is generally 1:M but it is not relevant here -- you join FACT to a SINGLE DIMENSION row -- you join from FACT to a dimensions primary key)
So, with a star schema -- you have a whole bunch of 1 to 1 relationships.
You were looking at the relationship the wrong way....
You stated the sufficient conditions.
group functions
Amit, November 12, 2002 - 11:30 pm UTC
Dear Tom,
My requirement is I 've to find the sum of quantity and the date corresponding to it but display only the max of the summed up value and its corresponding date.I am using the following query :
select * from (select to_char(docdt,'DD-MON-YYYY'),sum(val/100000) from t
where trunc(dt) >= to_date(:fdt,'dd-mon-yyyy')
and trunc(dt) < to_date(:tdt,'dd-mon-yyyy')+1
group by to_char(dt,'DD-MON-YYYY') order by sum(val/100000) desc) where rownum=1
If my from and to dates are far apart, the query takes more time to execute.
Any improvement is greately appreciated.
TIA
November 13, 2002 - 1:21 pm UTC
Hmm, the more data I have to scan and process the longer the query takes.... seems reasonable to me?
But anyway there is definite room for improvement. for example:
select * from (select to_char(docdt,'DD-MON-YYYY'),sum(val/100000) from t
where trunc(docdt) >= to_date(:fdt,'dd-mon-yyyy')
and trunc(docdt) < to_date(:tdt,'dd-mon-yyyy')+1
group by to_char(docdt,'DD-MON-YYYY') order by sum(val/100000) desc) where rownum=1
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.46 0.48 72 75 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.46 0.48 72 75 0 1
********************************************************************************
select *
from (
select trunc(docdt), sum(val/100000)
from t
where docdt >= to_date(:fdt,'dd-mon-yyyy')
and docdt < to_date(:tdt,'dd-mon-yyyy')+1
group by trunc(docdt)
order by 2 desc
)
where rownum = 1
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.22 0.24 72 75 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.22 0.24 72 75 0 1
those two queries are the same but by removing many of the unnecessary function calls you are making -- we can cut the CPU needs in half (my table t was created by select created, object_id from all_objects by the way).
I looked at:
where trunc(dt) >= to_date(:fdt,'dd-mon-yyyy')
Now, the trunc(dt) isn't needed since DT >= trunc(dt) always and if trunc(dt) >= to_date(:fdt) then DT by definition is >= to_date(:fdt)
Looking at
trunc(dt) < to_date(:tdt,'dd-mon-yyyy')+1
we can also say that if DT = 01-jan-2002 23:59:59 and trunc(dt) = 01-jan-2002 00:00:00. And if
01-jan-2002 00:00:00 < to_date(:tdt)+1
then to_date(:tdt)+1 must be 02-jan-2002 OR GREATER (the minimum it could be is 02-jan-2002.
Hence, if trunc(dt) < to_date(:tdt)+1 in this case, dt is ALSO < to_date(:tdt)+1 (cause we are just whacking the time component here)
So, we can remove lots of trunc calls. I also changed the TO_CHARS (lots of overhead there to string-ize a date) to TRUNC -- achieving the same results but at a lower CPU cost:
select to_char(docdt,'dd-mon-yyyy')
from
t group by to_char(docdt,'dd-mon-yyyy')
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 3 0.16 0.19 72 75 0 17
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 5 0.16 0.19 72 75 0 17
select trunc(docdt)
from
t group by trunc(docdt)
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 3 0.08 0.11 72 75 0 17
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 5 0.08 0.11 72 75 0 17
Now, the removal of the functions on docdt in the where clause will also permit this to use an index on docdt when/if appropriate -- if you happen to have an index on DOCDT,VAL -- you would get the biggest bang for the buck here as well.
group functions
Amit, November 12, 2002 - 11:39 pm UTC
Dear Tom,
Sorry.I just muffed up with the above query the correct one is :
select * from (select to_char(dt,'DD-MON-YYYY'),sum(val/100000) from t
where trunc(dt) >= to_date(:fdt,'dd-mon-yyyy')
and trunc(dt) < to_date(:tdt,'dd-mon-yyyy')+1
group by to_char(dt,'DD-MON-YYYY') order by sum(val/100000) desc) where rownum=1
TIA
November 13, 2002 - 1:22 pm UTC
see above -- i figured dt and docdt were one in the same...
what im waiting for
Bill, November 13, 2002 - 1:59 pm UTC
is a line of cool clothes called BIND VARIABLES
the tag on the clothes could be :TOM
Explanation of the terms "fan trap" and "chasm trap"
Dan Kefford, November 13, 2002 - 2:46 pm UTC
Randy, November 10, 2004 - 11:21 am UTC
Tom,
I have a query that takes the sum from two tables then divides the sum to give % of program. The query below is an example, however, the output is not correct.
select i.appn, p.program_name, i.i_pron, i.ssn,
sum(i.APPN_DOLLARS - i.WITHHELD + i.REPROGRAM - i.TAXES) "current_program",
sum(l.commitment), sum(l.obligations), sum(l.obligations) / sum(i.APPN_DOLLARS - i.WITHHELD + i.REPROGRAM - i.TAXES) "program_percent"
from program p, i_pron i, ledger l
where p.dispid = i.program
and i.dispid = l.i_pron
group by i.ssn, p.program_name, i.appn, i.i_pron
The sum of "current_program" and "program_percent" does not add up correctly. However, the sum of commitment and obligations all show correct totals. Where am I going wrong with this?
Thanks in advance,
Randy
November 10, 2004 - 12:24 pm UTC
show me, give me a "for example"
Randy, November 10, 2004 - 2:46 pm UTC
Ok, I have a query shown below...
1 select i.dispid, p.program_name, i.appn, i.ssn,
2 sum(i.APPN_DOLLARS - i.WITHHELD + i.REPROGRAM - i.TAXES) "current_program"
3 from i_pron i, program p
4 where i.program = p.dispid
5* group by i.dispid, i.ssn, i.appn, p.program_name
SQL> /
DISPID PROGRAM_NAME APPN SSN current_program
---------- ---------------------------------------- --------------- -------------- ---------------
41 Test Mod Line WTCV GZ2400 41565000
42 Carrier MODs WTCV GB1930 40770000
68 Reimb Customer-OMA Customer - OMA 123207.NC000 18958418.9
"current_program" show's the correct total. However, when I add the third table to the from line all the current_program totals change.
1 select i.dispid, p.program_name, i.appn, i.ssn,
2 sum(i.APPN_DOLLARS - i.WITHHELD + i.REPROGRAM - i.TAXES) "current_program"
3 from i_pron i, program p, ledger l
4 where i.program = p.dispid
5* group by i.dispid, i.ssn, i.appn, p.program_name
SQL> /
DISPID PROGRAM_NAME APPN SSN current_program
---------- ---------------------------------------- --------------- -------------- ---------------
41 Test Mod Line WTCV GZ2400 8687085000
42 Carrier MOD WTCV GB1930 8520930000
68 Reimb Customer-OMA Customer - OMA 123207.NC000 3962309548
November 10, 2004 - 8:14 pm UTC
yes, well, you did cartesian product to ledger there.
where is the join to L?
Group functions
Peg, December 29, 2004 - 10:34 am UTC
Thanks so much! I've been looking for a technique to solve this type of problem (group functions on multiple tables)for days!
How to sum the dimension table columns?
Michael, January 30, 2005 - 6:33 am UTC
Hi Tom,
I have very similar problem,
I have 2 table: dimension and fact.
For example:
My dimension table stores book. The structure is: book_id, shelf_id, num_of_pages
In the fact table I have sales data: book_id, date, price.
The tables contains the following records:
Books:
book_id shelf_id num_of_pages
----------------------------------------------------------
1 1 100
2 1 200
sales:
book_id date price
-------------------------------------------------------
1 d1 10
1 d2 11
2 d1 20
2 d2 21
2 d3 22
2 d4 23
I want to create a query that group the data by shelfs and present the total sales (according to price) and the sum of pages for all the books that were sold. The problem is the number of pages is multiplied by the number of sales.
For example the following SQL:
Select sum(num_of_pages) num_of_pages
, sum(price) total_price
from books, sales
where books.book_id = sales.book_id
group by books.shelfs_id.
IÂ’ll get num_of_pages = 1000 (100 * 2 + 200 * 4), total_price = 107.
In this case, even dividing by count will not work since some of the records are multiply by 2 and some by 4.
Can you help?
January 30, 2005 - 9:37 am UTC
if i only had a create table and some inserts... i can see how to do this with analytics. do you?
here are the scropts
Michael, January 30, 2005 - 10:59 am UTC
CREATE TABLE books
( book_id NUMBER,
shelf_id NUMBER,
num_of_pafes NUMBER);
Insert into books
(BOOK_ID, SHELF_ID, NUM_OF_PAFES)
Values
(1, 1, 100);
Insert into books
(BOOK_ID, SHELF_ID, NUM_OF_PAFES)
Values
(2, 1, 200);
CREATE TABLE MICK_63.books_sales
( book_id NUMBER,
sales_date DATE,
price NUMBER
);
Insert into books_sales
(BOOK_ID, SALES_DATE, PRICE)
Values
(1, TO_DATE('01/30/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 10);
Insert into books_sales
(BOOK_ID, SALES_DATE, PRICE)
Values
(1, TO_DATE('01/31/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 11);
Insert into books_sales
(BOOK_ID, SALES_DATE, PRICE)
Values
(2, TO_DATE('01/30/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 20);
Insert into books_sales
(BOOK_ID, SALES_DATE, PRICE)
Values
(2, TO_DATE('01/31/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 21);
Insert into books_sales
(BOOK_ID, SALES_DATE, PRICE)
Values
(2, TO_DATE('02/01/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 22);
Insert into books_sales
(BOOK_ID, SALES_DATE, PRICE)
Values
(2, TO_DATE('02/02/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 23);
January 30, 2005 - 12:33 pm UTC
actually, don't even need analytics -- just roll up the sales to the book level, join, then aggregate up to the shelf level.
select a.shelf_id, sum(a.num_of_pages), sum(b.price)
from books a,
(select sum(price) price, book_id
from books_sales
group by book_id ) b
where a.book_id = b.book_id
group by a.shelf_id;
what's the solution you have?
Gabe, January 30, 2005 - 11:00 am UTC
create table book
( book_id number(9) primary key
,shelf_id number(9) not null
,num_of_pages number(9) not null
);
insert into book values (1,1,100);
insert into book values (2,1,200);
insert into book values (3,2,300);
insert into book values (4,3,120);
insert into book values (5,3,120);
create table sale
( book_id number(9) not null references book(book_id)
,sale_dt date not null
,price number(9) not null
);
insert into sale values (1, sysdate-9, 10);
insert into sale values (1, sysdate-8, 11);
insert into sale values (2, sysdate-7, 20);
insert into sale values (2, sysdate-6, 21);
insert into sale values (2, sysdate-5, 22);
insert into sale values (2, sysdate-4, 23);
insert into sale values (3, sysdate-3, 30);
insert into sale values (3, sysdate-2, 40);
insert into sale values (3, sysdate-1, 50);
insert into sale values (4, sysdate+1, 30);
insert into sale values (4, sysdate+2, 40);
insert into sale values (5, sysdate+3, 50);
SQL> select shelf_id, sum(pages), sum(sales)
2 from ( select b.shelf_id
3 ,b.book_id
4 ,b.num_of_pages pages
5 ,sum(s.price) sales
6 from sale s, book b
7 where s.book_id = b.book_id
8 group by b.shelf_id, b.book_id, b.num_of_pages
9 )
10 group by shelf_id
11 ;
SHELF_ID SUM(PAGES) SUM(SALES)
---------- ---------- ----------
1 300 107
2 300 120
3 240 120
I can do some analytics as well ... but is it worth it (performance, elegance)?
Unless I'm missing something here!
The thing that is missingÂ…
Michael, January 31, 2005 - 2:43 am UTC
That is great but my problem is that the format of the SQL is strict since it is going into a template:
The template is:
Select shelf_id,
<expression that will sum pages>,
<expression that will sum sales>
from sale s, book b
where s.book_id = b.book_id
group by b.shelf_id
so I cant add sub selects or additional group by. I need to add this into one expression that is grouped by shelf.
January 31, 2005 - 8:15 am UTC
The thing that is missing is you have cut out the power of SQL in the name of "generic code"
That is the thing that is missing. If you lobotimize SQL, you get -- well, brain dead SQL.
What is this "template" thing all about (besides sounding like a bad idea(tm))...
It would take two levels of aggregation to do this. I don't believe you'll "get there from here" with that template (withing inline views at the very least)
not for nothing
Michael, January 31, 2005 - 8:49 am UTC
We have a generic system that allows the user to configure what data he wants to look at.
The user configures parts of the select statement and also on what columns to do the group by.
The application combines this into a whole SQL. So what we have is some kind of an SQL generator.
I know that we can get this data by running different SQL, but enabling the application to run this kind of SQL will force us to make the applicationÂ’s SQL generator even more complex.
This is why I thought that we can achieve this data by using analytic functions.
My logic is – Oracle multiply the records in the books table by the records in the sales table as a result of the join between 2 tables. Maybe, there is a function that can sum this data before the multiplication.
January 31, 2005 - 9:16 am UTC
I know they are "not for nothing", but they always -- repeat always -- come at a cost, a pretty high cost if you ask me after seeing so many variations of them over many many many years.
can "expression" be a "scalar subquery"
If so
ops$tkyte@ORA10GR1> select A.shelf_id,
2 sum(A.num_of_pages) pages,
3 (select sum(price)
4 from sale, book
5 where sale.book_id = book.book_id
6 and book.shelf_id = A.shelf_id) price
7 from book A
8 group by A.shelf_id;
SHELF_ID PAGES PRICE
---------- ---------- ----------
1 300 107
2 300 120
3 240 120
almost
Michael, January 31, 2005 - 9:20 am UTC
It can be a subselect but the main select must join the books and sales tables.
January 31, 2005 - 9:45 am UTC
"why" he says. you just need to use the template against "book" instead of "book and sales"
anyway, what we'd need is the factor to divide "num_of_pages" by (the number of rows it was multiplied by). that would be CNT:
ops$tkyte@ORA10GR1> select a.shelf_id,
2 a.num_of_pages,
3 count(distinct b.rowid) over (partition by a.rowid) cnt,
4 b.price
5 from book a, sale b
6 where a.book_id = b.book_id
7 /
SHELF_ID NUM_OF_PAGES CNT PRICE
---------- ------------ ---------- ----------
1 100 2 10
1 100 2 11
1 200 4 20
1 200 4 21
1 200 4 22
1 200 4 23
2 300 3 30
2 300 3 40
2 300 3 50
3 120 2 30
3 120 2 40
3 120 1 50
12 rows selected.
<b>so, using an inline view, we could:</b>
ops$tkyte@ORA10GR1>
ops$tkyte@ORA10GR1> select shelf_id, sum(num_of_pages/cnt), sum(price)
2 from (
3 select a.shelf_id,
4 a.num_of_pages,
5 count(distinct b.rowid) over (partition by a.rowid) cnt,
6 b.price
7 from book a, sale b
8 where a.book_id = b.book_id
9 )
10 group by shelf_id
11 /
SHELF_ID SUM(NUM_OF_PAGES/CNT) SUM(PRICE)
---------- --------------------- ----------
1 300 107
2 300 120
3 240 120
<b>but it won't work as an analytic without the inline view:</b>
ops$tkyte@ORA10GR1>
ops$tkyte@ORA10GR1> select a.shelf_id,
2 sum(a.num_of_pages/
3 count(distinct b.rowid) over (partition by a.rowid)) pages,
4 sum(b.price)
5 from book a, sale b
6 where a.book_id = b.book_id
7 group by shelf_id
8 /
count(distinct b.rowid) over (partition by a.rowid)) pages,
*
ERROR at line 3:
ORA-30483: window functions are not allowed here
<b>so that is out, but now you know what you need -- the count of rows in sale for that book id</b>
ops$tkyte@ORA10GR1> select a.shelf_id,<b>
2 sum(a.num_of_pages/
3 (select count(*) from sale where book_id = a.book_id)) pages,</b>
4 sum(b.price) price
5 from book a, sale b
6 where a.book_id = b.book_id
7 group by shelf_id
8 /
SHELF_ID PAGES PRICE
---------- ---------- ----------
1 300 107
2 300 120
3 240 120
ops$tkyte@ORA10GR1>
<b>there, a way underperforming query that fits the template</b>
I am for some reason reminded of a childhood show -- the Brady Bunch (you might not know it at all). There was an Episode "Johnny Bravo"
http://www.bradyresidence.com/bravo99.html
Greg Brady got to be a "rock star" just because the "suit fit" -- not for any other reason. Seems to fit here for some reason :)
You are the KING
Michael, January 31, 2005 - 10:43 am UTC
Thanks,
This is grate.
Fits like a glove.
In addition, it will not be so bad regarding performance since we have an index sales.book_id so the sub SQL performing a fast scan on the index.
Helped a lot.
January 31, 2005 - 11:33 am UTC
over and over and over again. anything done alot adds up remember....
ok
Siva, January 31, 2005 - 11:42 am UTC
Hi Tom,
What is the difference between Inline view and subquery?
Can they replace each other?
Please reply.
January 31, 2005 - 11:56 am UTC
inline view can be used where a view could be used:
select * from VIEW
select * from ( select * from t );
a subquery is a set used in a predicate
where x in ( subquery );
What is (tm)?
Menon, January 31, 2005 - 12:10 pm UTC
"What is this "template" thing all about (besides sounding like a bad idea(tm)).."
E.g. in the above sentence you have "idea (tm)."
I keep seeing it and thought it stood for trademark.
However, that does not fit in the above statement?
January 31, 2005 - 12:20 pm UTC
it is a "joke"
thats a really good idea "(tm)"
sum of same fields in two different tables
Sonali Sherma, February 10, 2005 - 1:23 pm UTC
Hi Tom,
I am facing a problem, hope you would help
I have two tables.. here is the sample data..
create table t1
(item_ID NUMBER(2),
book_ID NUMBER(2),
price NUMBER(2))
/
insert into t1 (item_ID, book_ID, price) values (1,1,60);
insert into t1 (item_ID, book_ID, price) values (1,2,70);
insert into t1 (item_ID, book_ID, price) values (1,3,80);
insert into t1 (item_ID, book_ID, price) values (2,1,10);
insert into t1 (item_ID, book_ID, price) values (2,1,20);
insert into t1 (item_ID, book_ID, price) values (1,3,40);
create table t2
(item_ID NUMBER(2),
book_ID NUMBER(2),
price NUMBER(2))
/
insert into t2 (item_ID, book_ID, price) values (1,1,89);
insert into t2 (item_ID, book_ID, price) values (1,2,29);
insert into t2 (item_ID, book_ID, price) values (1,3,99);
insert into t2 (item_ID, book_ID, price) values (2,1,55);
insert into t2 (item_ID, book_ID, price) values (2,1,90);
insert into t2 (item_ID, book_ID, price) values (1,3,10);
as you can see, both tables have the same column names with different data. what I require is that i need to calculate the SUM of price grouped by book_id from both the tables in a single SQL statement output.
so my required output is this:
BOOK_ID SUM(PRICE)
-------- ----------
1 323
2 99
3 229
where the SUM(price) for each book_id is for all the books in both the tables where book_id = 1 and then 2 in the second row and then 3 in the third row.
Would you please help, as I am unable to write the query for this.?
Thanks,
Sonali.
February 11, 2005 - 7:01 pm UTC
select...
from ( select book_id, sum(price) price from t1 group by book_id ) t1,
( select book_id, sum(price) price from t1 group by book_id ) t2
where t1.book_id = t2.book_id;
Date function
A reader, August 16, 2005 - 1:34 pm UTC
Tom, is having a fuction like this a good practice?
CREATE OR REPLACE function customer_sum_FUNC(date_IN IN DATE)
return number
is
v_sum number := 0 ;
begin
select count(*) into v_sum
from his_customer
where trunc(completed_date) = trunc(date_IN);
return(v_sum);
end ;
/
August 17, 2005 - 12:01 pm UTC
I doubt it.
why would you not just query that out with the data in the first place?
but at the very least,
where completed_date >= trunc(date_in) and completed < trunc(date_in)+1;
to let you at least use an index on completed_date.
A reader, August 17, 2005 - 2:24 pm UTC
This will use use the index. Correct me if I am wrong Tom.
select count(*)
from his_customer
where completed_date between to_date( completed_date,'dd-mon-yyyy hh24:mi:ss' )
and to_date(:DATE_IN,'dd-mon-yyyy hh24:mi:ss' )
August 17, 2005 - 2:44 pm UTC
sorry, that doesn't even compute.
where completed_date between to_date( completed_date,'dd-mon-yyyy hh24:mi:ss' )
????????
you are comparing completed_date to completed_date using an IMPLICIT conversion of completed date to string and then explicitly back into a date?????
??
A reader, August 17, 2005 - 3:02 pm UTC
Was trying to use between...just another way to trying use the index.
August 17, 2005 - 5:11 pm UTC
but you betweened the column with ITSELF
where x between x and y
that would just be where x <= y
A reader, August 17, 2005 - 4:13 pm UTC
Tom,
Can you give us an example of using between,in this
case?
August 17, 2005 - 5:16 pm UTC
why? it would be much "harder".
where completed_date between trunc(date_in)
and completed < trunc(date_in)+1-1/24/60/60;
Correcting Tom's typo
A reader, August 17, 2005 - 6:36 pm UTC
It should be
where completed_date between trunc(date_in)
and trunc(date_in)+1-1/24/60/60;
instead of
where completed_date between trunc(date_in)
and completed < trunc(date_in)+1-1/24/60/60;
August 18, 2005 - 3:27 pm UTC
yup, correct, thanks!
Dynamic Condition
Mita, August 31, 2005 - 4:10 pm UTC
I have a reporting requirement which needs the query conditions to be defined dynamically.
section column value incl_excl
A id 1 I
A id 2 E
A catg A I
A catg B I
B type 3 I
C cntry ABC I
SECTION column represents a section of report which should display data satisfy all include and exclude conditions. Each section can have differnt no of include/exclude conditions. Data should meet all the include conditions for all different columns. (it would be like IN for include conditions with same column)
what's the best way to do it ??
September 01, 2005 - 1:25 am UTC
not sure I understand.
More Info
Mita, September 01, 2005 - 11:34 am UTC
Let me explain you in simple terms.
I have a report which shows products under different header/sections based on the attributes of the product. we have to add new headers/sections every now and then based on new products so we decided to make this query metadata driven where we will store each section/header and attributes of the product which will fall under that.
what will be the best way to achieve that ?? One way would be generating whole SQLs dynamically based on metadata.
September 01, 2005 - 3:51 pm UTC
sorry, not quite simple enough.
metadata is too big a word maybe.
remember, I don't work with you, I haven't been looking at your problem with you for hours or days. I know nothing about it.
(and remember, metadata sounds really cool, it is good for documentation, but to run a real system - you might rethink that if scalability, performance, or ease of understanding is among your goals)
Aggregate Function - Multiplication
A reader, November 18, 2005 - 9:22 am UTC
Hi Tom,
Is there any aggregate function for multiplication which is the counterpart of the sum() function. Currently I am using a for loop to iterate over all the rows which is not elegant.
e.g.
select product(number) from mytable;
November 18, 2005 - 3:20 pm UTC