Count(*)
Parminder Garcha, October 02, 2001 - 9:20 am UTC
The use of 'exists' was certainly useful and was something I had forgotten about, however after generating two SQL statements one using the 'Exists' and the other doing a straight forward count as in the original question I found :-
The explain plan output showed more 'consistent gets' with the use of 'Exists' meaning Oracle used more data blocks to process the query. So if the Query is used very frequently in whatever scenario does the 'Exists' mean less effiency ? Why else would it retrieve more 'consistent gets' ?
Thanks.
October 02, 2001 - 9:32 am UTC
does not seem probable to me. You would need to supply a test case such as I am following.
Here I create a table based on all objects. I then find the name of the "last object" and the name of the "first object" as they exist in this newly created table (a row from the first block that is scanned and one from the last block).
I then run both styles of queries. It shows that the "where exists" will stop scanning as soon as it finds a row -- so when we search for the first row -- it does SIGNIFICANTLY less work. When we look for the last row -- it does about the same amount of work (tiny bit more, it has the "overhead" of scanning DUAL as well).
On average -- the where exists will significantly outperform the count(*). You would need to provide a counter case (with explain plans and all) for us to look at.
Note that in a heavily modified database -- consistent gets will go up and down with respect to the amount of modifications due to consistent reads to the rollback segment. Make sure that is not affecting your tests!
ops$tkyte@ORA817DEV.US.ORACLE.COM> create table t as select * from all_objects;
Table created.
ops$tkyte@ORA817DEV.US.ORACLE.COM> column first_oname new_val f
ops$tkyte@ORA817DEV.US.ORACLE.COM> column last_oname new_val l
ops$tkyte@ORA817DEV.US.ORACLE.COM> select object_name first_oname from t where rownum = 1;
FIRST_ONAME
------------------------------
/1005bd30_LnkdConstant
ops$tkyte@ORA817DEV.US.ORACLE.COM> select object_name last_oname
2 from ( select object_name
3 from ( select object_name, rownum r from t )
4 order by r DESC
5 )
6 where rownum = 1
7 /
LAST_ONAME
------------------------------
B
ops$tkyte@ORA817DEV.US.ORACLE.COM> set autotrace traceonly statistics
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> select count(*)
2 from dual
3 where exists ( select NULL
4 from t
5 where object_name = '&f' )
6 /
old 5: where object_name = '&f' )
new 5: where object_name = '/1005bd30_LnkdConstant' )
Statistics
----------------------------------------------------------
0 recursive calls
16 db block gets
6 consistent gets
0 physical reads
0 redo size
367 bytes sent via SQL*Net to client
425 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> select count(*)
2 from t
3 where object_name = '&f'
4 /
old 3: where object_name = '&f'
new 3: where object_name = '/1005bd30_LnkdConstant'
Statistics
----------------------------------------------------------
0 recursive calls
15 db block gets
236 consistent gets
0 physical reads
0 redo size
367 bytes sent via SQL*Net to client
425 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> select count(*)
2 from dual
3 where exists ( select NULL
4 from t
5 where object_name = '&l' )
6 /
old 5: where object_name = '&l' )
new 5: where object_name = 'B' )
Statistics
----------------------------------------------------------
0 recursive calls
19 db block gets
237 consistent gets
0 physical reads
0 redo size
367 bytes sent via SQL*Net to client
425 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> select count(*)
2 from t
3 where object_name = '&l'
4 /
old 3: where object_name = '&l'
new 3: where object_name = 'B'
Statistics
----------------------------------------------------------
0 recursive calls
15 db block gets
236 consistent gets
0 physical reads
0 redo size
367 bytes sent via SQL*Net to client
425 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> set autotrace off
ops$tkyte@ORA817DEV.US.ORACLE.COM>
Count(*)
Parminder Garcha, October 02, 2001 - 10:42 am UTC
Yes I agree, apologies for my earlier inaccuracies, putting in another test similar to your example my results replicate yours.........conclusion 'where exists' is the speed king....well over count(*) anyway.
Count(*)
Parminder Garcha, October 02, 2001 - 10:51 am UTC
Yes I agree, apologies for my earlier inaccuracies, putting in another test similar to your example my results replicate yours.........conclusion 'where exists' is the speed king....well over count(*) anyway.
Sorry to tread old ground but....
A reader, September 24, 2002 - 3:50 am UTC
Tom, on the current thread, but i'm trying to work out given that I agree wholeheartedly that a) the count(*) from dual... is better than a straight check, why the following code is returning exactly the same times in execution?
SQL> desc big_table
Name Null? Type
----------------------------------------- -------- ----------------------------
A VARCHAR2(10)
SQL> select count(*) from big_table;
COUNT(*)
----------
100000
no indexes etc, and table not analyzed...
SQL> set timing on
SQL> ed
Wrote file afiedt.buf
1 declare
2 cursor c1 is select NULL from big_table where a = 'B';
3 l1 VARCHAR2(1);
4 begin
5 for i in 1..1000 loop
6 open c1;
7 fetch c1 into l1;
8 IF c1%FOUND THEN
9 null;
10 END IF;
11 close c1;
12 end loop;
13* end;
SQL> /
PL/SQL procedure successfully completed.
Elapsed: 00:00:33.04
SQL> declare
2 cursor c1 is select count(*)
3 from dual
4 WHERE EXISTS (SELECT NULL
5 FROM big_table
6 where a = 'B'
7 );
8 l1 NUMBER;
9 begin
10 for i in 1..1000 loop
11 open c1;
12 fetch c1 into l1;
13 IF c1%FOUND THEN
14 null;
15 END IF;
16 close c1;
17 end loop;
18 end;
19 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:33.05
It's probably something very straightforward, maybe not enough data / iterations to show any significant increase, but I expected the second code to be quicker than the first using you're previous comments.
Thanks (as always) for your response
September 24, 2002 - 7:44 am UTC
why? they do the SAME exact thing.
where exists -- stops after the first hit.
open
fetch
close -- stops after the first hit
what you want to time:
SQL> declare
2 cursor c1 is select count(*)
3 from dual
4 WHERE EXISTS (SELECT NULL
5 FROM big_table
6 where a = 'B'
7 );
8 l1 NUMBER;
9 begin
10 for i in 1..1000 loop
11 open c1;
12 fetch c1 into l1;
13 IF c1%FOUND THEN
14 null;
15 END IF;
16 close c1;
17 end loop;
18 end;
19 /
vs
SQL> declare
8 l1 NUMBER;
9 begin
10 for i in 1..1000 loop
select count(*) into l1
3 from dual
4 WHERE EXISTS (SELECT NULL
5 FROM big_table
6 where a = 'B'
7 );
17 end loop;
18 end;
19 /
Great info on consistent gets vs. rollback segments!
Robert, September 24, 2002 - 9:30 am UTC
Tom,
Great info!!
"...consistent gets [may] go up and down with respect to the amount of modifications due to consistent reads to the rollback segment."
This is an important fact to remember and valuable information for analyzing our query statistics.
Thanks,
Robert.
select count(*)
john feng, September 24, 2002 - 4:56 pm UTC
I appreciate this tip to detect if there is any record regardless the record count.
Tell me at least something....
Robert, December 19, 2002 - 10:20 am UTC
Tom,
Using EMP as example with one extra column - COMPANY - that has the SAME value "MSOFT"
EMPNO NUMBER(4)
ENAME VARCHAR2(10) Y
JOB VARCHAR2(9) Y
MGR NUMBER(4) Y
HIREDATE DATE Y
SAL NUMBER(7,2) Y
COMM NUMBER(7,2) Y
DEPTNO NUMBER(2) Y
COMPANY VARCHAR2(10) Y
How can I return this:
COMPANY COUNT(*)
---------- ----------
MSOFT 0
with a query like :
SQL> select max(company), count(*) from emp where job = 'ACTOR' ;
...when there is NO such job as "ACTOR" ?
Thanks
December 19, 2002 - 10:29 am UTC
It will not.
scott@ORA920> select max(ename), count(*) from emp where 1=0;
MAX(ENAME) COUNT(*)
---------- ----------
0
scott@ORA920> select max(ename), count(*) from emp where job = 'NOT_A_JOB';
MAX(ENAME) COUNT(*)
---------- ----------
0
scott@ORA920>
You would need to post a full example if you think otherwise...
Helena Markova, February 12, 2003 - 3:32 am UTC
rownum is better
A reader, December 11, 2003 - 7:42 am UTC
create table x as select * from dba_objects order by object_id;
select count(*) from x where object_id = 20000;
COUNT(*)
----------
1
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (FULL) OF 'X'
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
380 consistent gets
240 physical reads
0 redo size
379 bytes sent via SQL*Net to client
499 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
select count(*) from x where object_id = 20000 and rownum = 1;
COUNT(*)
----------
1
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 COUNT (STOPKEY)
3 2 TABLE ACCESS (FULL) OF 'X'
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
267 consistent gets
156 physical reads
0 redo size
379 bytes sent via SQL*Net to client
499 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
select count(*) from dual where
exists (select null from x where object_id = 20000)
COUNT(*)
----------
1
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'DUAL'
4 2 TABLE ACCESS (FULL) OF 'X'
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
270 consistent gets
170 physical reads
0 redo size
379 bytes sent via SQL*Net to client
499 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
rownum = 1 is better than exists
OK
Catherine, February 09, 2004 - 10:09 am UTC
Hi Tom,
Nice to meet you.
Regarding the Exists operator,Can it *refer* the same table
in the outer and inner query?This is the thing I want to
know and could you please provide a simple example?
Please do reply.
February 09, 2004 - 10:30 am UTC
sure, it matters not what 'tables' are referenced -- SQL doesn't care if you reference the same table or different tables -- they are all "different" tables to it.
here we get a list of all the objects owned by people that have at least one materialized view:
ops$tkyte@ORA920PC> select owner, object_name, object_type
2 from all_objects
3 where exists ( select null from all_objects a2
4 where a2.owner = all_objects.owner
5 and a2.object_type = 'MATERIALIZED VIEW' )
6 /
OWNER OBJECT_NAME OBJECT_TYPE
------------------------------ ------------------------------ ------------------
SH CAL_MONTH_SALES_MV TABLE
SH CAL_MONTH_SALES_MV MATERIALIZED VIEW
SH CHANNELS TABLE
SH CHAN_PK INDEX
SH COSTS TABLE PARTITION
SH COSTS TABLE PARTITION
SH COSTS TABLE PARTITION
Nice
Catherine, February 11, 2004 - 10:54 am UTC
Dear Tom,
Thanks for your reply.Could you please show some examples
involving the traditional EMP and DEPT Tables?I would be
very happy to see that.Please do reply.
Thanks.
February 11, 2004 - 12:34 pm UTC
programming assignment for you:
show me the list of emp who work in a department that has at least one sales person.
result set will be:
DEPTNO ENAME JOB DECODE(
---------- ---------- --------- -------
30 ALLEN SALESMAN <<<====
30 WARD SALESMAN <<<====
30 MARTIN SALESMAN <<<====
30 BLAKE MANAGER
30 TURNER SALESMAN <<<====
30 JAMES CLERK
6 rows selected.
as this is a variation on the
<quote>
here we get a list of all the objects owned by people that have at least one
materialized view:
</quote>
from above but PEOPLE will be deptno's and MATERIALIZED VIEW will be SALESMAN
Count(*) response time
Karthik, February 11, 2004 - 6:09 pm UTC
Tom
Thanks for your nice response. Appreciate your time on this forum.
Is there any best way to get good response time for
COUNT(*) against table, which has 50Million recs.
Using COUNT(primarykey_column)will give good response time?
Please suggest me any techique to get good response time for Select count(*) from table. I am using Oracle8.1.7
Thanks very much.
February 11, 2004 - 6:53 pm UTC
analyze the table and the count(*) will fast full scan any index on a NOT NULLable column -that'll be as fast as it can be, reads least amount of data.
About your programming assingnment
MV From India, February 12, 2004 - 7:00 am UTC
Hi Tom
I've tested u'r programming assignment with "Where Exists" as well as with "Where deptno in". but the second one is running faster than first one. Why?
(Testing purpose, i've inserted some records in emp table, since those records are deleted from the emp table by other users.)
SQL> select deptno, ename, job, decode(job, 'SALESMAN','<<<<====')
2 from emp a
3 where exists (select null from emp b where b.deptno = a.deptno
4* and b.job = 'SALESMAN')
SQL> /
DEPTNO ENAME JOB DECODE(J
---------- ---------- --------- --------
30 fff SALESMAN <<<<====
30 hhh
Elapsed: 00:00:04.10
SQL> select deptno, ename, job, decode(job, 'SALESMAN','<<<<====')
2 from emp a
3 where deptno in (select deptno from emp where job = 'SALESMAN');
DEPTNO ENAME JOB DECODE(J
---------- ---------- --------- --------
30 fff SALESMAN <<<<====
30 hhh
Elapsed: 00:00:01.60
Thanks in advance
MV from India
February 12, 2004 - 8:54 am UTC
you are probably using the brain dead RBO is why - -but it is hard to say since you don't share a tkprof with us and using timing in sqlplus isn't very reliable. tkprof is what you want (need) to use.
OK
Ben, February 15, 2004 - 8:35 am UTC
Dear sir,
I'm well and wish the same from you.I would like to transform the following query
sql>select deptno,ename,sal from emp where
sal = any(select max(sal) from emp group by deptno);
using *Exists * operator to get the same output?Could you
please help with the query?
Awaiting your reply
Bye!
February 15, 2004 - 11:50 am UTC
homework?
Sad
Ben, February 16, 2004 - 12:05 am UTC
Dear Sir,
Is that difficult to arrive at?I felt to be stupid at this
moment.Can you give a try?
Bye!
February 16, 2004 - 7:04 am UTC
it is easy, it is basic SQL. Give it a try -- show us what you've tried. Gotta master the fundementals and this is one of the more basic ones here.
Can this do?
Ben, February 17, 2004 - 7:10 am UTC
Dear Sir,
Thanks for your advice.Can the following query do?
sql> select deptno,ename,sal from emp e where
exists(select 'x' from emp where deptno = e.deptno
and /* How to coordinate the max(sal) with (outer) sal*/
I am confused with this and FYI My system crashed two
days back and I am in no position to test with statements.
Hope you understand my problem.
Bye!
February 17, 2004 - 9:55 am UTC
read about the HAVING clause
look at your first query:
select deptno,ename,sal from emp where
sal = any(select max(sal) from emp group by deptno);
the deptno isn't something you want to correlate on. you want to generate the set of max(sals) having the max(sal) the same as the current rows sal.....
there where deptno = e.deptno isn't part of your solution.
OK
Ben, February 18, 2004 - 12:22 am UTC
Dear sir,
Can this work?
sql>select deptno,ename,sal from emp e where
exists(select 'x' from emp group by deptno
having max(sal) = e.sal);
I think this can work and Is there any refinement for
the query?
Bye!
February 18, 2004 - 8:34 pm UTC
there you go, you got it.
scott@ORA10G> select deptno,ename,sal from emp where
sal = any(select max(sal) from emp group by deptno); 2
DEPTNO ENAME SAL
---------- ---------- ----------
10 KING 5000
20 SCOTT 7781.24
30 BLAKE 2850
scott@ORA10G> select deptno,ename,sal from emp e where
exists(select 'x' from emp group by deptno
having max(sal) = e.sal); 2 3
DEPTNO ENAME SAL
---------- ---------- ----------
30 BLAKE 2850
10 KING 5000
20 SCOTT 7781.24
the refinement is to go back to the = any or "in"
I am very confused about this
Nick, March 29, 2004 - 2:23 pm UTC
Hi Tom,
Can you clear this perky confusion.
when I run this query
select count(*),wrawardcat
from member group by wrawardcat
176 C
929 G
63 H
208 J
91 L
144 R
471418 <----total
However if I run this
when I run this query
select count(wrawardcat),wrawardcat
from member group by wrawardcat
176 C
929 G
63 H
208 J
91 L
144 R
0 <----without total
March 29, 2004 - 3:11 pm UTC
thats not a total.
You have nulls.
scott@ORA9IR2> select comm, count(*), count(comm), decode(comm,null, 'IS NULL', 'IS NOT NULL' )
2 from emp
3 group by comm;
COMM COUNT(*) COUNT(COMM) DECODE(COMM
---------- ---------- ----------- -----------
0 1 1 IS NOT NULL
300 1 1 IS NOT NULL
500 1 1 IS NOT NULL
1400 1 1 IS NOT NULL
10 0 IS NULL
you have 471,418 null wrawardcat's in there. count(*) counts NULLs, count(expression) counts NON-NULL occurences of expression....
Thanks!!!
A reader, March 29, 2004 - 3:58 pm UTC
count, exists and in
Marvin, August 26, 2004 - 5:37 am UTC
Hi
I have this query
select count(*)
from CONTEO_ACCIONES_DM
where id_accion in (select id_accion
from CONTEO_ACCIONES_ODS
where CR_BAJA_REG = 'S');
is this the same as
select count(*)
from CONTEO_ACCIONES_DM
where exists(select null
from CONTEO_ACCIONES_ODS
where CR_BAJA_REG = 'S');
if id_accion can have NULL can we still use IN?
Cheers
August 26, 2004 - 10:00 am UTC
in and exists are semantically interchangeable,
not in and not exists are not interchangeable in general.
however, the rewrite you did isn't done:
select count(*)
from conteo_acciones_dm t1
where exists ( select null
from conteo_acciones_ods t2
where cr_baja_reg = 'S'
AND t2.id_accion = t1.id_accion ) <<<=== very important.
oops I should have used NOT
Marvin, August 28, 2004 - 2:21 pm UTC
Hi again and sorry for the late follow up,
I should have used NOT IN which was my purpose
so
select count(*)
from CONTEO_ACCIONES_DM
where id_accion not in (select id_accion
from CONTEO_ACCIONES_ODS
where CR_BAJA_REG = 'S');
is this the same as
select count(*)
from CONTEO_ACCIONES_DM a
where not exists(select null
from CONTEO_ACCIONES_ODS b
where CR_BAJA_REG = 'S'
and a.id_Accion = b.id_accion);
If id_accion can have NULL can we still use NOT IN? I guess not since and a.id_Accion = b.id_accion will not be true for NULLs right?
thx
August 28, 2004 - 3:57 pm UTC
if id_accion is NULLABLE, these two queries are not equivalent.
More than X rows
Tri, February 14, 2005 - 2:54 pm UTC
How about selecting whether or not we have more than, say, 6 employees making over 5000 salary? This is slightly different than selecting whether we have only 1 such employee or not. I have tried a group by/having w/o any better performance (still scans the entire index instead of stopping after getting 6).
February 14, 2005 - 5:54 pm UTC
not following you -- how about a small example.
to see if we had 6 over 5000, I would:
select count(*)
from ( select 1
from t
where salary > 5000
and rownum <= 7 )
if answer is 7, then yes, you do.
More than X rows
Tri, February 14, 2005 - 3:46 pm UTC
Knowing that the limit is a small number, the best hack I can come up with so far is:
select count(*)
from dual
where (:limit-1) in (select rownum
from emp
where sal = 5000
and rownum < :limit)
Is there a better way?
February 14, 2005 - 6:01 pm UTC
see above, it is easier than that.
Using decode
Alex, February 15, 2005 - 11:57 am UTC
Tom,
I'm pretty sure this way to check the existence of records is worse, would you mind commenting on how bad and why:
SELECT decode (COUNT (*), 0, 'NO', 'YES')
FROM dual
WHERE EXISTS (
SELECT col1
FROM tab1
WHERE col2 = param1
AND col1 = param2
AND col3 = param3
AND col4 = param4);
I found it in code of ours, I'm looking to change it for the better. I'd test it myself if it wasn't part of a huge script that takes hours to run and the parameters are unknown to me.
February 15, 2005 - 3:39 pm UTC
my favorite way to do this:
<quote>
this space intentionally left blank
</quote>
but your query is basically what i coded above?
Great Tips
Dave Conrad, February 15, 2005 - 12:49 pm UTC
Great tips - really helped.
decode
Alex, February 15, 2005 - 4:04 pm UTC
I don't get what you wrote, are you implying what I posted is a waste of space?
I'm guessing then the decode doesn't effect performance much then. That wasn't what I was expecting to hear because awhile back, before I knew count(column) excluded NULLs, I replaced the statement above with one like this:
select (col1)
from tab1
where col2 = param1.....
And that made the script run much faster. So I guess that leads me to one more question; Are the above statement with the decode and this one equivalent IF col1 is not nullable?
February 16, 2005 - 7:36 am UTC
I was implying that logic that
a) checks for a row and then
b) processes
COULD instead
a) just process, if you find nothing to process, so be it. in between the time you a) check and b) process -- who is to say the row you found in a) still exists anyway?
apply decode once to a single row isn't going to materially affect the overall performance of an application.
select count(col1) could use a fast full index scan on col1 in order to count how many rows -- count(*) might have to do a full TABLE scan -- that would account for the differences in that one.
Very clear
Alex, February 16, 2005 - 9:23 am UTC
That was very helpful thank you. I knew I saw you express how you felt about checking, then processing before that's why I was surprised to see your first response at the top wasn't "don't do that, it's a waste of effort."
In my case, it's to check to see if the record already exists to avoid constraint errors (at least that's what I think the original coder's intentions where). How would you handle that? I was thinking there's probably a way to catch the exceptions and say "that's ok, go on to the next row". If I was writing the script from scratch I would just include the selection criteria in the insert.
February 16, 2005 - 9:30 am UTC
well, why not say "i expect there might be a constraint violated if i do this, so lets catch it and do what is appropriate if it happens (ignore it perhaps)"
for you see -- just because you checked at point A) in time, does not mean at all by the time you get to point B) that what you checked at point A) hasn't been changed!
Does stop key make a differnece?
A reader, March 14, 2005 - 4:39 pm UTC
Does useing a stopkey make any differnece?
SQL> set autotrace traceonly statistics
select count(*) from trade where trade_id > 1 and rownum <=1;
Statistics
----------------------------------------------------------
0 recursive calls
4 db block gets
2 consistent gets
0 physical reads
0 redo size
189 bytes sent via SQL*Net to client
204 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processed
SELECT STATEMENT Optimizer Mode=CHOOSE 1 1
SORT AGGREGATE 1 3
COUNT STOPKEY
INDEX FAST FULL SCAN FXBLOTTER.PK_TRADE 1 K 3 K 1
SQL> select count(*)
2 from dual
3 where exists ( select NULL
4 from trade
5 where trade_id > 1 );
Statistics
----------------------------------------------------------
0 recursive calls
4 db block gets
3 consistent gets
0 physical reads
0 redo size
189 bytes sent via SQL*Net to client
285 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1 rows processed
SELECT STATEMENT Optimizer Mode=CHOOSE 1 1
SORT AGGREGATE 1
FILTER
TABLE ACCESS FULL SYS.DUAL 82 1
INDEX RANGE SCAN FXBLOTTER.PK_TRADE 1 K 3 K 4
March 14, 2005 - 7:56 pm UTC
look above, we've discussed using rownum in the past.
Why is it so ?
culonbu, March 15, 2005 - 6:36 pm UTC
create table dataowner
(owner varchar2(12)
,a2 varchar2(32)
,t1 varchar2(32)
)
;
Insert into dataowner
(OWNER, A2, T1)
Values
('gia.dinh', 'ACFA1', '0002S');
Insert into dataowner
(OWNER, A2, T1)
Values
('culonbu', 'ACFA1', '0025S');
Insert into dataowner
(OWNER, A2, T1)
Values
('gia.dinh', 'ACFA1', '0005S');
Insert into dataowner
(OWNER, A2, T1)
Values
('gia.dinh', 'ACFA1', '0001S');
Insert into dataowner
(OWNER, A2, T1)
Values
('gia.dinh', 'ACFA1', '0006S');
Insert into dataowner
(OWNER, A2, T1)
Values
('gia.dinh', 'ACFA2', '0001S');
Insert into dataowner
(OWNER, A2, T1)
Values
('gia.dinh', 'ACFA2', '0002S');
Insert into dataowner
(OWNER, A2, T1)
Values
('rdw', 'ACFA1', '0021S');
Insert into dataowner
(OWNER, A2, T1)
Values
('culonbu12', 'ACFA1', '0020M');
Insert into dataowner
(OWNER, A2, T1)
Values
('culonbu1', 'ACFA1', '0020B');
Insert into dataowner
(OWNER, A2, T1)
Values
('culonbu1', 'ACFA1', '0020S');
Insert into dataowner
(OWNER, A2, T1)
Values
('rdw', 'ACFA1', '0020P');
Insert into dataowner
(OWNER, A2, T1)
Values
('culonbu', 'ACFA1', '0004S');
COMMIT;
SQL> SELECT * FROM dataowner
2 WHERE owner = 'culonbu';
OWNER A2 T1
------------ -------------------------------- --------------------------------
culonbu ACFA1 0025S
culonbu ACFA1 0004S
SQL> SELECT * FROM dataowner
2 WHERE EXISTS( SELECT NULL FROM dataowner
3 WHERE owner = 'culonbu')
4 ;
OWNER A2 T1
------------ -------------------------------- --------------------------------
gia.dinh ACFA1 0002S
culonbu ACFA1 0025S
gia.dinh ACFA1 0005S
gia.dinh ACFA1 0001S
gia.dinh ACFA1 0006S
gia.dinh ACFA2 0001S
gia.dinh ACFA2 0002S
rdw ACFA1 0021S
culonbu12 ACFA1 0020M
culonbu1 ACFA1 0020B
culonbu1 ACFA1 0020S
rdw ACFA1 0020P
culonbu ACFA1 0004S
13 rows selected.
Can you please explain why second SQL statement ( WHERE EXISTS) returns 13 records ?
Thanks
March 15, 2005 - 9:21 pm UTC
because
SQL> SELECT * FROM dataowner
2 WHERE EXISTS( SELECT NULL FROM dataowner
3 WHERE owner = 'culonbu')
4
is conceptually just like:
for x in ( select * from dataowner )
loop
select count(*) into l_cnt from dataowner where owner = 'culonbu';
if ( l_cnt > 0 )
then
OUTPUT X;
end if;
end loop;
so, if there is one record in dataowner where owner = 'culonbu', every record comes out in the result set.
Please disregard my last request
A reader, March 15, 2005 - 7:06 pm UTC
Sorry Tom.
I just have a mental block .
Does stop key make a differnece?
A reader, March 16, 2005 - 3:37 pm UTC
Quote: "look above, we've discussed using rownum in the past.
"
I reread the post, I do not see the specific answer. (If I am blind, forgive me.) The stats look better when I do a "select count(*)... where rownum <=1" and the code is easiery for me to read. You are saying I am not correct. My stats says it is faster. Can you help?
March 17, 2005 - 7:52 am UTC
ctl-f for
rownum is better
where did I say you are not correct?
No WHERE Clause
karthick pattabiraman, September 24, 2007 - 5:46 am UTC
T1 is a big table. I just want to see if there is any data in it. Which means there is no WHERE Clause. Following are the different things that i tried.
SQL> select count(1) from t1;
Elapsed: 00:00:05.07
Statistics
----------------------------------------------------------
0 recursive calls
42 db block gets
9589 consistent gets
7148 physical reads
0 redo size
371 bytes sent via SQL*Net to client
300 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processed
The above is the worst way to do it i believe :-)
SQL> select count(1) from t1 where rownum = 1;
Elapsed: 00:00:00.01
Statistics
----------------------------------------------------------
0 recursive calls
4 db block gets
1 consistent gets
0 physical reads
0 redo size
368 bytes sent via SQL*Net to client
317 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processed
Above one is prity good..
SQL> select count(1) from dual where exists(select * from t1);
Elapsed: 00:00:00.01
Statistics
----------------------------------------------------------
0 recursive calls
8 db block gets
2 consistent gets
0 physical reads
0 redo size
368 bytes sent via SQL*Net to client
333 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1 rows processed
This is also ok..
SQL> select count(1) from dual where exists(select * from t1 where rownum = 1);
Elapsed: 00:00:00.01
Statistics
----------------------------------------------------------
0 recursive calls
8 db block gets
2 consistent gets
0 physical reads
0 redo size
368 bytes sent via SQL*Net to client
353 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1 rows processed
This is also ok..
But nothing is as good as 2nd sql.
My Question is so if i need to check a table for existance of records and i have no WHERE clause to apply can i just user rownum and check for first row.
But i know rownum is not a physical column and it will be populated only after generating the result set. But why the second SQL is better than the rest.
for your information
SQL> select count(1) from t1;
COUNT(1)
----------
3502560
September 26, 2007 - 1:35 pm UTC
here is the best way:
[this space intentionally left blank]
FACT: the millisecond after you look, there could be a row, so your approach of
if there is data to process
then
process
should just be:
process anything we find
I hate the act of "is there a row", seems so wasteful.
if you have to do it,
select count(*) from t where rownum = 1;
and pray for an index on a non-nullable column to make that fast.
some more light on this.
reader, January 30, 2008 - 1:43 pm UTC
Dear Tom,
good day to you as always, if you can please throw some more light on this.
As per your trick using "exists" to find if atleast one row is available will be better than having count(*) on the table with rownum=1, but in case one is aware that the table on which query is to be performed has a primary key (that mean unique and not nullable) will it be better to use count(*) with rownum=1 over "exists" or will exists still be better.
Now the situation where I have to know if atleast one row exists, with a given condition is as below.
Dept table (parent), Emp table (child) A department can't be deactivated (status marked deactive) till the time atleast one active employee exists belonging to the department.
In this situation empolyee table has primary key and column will be not nullable, so will it be better to use count(*) with rownum=1 or "exists" will still perform better.
Thanking you not just for this post, but for all your time and efforts you take to help people and share your knowledge.
Regards,
your fan.
January 30, 2008 - 2:09 pm UTC
I still think that this query is best:
[This space left intentionally blank]
As i get older, I see things more clearly. I fail to see why we have logic like:
see if row exists
if row exists
then
process
and not just:
process (and if a row didn't exist, process returns right away)
so, show me the logic in full that demands you KNOW a record exists (because as I am sure you are fully aware, that the instant you ask the question and get an answer, the answer could change - you might ask "is there a row" and be told "yes" - but when you go back to process it - it isn't there anymore. You might ask "is there a row" and be told no, but if you tried to process this non-existent row, it would be there)......
so, why even bother LOOKING, just try to process the darn data and if there isn't any, return. period.
the fastest way to do something is to NOT do it at all
example on situation that requires to check if row exists
reader, January 30, 2008 - 2:49 pm UTC
Dear Tom,
While I totally agree with you about the approach which should be adopted, I am giving it a shot to expalin you the situation/example where it requires for existence of row to be checked.
Same set of tables Dept (parent table), Emp (child table), A dept can only be deactivated if there are no active employees in the department. Two ways to do it, first deactivate all the employees of the department one by one and then deactivate the department, second in deactivate of department, deactivate all the active employees then deactivate the department itself. Here the deactivation request can be raised but unless it is approved by the approver the records are not marked deactivaed, i.e. the status will not change to 'Deactive'.
User u1 raises a deactive request (req1) on the department say dept no 40 after checking that no active employee exists for dept no 40, before his request gets approved another user u2 raises request(req2) for activation of employee e1 of department no 40 and his requests (req2) gets approved before deactivation request(req1), now when the approver is going to approve the request(req1) of user u1 there is an active employee(e1) so the request should not be approved, i.e. the system should say active employee exists.
In this situation I need to find out if atleasts one active employee exists to give the approve a message stating active employee exists, request(req1) should be rejected.
Taking this into consideration which query will comparitively perform better, the one which is using "where exists" or the one with "where rownum=1" on a table that has atleast one not nullable column and has index on the column.
I hope my example is making sense to you,and you will be able to guide me, I am as of now using "where exists" but after your last post on this got a little confussed. Can we have a demarcation as below
1. if one is sure that the table is having atleast one not nullable column with index he/she can use
select count(*) from xyz where rownum=1;
2. if one is not sure of a not nullable column with index it's better to use
select count(*) from dual where exists(select null from xyz where col1='abc');
Once again thanks for your time on this.
Regards,
your fan.
January 30, 2008 - 4:15 pm UTC
I read this far:
I am
giving it a shot to expalin you the situation/example where it requires for
existence of row to be checked.
and I can already say "you must not have read my response..."
...
(because as I am sure you are fully aware, that the instant you ask the question and get an answer, the answer could change - you might ask "is there a row" and be told "yes" - but when you go back to process it - it isn't there anymore. You might ask "is there a row" and be told no, but if you tried to process this non-existent row, it would be there)......
....
you wrote:
...Same set of tables Dept (parent table), Emp (child table), A dept can only be
deactivated if there are no active employees in the department. ...
do you realize that in Oracle, you cannot do this using APPLICATION LOGIC unless you employee a LOCK TABLE command (did you know that such a thing exists?) You have to serialize at a very high level. You see - reads are not blocked by writes and vice versa. If you think you can enforce integrity
a) ACROSS TABLES (your example)
b) over rows within a table
without massive locking and serialization - you are very wrong.
Your rule:
... A dept can only be
deactivated if there are no active employees in the department. ...
is implemented very easily:
delete from dept where deptno = 10 returing <columns> into <host variables>
insert into inactive_dept values ( <host variables> );
and you have DECLARITIVE integrity on DEPT so that you cannot delete a dept with emp records
If you believe you can enforce integrity across tables or across rows using application logic
WITHOUT LOCKING TABLES OR LARGE, VERY LARGE, RANGES OF ROWS
you are very wrong.
so, your example here, totally wrong, not a good case to look for a row....
because in a highly concurrent database like Oracle, you cannot see changes made by other transactions - NOR ARE YOU BLOCKED BY THEM.
hence, counting rows is an UTTER WASTE OF TIME, because by the time you count them, the answer may well have CHANGED
(please say - I see..... I got it)
things I (dis)like the most
A reader, February 04, 2008 - 3:35 pm UTC
a) comments that count(1) is better than count(*)
b) frequent index rebuilds help performance
c) raising no error with "when others" expception
d) ppl typing short forms esp. u
anything else to be added to this list ?
February 04, 2008 - 5:06 pm UTC
e) triggers
f) autonomous transactions
g) plsql being able to commit and rollback
of course.
A reader, February 05, 2008 - 9:41 am UTC
What is the reason for disliking (d) and (e) ?
A reader, February 05, 2008 - 9:42 am UTC
sorry
I meant - (f) and (g)
February 05, 2008 - 10:24 am UTC
see above, same reasons as triggers...
A reader, February 05, 2008 - 4:50 pm UTC
thanks, Tom
An alternative to querying dual
Sameer Kumar, August 30, 2012 - 3:02 am UTC
Hi Tom,
Nice post and the followup reply on reviews are even better.
For sometime I have been trying a few alternatives for checking for existence of records.
I found the below statement better (even better than using EXISTS and dual).
select 1 from (select 1 rn from test <where clause> order by 1) tmp where rownum =1;
The cost was 2 where I did not use any where clause.
The cost was 4 with
select 1 from dual where exists (select 1 from test);
The cost on a two int column table with 1048K records
September 10, 2012 - 6:02 pm UTC
cost smost.
what about real world performance?
if you just want to find out if a row exists
select 1 from test where rownum=1;
why even bother with dual in what you are doing????
EXISTS and ROWNUM
A reader, January 09, 2013 - 3:23 am UTC
<code>
select *
from t1
where exists (select *
from t2
where t2.id = t1.id
)
You say that EXISTS will stop after it finds one row, so I guess adding the filter "and ROWNUM = 1" inside the EXISTS query is superfluous ?
The only difference I notice in the EXPLAIN PLAN was an additional "COUNT STOPKEY" step.
Is it "cleaner" to code in the ROWNUM clause?</code>
January 14, 2013 - 12:48 pm UTC
... so I guess adding the filter "and ROWNUM = 1" inside the EXISTS query is superfluous ? ...
yes. and potentially dangerous from a performance perspective since the count stopkey in the plan will reduce the possible alternative plans we can generate.
see how the semantics of the "rownum=1" stuff precluded the rewrite of the query as a join:
ops$tkyte%ORA11GR2> create table t1 ( x int, y int );
Table created.
ops$tkyte%ORA11GR2> create table t2 ( x int, y int );
Table created.
ops$tkyte%ORA11GR2> create index t1_idx on t1(x);
Index created.
ops$tkyte%ORA11GR2> create index t2_idx on t2(x);
Index created.
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> exec dbms_stats.set_table_stats( user, 'T1', numrows => 1000000, numblks => 100000 );
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2> exec dbms_stats.set_table_stats( user, 'T2', numrows => 100 , numblks => 1 );
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> select * from t1 where exists ( select null from t2 where t1.x = t2.x );
Execution Plan
----------------------------------------------------------
Plan hash value: 2334785590
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000K| 37M| 1 (100)| 00:00:01 |
| 1 | MERGE JOIN SEMI | | 1000K| 37M| 1 (100)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID| T1 | 1000K| 24M| 0 (0)| 00:00:01 |
| 3 | INDEX FULL SCAN | T1_IDX | 1 | | 0 (0)| 00:00:01 |
|* 4 | SORT UNIQUE | | 100 | 1300 | 1 (100)| 00:00:01 |
| 5 | INDEX FULL SCAN | T2_IDX | 100 | 1300 | 0 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("T1"."X"="T2"."X")
filter("T1"."X"="T2"."X")
ops$tkyte%ORA11GR2> select * from t1 where exists ( select null from t2 where t1.x = t2.x and rownum = 1);
Execution Plan
----------------------------------------------------------
Plan hash value: 1536002388
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 32 | 832 | 882K (1)| 02:56:25 |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL| T1 | 1000K| 24M| 27123 (1)| 00:05:26 |
|* 3 | COUNT STOPKEY | | | | | |
|* 4 | INDEX RANGE SCAN| T2_IDX | 1 | 13 | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( EXISTS (SELECT 0 FROM "T2" "T2" WHERE ROWNUM=1 AND
"T2"."X"=:B1))
3 - filter(ROWNUM=1)
4 - access("T2"."X"=:B1)
ops$tkyte%ORA11GR2> select * from t2 where exists ( select null from t1 where t1.x = t2.x );
Execution Plan
----------------------------------------------------------
Plan hash value: 2690981463
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100 | 3900 | 2 (0)| 00:00:01 |
| 1 | NESTED LOOPS SEMI | | 100 | 3900 | 2 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL| T2 | 100 | 2600 | 2 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | T1_IDX | 1000K| 12M| 0 (0)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("T1"."X"="T2"."X")
ops$tkyte%ORA11GR2> select * from t2 where exists ( select null from t1 where t1.x = t2.x and rownum = 1);
Execution Plan
----------------------------------------------------------
Plan hash value: 2142150987
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 32 | 832 | 5 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL| T2 | 100 | 2600 | 2 (0)| 00:00:01 |
|* 3 | COUNT STOPKEY | | | | | |
|* 4 | INDEX RANGE SCAN| T1_IDX | 1 | 13 | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( EXISTS (SELECT 0 FROM "T1" "T1" WHERE ROWNUM=1 AND
"T1"."X"=:B1))
3 - filter(ROWNUM=1)
4 - access("T1"."X"=:B1)
ops$tkyte%ORA11GR2> set autotrace off
EXISTS and ROWNUM
A reader, January 15, 2013 - 6:42 am UTC
That was really good as I thought it was "cleaner" code, but you demonstrated it may actually have a negative impact
do something
A reader, June 27, 2013 - 8:12 am UTC
You wrote
so, show me the logic in full that demands you KNOW a record exists (because as I am sure you are fully aware, that the instant you ask the question and get an answer, the answer could change - you might ask "is there a row" and be told "yes" - but when you go back to process it - it isn't there anymore. You might ask "is there a row" and be told no, but if you tried to process this non-existent row, it would be there)......
We have code which test if a database ROLE has been assigned to a user (exists on DBA_ROLE_PRIVS) and if it hasn't then we GRANT the ROLE to the user. In most cases, when this code is run, the ROLE has been granted already.
So I thought I'd try out what you said, removed the check if record exists and just GRANT the ROLE.
What I found was that it was 50% slower by NOT doing the existence check. I guess GRANT ROLE on the fly is an "expensive" operation and hence doing the check was worthwhile is this case.
July 01, 2013 - 9:06 pm UTC
in this case - since you don't expect anyone else to be doing a grant I presume - doing the check would cost less than running the grant role code, yes.
but then again, since this would be a one time operation, and you wouldn't be doing this hundreds or thousands of times in a set (seems safe to assume), a 50% slower bit of code that runs faster than we blink doesn't seem to be an issue either.