Tom,
Thanks again for all the support you give to the Oracle community.
I have a somewhat simple issue with the CBO choosing a nested loops join over a hash join. This whole issue came about while we were implementing rls, but I am able to reproduce it in a much smaller, simpler fashion. Here goes:
create cluster hash_cluster
(hash_key varchar2(20))
hashkeys 1000
size 8192
/
create table t1 (
username varchar2(20) not null,
countrycode varchar2(7)
)
cluster hash_cluster(username)
/
insert into t1 select 'TOM', to_char(level * 2) from dual connect by level <= 10;
create table t2 (
countrycode varchar2(7),
attempts number
)
/
insert /*+ append */ into t2
select to_char(mod(level, 150)),
mod(level, 5000)
from dual
connect by level <= 500000
/
exec dbms_stats.gather_table_stats(user, 'T1');
exec dbms_stats.gather_table_stats(user, 'T2');
set autotrace traceonly
select sum(attempts)
from t2
where countrycode in (select countrycode from t1 where username = 'TOM')
/
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 165 (8)| 00:00:02 |
| 1 | SORT AGGREGATE | | 1 | 13 | | |
| 2 | NESTED LOOPS SEMI | | 32791 | 416K| 165 (8)| 00:00:02 |
| 3 | TABLE ACCESS FULL| T2 | 495K| 3384K| 165 (8)| 00:00:02 |
|* 4 | TABLE ACCESS HASH| T1 | 1 | 6 | | |
----------------------------------------------------------------------------
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
64315 consistent gets
Now, if I hint the correlated subquery, we get what we expect:
where countrycode in (select /*+ USE_HASH(t) */ countrycode from t1 t where username = 'TOM')
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 173 (13)| 00:00:03 |
| 1 | SORT AGGREGATE | | 1 | 13 | | |
|* 2 | HASH JOIN RIGHT SEMI| | 32791 | 416K| 173 (13)| 00:00:03 |
|* 3 | TABLE ACCESS HASH | T1 | 10 | 60 | | |
| 4 | TABLE ACCESS FULL | T2 | 495K| 3384K| 165 (8)| 00:00:02 |
------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
856 consistent gets
Is it that I am gathering stats wrong on the cluster? If we replace the cluster table with a normal heap table, we get the expected results without the hint:
create table t as select * from t1;
exec dbms_stats.gather_table_stats(user, 'T');
select sum(attempts)
from t2
where countrycode in (select countrycode from t where username = 'TOM')
/
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
856 consistent gets
Thanks,
Chris
It is missing information on the hash cluster itself.set
ops$tkyte%ORA10GR2> set autotrace traceonly
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select sum(attempts)
2 from t2
3 where countrycode in (select countrycode from t1 where username = 'TOM')
4 /
Execution Plan
----------------------------------------------------------
Plan hash value: 3679504953
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 299 (11)| 00:00:02 |
| 1 | SORT AGGREGATE | | 1 | 13 | | |
| 2 | NESTED LOOPS SEMI | | 33155 | 420K| 299 (11)| 00:00:02 |
| 3 | TABLE ACCESS FULL| T2 | 500K| 3422K| 299 (11)| 00:00:02 |
|* 4 | TABLE ACCESS HASH| T1 | 1 | 6 | | |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("USERNAME"='TOM')
filter("COUNTRYCODE"="COUNTRYCODE")
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
21025 consistent gets
0 physical reads
0 redo size
419 bytes sent via SQL*Net to client
400 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
ops$tkyte%ORA10GR2> analyze cluster hash_cluster compute statistics;
Cluster analyzed.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select sum(attempts)
2 from t2
3 where countrycode in (select countrycode from t1 where username = 'TOM')
4 /
Execution Plan
----------------------------------------------------------
Plan hash value: 3556062935
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 12 | 317 (15)| 00:00:02 |
| 1 | SORT AGGREGATE | | 1 | 12 | | |
|* 2 | HASH JOIN RIGHT SEMI| | 33155 | 388K| 317 (15)| 00:00:02 |
|* 3 | TABLE ACCESS HASH | T1 | 10 | 50 | 1 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL | T2 | 500K| 3422K| 299 (11)| 00:00:02 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("COUNTRYCODE"="COUNTRYCODE")
3 - access("USERNAME"='TOM')
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
882 consistent gets
0 physical reads
0 redo size
419 bytes sent via SQL*Net to client
400 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed