I am trying to join (outer join) a number of IOT's all with the same key in Parallel Mode; and am expecting the optimizer to use a Merge Join in order to avoid sorts (as the documentation clearly says it should); but it does not. - I'm tracing the problem and found that the IOT's which have been created with default Degree of Parallelism, the IOT Indexes themselves are only Single, i.e. No Parallelism. - Using the NO_PARALLEL hint is interesting, but it still avoids Merge Joins. I get the same results with 12.2.0.1.0 - My question are:-
1.) How can I ensure that the IOT Primary Indexes are in Parallel Mode?
2.) Why does the CBO still avoid the Merge Join, when the sources are pre-sorted?
Thanks, Elwyn
CREATE TABLE TEST1 (
MyKey INTEGER,
MyData VARCHAR2(20),
CONSTRAINT X1 PRIMARY KEY (MyKey)
) ORGANIZATION INDEX PARALLEL NOLOGGING;
/
CREATE TABLE TEST2 (
MyKey INTEGER,
MyData VARCHAR2(20),
CONSTRAINT X2 PRIMARY KEY (MyKey)
) ORGANIZATION INDEX PARALLEL NOLOGGING;
/
select INDEX_NAME, DEGREE, INSTANCES from USER_INDEXES where INDEX_NAME like 'X%';
/
select /*+ NO_PARALLEL USE_MERGE(t1 t2) */ *
from TEST1 t1, TEST2 t2
where t1.MyKey=t2.MyKey(+);
/
I feel like you're approaching this backwards. It's not the case that joining two IOTs guarantees a merge join. More that reading data from an index means the database doesn't have to sort.
As the docs say:
Because of sorts required by other operations, the optimizer finds it cheaper to use a sort merge.
If an index exists, then the database can avoid sorting the first data set. However, the database always sorts the second data set, regardless of indexes. https://docs.oracle.com/en/database/oracle/oracle-database/18/tgsql/joins.html#GUID-3F935776-FE28-4350-9FA4-E6B47489156E Given your example, the database chooses a nested loop because it has a lower cost:
CREATE TABLE TEST1 (
MyKey INTEGER,
MyData VARCHAR2(20),
CONSTRAINT X1 PRIMARY KEY (MyKey)
) ORGANIZATION INDEX PARALLEL NOLOGGING;
CREATE TABLE TEST2 (
MyKey INTEGER,
MyData VARCHAR2(20),
CONSTRAINT X2 PRIMARY KEY (MyKey)
) ORGANIZATION INDEX PARALLEL NOLOGGING;
set serveroutput off
select /*+ USE_MERGE(t1 t2) */ *
from TEST1 t1, TEST2 t2
where t1.MyKey=t2.MyKey(+);
select *
from table(dbms_xplan.display_cursor(null, null, 'BASIC +COST LAST'));
PLAN_TABLE_OUTPUT
EXPLAINED SQL STATEMENT:
------------------------
select /*+ USE_MERGE(t1 t2) */ * from TEST1 t1, TEST2 t2 where
t1.MyKey=t2.MyKey(+)
Plan hash value: 786864782
----------------------------------------------------
| Id | Operation | Name | Cost (%CPU)|
----------------------------------------------------
| 0 | SELECT STATEMENT | | 5 (100)|
| 1 | MERGE JOIN OUTER | | 5 (20)|
| 2 | SORT JOIN | | 2 (0)|
| 3 | INDEX FULL SCAN | X1 | 2 (0)|
| 4 | SORT JOIN | | 3 (34)|
| 5 | INDEX FAST FULL SCAN| X2 | 2 (0)|
----------------------------------------------------
select /*+ NO_PARALLEL */ *
from TEST1 t1, TEST2 t2
where t1.MyKey=t2.MyKey(+);
select *
from table(dbms_xplan.display_cursor(null, null, 'BASIC +COST LAST'));
PLAN_TABLE_OUTPUT
EXPLAINED SQL STATEMENT:
------------------------
select /*+ NO_PARALLEL */ * from TEST1 t1, TEST2 t2 where
t1.MyKey=t2.MyKey(+)
Plan hash value: 2370096979
---------------------------------------------------
| Id | Operation | Name | Cost (%CPU)|
---------------------------------------------------
| 0 | SELECT STATEMENT | | 2 (100)|
| 1 | NESTED LOOPS OUTER | | 2 (0)|
| 2 | INDEX FAST FULL SCAN| X1 | 2 (0)|
| 3 | INDEX UNIQUE SCAN | X2 | 0 (0)|
---------------------------------------------------
So the real question is:
Why do you
need a merge join? What's the problem with the nested loops/hash join you currently have? And how does a merge join "fix" this?
On the parallel point: if you want to ensure the query runs in parallel, add the parallel hint. If you've enabled Automatic Degree of Parallelism (Auto DOP) the database will determine whether or not to use parallel execution.
Read more about this at:
https://blogs.oracle.com/datawarehousing/what-is-auto-dop