Hi,
I have a query that fetches data from a remote database and this data is not what i was expecting (less data or no data). I did try to understand what is causing the issue and noticed the query was being transformed while run on the remote db.
Based on what i see, either this is a bug or my misunderstanding of left outer joins.
So, i will provide my question on left outer join and as well as my observation on the query transformation.
Understanding on left outer join:
The filter/predicate on the driving table in the outer joins will be applied only while doing the join. That will not be applied on the driving table as a whole.
Example:
WITH rawdata AS
(SELECT rownum nbr FROM dual CONNECT BY LEVEL < 11)
SELECT f.nbr f_nbr,s.nbr s_nbr
FROM rawdata f
LEFT OUTER JOIN rawdata s
ON (MOD(f.nbr, 2) = 0 AND s.nbr = f.nbr)
ORDER BY 1 ASC;
F_NBR S_NBR
1
2 2
3
4 4
5
6 6
7
8 8
9
10 10
Here the condition MOD(f.nbr, 2) = 0 is applied only while joining the "rawdata s" and hence we are able to get all the 10 rows from the first table.
I think my understanding here is good. Let me know if i am wrong.
Now coming to the problem i encountered:
The below question is based on my understanding above.
I have a query that queries data from the remote database. The results i get are not as expected (less rows or no rows).
Main DB (Where the sql is initiated): 12.1.0.2.0
Remote DB : 10.2.0.3.0
Query from Main db:
SELECT rh.code AS cd
,rh.effecdate AS effdate
FROM test_tab1@test_dbl rh
LEFT JOIN test_tab2@test_dbl price
ON rh.code = price.code
AND rh.re_date = price.re_date
AND rh.re_do = 'Y'
WHERE rh.e_type IN ('CD', 'SS','AB')
AND to_char(rh.effecdate, 'YYYY-MM-DD') =
to_char(sysdate, 'YYYY-MM-DD')
AND nvl(rh.d_type, 'XX') != 'DD'
AND rh.date_delet IS NULL
ORDER BY effdate, cd;
Query from remote db: (This i got from v$sql)
SELECT "A2"."CODE", "A2"."EFFECDATE"
FROM "TEST_TAB1" "A2", "TEST_TAB2" "A1"
WHERE ("A2"."E_TYPE" = 'AB' OR "A2"."E_TYPE" = 'CD' OR "A2"."E_TYPE" = 'SS')
AND to_char("A2"."EFFECDATE", 'YYYY-MM-DD') = to_char(SYSDATE@ !, 'YYYY-MM-DD')
AND nvl("A2"."D_TYPE", 'XX') <> 'DD'
AND "A2"."DATE_DELET" IS NULL
AND "A2"."RE_DATE" = "A1"."RE_DATE"(+)
AND "A2"."CODE" = "A1"."CODE"(+)
AND "A2"."RE_DO" = CASE
WHEN ("A1"."CODE"(+) IS NOT NULL) THEN
'Y'
ELSE
'Y'
END
ORDER BY "A2"."EFFECDATE", "A2"."CODE"
The problem i see here is with the case statement. I think it should have been as below.
AND "A2"."RE_DO" = CASE
WHEN ("A1"."CODE"(+) IS NOT NULL) THEN
'Y'
ELSE
"A2"."RE_DO"
END
Is this a bug?
Predicates specified on the left table in an ON clause of an ANSI JOIN are evaluated as part of the join. You can actually see this in note section underneath the execution plan, as these predicates appear as an ACCESS predicate rather than a FILTER predicate.
Let’s take your original example:
WITH rawdata AS
(SELECT rownum nbr FROM dual CONNECT BY LEVEL < 11)
SELECT f.nbr f_nbr,s.nbr s_nbr
FROM rawdata f LEFTOUTER JOIN rawdata s
ON (MOD(f.nbr, 2) = 0 AND s.nbr = f.nbr)
ORDER BY 1 ASC;
F_NBR S_NBR
---------- ----------
1
2 2
3
4 4
5
6 6
7
8 8
9
10 10
10 rows selected.
SQL> select * from table(dbms_xplan.display_cursor());
-----------------------------------------------------------------------------
| Id | Operation | Name |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | TEMP TABLE TRANSFORMATION | |
| 2 | LOAD AS SELECT (CURSOR DURATION MEMORY)|SYS_TEMP_0FD9D669A_57D102 |
| 3 | COUNT | |
| 4 | CONNECT BY WITHOUT FILTERING | |
| 5 | FAST DUAL | |
| 6 | SORT ORDER BY | |
|* 7 | HASH JOIN OUTER | |
| 8 | VIEW | |
| 9 | TABLE ACCESS FULL | SYS_TEMP_0FD9D669A_57D102|
| 10 | VIEW | |
| 11 | TABLE ACCESS FULL | SYS_TEMP_0FD9D669A_57D102|
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
7 - access("S"."NBR"="F"."NBR" AND MOD("F"."NBR",2)=CASE
WHEN ("S"."NBR" IS NOT NULL) THEN 0 ELSE 0 END )
If we look at the predicate information under the plan, we see an access predicate for the LEFT OUTER JOIN that joins S and F on NBR and also the conversion of the MOD(f.nbr,2)=0 predicate to a CASE statement, where both branches of the CASE have 0 as the value.
In Oracle, ANSI left outer joins are internally expressed in terms of Oracle’s left outer join syntax either directly or via a left outer joined lateral views.
We convert any single-table filter predicate on the left table which appears in the ON clause of a left outer join into a pseudo-outer-join predicate using CASE. Although it is a filter predicate, it appears in the ON clause of left outer join; and hence, it must be treated like an outer-join predicate. For outer-joins, it matters if a predicate is applied before the outer-join on the left table or with the outer-join evaluation, since left outer joins return all the rows of the left table irrespective of whether join predicates evaluates to TRUE or FALSE.
This is the reason we convert this type of filter predicate into pseudo-outer-join predicates, thereby forcing the filter/pseudo-outer-join predicate to be evaluated with other outer join predicates.
So, your example query will actually be rewritten as
WITH rawdata AS
(SELECT rownum nbr FROM dual CONNECT BY LEVEL < 11)
SELECT f.nbr f_nbr,s.nbr s_nbr
FROM rawdata f, rawdata s
WHERE f.nbr = s.nbr (+)
AND MOD(f.nbr,2)= (CASE WHEN (s.nbr(+) IS NOT NULL)
THEN 0
ELSE 0);
If the filter predicate MOD(f.nbr,2)=0 was applied before the left outer join, it would produce only 5 rows, but the left outer join must return all the rows from the left table hence the CASE statement.
Let’s now move on to your actual workload query.
When you execute a left outer join over a DB Link the transformation will happen on the database where the query is executed, and the transformed query will be sent over the DB Link, which is why you see the Oracle outer join syntax on the remote database. In your case the query is executed on the 12.1.0.2 database. In Oracle Database 12.1.0.2 we re-architected the ANSI JOIN optimization to support multiple outer join and improve performance.
The transformed SQL statement you saw, doesn’t look unexpected but that doesn’t mean you haven't encountered a bug. Have you tried running the original SQL statement directly on the remote 10g system to see if the result is different?