>So given that millions of record on both the table,
> does a scalar query scale up?
>Scalar subquery caching are good for response time
> and not for through put.
Of course, you're right that often hash join is much better than SSC, but from my experience:
1) The number of transactions is much more than number of merchants;
2) Usually just 5% of merchants generates 95% of transactions
PS. btw, your example is not quite relevant to the question:
1) your query is a best subject for groupby placement;
2) on versions> 12.1 your subquery from second query could be unnested and group-by placement can move it into group-by just by second table,
so CBO can rewrite your query as the first one and both of them will have the same plan.
Check it:
SQL> explain plan for
2 select t1.object_id, ( select max(t2.object_id)
3 from t2
4 where t2.object_id = t1.object_id ) max_id
5 from t1
6 ;
Explained.
SQL> @xplan
P_FORMAT
----------------
typical
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------
Plan hash value: 3931102600
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 75299 | 2279K| | 745 (1)| 00:00:01 |
|* 1 | HASH JOIN OUTER | | 75299 | 2279K| 1256K| 745 (1)| 00:00:01 |
| 2 | INDEX FAST FULL SCAN| T1_PK | 75299 | 367K| | 44 (0)| 00:00:01 |
| 3 | VIEW | VW_SSQ_1 | 76296 | 1937K| | 502 (1)| 00:00:01 |
| 4 | HASH GROUP BY | | 76296 | 372K| 2672K| 502 (1)| 00:00:01 |
| 5 | INDEX FULL SCAN | T2_IDX | 225K| 1103K| | 502 (1)| 00:00:01 |
------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("ITEM_1"(+)="T1"."OBJECT_ID")
As you see, group-by was done just on T2 table and after that it was outer-joined to T1.
You can also check trace 10053 to see how it looks after transformations:
set echo on;
alter session set MAX_DUMP_FILE_SIZE = unlimited;
alter session set tracefile_identifier='grby';
alter session set events '10053 trace name context forever, level 1';
select t1.object_id, ( select max(t2.object_id) from t2 where t2.object_id = t1.object_id ) max_id from t1;
set echo off;
disc;
--after that from the trace:
Final query after transformations:******* UNPARSED QUERY IS *******
SELECT "T1"."OBJECT_ID" "OBJECT_ID","VW_SSQ_1"."MAX(T2.OBJECT_ID)" "MAX_ID"
FROM (SELECT MAX("T2"."OBJECT_ID") "MAX(T2.OBJECT_ID)","T2"."OBJECT_ID" "ITEM_1"
FROM "XTENDER"."T2" "T2" GROUP BY "T2"."OBJECT_ID") "VW_SSQ_1","XTENDER"."T1" "T1"
WHERE "VW_SSQ_1"."ITEM_1"(+)="T1"."OBJECT_ID"
The same transformed query you would get with first your query.