With first_rows_N, you're instructing the database to optimize the query as if it only fetches N rows. Regardless of how many it really gets. This will tend the optimizer to choose a nested loop over a hash or merge join, as this example shows:
alter session set optimizer_mode = first_rows_10;
set feed only
select /*+ gather_plan_statistics */* from hr.employees e
join hr.departments d
on d.department_id = e.department_id;
set feed on
select *
from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
SQL_ID gu93rxp7dc7uz, child number 0
-------------------------------------
select /*+ gather_plan_statistics */* from hr.employees e join
hr.departments d on d.department_id = e.department_id
Plan hash value: 1021246405
------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 106 |00:00:00.01 | 25 |
| 1 | NESTED LOOPS | | 1 | 25 | 106 |00:00:00.01 | 25 |
| 2 | NESTED LOOPS | | 1 | 25 | 106 |00:00:00.01 | 14 |
| 3 | TABLE ACCESS FULL | DEPARTMENTS | 1 | 27 | 27 |00:00:00.01 | 8 |
|* 4 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 27 | 8 | 106 |00:00:00.01 | 6 |
| 5 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 106 | 8 | 106 |00:00:00.01 | 11 |
------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("D"."DEPARTMENT_ID"="E"."DEPARTMENT_ID")
alter session set optimizer_mode = all_rows;
set feed only
select /*+ gather_plan_statistics */* from hr.employees e
join hr.departments d
on d.department_id = e.department_id;
set feed on
select *
from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
SQL_ID c5rucucazuum7, child number 1
-------------------------------------
select /*+ gather_plan_statistics */* from hr.employees e join
hr.departments d on d.department_id = e.department_id
Plan hash value: 1343509718
---------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 106 |00:00:00.01 | 13 | | | |
| 1 | MERGE JOIN | | 1 | 106 | 106 |00:00:00.01 | 13 | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| DEPARTMENTS | 1 | 27 | 27 |00:00:00.01 | 6 | | | |
| 3 | INDEX FULL SCAN | DEPT_ID_PK | 1 | 27 | 27 |00:00:00.01 | 3 | | | |
|* 4 | SORT JOIN | | 27 | 107 | 106 |00:00:00.01 | 7 | 22528 | 22528 |20480 (0)|
| 5 | TABLE ACCESS FULL | EMPLOYEES | 1 | 107 | 107 |00:00:00.01 | 7 | | | |
---------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("D"."DEPARTMENT_ID"="E"."DEPARTMENT_ID")
filter("D"."DEPARTMENT_ID"="E"."DEPARTMENT_ID")
This is because a nested loop can pass each row up to the next stage in the plan as soon as it's processed. So the query can stop once it's joined N rows. Whereas a hash join must access all the rows in the driving table before accessing the other table.
But, if the join returns no rows, it still must process the whole driving table. As you've found if the data sets are "large" this can be significantly slower than other join methods.