Skip to Main Content
  • Questions
  • Difference between Correlated and Non-Correlated Subqueries

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Santhosh.

Asked: March 20, 2017 - 6:41 am UTC

Last updated: March 29, 2017 - 11:14 am UTC

Version: 11G

Viewed 10K+ times! This question is

You Asked

Hi,

In Many Website and Question answer communities like Quora etc i read about difference between Non and Co-related Sub queries, the basic difference is Co-relate execute outer query first then sub query
Example

select * from departments dep 
where employee_id in (select employee_id from employees emp where emp.employee_id = dep.employee_id)


but non co-related query executes inner query first then outer, for example

select * from departments 
where employee_id in (select employee_id from employees);


So my question are
1.how does Oracle have intelligence to judge whether the query is a Co-related or Non Co-realted?
2.Does it Follow the Same select statement Execution process for all types of Sub-queries like From cluse excution then Where clause execution then Group by,Having, Order By?

and Chris said...

Correlated vs. non-correlated isn't to do with how Oracle Database processes the SQL. It's just whether or not you include a column from the outer query in the subquery.

A subquery is correlated if you have a column from one or more parent tables in the subquery. So:

select * from departments dep 
where employee_id in (select employee_id from employees emp where emp.employee_id = dep.employee_id)


is correlated because dep.employee is in the nested query. Whereas:

select * from departments 
where employee_id in (select employee_id from employees);


is uncorrelated because there is no reference to departments in the subquery.

Logically the database executes the outer query then finds the matching rows in the inner query. But in practice the optimizer can rewrite it to process them in a different way.

In general the optimizer attempts to access the table with the smallest result set first. For example, below I create two tables. One with 10 rows, one with 1,000. It doesn't matter which table is the parent and which is the subquery. The optimizer chooses to access the smaller table first:

create table t1 as
  select rownum x, 'xxxx' y from dual connect by level <= 1000;

create table t2 as
  select rownum x, 'yyyy' y from dual connect by level <= 10;
 
exec dbms_stats.gather_table_stats(user, 't1');
exec dbms_stats.gather_table_stats(user, 't2');

set serveroutput off
alter session set statistics_level = all;

select * from t1
where  t1.x in (select t2.x from t2);

select * 
from   table(dbms_xplan.display_cursor(null, null, 'BASIC LAST'));

PLAN_TABLE_OUTPUT                                      
EXPLAINED SQL STATEMENT:                               
------------------------                               
select * from t1 where  t1.x in (select t2.x from t2)  
                                                       
Plan hash value: 1167709942                            
                                                       
-------------------------------------                  
| Id  | Operation            | Name |                  
-------------------------------------                  
|   0 | SELECT STATEMENT     |      |                  
|   1 |  HASH JOIN RIGHT SEMI|      |                  
|   2 |   TABLE ACCESS FULL  | T2   |                  
|   3 |   TABLE ACCESS FULL  | T1   |                  
-------------------------------------   

select * from t2
where  t2.x in (select t1.x from t1);

select * 
from   table(dbms_xplan.display_cursor(null, null, 'BASIC LAST'));

PLAN_TABLE_OUTPUT                                      
EXPLAINED SQL STATEMENT:                               
------------------------                               
select * from t2 where  t2.x in (select t1.x from t1)  
                                                       
Plan hash value: 4118005679                            
                                                       
-----------------------------------                    
| Id  | Operation          | Name |                    
-----------------------------------                    
|   0 | SELECT STATEMENT   |      |                    
|   1 |  HASH JOIN SEMI    |      |                    
|   2 |   TABLE ACCESS FULL| T2   |                    
|   3 |   TABLE ACCESS FULL| T1   |                    
-----------------------------------


In both cases it's turned the subquery into a join. Due to this rewriting it may access the tables in a different order than you expect. The statement as a whole will still process from clauses first, then where, etc. But you can't say for sure how the subquery slots into this just by looking at it.

Further reading: Using Subqueries:

http://docs.oracle.com/database/122/SQLRF/Using-Subqueries.htm#SQLRF52357

Rating

  (2 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Process of Execution of sub queries

santhoshreddy podduturi, March 28, 2017 - 6:07 am UTC

is the difference they explained not correct always?
Chris Saxon
March 29, 2017 - 11:14 am UTC

Correlated vs non-correlated is nothing to do with the order of execution. Just whether the inner query references table(s) in the outer one.

A reader, November 07, 2017 - 7:12 am UTC


More to Explore

Performance

Get all the information about database performance in the Database Performance guide.