Skip to Main Content
  • Questions
  • Issue with OR expanding and subquery

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, Kim.

Asked: January 18, 2017 - 2:10 pm UTC

Last updated: January 18, 2017 - 11:48 pm UTC

Version: 11.2.0.4 EE

Viewed 1000+ times

You Asked

Hi Tom 

I have an issue with OR expansion. Using literals, the OR conditions are expanded neatly.
However, I am unable to make the optimizer come up with a plan that uses the index, when I use a subquery.

drop table t1;
drop table t2;

create table t1 as 
select object_id id1, data_object_id id2 
from   all_objects where object_id is not null;

alter table t1 add constraint t1_pk primary key (id1);
create index t1_i1 on t1 (id2);

create table t2 as 
select id1, id2  
from   t1;

create index t2_i1 on t2 (id2);

select * 
from t1
where  id1 in ( 1, 2, 3)
or     id2 in ( 4, 5 ); 

select * from table(dbms_xplan.display_cursor);

SQL_ID  3nrmyb6ku7cqa, child number 0
-------------------------------------
select *  from t1 where  id1 in ( 1, 2, 3) or     id2 in ( 4, 5 )
 
Plan hash value: 3986143333
 
---------------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |       |       |     8 (100)|          |
|   1 |  CONCATENATION                |       |       |       |            |          |
|   2 |   INLIST ITERATOR             |       |       |       |            |          |
|   3 |    TABLE ACCESS BY INDEX ROWID| T1    |     1 |    26 |     3   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | T1_I1 |    41 |       |     2   (0)| 00:00:01 |
|   5 |   INLIST ITERATOR             |       |       |       |            |          |
|*  6 |    TABLE ACCESS BY INDEX ROWID| T1    |   819 | 21294 |     5   (0)| 00:00:01 |
|*  7 |     INDEX UNIQUE SCAN         | T1_PK |    41 |       |     4   (0)| 00:00:01 |
---------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   4 - access(("ID2"=4 OR "ID2"=5))
   6 - filter((LNNVL("ID2"=4) AND LNNVL("ID2"=5)))
   7 - access(("ID1"=1 OR "ID1"=2 OR "ID1"=3))
 
Note
-----
   - dynamic sampling used for this statement (level=2)
 


select * 
from t1
where  id1 in (select id1 from t2 where id2 = 3)
or     id2 in ( 2, 3 ); 

select * from table(dbms_xplan.display_cursor);

SQL_ID  316pgp8n8jmup, child number 0
-------------------------------------
select *  from t1 where  id1 in (select id1 from t2 where id2 = 3) or   
  id2 in ( 2, 3 )
 
Plan hash value: 3689097399
 
--------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |       |       |    37 (100)|          |
|*  1 |  FILTER                      |       |       |       |            |          |
|   2 |   TABLE ACCESS FULL          | T1    | 83567 |  2121K|    37   (3)| 00:00:01 |
|*  3 |   TABLE ACCESS BY INDEX ROWID| T2    |     1 |    26 |     2   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN          | T2_I1 |   318 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter((INTERNAL_FUNCTION("ID2") OR  IS NOT NULL))
   3 - filter("ID1"=:B1)
   4 - access("ID2"=3)
 
Note
-----
   - dynamic sampling used for this statement (level=2)



with x as ( select /*+ mateialize */ id1 from t2 where id2 = 3 )
select * 
from t1
where  id1 in ( select id1 from x )
or     id2 in ( 2, 3 ); 

select * from table(dbms_xplan.display_cursor);

SQL_ID  0fwy3zm22fmuy, child number 0
-------------------------------------
with x as ( select /*+ mateialize */ id1 from t2 where id2 = 3 ) select 
*  from t1 where  id1 in ( select id1 from x ) or     id2 in ( 2, 3 )
 
Plan hash value: 3689097399
 
--------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |       |       |    37 (100)|          |
|*  1 |  FILTER                      |       |       |       |            |          |
|   2 |   TABLE ACCESS FULL          | T1    | 83567 |  2121K|    37   (3)| 00:00:01 |
|*  3 |   TABLE ACCESS BY INDEX ROWID| T2    |     1 |    26 |     2   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN          | T2_I1 |   318 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter((INTERNAL_FUNCTION("ID2") OR  IS NOT NULL))
   3 - filter("ID1"=:B1)
   4 - access("ID2"=3)
 
Note
-----
   - dynamic sampling used for this statement (level=2)
 


Why is the optimizer insisting on full table scanning T1 ?


Br,
Kim 


with LiveSQL Test Case:

and we said...

In Oracle Database 11g Release 2 OR-Expansion is a heuristic base query transformation and we don’t consider statements with subqueries for OR-Expansion since we can not unnest the subquery that would end up in one of the branching.

Starting in Oracle Database 12c R2, OR-Expansion is a cost-based transformation, which means both OR-Expansion and subquery unnesting could take place in the same query. Making it possible to be used in your example.

More information on the cost-based OR expansion transformation can be found in the "What to expect from the Optimizer when upgrading from 11g to 12c" whitepaper, http://www.oracle.com/technetwork/database/bi-datawarehousing/twp-optimizer-with-oracledb-12c-1963236.pdf

Without OR-Expansion, the Optimizer selects a full table scan of T1 in your second and third query, because of the cost. An index scan of T1_PK followed by a table access by ROWID is a lot more expensive.

You can see the costs if you generate a 10053 trace or if you force the index plan with a hint.

select /*+ index(t1) */ *
  2  from t1
  3  where  id1 in (select id1 from t2 where id2 = 3)
  4  or     id2 in ( 2, 3 );

  select * from table(dbms_xplan.display_cursor);

----------------------------------------------------------------------------------------------
| Id  | Operation        | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |      |      |   49 (100)|      |
|*  1 |  FILTER         |      |      |      |    |      |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T1    | 8192 |  208K|   49   (7)| 00:00:01 |
|   3 |    INDEX FULL SCAN       | T1_PK | 8192 |      |   20   (5)| 00:00:01 |
|*  4 |   TABLE ACCESS BY INDEX ROWID BATCHED| T2    |    1 |   26 |    2   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN       | T2_I1 |    1 |      |    1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------


==== Abstract from 10053 Trace File ============================
  Table: T1  Alias: T1
    Card: Original: 8193.000000  Rounded: 8193  Computed: 8193.000000  Non Adjusted: 8193.000000
  Scan IO  Cost (Disk) =   7.000000
  Scan CPU Cost (Disk) =   1542360.240000
  Cost of predicates:
    io = 0.000000, cpu = 69.800000, sel = 0.059500 flag = 2048  (OR chain)
      io = NOCOST, cpu = 50.000000, sel = 0.010000 flag = 2049  (Inlist)
        io = NOCOST, cpu = NOCOST, sel = NOSEL flag = 0  ("T1"."ID2"=2)
        io = NOCOST, cpu = NOCOST, sel = NOSEL flag = 0  ("T1"."ID2"=3)
      io = NOCOST, cpu = 20.000000, sel = 0.050000 flag = 2562  ( EXISTS (SELECT 0 FROM "T2" "T2" WHERE "T2"."ID2"=3 AND "T2"."ID1"=:B1))
  Total Scan IO  Cost  =   7.000000 (scan (Disk))
                         + 0.000000 (io filter eval) (= 0.000000 (per row) * 8193.000000 (#rows))
                       =   7.000000
  Total Scan CPU  Cost =   1542360.240000 (scan (Disk))
                         + 571871.400000 (cpu filter eval) (= 69.800000 (per row) * 8193.000000 (#rows))
                       =   2114231.640000
  Access Path: TableScan
    Cost:  8.761860  Resp: 8.761860  Degree: 0
      Cost_io: 7.000000  Cost_cpu: 2114232
      Resp_io: 7.000000  Resp_cpu: 2114232
  ****** trying bitmap/domain indexes ******
 
****** Costing Index T1_PK
  Access Path: index (FullScan)
    Index: T1_PK
    resc_io: 19.000000  resc_cpu: 1773907
    ix_sel: 1.000000  ix_sel_with_filters: 1.000000
    Cost: 20.478256  Resp: 20.478256  Degree: 0
  ****** finished trying bitmap/domain indexes ******
  Best:: AccessPath: TableScan
         Cost: 8.761860  Degree: 1  Resp: 8.761860  Card: 8193.000000  Bytes: 0.000000


Rating

  (3 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

from Tom

Rajeshwaran Jeyabal, January 19, 2017 - 5:46 am UTC

Thank You

Kim, January 19, 2017 - 7:27 am UTC

Thank You Maria, thank was very helpful.

Of course, our real issue is with large tables, where the full table scan is very costly. So when we get to 12c, the optimizer should choose the index based OR expansion.

Br,
Kim

Query optimization for View predicate pushing

Rajeshwaran Jeyabal, December 09, 2023 - 2:21 pm UTC

The below example is modelled like our application scenario.

create table t1 as select * from all_users;
create table t2 as select * from all_users;
create table t3 as select * from all_users;
create table t4 as select * from all_users;
alter table t1 add constraint t1_pk primary key(user_id);
alter table t2 add constraint t2_pk primary key(user_id);
alter table t3 add constraint t3_pk primary key(user_id);
alter table t4 add constraint t4_pk primary key(user_id);
create unique index t4_idx on t4(username);
create or replace view v 
as 
select 'A' as id ,
    user_id as x,
    to_number(null) as y,
    to_number(null) as z
from t1 
union all     
select 'B' as id ,
    to_number(null) as x,
    user_id as y,
    to_number(null) as z
from t2 
union all     
select 'C' as id ,
    to_number(null) as x,
    to_number(null) as y,
    user_id as z
from t3;

when we do filter on view using literals/bind variables - it produces range scan on the required query blocks within union all structure and it looks optimal.


rajesh@DECSDEV> select * from v where x = 55;
no rows selected
rajesh@DECSDEV> select * from table( dbms_xplan.display_cursor(format=>'allstats last') );
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  3abhs08c4sztr, child number 0
-------------------------------------
select * from v where x = 55
Plan hash value: 580289717
---------------------------------------------------------------------------------------
| Id  | Operation           | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |       |      1 |        |      0 |00:00:00.01 |       1 |
|   1 |  VIEW               | V     |      1 |      3 |      0 |00:00:00.01 |       1 |
|   2 |   UNION-ALL         |       |      1 |        |      0 |00:00:00.01 |       1 |
|*  3 |    INDEX UNIQUE SCAN| T1_PK |      1 |      1 |      0 |00:00:00.01 |       1 |
|*  4 |    FILTER           |       |      1 |        |      0 |00:00:00.01 |       0 |
|   5 |     INDEX FULL SCAN | T2_PK |      0 |     81 |      0 |00:00:00.01 |       0 |
|*  6 |    FILTER           |       |      1 |        |      0 |00:00:00.01 |       0 |
|   7 |     INDEX FULL SCAN | T3_PK |      0 |     81 |      0 |00:00:00.01 |       0 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
  3 - access("USER_ID"=55)
  4 - filter(NULL IS NOT NULL)
  6 - filter(NULL IS NOT NULL)


but when we supply a subquery in-place of literal/bind variables, the optimizer is not pushing the results of the subquery (or merging the subquery) to the required query block to produced range-scan but instead it ended up with INDEX-Full scan

rajesh@DECSDEV> select * from v where x in ( select user_id from t4 where username ='SYS' ) and x is not null;
I          X          Y          Z
- ---------- ---------- ----------
A          0
rajesh@DECSDEV> select * from table( dbms_xplan.display_cursor(format=>'allstats last') );
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  c2x0xtd892bsd, child number 0
-------------------------------------
select * from v where x in ( select user_id from t4 where username
='SYS' ) and x is not null
Plan hash value: 2986897572
-------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |      1 |        |      1 |00:00:00.01 |       4 |
|   1 |  NESTED LOOPS                |        |      1 |      1 |      1 |00:00:00.01 |       4 |
|   2 |   TABLE ACCESS BY INDEX ROWID| T4     |      1 |      1 |      1 |00:00:00.01 |       2 |
|*  3 |    INDEX UNIQUE SCAN         | T4_IDX |      1 |      1 |      1 |00:00:00.01 |       1 |
|*  4 |   VIEW                       | V      |      1 |      1 |      1 |00:00:00.01 |       2 |
|   5 |    UNION-ALL                 |        |      1 |        |     81 |00:00:00.01 |       2 |
|   6 |     INDEX FULL SCAN          | T1_PK  |      1 |     81 |     81 |00:00:00.01 |       2 |
|*  7 |     FILTER                   |        |      1 |        |      0 |00:00:00.01 |       0 |
|   8 |      INDEX FULL SCAN         | T2_PK  |      0 |     81 |      0 |00:00:00.01 |       0 |
|*  9 |     FILTER                   |        |      1 |        |      0 |00:00:00.01 |       0 |
|  10 |      INDEX FULL SCAN         | T3_PK  |      0 |     81 |      0 |00:00:00.01 |       0 |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
  3 - access("USERNAME"='SYS')
  4 - filter("X"="USER_ID")
  7 - filter(NULL IS NOT NULL)
  9 - filter(NULL IS NOT NULL)

31 rows selected.

anything can be done/modified to that above query to turn Index full scan into Index range scan?

the above demo was from 19c ATP-S instance.

More to Explore

Performance

Get all the information about database performance in the Database Performance guide.