Skip to Main Content
  • Questions
  • Check Constraints and Explain Plan Filter Predicates

Breadcrumb

May 4th

Question and Answer

Chris Saxon

Thanks for the question, Patrick.

Asked: November 28, 2019 - 1:59 pm UTC

Last updated: June 22, 2021 - 9:35 am UTC

Version: 12c

Viewed 1000+ times

You Asked

Why does the Oracle SQL Optimizer include a filter predicate for a Check Constraint when generating an execution plan for a SELECT statement? If the constraint is valid (according to DBA_CONSTRAINTS), then the table rows are all compliant with the constraint, so why generate a corresponding filter predicate?

Here is an example:
CREATE TABLE TESTER 
(
   PARENTID NUMBER(19,0) NOT NULL, 
   DATAID NUMBER(19,0) NOT NULL,
   SOMEVALUE NUMBER (19,0)
   CONSTRAINT LOOP_CHECK CHECK (DATAID <> PARENTID) ENABLE
);

CREATE INDEX TESTER_DATAID ON TESTER (DATAID);

INSERT INTO TESTER(PARENTID< DATAID< SOMEVALUE) VALUES (0,100,1);
INSERT INTO TESTER(PARENTID< DATAID< SOMEVALUE) VALUES (0,200,2);
INSERT INTO TESTER(PARENTID< DATAID< SOMEVALUE) VALUES (100,1001,123);
INSERT INTO TESTER(PARENTID< DATAID< SOMEVALUE) VALUES (100,1002,456);
INSERT INTO TESTER(PARENTID< DATAID< SOMEVALUE) VALUES (100,1003,789);
INSERT INTO TESTER(PARENTID< DATAID< SOMEVALUE) VALUES (200,2001,234);
INSERT INTO TESTER(PARENTID< DATAID< SOMEVALUE) VALUES (200,2002,567);
INSERT INTO TESTER(PARENTID< DATAID< SOMEVALUE) VALUES (200,2003,890);

SELECT * FROM TESTER WHERE DATAID IN (1001, 2001);


If you check the execution plan for the final SELECT statement, you will see an Access Predicate, with an OR condition and the terms DATAID=1001 and DATAID=2001. This is expected.

You will also see a Filter Predicate resulting from the Check Constraint. It contains an OR condition, and the terms PARENTID<>1000 and PARENTID<>2001.

1) Why is the filter predicate being generated? If the check is consistent then it is known that no records will fail the filter.

2) The filter is doubly redundant. Given that it represents the condition (PARENTID<>1001) OR (PARENTID<>2001), it is logically always TRUE.

3) Is there any significant overhead in the query processing resulting from the filter predicate?

This example is trivial, but it is based on an actual case at one of my clients, where the table being queried contains > 100 million rows, and there are system-generated queries against the table where the "DATAID IN(...)" clause contains hundreds of values. It would depend on how Oracle evaluates the resulting (PARENTID <> m) OR (PARENTID <> n) OR.. inequalities. If it stops after the first condition that evaluates to TRUE, then it will never need to check beyond the first two inequalities. If it evaluates all of the conditions regardless, then it is performing a potentially large number of unnecessary check comparisons for every row returned by the associated access predicate.

and Chris said...

If you look at the "Final query after transformations" - which you can get from the lovely 10053 trace - you'll see something like:

select "TESTER"."PARENTID"    "PARENTID",
       "TESTER"."DATAID"      "DATAID",
       "TESTER"."SOMEVALUE"   "SOMEVALUE"
from   "CHRIS"."TESTER" "TESTER"
where  ("TESTER"."DATAID" = 1001 or "TESTER"."DATAID" = 2001)
and    (1001 <> "TESTER"."PARENTID" or 2001 <> "TESTER"."PARENTID")


The plan filters are in the query. So they become part of the plan.

These come from the transitive predicate generation steps.

So why bother with this?

Well in some cases it enables you to get better plans.

Consider this:

create table t (
  c1 int not null
    check ( c1 > 0 ),
  c2 int not null,
  check ( c1 = c2 )
);

create index i1 
  on t ( c1 );

insert into t 
  with rws as (
    select level x from dual
    connect by level <= 10
  )
    select x, x from rws;

commit;
  
exec dbms_stats.gather_table_stats ( user, 't' ) ;

select * from t
where  c2 = 0;


From the constraints, we know that:

- C1 = C2
- C2 > 0
=> the query returns no rows

Also, because the columns are equal, in theory the optimizer could use the index on C1 instead of full scanning the table. Even though only C2 appears in the where clause.

So what happens?

The plan shows:

--------------------------------------------------------------------------------    
| Id  | Operation                            | Name | Starts | E-Rows | A-Rows |    
--------------------------------------------------------------------------------    
|   0 | SELECT STATEMENT                     |      |      1 |        |      0 |    
|*  1 |  FILTER                              |      |      1 |        |      0 |    
|*  2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T    |      0 |      1 |      0 |    
|*  3 |    INDEX RANGE SCAN                  | I1   |      0 |      1 |      0 |    
--------------------------------------------------------------------------------    
                                                                                    
Predicate Information (identified by operation id):                                 
---------------------------------------------------                                 
                                                                                    
   1 - filter(NULL IS NOT NULL)                                                     
   2 - filter(("C2"=0 AND "C2">0))                                                  
   3 - access("C1"=0) 


It has indeed used the index on c1. Even though we queried c2! :)

Even better, the optimizer has identified the query returns no rows. So it doesn't even bother to access the index.

You can see this from the NULL IS NOT NULL filter at step 1. And zero starts for operations 2 & 3.

Of course, there will always be cases where this makes no difference or even ends up with a worse plan. Such is the consequence of a general case solution.

With respect to your scenario:

Simple inequalities are cheap to run. But as you have so many of them it's possible you have a noticeable overhead. So really the task for you is:

TEST!

Try running the queries with and without the constraint enabled. See if this makes a measurable difference to performance.

If it does, raise an enhancement request showing the effect. Describe the business impact this has. We can then prioritise this against other enhancements.

Rating

  (2 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

One thing still unclear

Patrick McManus, December 17, 2019 - 12:53 pm UTC

Hi Chris, and thanks for the response, it explains the situation clearly. Unfortunately, the database is provided by a third party vendor and I have no control over the presence of the check constraint in question.

There is still one thing that isn't clear to me. When Oracle is evaluating the filter predicate (PARENTID<>1001) OR (PARENTID<>2000) OR... (PARENTID<>nnnn), does it stop as soon as it reaches the first expression that evaluates to TRUE, or does it continue to evaluate every inequality in the list? I am assuming that it first parses the entire query to determine that there are only 'OR' conditions between the inequalities, and no others like 'AND' or 'NOT'.

If it stops at the first inequality that evaluates to TRUE, then it doesn't matter how many inequalities there are in the clause, it will never need to check beyond the first two, and there shouldn't be any performance impact to speak of.
Chris Saxon
December 17, 2019 - 4:27 pm UTC

does it stop as soon as it reaches the first expression that evaluates to TRUE, or does it continue to evaluate every inequality in the list?

It certainly can use short-circuit evaluation.

You can verify with something like this:

create or replace function f ( p int ) 
  return int as
  retval int;
begin
  dbms_output.put_line ( 'Called: ' || p );
  return p + 1;
end f;
/

create table t (
  c1 int
);

insert into t ( c1 ) 
with rws as (
  select level x from dual
  connect by level <= 10
)
  select * from rws;
  
commit;

set serveroutput on
select c1
from   t
where  ( f ( c1 ) <> 2 or f ( c1 ) <> 3 );


The function adds one to the input. And displays "Called: X" where X is the current C1 value.

So the database only needs to evaluate the second expression (f <> 3) when C1 = 1. So you would expected "Called: 1" twice in the output. And every other value once.

Which is exactly what I see:

C1   
    1 
    2 
    3 
    4 
    5 
    6 
    7 
    8 
    9 
   10 


10 rows selected. 

Called: 1
Called: 1
Called: 2
Called: 3
Called: 4
Called: 5
Called: 6
Called: 7
Called: 8
Called: 9
Called: 10


So I'd only look into this further if the query is "too slow".

What are exact requirements on transitive predicates generation?

Edgar, June 22, 2021 - 6:44 am UTC

I have the opposite problem.
I am trying to add a check constraint to improve the execution plan.
But transitive predicates are not generated.


create table basketitem
(   
    basketitemid NUMBER(16,0) NOT NULL,
    basketid NUMBER(16,0) NOT NULL,
 -- many columns
    attrs clob,
    constraint pk_basketitem primary key (basketitemid) using index local,
 constraint ck1_basketitem check( basketitemid >=  trunc(basketid/17592186044416) * 17592186044416 ),
 constraint ck2_basketitem check( basketitemid <   trunc(basketid/17592186044416 + 1) * 17592186044416 )
)
 row archival
 partition by range (basketitemid) interval (17592186044416) -- 2 ** 44
 (partition p0 values less than (17592186044416))
;
 
create index i1_basketitem on basketitem (basketid) compress 1 local;
 
select * from basketitem where basketid = :b;
 
----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name          | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |               |     1 |  4030 |     1   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE ALL                       |               |     1 |  4030 |     1   (0)| 00:00:01 |     1 |1048575|
|*  2 |   TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| BASKETITEM    |     1 |  4030 |     1   (0)| 00:00:01 |     1 |1048575|
|*  3 |    INDEX RANGE SCAN                        | I1_BASKETITEM |     1 |       |     1   (0)| 00:00:01 |     1 |1048575|
----------------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter("BASKETITEM"."ORA_ARCHIVE_STATE"='0')
   3 - access("BASKETID"=TO_NUMBER(:B))


No luck, partition range all = bad plan.

At the same time, while adding predicates derived from constraints manually, I can get better plan

select * from basketitem where basketid = :b
 and basketitemid >=  trunc(:b/17592186044416) * 17592186044416
 and basketitemid <   trunc(:b/17592186044416 + 1) * 17592186044416;
 
-----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                   | Name          | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                            |               |     1 |  4030 |     1   (0)| 00:00:01 |       |       |
|*  1 |  FILTER                                     |               |       |       |            |          |       |       |
|   2 |   PARTITION RANGE ITERATOR                  |               |     1 |  4030 |     1   (0)| 00:00:01 |   KEY |   KEY |
|*  3 |    TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| BASKETITEM    |     1 |  4030 |     1   (0)| 00:00:01 |   KEY |   KEY |
|*  4 |     INDEX RANGE SCAN                        | I1_BASKETITEM |     1 |       |     1   (0)| 00:00:01 |   KEY |   KEY |
-----------------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter(TRUNC(TO_NUMBER(:B)/17592186044416+1)*17592186044416>TRUNC(TO_NUMBER(:B)/17592186044416)*17592186044416
              )
   3 - filter("BASKETITEM"."ORA_ARCHIVE_STATE"='0' AND "BASKETITEMID">=TRUNC(TO_NUMBER(:B)/17592186044416)*1759218604
              4416 AND "BASKETITEMID"<TRUNC(TO_NUMBER(:B)/17592186044416+1)*17592186044416)
   4 - access("BASKETID"=TO_NUMBER(:B))


The query below does not produce good plan either:

select * from basketitem where basketid = :b
 and basketitemid >=  trunc(basketid/17592186044416) * 17592186044416
 and basketitemid <   trunc(basketid/17592186044416 + 1) * 17592186044416;


So my question is: what are exact requirements on transitive predicates generation?
Is it documented somewhere?

Chris Saxon
June 22, 2021 - 9:35 am UTC

The optimizer needs a constant to generate transitive predicates. As MOS note 68979.1 says:

In 11g there are the following rules for generating predicate using transitivity:

Standard Predicate:

a = b and a = 10
then b = 10


So if you add a constraint limiting basketitemid in terms of some constant, you'll find this predicate added to the plan. Though this still isn't enough for the optimizer to limit the partition selection!

create table basketitem (   
  basketitemid NUMBER(16,0) NOT NULL,
  basketid NUMBER(16,0) NOT NULL,
  constraint pk_basketitem primary key (basketitemid) using index local,
  constraint ck1_basketitem check( basketitemid >= trunc(basketid/17592186044416) * 17592186044416 ),
  constraint ck2_basketitem check( basketitemid < trunc(basketid/17592186044416 + 1) * 17592186044416 ),
  constraint ck3_basketitem check( basketitemid < 17592186044416 )
) partition by range (basketitemid) interval (17592186044416) -- 2 ** 44
 (partition p0 values less than (17592186044416))
;

create index i1_basketitem on basketitem (basketid) compress 1 local;
 
var b number

select * from basketitem where basketid = :b;

select * 
from   table(dbms_xplan.display_cursor(null, null, 'TYPICAL LAST +PARTITION'));

----------------------------------------------------------------------------------------------------------------------------    
| Id  | Operation                                  | Name          | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    
----------------------------------------------------------------------------------------------------------------------------    
|   0 | SELECT STATEMENT                           |               |       |       |     1 (100)|          |       |       |    
|   1 |  PARTITION RANGE ALL                       |               |     1 |    26 |     1   (0)| 00:00:01 |     1 |1048575|    
|   2 |   TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| BASKETITEM    |     1 |    26 |     1   (0)| 00:00:01 |     1 |1048575|    
|*  3 |    INDEX RANGE SCAN                        | I1_BASKETITEM |     1 |       |     1   (0)| 00:00:01 |     1 |1048575|    
----------------------------------------------------------------------------------------------------------------------------    
                                                                                                                                
Predicate Information (identified by operation id):                                                                             
---------------------------------------------------                                                                             
                                                                                                                                
   3 - access("BASKETID"=:B)                                                                                                    
       filter(TRUNC("BASKETID"/17592186044416)*17592186044416<17592186044416) 


No luck, partition range all = bad plan.


Is the plan without the partition filter really that bad in practice?

More to Explore

Performance

Get all the information about database performance in the Database Performance guide.