Skip to Main Content
  • Questions
  • Timestamp comparison against literal value in case statement - Performance Issue

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Paul.

Asked: February 11, 2021 - 4:01 am UTC

Last updated: February 11, 2021 - 4:47 pm UTC

Version: Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

Viewed 1000+ times

You Asked

Hi Tom,

I have a large database table that I am trying to pull data from in summary format, which involves summarizing the data with a variable start date. For the sake of example, I have framed the question as a much simpler version of the code than I actually have in production. For the purposes of this example, assume I have a table of orders that I want to pull from, which contains order information by chain, store, and date (date is a timestamp column).

The orders table contains orders from the time the stores came into being and started stocking the shelves, but I only want data from the time the individual store actually opened its doors to the public. For that reason, I have built a case statement which has hard-coded the store opening date (yes, I know this is not good design, but assume the database does not have that information, and I have been asked to create an ad-hoc report that will be used once and thrown away ;) ).

My issue is this. If I hard-code a single start date, the query returns in a matter of seconds. However, this does not work for me because each individual store has a unique opening date, so I have hard-coded the opening date using date literals for each store into a CASE statment. Although I would expect an increase in processing time to execute the CASE statement, it results in an EXPONENTIAL increase in processing time.

I have tried changing the TO-DATE in the case literals to TIMESTAMP, thinking that might avoid an implicit conversion, but it does not seem to make a difference.

So, I am looking for a better way to do this. My explain plan shows indexes are being used, there is an index by chainid, storeid, and ordertmst.

It is obviously a monkey wrench thrown into the works by the CASE statement, but I cannot figure out why the disparity is so very large.

select vtc.chainid, vtc.storeid, vtc.orderid, vtc.ordertmst, count(*) as itemcount
             from orders vtc 
             where vtc.backlog = 'N'
               and vtc.ordertmst >= --to_date ( '20200101', 'yyyymmdd' )  -- <<< if I use this instead of the case, results come back almost immediately
               ----- this case statment causes the query to return in HOURS instead of seconds.
               case vtc.chainid                                              
                        when 501 then to_date ( '20200101', 'yyyymmdd' ) 
                        when 504 then 
                           case
                           when vtc.storeid in ( 2 ) then to_date ( '2020021400', 'yyyymmddhh24' ) 
                           when vtc.storeid in ( 12 ) then to_date ( '2020020310', 'yyyymmddhh24' ) 
                           when vtc.storeid in ( 9, 10 ) then to_date ( '2020022506', 'yyyymmddhh24' ) 
                           when vtc.storeid in ( 5, 6 ) then to_date ( '202003251355', 'yyyymmddhh24mi' ) 
                           when vtc.storeid in ( 7, 8 ) then to_date ( '202003121920', 'yyyymmddhh24mi' )  
                           when vtc.storeid in ( 3, 4 ) then to_date ( '202004101355', 'yyyymmddhh24mi' ) 
                           else to_date ( '20200101', 'yyyymmdd' )  
                           end 
                        when 507 then 
                           case
                             when vtc.storeid in ( 1, 2 ) then to_date ( '202004160950', 'yyyymmddhh24mi' )
                             when vtc.storeid in ( 3, 4 ) then to_date ( '202005051400', 'yyyymmddhh24mi' )
                             else to_date ( '47120101', 'yyyymmdd' )  
                           end 
                        when 502 then 
                           case
                           when vtc.storeid in ( 11 ) then to_date ( '202004031800', 'yyyymmddhh24mi' ) 
                           when vtc.storeid in ( 14 ) then to_date ( '202004231200', 'yyyymmddhh24mi' ) 
                           else to_date ( '20200101', 'yyyymmdd' )  
                           end 
                        when 701 then to_date ( '2020011403', 'yyyymmddhh24' )  
                        when 702 then to_date ( '2020010600', 'yyyymmddhh24' )  
                        when 608 then to_date ( '202001290235', 'yyyymmddhh24mi' )  
                        when 607 then to_date ( '202002030349', 'yyyymmddhh24mi' )  
                        when 601 then to_date ( '202002110021', 'yyyymmddhh24mi' )  
                        when 602 then to_date ( '202002162335', 'yyyymmddhh24mi' )  
                        when 606 then to_date ( '202002240142', 'yyyymmddhh24mi' )  
                        when 618 then to_date ( '202002262130', 'yyyymmddhh24mi' )  
                        when 614 then to_date ( '202003012227', 'yyyymmddhh24mi' )  
                        when 605 then to_date ( '202003230300', 'yyyymmddhh24mi' )   
                        when 610 then to_date ( '202003200900', 'yyyymmddhh24mi' )    
                        when 609 then to_date ( '202003292340', 'yyyymmddhh24mi' )   
                        when 613 then to_date ( '202004052000', 'yyyymmddhh24mi' )  
                        when 617 then to_date ( '202004072300', 'yyyymmddhh24mi' )  
                        when 506 then to_date ( '202005100908', 'yyyymmddhh24mi' )  
                        else to_date ( '20200101', 'yyyymmdd' )
                            end 
group by vtc.chainid, vtc.storeid, vtc.orderid, vtc.ordertmst


and Chris said...

The optimizer is unable to use an index on the timestamp column when you compare it to the case expression. So I suspect you're seeing the query go from a nice, fast index access to less efficient access method. If you share the two plans we can comment further.

Here's an example similar to your situation. The first query compares the date column to a date literal and uses the index. But put the same dates in a complex case expression and it switches to a full table scan:

create table t (
  c1 int, c2 date, c3 int, c4 varchar2(100)
);

insert into t 
with rws as (
  select level x from dual
  connect by level <= 1000
)
  select x, date'2020-01-01' + ( x / 2 ), 
         mod ( x, 4 ), lpad ('x', 100, 'x' ) 
  from   rws;
commit;

create index i on t ( c2 );

set serveroutput off

select * from t
where  c2 >= date'2021-05-01';

select * 
from   dbms_xplan.display_cursor( format => 'BASIC LAST +PREDICATE');

----------------------------------------------------                 
| Id  | Operation                           | Name |                 
----------------------------------------------------                 
|   0 | SELECT STATEMENT                    |      |                 
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T    |                 
|*  2 |   INDEX RANGE SCAN                  | I    |                 
----------------------------------------------------                 
                                                                     
Predicate Information (identified by operation id):                  
---------------------------------------------------                  
                                                                     
   2 - access("C2">=TO_DATE(' 2021-05-01 00:00:00', 'syyyy-mm-dd     
              hh24:mi:ss'))

select * from t
where  c2 >= case c3
  when 0 then date'2021-05-01'
  when 1 then date'2021-05-01'
  when 2 then date'2021-05-01'
  when 3 then date'2021-05-01'
end;

select * 
from   dbms_xplan.display_cursor( format => 'BASIC LAST +PREDICATE');

----------------------------------                                                      
| Id  | Operation         | Name |                                                      
----------------------------------                                                      
|   0 | SELECT STATEMENT  |      |                                                      
|*  1 |  TABLE ACCESS FULL| T    |                                                      
----------------------------------                                                      
                                                                                        
Predicate Information (identified by operation id):                                     
---------------------------------------------------                                     
                                                                                        
   1 - filter("C2">=CASE "C3" WHEN 0 THEN TO_DATE(' 2021-05-01                          
              00:00:00', 'syyyy-mm-dd hh24:mi:ss') WHEN 1 THEN TO_DATE(' 2021-05-01     
              00:00:00', 'syyyy-mm-dd hh24:mi:ss') WHEN 2 THEN TO_DATE(' 2021-05-01     
              00:00:00', 'syyyy-mm-dd hh24:mi:ss') WHEN 3 THEN TO_DATE(' 2021-05-01     
              00:00:00', 'syyyy-mm-dd hh24:mi:ss') END )   


In your example it looks to me like all the dates in the expression are on or after the original date you used. So an easy workaround (for now) is:

Do both!

Keep the simple literal and the complex case expression.

You will almost certainly need to come up with a better solution long-term though. Over time your data will grow, and the (presumably) index will no longer be effective, meaning the optimizer switches to a different access method, which may lead you back to your current situation.

Is this answer out of date? If it is, please let us know via a Comment

More to Explore

SQL

The Oracle documentation contains a complete SQL reference.