Skip to Main Content
  • Questions
  • Indexing strategy for dates in a query

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Geraldo.

Asked: February 13, 2020 - 1:53 pm UTC

Answered by: Chris Saxon - Last updated: February 18, 2020 - 6:03 pm UTC

Category: Database Development - Version: 18.6

Viewed 100+ times

You Asked

Hello, Ask Tom team.

I have the following query:

SELECT guid, sender_id,  doc, status, arrived_date, register_date, last_updated_date 
FROM user1.table1 
WHERE (sender_id=:SENDER OR :SENDER IS NULL ) AND (status=:STATUS OR :status IS NULL) 
AND TO_DATE(register_date, 'DD/MM/RRRR') 
BETWEEN TO_DATE(:FIRST_DATE, 'DD/MM/RRRR') AND TO_DATE(:SECOND_DATE, 'DD/MM/RRRR') 
ORDER BY arrived_date DESC


Explain Plan:

SQL_ID  8pubuxwscr06u, child number 1
-------------------------------------
SELECT guid, sender_id,  doc, status, arrived_date, register_date, last_updated_date 
FROM user1.table1 
WHERE (sender_id=:SENDER OR :SENDER IS NULL ) AND (status=:STATUS OR :status IS NULL) 
AND TO_DATE(register_date, 'DD/MM/RRRR') 
BETWEEN TO_DATE(:FIRST_DATE, 'DD/MM/RRRR') AND TO_DATE(:SECOND_DATE, 'DD/MM/RRRR') 
ORDER BY arrived_date DESC
 
Plan hash value: 1199727206
 
--------------------------------------------------------------------------------
| Id  | Operation           | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |          |       |       |  4188 (100)|          |
|   1 |  SORT ORDER BY      |          |     7 |   616 |  4188   (1)| 00:00:01 |
|*  2 |   FILTER            |          |       |       |            |          |
|*  3 |    TABLE ACCESS FULL| TABLE1 |     7 |   616 |  4187   (1)| 00:00:01 |
--------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter(TO_DATE(:FIRST_DATE,'DD/MM/RRRR')>=TO_DATE(:FIRST_DATE,'DD/MM/RRRR'))
   3 - filter(((:SENDER IS NULL OR "sender_id"=:SENDER) AND (:STATUS IS NULL 
              OR "status"=TO_NUMBER(:STATUS)) AND 
              TO_DATE(INTERNAL_FUNCTION("register_date"),'DD/MM/RRRR')>=TO_DATE(:FIRST_DATE,'DD/MM/RRRR') AND TO_DATE(INTERNAL_FUNCTION("register_date"),'DD/MM/RRRR')<=TO_DATE(:SECOND_DATE,'DD/MM/RRRR')))


1. register_date column has the time component but in my filter I do not care about it.
2. status column can have the following values: (0,1,2,3,4,5,6).
3. sender_id can be repeated in the table
4. doc column values can be repeated once per sender_id (here sender_id,doc is a unique key).

I think the problem is that in the query the user can omit sender_id or status (can be null). And is hard to index this. What do you think ?

I have other specific questions:

1. What indexes would help in this query?
2. Do I need to index all columns being used here?
3. When indexing date columns, how implement a index in order it can be used well no matter the time component?
4. Basically, when using this query is more important a specific date range because all the other values remains, most of the time, null (this means it bring them all (sender_id and status) for the specific date range).

Thanks in advanced.

Regards,

and we said...

Yikes, TO_DATEing DATE! You have implicit conversions going on!

This is why there's the INTERNAL_FUNCTION("register_date") nonsense in the predicates section of the plan.

Avoid!

Either TRUNC() the date. Or better still, use:

AND register_date >= TO_DATE(:FIRST_DATE, 'DD/MM/RRRR') 
AND register_date < TO_DATE(:SECOND_DATE, 'DD/MM/RRRR') + 1


This allows the optimizer to use an index on just register_date most effectively.

1. Assuming you rewrite the date filter as shown above, likely some combination of

register_date, status, sender


Possibly with arrived_date at the end to help avoid a sort.

Though the OR IS NULL conditions on the variables makes it harder for the optimizer to use those indexes.

2. No. Indexing all the columns in a query usually gives the best performance.

But if you do this for every query you'll end up with a mass of indexes.

Usually creating indexes on the most selective columns (those that return the fewest) rows is good enough.

3. Change the where clause as discussed above. Avoid functions on the column!

4. Is this a question?

and you rated our response

  (3 ratings)

Reviews

Follow Up

February 14, 2020 - 6:53 pm UTC

Reviewer: Geraldo

Possibly with arrived_date at the end to help avoid a sort.
Do you mean arrived_date at the end of the column list in SELECT statement?

4. Basically, when using this query is more important a specific date range because all the other values remains, most of the time, null (this means it bring them all (sender_id and status) for the specific date range). This was a kind of note. I mean that this query basically is filtered by register_date most of the time.

I am applying your recommendations and then I will post the explain plan.

Chris Saxon

Followup  

February 18, 2020 - 10:44 am UTC

Do you mean arrived_date at the end of the column list in SELECT statement?

I'm mean arrived date at the end of the column list in the index.

Follow Up

February 18, 2020 - 11:20 am UTC

Reviewer: Geraldo

Thanks for answering, Chris.

In your above comment there is not an index with arrived_date column.

You just wrote:
register_date, status, sender

I thought the index would be on those three columns. Please, can you explain this again?

Thanks in advanced.
Chris Saxon

Followup  

February 18, 2020 - 6:03 pm UTC

register_date, status, sender Possibly with arrived_date at the end

i.e.

register_date, status, sender, arrived_date

I'm not sure including arrived_date is worth it though. You'll need to test.

Follow Up

February 18, 2020 - 6:09 pm UTC

Reviewer: Geraldo

Ok. Thanks.

I will try and post the results here.

Regards,

More to Explore

Design

New to good database design? Check out Chris Saxon's full fundamentals class.