Hi,
I have the following query:
SELECT
sum(SAL_DXC.CEXREALPROG + SAL_DXC.CEXNONPROG),
CM_T_PERIODO_DI.CODPERIODO,
CM_T_PERIODO_DI.ANO,
CM_T_PERIODO_DI.NUMMES
FROM
CM_T_PERIODO_DI,
CM_T_DXC_HE SAL_DXC,
CM_T_HOSPITAL_DI CM_T_HOSPITAL_DI_IDHOSP
WHERE
( CM_T_HOSPITAL_DI_IDHOSP.IDHOSPITAL=SAL_DXC.IDHOSPITAL )
AND ( SAL_DXC.IDDATASALTROZO between CM_T_PERIODO_DI.IDDATADESDE and CM_T_PERIODO_DI.IDDATAHASTA and SAL_DXC.TROZOFINMOV=1 )
AND ( CM_T_PERIODO_DI.TIPOPERIODO = 'Meses'
)
AND ( CM_T_PERIODO_DI.TIPOPERIODO = 'Meses' )
AND ( CM_T_PERIODO_DI.TIPOPERIODO = 'Meses' )
AND
(
CM_T_HOSPITAL_DI_IDHOSP.NOMHOSP = 'C.H. DE OURENSE'
AND
CM_T_PERIODO_DI.CODPERIODO BETWEEN '2019/01' AND '2019/10'
AND (CM_T_PERIODO_DI.TIPOPERIODO = 'Meses'
)
)
GROUP BY
CM_T_PERIODO_DI.CODPERIODO,
CM_T_PERIODO_DI.ANO,
CM_T_PERIODO_DI.NUMMES
----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 13 | 1118 | | 198 (1)| 00:00:03 |
| 1 | HASH GROUP BY | | 13 | 1118 | 1104K| 198 (1)| 00:00:03 |
| 2 | NESTED LOOPS | | 10922 | 917K| | 197 (0)| 00:00:03 |
| 3 | NESTED LOOPS | | 10922 | 917K| | 197 (0)| 00:00:03 |
| 4 | MERGE JOIN CARTESIAN | | 14 | 966 | | 10 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID | CM_T_HOSPITAL_DI | 1 | 35 | | 2 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | CM_T_HOSPITAL_DI_NOMHOSP | 1 | | | 1 (0)| 00:00:01 |
| 7 | BUFFER SORT | | 14 | 476 | | 8 (0)| 00:00:01 |
|* 8 | TABLE ACCESS BY INDEX ROWID| CM_T_PERIODO_DI | 14 | 476 | | 8 (0)| 00:00:01 |
|* 9 | INDEX RANGE SCAN | CM_IX_PERIODO_TIPOPERIODO | 456 | | | 1 (0)| 00:00:01 |
|* 10 | INDEX RANGE SCAN | CM_IX_DXC_IDDATATROZO | 109 | | | 0 (0)| 00:00:01 |
|* 11 | TABLE ACCESS BY INDEX ROWID | CM_T_DXC_HE | 760 | 12920 | | 13 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
6 - access("CM_T_HOSPITAL_DI_IDHOSP"."NOMHOSP"='C.H. DE OURENSE')
8 - filter("CM_T_PERIODO_DI"."CODPERIODO">='2019/01' AND "CM_T_PERIODO_DI"."CODPERIODO"<='2019/10')
9 - access("CM_T_PERIODO_DI"."TIPOPERIODO"='Meses')
10 - access("SAL_DXC"."IDDATASALTROZO">="CM_T_PERIODO_DI"."IDDATADESDE" AND
"SAL_DXC"."IDDATASALTROZO"<="CM_T_PERIODO_DI"."IDDATAHASTA")
11 - filter("SAL_DXC"."TROZOFINMOV"=1 AND "CM_T_HOSPITAL_DI_IDHOSP"."IDHOSPITAL"="SAL_DXC"."IDHOSPITAL
")
The query runs well for small date range, but more than a year it runs very slowly and goes better with a SAL_DXC full scan. How can I make the optimizer to know the date range is too big to use the index?
Many thanks,
VĂctor
The query has a couple of range searches:
SAL_DXC.IDDATASALTROZO between CM_T_PERIODO_DI.IDDATADESDE and CM_T_PERIODO_DI.IDDATAHASTA
AND
CM_T_PERIODO_DI.CODPERIODO BETWEEN '2019/01' AND '2019/10'
As discussed in yesterday's SQL Office Hours, it's hard for the optimizer to get good estimates for these.
When the recording is available you can view it at:
https://asktom.oracle.com/pls/apex/f?p=100:551:::NO:RP,551:P551_CLASS_ID:6623 For the TL;DR in the meantime...
There are a few strategies you can use. Starting from roughly easiest/least impact to hardest/most impact:
Increase Dynamic Stats LevelAdding the
/*+ dynamic_sampling ( 11 ) */
hint or setting OPTIMIZER_DYNAMIC_SAMPLING in the session enables the optimizer to capture extra stats about your tables when choosing a plan.
This alone may be enough for it to get better estimates for the range predicates. And thus choose a better plan.
Create IndexesThe query uses an index to locate rows for CM_T_PERIODO_DI & CM_T_DXC_HE. But then does further filtering of the rows on the table. It's more efficient to find exactly the rows you want using the index. Then read only those rows from the table.
So I'd try creating indexes on:
CM_T_PERIODO_DI ( TIPOPERIODO, CODPERIODO )
CM_T_DXC_HE ( TROZOFINMOV, IDHOSPITAL, IDDATASALTROZO )
After you've created these you may end up with the same plan "shape". But the optimizer will be able to do the filtering at steps 8 & 10 in the index. So it accesses fewer rows from the tables.
So even if using dynamic stats "fixes" your plan, I'd still try these out too.
Check the StatsLine 10 estimates the index will locate 109 rows. But then on step 11 it estimates 760 rows.
So somehow it thinks that it'll use the index to find ~100 rows. But then return ~7 times that from the table!
Which is... suspicious. Check your table and index stats are up-to-date.
NB - increasing the dynamic_sampling should already address this in your query.
Upgrade!From 12c the optimizer can use adaptive plans. These enable the query to choose between nested loops and hash joins, based on the number of rows the first table returns. It's possible this will kick in, enabling the optimizer to full scan CM_T_DXC_HE and hash join it to the other tables.
Of course upgrading is a lot of effort, has other implications, and may not improve your query! But 11g is
over a decade old now. So is something you should have on your roadmap to happen "soon" anyway.