You Asked

Hi,

I have the following query:

SELECT
  sum(SAL_DXC.CEXREALPROG + SAL_DXC.CEXNONPROG),
  CM_T_PERIODO_DI.CODPERIODO,
  CM_T_PERIODO_DI.ANO,
  CM_T_PERIODO_DI.NUMMES
FROM
  CM_T_PERIODO_DI,
  CM_T_DXC_HE  SAL_DXC,
  CM_T_HOSPITAL_DI  CM_T_HOSPITAL_DI_IDHOSP
WHERE
  ( CM_T_HOSPITAL_DI_IDHOSP.IDHOSPITAL=SAL_DXC.IDHOSPITAL  )
  AND  ( SAL_DXC.IDDATASALTROZO between CM_T_PERIODO_DI.IDDATADESDE and  CM_T_PERIODO_DI.IDDATAHASTA and SAL_DXC.TROZOFINMOV=1  )
  AND  ( CM_T_PERIODO_DI.TIPOPERIODO  =  'Meses'
  )
  AND  ( CM_T_PERIODO_DI.TIPOPERIODO  =  'Meses'  )
  AND  ( CM_T_PERIODO_DI.TIPOPERIODO  =  'Meses'  )
  AND  
  (
   CM_T_HOSPITAL_DI_IDHOSP.NOMHOSP  =  'C.H. DE OURENSE'
   AND
   CM_T_PERIODO_DI.CODPERIODO  BETWEEN  '2019/01'  AND  '2019/10'
  AND  (CM_T_PERIODO_DI.TIPOPERIODO  =  'Meses'
)
  )
GROUP BY
  CM_T_PERIODO_DI.CODPERIODO,
  CM_T_PERIODO_DI.ANO,
  CM_T_PERIODO_DI.NUMMES


----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name                      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                 |                           |    13 |  1118 |       |   198   (1)| 00:00:03 |
|   1 |  HASH GROUP BY                   |                           |    13 |  1118 |  1104K|   198   (1)| 00:00:03 |
|   2 |   NESTED LOOPS                   |                           | 10922 |   917K|       |   197   (0)| 00:00:03 |
|   3 |    NESTED LOOPS                  |                           | 10922 |   917K|       |   197   (0)| 00:00:03 |
|   4 |     MERGE JOIN CARTESIAN         |                           |    14 |   966 |       |    10   (0)| 00:00:01 |
|   5 |      TABLE ACCESS BY INDEX ROWID | CM_T_HOSPITAL_DI          |     1 |    35 |       |     2   (0)| 00:00:01 |
|*  6 |       INDEX RANGE SCAN           | CM_T_HOSPITAL_DI_NOMHOSP  |     1 |       |       |     1   (0)| 00:00:01 |
|   7 |      BUFFER SORT                 |                           |    14 |   476 |       |     8   (0)| 00:00:01 |
|*  8 |       TABLE ACCESS BY INDEX ROWID| CM_T_PERIODO_DI           |    14 |   476 |       |     8   (0)| 00:00:01 |
|*  9 |        INDEX RANGE SCAN          | CM_IX_PERIODO_TIPOPERIODO |   456 |       |       |     1   (0)| 00:00:01 |
|* 10 |     INDEX RANGE SCAN             | CM_IX_DXC_IDDATATROZO     |   109 |       |       |     0   (0)| 00:00:01 |
|* 11 |    TABLE ACCESS BY INDEX ROWID   | CM_T_DXC_HE               |   760 | 12920 |       |    13   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   6 - access("CM_T_HOSPITAL_DI_IDHOSP"."NOMHOSP"='C.H. DE OURENSE')
   8 - filter("CM_T_PERIODO_DI"."CODPERIODO">='2019/01' AND "CM_T_PERIODO_DI"."CODPERIODO"<='2019/10')
   9 - access("CM_T_PERIODO_DI"."TIPOPERIODO"='Meses')
  10 - access("SAL_DXC"."IDDATASALTROZO">="CM_T_PERIODO_DI"."IDDATADESDE" AND 
              "SAL_DXC"."IDDATASALTROZO"<="CM_T_PERIODO_DI"."IDDATAHASTA")
  11 - filter("SAL_DXC"."TROZOFINMOV"=1 AND "CM_T_HOSPITAL_DI_IDHOSP"."IDHOSPITAL"="SAL_DXC"."IDHOSPITAL
              ")

The query runs well for small date range, but more than a year it runs very slowly and goes better with a SAL_DXC full scan. How can I make the optimizer to know the date range is too big to use the index?

Many thanks,
Víctor

and Chris said...

The query has a couple of range searches:

SAL_DXC.IDDATASALTROZO between CM_T_PERIODO_DI.IDDATADESDE and  CM_T_PERIODO_DI.IDDATAHASTA 
AND
CM_T_PERIODO_DI.CODPERIODO  BETWEEN  '2019/01'  AND  '2019/10'

As discussed in yesterday's SQL Office Hours, it's hard for the optimizer to get good estimates for these.

When the recording is available you can view it at:

https://asktom.oracle.com/pls/apex/f?p=100:551:::NO:RP,551:P551_CLASS_ID:6623

For the TL;DR in the meantime...

There are a few strategies you can use. Starting from roughly easiest/least impact to hardest/most impact:

Increase Dynamic Stats Level

Adding the

/*+ dynamic_sampling ( 11 ) */

hint or setting OPTIMIZER_DYNAMIC_SAMPLING in the session enables the optimizer to capture extra stats about your tables when choosing a plan.

This alone may be enough for it to get better estimates for the range predicates. And thus choose a better plan.

Create Indexes

The query uses an index to locate rows for CM_T_PERIODO_DI & CM_T_DXC_HE. But then does further filtering of the rows on the table. It's more efficient to find exactly the rows you want using the index. Then read only those rows from the table.

So I'd try creating indexes on:

CM_T_PERIODO_DI ( TIPOPERIODO, CODPERIODO )
CM_T_DXC_HE ( TROZOFINMOV, IDHOSPITAL, IDDATASALTROZO )

After you've created these you may end up with the same plan "shape". But the optimizer will be able to do the filtering at steps 8 & 10 in the index. So it accesses fewer rows from the tables.

So even if using dynamic stats "fixes" your plan, I'd still try these out too.

Check the Stats

Line 10 estimates the index will locate 109 rows. But then on step 11 it estimates 760 rows.

So somehow it thinks that it'll use the index to find ~100 rows. But then return ~7 times that from the table!

Which is... suspicious. Check your table and index stats are up-to-date.

NB - increasing the dynamic_sampling should already address this in your query.

Upgrade!

From 12c the optimizer can use adaptive plans. These enable the query to choose between nested loops and hash joins, based on the number of rows the first table returns. It's possible this will kick in, enabling the optimizer to full scan CM_T_DXC_HE and hash join it to the other tables.

Of course upgrading is a lot of effort, has other implications, and may not improve your query! But 11g is over a decade old now. So is something you should have on your roadmap to happen "soon" anyway.

Rating

(1 rating)

Is this answer out of date? If it is, please let us know via a Comment

Comments

MERGE JOIN CARTESIAN

A reader, December 02, 2019 - 6:28 am UTC

Could not help notice "MERGE JOIN CARTESIAN" in the execution plan.

Apparently, "CM_T_PERIODO_DI" is not joined to either of CM_T_DXC_HE and/or CM_T_HOSPITAL_DI.

Not sure if it is intentional but thought of highlighting.

Also "CM_T_PERIODO_DI.CODPERIODO  BETWEEN  '2019/01'  AND  '2019/10'" should be using TO_DATE for conversion.

Cheers!

December 05, 2019 - 9:28 am UTC

Could not help notice "MERGE JOIN CARTESIAN" in the execution plan.

The optimizer expects just one row from CM_T_HOSPITAL_DI. And few from CM_T_PERIODO_DI.

Provided these estimates are correct, this is can be a good option. A Cartesian product of 1 row with anything returns the number of rows in the second data set. So this can be the most efficient way to do this.

Also "CM_T_PERIODO_DI.CODPERIODO BETWEEN '2019/01' AND '2019/10'" should be using TO_DATE for conversion.

There are no data type conversions in the predicates. So I suspect CODPERIODO stores string values like '2019/01', '2019/02', etc. Meaning no conversion is needed!

More to Explore

Performance

Get all the information about database performance in the Database Performance guide.

Breadcrumb

Question and Answer

You Asked

and Chris said...

Rating

Comments

MERGE JOIN CARTESIAN

More to Explore

Performance

Date ranges SQL tuning

Breadcrumb

Question and Answer

You Asked

and Chris said...

Rating

Comments

MERGE JOIN CARTESIAN

More to Explore

Performance