Skip to Main Content
  • Questions
  • Slow query because the cardinality estimate is wrong for joins on foreign keys

Breadcrumb

May 4th

Question and Answer

Chris Saxon

Thanks for the question, Valer.

Asked: July 01, 2016 - 11:21 pm UTC

Last updated: November 06, 2020 - 9:16 am UTC

Version: 12c

Viewed 10K+ times! This question is

You Asked

While investigating a very slow query in our OLTP db, I noticed that Oracle severly under estimates the cardinality for joins that are on foreign key. The following script replicates the issue.

create table A (part number not null, rec number not null, d varchar2(50), primary key (part, rec));
create table B (part number not null, rec number not null, t1_rec number not null, d varchar2(50), primary key (part, rec));
alter table B add constraint fkB_A foreign key (part, t1_rec) references A (part, rec);

begin
for i in 1..170 loop
insert into A values (i, mod(i, 17), 'a' || i);
insert into A values (i, mod(i, 17) + 20, 'aa' || i);
insert into A values (i, mod(i, 17) + 40, 'a' || i);
insert into A values (i, mod(i, 17) + 60, 'a' || i);
end loop;
end;
/

begin
for i in 1..17*230 loop
insert into B values (mod(i, 170) + 1, i, mod(mod(i, 170) + 1, 17), 'a' || i);
end loop;
end;
/

commit;

exec dbms_stats.gather_table_stats(null, 'A');
exec dbms_stats.gather_table_stats(null, 'B');

explain plan for select * from A, B where A.part = 5 and A.part = B.part and A.rec = B.t1_rec;


My script creates a parent (A) and child (B) table which are joined through a complex foreign key on 2 columns.
Next I insert some data into both tables, such that the data is balanced. After that I update the table stats.
Finally I run a query that joins A & B on the foreign key columns and filters one of the column by a value.

Oracle comes up with the plan below. The estimate on Ids 3, 4 & 5 is right on, but then, IMO, the cardinality estimate on the HASH JOIN is wrong, because the tables are in parent to child relationship the actual number of rows is equal to the number of rows fetched from the child table. That is 23.

------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 7 | 189 | 17 (0)| 00:00:01 |
|* 1 | HASH JOIN | | 7 | 189 | 17 (0)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID BATCHED| A | 4 | 44 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | SYS_C00403559 | 4 | | 2 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID BATCHED| B | 23 | 368 | 14 (0)| 00:00:01 |
|* 5 | INDEX RANGE SCAN | SYS_C00403563 | 23 | | 2 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("A"."PART"="B"."PART" AND "A"."REC"="B"."T1_REC")
3 - access("A"."PART"=5)
5 - access("B"."PART"=5)



Why is this an issue for us? In our OLTP system we have multiple transaction tables and multiple dimension/satellite tables. Often the reporting queries join 2+ transaction tables and a larger number of dimensions. Because of this under-estimate of cardinality, Oracle chooses plans that would do Nested Loop joins in the order: Trx 1 -> dimension 1 -> dimmension 2 -> ... dimension n -> Trx2 -> dimension n+1 -> ...
That's on the assumption that even though it fetches a large number of rows from Trx 1 table, the joins with its dimensions will reduce the rows to a small number (usually the plans show 2 and 1 rows) so then it is OK to do a Nested Loop of that branch with the second Trx table. Of course, the reality is different, the joins with dimensions don't reduce the data at all!

Is there anything we could do at DB layer to improve these estimates?



and Chris said...

The number of rows returned from the child gives an upper bound for the total rows. But in the general case it's not guaranteed that num rows returned from hash join = num rows returned from child table.

Filtering on the parent could result in fewer rows than this!

For example, using your setup:

- For part = 5, there is one row in A where d = 'aa5'
- But there are no matching rows in B for this parent!

select /*+ gather_plan_statistics */* from A, B 
where  A.part = 5 and A.part = B.part and A.rec = B.t1_rec
and    a.d = 'aa5'; 

select * from table(dbms_xplan.display_cursor(null, null, 'IOSTATS LAST'));

-------------------------------------------------------------------------------------------------------                                                         
| Id  | Operation                    | Name         | Starts | E-Rows | A-Rows |   A-Time   | Buffers |                                                         
-------------------------------------------------------------------------------------------------------                                                         
|   0 | SELECT STATEMENT             |              |      1 |        |      0 |00:00:00.01 |      19 |                                                         
|*  1 |  HASH JOIN                   |              |      1 |      1 |      0 |00:00:00.01 |      19 |                                                         
|*  2 |   TABLE ACCESS BY INDEX ROWID| A            |      1 |      1 |      1 |00:00:00.01 |       3 |                                                         
|*  3 |    INDEX RANGE SCAN          | SYS_C0024994 |      1 |      4 |      4 |00:00:00.01 |       2 |                                                         
|*  4 |   TABLE ACCESS FULL          | B            |      1 |     23 |     23 |00:00:00.01 |      16 |                                                         
-------------------------------------------------------------------------------------------------------

There's 23 rows returned from B. But these don't match to any rows in A! So the whole query returns nothing. 23 rows for the hash join in this case is an overestimate.

In cases like this you could look at SQL profiles to improve the estimates:

https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:61313086268493
https://docs.oracle.com/database/121/TGSQL/tgsql_profiles.htm

Or SQL Plan Management (baselines) to lock queries to a particular execution plan:

http://www.oracle.com/technetwork/database/bi-datawarehousing/twp-sql-plan-management-11gr2-133099.pdf
https://oracle-base.com/articles/11g/sql-plan-management-11gr1

Rating

  (5 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

A reader, July 06, 2016 - 7:03 pm UTC

I understand that the estimate should be different if there are extra predicates on the parent table. At the same time your example is an extreme, caused by the fact that my load scripts had a defect, so the data was skewed in table B instead of being balanced.

Let's correct the skewing by running this script for table B instead:
begin
for i in 1..17*230 loop
insert into B values (mod(i, 170) + 1, i, mod(mod(i, 170) + 1, 17), 'a' || i);
insert into B values (mod(i, 170) + 1, i + 4600, mod(mod(i, 170) + 1, 17) + 20, 'a' || i);
insert into B values (mod(i, 170) + 1, i + 2 * 4600, mod(mod(i, 170) + 1, 17) + 40, 'a' || i);
insert into B values (mod(i, 170) + 1, i + 4 * 4600, mod(mod(i, 170) + 1, 17) + 60, 'a' || i);
end loop;
end;
/

Now, it is obvious that the estimate is wrong.

SQL> select * from A, B where A.part = 5 and A.part = B.part and A.rec = B.t1_rec and a.d = 'aa5'; 

23 rows selected.
-------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                |     1 |    27 |    20   (0)| 00:00:01 |
|*  1 |  HASH JOIN                           |                |     1 |    27 |    20   (0)| 00:00:01 |
|*  2 |   TABLE ACCESS BY INDEX ROWID BATCHED| A              |     1 |    11 |     3   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | SYS_C004190833 |     4 |       |     2   (0)| 00:00:01 |
|*  4 |   TABLE ACCESS FULL                  | B              |    92 |  1472 |    17   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------

Chris Saxon
July 06, 2016 - 11:50 pm UTC

Even with balanced data, the moment you have multiple columns/predicates in play, there are challenges for the optimizer

consider two columns

BIRTHDAY_MONTH
STAR_SIGN

Based on the standard population, the columns are balanced, each with 12 distinct values, and each value containing 100/12 =8.5% of the data.

But the predicate:

where birthday_month = 'January' and star_sign = 'Aquarius'

returns 8% of the data, whereas

where birthday_month = 'March' and star_sign = 'Aquarius'

returns 0% of the data.

So should the optimizer estimate 0 or 8 % ? In this simple example, we can use column grouping statistics (extended statistics) to help out, but you can see the challenge faced when (say) those columns are part of join conditions.

Wrong cardinality estimate

John, November 03, 2020 - 6:25 am UTC

Greetings!

I am seeing huge cardinality difference in Indexes in the execution plan( added below) . Its way off in A-Rows Vs E-Rows. Stats are uptodate on the table and indexes.

1) Also, tried with Extended stats. no difference.
Note that the underlying queries are built using views.

Can you help with what could be the cause of wrong cardinality estimates.


PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 14kc8hadyw0t6, child number 1
-------------------------------------
SELECT /*+ gather_plan_statistics */ "SFi"."INSTRUMENT_ID" AS
"SFi_INSTRUMENT_ID", "SFi"."COUPON_RATE" AS
"SFi_COUPON_RATE", "SFi"."MATURITY_DATE" AS
"SFi_MATURITY_DATE", "SFi"."ORG_NAME" AS "SFi_ORG_NAME",
"SFi"."INST_NAME" AS "SFi_INST_NAME", "SFi"."ASSET_TYPE"
AS "SFi_ASSET_TYPE", "SFi"."DEAD_DATE" AS
"SFi_DEAD_DATE", "SFi"."ORGANIZATION_ID" AS
"SFi_ORGANIZATION_ID", "SFi"."BOND_TICKER" AS
"SFi_BOND_TICKER", "iic"."INSTRUMENT_ID" AS "iic_INSTRUMENT_ID",
"iic"."IDENTIFIER" AS "iic_IDENTIFIER", "iic"."IDENTIFIER_TYPE" AS
"iic_IDENTIFIER_TYPE" FROM "BTGSM"."SEARCH_FI" "SFi" LEFT JOIN
"BT"."I_XREF_CUR" "iic" ON "iic"."INSTRUMENT_ID" =
"SFi"."INSTRUMENT_ID" WHERE ("SFi"."BOND_TICKER" =
UPPER(:tkr) AND "SFi"."DEAD_DATE" IS NULL) ORDER BY
"SFi"."INSTRUMENT_ID" ASC, "iic"."IDENTIFIER_TYPE" ASC FETCH NEXT
500 ROWS ONLY

Plan hash value: 2797543829

--------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 500 |00:00:20.97 | 12M| | | |
|* 1 | VIEW | | 1 | 500 | 500 |00:00:20.97 | 12M| | | |
|* 2 | WINDOW SORT PUSHED RANK | | 1 | 111K| 500 |00:00:20.97 | 12M| 478K| 478K| 424K (0)|
| 3 | NESTED LOOPS OUTER | | 1 | 111K| 6001K|00:00:17.93 | 12M| | | |
| 4 | NESTED LOOPS OUTER | | 1 | 71912 | 2583K|00:00:05.76 | 1986K| | | |
| 5 | NESTED LOOPS | | 1 | 71912 | 2583K|00:00:02.53 | 438K| | | |
|* 6 | TABLE ACCESS STORAGE FULL | ORG_MASTER | 1 | 623 | 472 |00:00:01.74 | 41973 | 1025K| 1025K| |
|* 7 | TABLE ACCESS BY INDEX ROWID BATCHED| I_MASTER | 472 | 115 | 2583K|00:00:02.16 | 396K| | | |
|* 8 | INDEX RANGE SCAN | I_MASTER_IDX2 | 472 | 135 | 2744K|00:00:00.55 | 7217 | 1025K| 1025K| |
| 9 | TABLE ACCESS BY INDEX ROWID | F_INCOME | 2583K| 1 | 69395 |00:00:02.43 | 1548K| | | |
|* 10 | INDEX UNIQUE SCAN | F_INCOME_PK | 2583K| 1 | 69395 |00:00:01.73 | 1473K| 1025K| 1025K| |
|* 11 | TABLE ACCESS BY INDEX ROWID BATCHED | I_XREF | 2583K| 2 | 5998K|00:00:11.42 | 10M| | | |
|* 12 | INDEX RANGE SCAN | I_XREF_PK | 2583K| 2 | 6077K|00:00:05.58 | 6198K| 1025K| 1025K| |
--------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("from$_subquery$_004"."rowlimit_$$_rownumber"<=500)
2 - filter(ROW_NUMBER() OVER ( ORDER BY "INSTRUMENT_ID","IDENTIFIER_TYPE")<=500)
6 - storage("BOND_TICKER"=UPPER(:TKR))
filter("BOND_TICKER"=UPPER(:TKR))
7 - filter("DEAD_DATE" IS NULL)
8 - access("ORGANIZATION_ID"="ORGANIZATION_ID")
10 - access("INSTRUMENT_ID"="INSTRUMENT_ID")
11 - filter(("RECORDED_DATE_END">=SYS_EXTRACT_UTC(SYSTIMESTAMP(6)) AND "EFFECTIVE_DATE_END">=SYS_EXTRACT_UTC(SYSTIMESTAMP(6))))
12 - access("INSTRUMENT_ID"="INSTRUMENT_ID" AND "RECORDED_DATE_START"<=SYS_EXTRACT_UTC(SYSTIMESTAMP(6)) AND
"EFFECTIVE_DATE_START"<=SYS_EXTRACT_UTC(SYSTIMESTAMP(6)))
filter(("RECORDED_DATE_START"<=SYS_EXTRACT_UTC(SYSTIMESTAMP(6)) AND "EFFECTIVE_DATE_START"<=SYS_EXTRACT_UTC(SYSTIMESTAMP(6))))

Chris Saxon
November 03, 2020 - 5:28 pm UTC

The only estimate that's significantly under is on line 8:

I_MASTER_IDX2 |  472 |   135 | 2744K


472 * 135 ~ 63,000, much lower than the actuals there. The nested loops mis-estimates are a consequence of this.

This suggests there may be a big skew in the values for ORGANIZATION_ID; because you're joining on this column, the optimizer doesn't know what value(s) you're searching for. So histograms have minimal benefit.

A few things to consider:

- Create an index on I_MASTER ( ORGANIZATION_ID, DEAD_DATE ), so the database doesn't have to go to the table to filter out the null dead dates
- Using /*+ dynamic_sampling(4) */ or higher, this may help the optimizer find a better plan
- Create a SQL profile to help the optimizer find a better plan, then lock this in a baseline if it's good.

using Dynamic sampling = 11

Rajeshwaran, Jeyabal, November 04, 2020 - 8:01 am UTC

was reading the initial question - that came from Oracle 12c - we got dynamic sampling level =11 new in 12c and above to fix these kind of join cardinality estimates

Any specific reason this "dynamic sampling" option was not provided/mentioned there in the initial response ?

https://docs.oracle.com/database/121/NEWFT/chapter12101.htm#FEATURENO09674

demo@XEPDB1> select * from table( dbms_xplan.display_cursor(format=>'allstats last'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------
SQL_ID  37axu62fzkmkj, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ * from A, B where A.part = 5 and
A.part = B.part and A.rec = B.t1_rec

Plan hash value: 2824785590

----------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name         | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |              |      1 |        |     23 |00:00:00.01 |      19 |
|*  1 |  HASH JOIN                           |              |      1 |      1 |     23 |00:00:00.01 |      19 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| A            |      1 |      4 |      4 |00:00:00.01 |       3 |
|*  3 |    INDEX RANGE SCAN                  | SYS_C0014404 |      1 |      4 |      4 |00:00:00.01 |       2 |
|*  4 |   TABLE ACCESS FULL                  | B            |      1 |     23 |     23 |00:00:00.01 |      16 |
----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("A"."PART"="B"."PART" AND "A"."REC"="B"."T1_REC")
   3 - access("A"."PART"=5)
   4 - filter("B"."PART"=5)


24 rows selected.


with Dynamic sampling =1 the join cardinality estimates are getting fixed.

demo@XEPDB1> select * from table( dbms_xplan.display_cursor(format=>'allstats last'));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------
SQL_ID  5wjqt128tgdy8, child number 0
-------------------------------------
select /*+ dynamic_sampling(11) gather_plan_statistics */ * from A, B
where A.part = 5 and A.part = B.part and A.rec = B.t1_rec

Plan hash value: 2824785590

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name         | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |              |      1 |        |     23 |00:00:00.01 |      19 |
|*  1 |  HASH JOIN                           |              |      1 |     23 |     23 |00:00:00.01 |      19 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| A            |      1 |      4 |      4 |00:00:00.01 |       3 |
|*  3 |    INDEX RANGE SCAN                  | SYS_C0014404 |      1 |      4 |      4 |00:00:00.01 |       2 |
|*  4 |   TABLE ACCESS FULL                  | B            |      1 |     23 |     23 |00:00:00.01 |      16 |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("A"."PART"="B"."PART" AND "A"."REC"="B"."T1_REC")
   3 - access("A"."PART"=5)
   4 - filter("B"."PART"=5)

Note
-----
   - dynamic statistics used: dynamic sampling (level=0)


Chris Saxon
November 04, 2020 - 9:24 am UTC

This is a four-year-old question! I can't remember my thought process at the time - I probably just overlooked this

But it's worth remembering that getting accurate row estimates is just a means to an end - getting the "right" plan.

As adjusting the row estimate has no effect on the join (it's still hash join reading the tables in the same order) for this query, so perhaps the better question is "does it matter the row estimate is wrong in this plan?"

Wrong cardinality estimate

John, November 05, 2020 - 6:28 am UTC

Hi Chris,

1) Tested with creating index on I_MASTER(organization_id, dead_date) but did not see much difference in the query performance or query plan.

2) Tested with setting optimizer_dynamic_sampling=11 at session level. This query is built from views.

Here is the plan. It looks like cardinality estimates is much better than before. But the plan is doing full table scan.

Any better way to write this query ?


QL> alter session set optimizer_dynamic_sampling=11 ;

Session altered.

..
500 rows selected.

Elapsed: 00:00:55.43

SQL> select * from TABLE(dbms_xplan.display_cursor(format => 'ALLSTATS LAST'))
2 /

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 14kc8hadyw0t6, child number 1
-------------------------------------
SELECT /*+ gather_plan_statistics */ "SFi"."INSTRUMENT_ID" AS
"SFi_INSTRUMENT_ID", "SFi"."COUPON_RATE" AS
"SFi_COUPON_RATE", "SFi"."MATURITY_DATE" AS
"SFi_MATURITY_DATE", "SFi"."ORG_NAME" AS "SFi_ORG_NAME",
"SFi"."INST_NAME" AS "SFi_INST_NAME", "SFi"."ASSET_TYPE"
AS "SFi_ASSET_TYPE", "SFi"."DEAD_DATE" AS
"SFi_DEAD_DATE", "SFi"."ORGANIZATION_ID" AS
"SFi_ORGANIZATION_ID", "SFi"."BOND_TICKER" AS
"SFi_BOND_TICKER", "iic"."INSTRUMENT_ID" AS "iic_INSTRUMENT_ID",
"iic"."IDENTIFIER" AS "iic_IDENTIFIER", "iic"."IDENTIFIER_TYPE" AS
"iic_IDENTIFIER_TYPE" FROM "BTGSM"."SEARCH_FI" "SFi" LEFT JOIN
"BT"."I_XREF_CUR" "iic" ON "iic"."INSTRUMENT_ID" =
"SFi"."INSTRUMENT_ID" WHERE ("SFi"."BOND_TICKER" =
UPPER(:tkr) AND "SFi"."DEAD_DATE" IS NULL) ORDER BY
"SFi"."INSTRUMENT_ID" ASC, "iic"."IDENTIFIER_TYPE" ASC FETCH NEXT
500 ROWS ONLY


Plan hash value: 3622608554

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem | Used-Tmp|
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 500 |00:00:55.35 | 13M| 2074K| 107K| | | | |
|* 1 | VIEW | | 1 | 500 | 500 |00:00:55.35 | 13M| 2074K| 107K| | | | |
|* 2 | WINDOW SORT PUSHED RANK | | 1 | 66266 | 500 |00:00:55.35 | 13M| 2074K| 107K| 974K| 960K| 865K (0)| |
| 3 | NESTED LOOPS OUTER | | 1 | 66266 | 6078K|00:00:54.71 | 13M| 2074K| 107K| | | | |
|* 4 | HASH JOIN | | 1 | 39450 | 2617K|00:00:40.84 | 2008K| 2074K| 107K| 2326K| 2326K| 1278K (0)| |
|* 5 | TABLE ACCESS STORAGE FULL | ORG_MASTER | 1 | 323 | 475 |00:00:01.90 | 42423 | 0 | 0 | 1025K| 1025K| | |
|* 6 | HASH JOIN RIGHT OUTER | | 1 | 49M| 47M|00:00:33.86 | 1966K| 2074K| 107K| 1143M| 62M| 776M (1)| 878K|
| 7 | TABLE ACCESS STORAGE FULL | F_INCOME | 1 | 20M| 20M|00:00:01.59 | 797K| 797K| 0 | 1025K| 1025K| 10M (0)| |
|* 8 | TABLE ACCESS STORAGE FULL | I_MASTER | 1 | 49M| 47M|00:00:05.95 | 1169K| 1169K| 0 | 1025K| 1025K| 10M (0)| |
|* 9 | TABLE ACCESS BY INDEX ROWID BATCHED| I_XREF | 2617K| 2 | 6076K|00:00:12.77 | 11M| 0 | 0 | | | | |
|* 10 | INDEX RANGE SCAN | I_XREF_PK | 2617K| 2 | 6160K|00:00:06.00 | 6279K| 0 | 0 | 1025K| 1025K| | |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("from$_subquery$_004"."rowlimit_$$_rownumber"<=500)
2 - filter(ROW_NUMBER() OVER ( ORDER BY "INSTRUMENT_ID","IDENTIFIER_TYPE")<=500)
4 - access("ORGANIZATION_ID"="ORGANIZATION_ID")
5 - storage("BOND_TICKER"=UPPER(:TKR))
filter("BOND_TICKER"=UPPER(:TKR))
6 - access("INSTRUMENT_ID"="INSTRUMENT_ID")
8 - storage("DEAD_DATE" IS NULL)
filter("DEAD_DATE" IS NULL)
9 - filter(("RECORDED_DATE_END">=SYS_EXTRACT_UTC(SYSTIMESTAMP(6)) AND "EFFECTIVE_DATE_END">=SYS_EXTRACT_UTC(SYSTIMESTAMP(6))))
10 - access("INSTRUMENT_ID"="INSTRUMENT_ID" AND "RECORDED_DATE_START"<=SYS_EXTRACT_UTC(SYSTIMESTAMP(6)) AND "EFFECTIVE_DATE_START"<=SYS_EXTRACT_UTC(SYSTIMESTAMP(6)))
filter(("RECORDED_DATE_START"<=SYS_EXTRACT_UTC(SYSTIMESTAMP(6)) AND "EFFECTIVE_DATE_START"<=SYS_EXTRACT_UTC(SYSTIMESTAMP(6))))

Note
-----
- dynamic statistics used: dynamic sampling (level=AUTO)


56 rows selected.


BT.I_XREF_CUR view code is as below- Any tuning can be done to this query ?

CREATE OR REPLACE FORCE EDITIONABLE VIEW "BT"."I_XREF_CUR" ("EFFECTIVE_DATE_START", "IDENTIFIER", "IDENTIFIER_TYPE", "INSTRUMENT_ID", "LOADED_BY") AS
SELECT
EFFECTIVE_DATE_START,
IDENTIFIER,
IDENTIFIER_TYPE,
INSTRUMENT_ID,
LOADED_BY
FROM BT.I_XREF
WHERE sys_extract_utc(systimestamp) between RECORDED_DATE_START and RECORDED_DATE_END
AND sys_extract_utc(systimestamp) between EFFECTIVE_DATE_START and EFFECTIVE_DATE_END

Chris Saxon
November 05, 2020 - 9:34 am UTC

The hash join for lines 6-8 returns 47 million rows! No wonder it's taking a while.

It looks to me like SEARCH_FI is a view too - I can't see the ORG_MASTER, F_INCOME, and I_MASTER in those queries.

What's it's definition?

Wrong cardinality estimates

John, November 06, 2020 - 2:23 am UTC

Hi Chris,

SEARCH_FI is a view.
Here is the view definition .

CREATE OR REPLACE FORCE EDITIONABLE VIEW "BT"."SEARCH_FI" ("INSTRUMENT_ID", "COUPON_RATE", "MATURITY_DATE", "ORG_NAME", "INST_NAME", "BOND_TICKER", "ASSET_TYPE", "DEAD_DATE",
"ORGANIZATION_ID") AS
SELECT IM.INSTRUMENT_ID, FI.CURRENT_COUPON_RATE, FI.MATURITY_DATE, OM.PRIMARY_NAME, IM.PRIMARY_NAME, OM.BOND_TICKER, IM.ASSET_TYPE, IM.DEAD_DATE, OM.ORGANIZATION_ID
FROM
BT.I_MASTER_CUR IM
INNER JOIN BT.ORG_MASTER_CUR OM ON (IM.ORGANIZATION_ID = OM.ORGANIZATION_ID)
LEFT OUTER JOIN BT.F_INCOME_CUR FI ON (IM.INSTRUMENT_ID = FI.INSTRUMENT_ID)
;
CREATE OR REPLACE FORCE EDITIONABLE VIEW "BT"."I_MASTER_CUR" ("ASSET_TYPE", "CHILD_ISSUE_IND", "COMMENTS", "COUNTRY_OF_RISK", "CREATION_DATE", "DEAD_DATE", "DELIVERY_DATE", "DTC_ELIGIBILITY",
"EVAL_QUOTATION_BASIS", "HYBRID_TYPE", "IDC_ASSET_TYPE", "INITIAL_AUCTION_DATE", "INSTRUMENT_ID", "INSTRUMENT_SETUP_STATUS", "INSTRUMENT_SETUP_STATUS_DATE", "INSTRUMENT_STATUS",
"INSTRUMENT_STAT
US_REASON", "INSTRUMENT_TYPE", "ISSUE_DATE", "ISSUE_PRICE", "ISSUE_PRICE_FIXED_IND", "ISSUE_PRICE_PERCENT", "ISSUE_PRICE_TYPE", "LOADED_BY", "ORGANIZATION_ID", "OUTSTANDING_AMOUNT",
"OUTSTANDING_IND", "OVERALLOTMENT_IND", "PRIMARY_CURRENCY_CODE", "PRIMARY_EXCHANGE", "PRIMARY_MARKET_CODE", "PRIMARY_NAME", "REGISTRATION_DAYS", "REGISTRATION_TYPE", "RIGHTS_IND", "SECTOR",
"SEC_REGISTRATION_FLAG", "SE
TTLEMENT_DATE", "SOURCE_CODE", "TRADING_RESTRICTIONS_TYPE", "TRADING_STATUS", "UNIT_IND") AS
SELECT
ASSET_TYPE,
CHILD_ISSUE_IND,
COMMENTS,
COUNTRY_OF_RISK,
CREATION_DATE,
DEAD_DATE,
DELIVERY_DATE,
DTC_ELIGIBILITY,
EVAL_QUOTATION_BASIS,
HYBRID_TYPE,
IDC_ASSET_TYPE,
INITIAL_AUCTION_DATE,
INSTRUMENT_ID,
INSTRUMENT_SETUP_STATUS,
INSTRUMENT_SETUP_STATUS_DATE,
INSTRUMENT_STATUS,
INSTRUMENT_STATUS_REASON,
INSTRUMENT_TYPE,
ISSUE_DATE,
ISSUE_PRICE,
ISSUE_PRICE_FIXED_IND,
ISSUE_PRICE_PERCENT,
ISSUE_PRICE_TYPE,
LOADED_BY,
ORGANIZATION_ID,
OUTSTANDING_AMOUNT,
OUTSTANDING_IND,
OVERALLOTMENT_IND,
PRIMARY_CURRENCY_CODE,
PRIMARY_EXCHANGE,
PRIMARY_MARKET_CODE,
PRIMARY_NAME,
REGISTRATION_DAYS,
REGISTRATION_TYPE,
RIGHTS_IND,
SECTOR,
SEC_REGISTRATION_FLAG,
SETTLEMENT_DATE,
SOURCE_CODE,
TRADING_RESTRICTIONS_TYPE,
TRADING_STATUS,
UNIT_IND
FROM IC.I_MASTER
;
CREATE OR REPLACE FORCE EDITIONABLE VIEW "BT"."ORG_MASTER_CUR" ("BOND_TICKER", "INCORPORATED_STATE_CODE", "LOADED_BY", "ORGANIZATION_COUNTRY_CODE", "ORGANIZATION_ID", "ORGANIZATION_STATUS",
"ORGANIZATION_TYPE", "PRIMARY_NAME", "PARENT", "ULT_PARENT", "IS_LEGAL_ENTITY_ROOT") AS
SELECT
BOND_TICKER,
INCORPORATED_STATE_CODE,
LOADED_BY,
ORGANIZATION_COUNTRY_CODE,
ORGANIZATION_ID,
ORGANIZATION_STATUS,
ORGANIZATION_TYPE,
PRIMARY_NAME,
PARENT,
ult_parent,
IS_LEGAL_ENTITY_ROOT
FROM IC.ORG_MASTER
;
CREATE OR REPLACE FORCE EDITIONABLE VIEW "BT"."F_INCOME_CUR" ("ACCRETING_IND", "ACCRETION_BASIS", "ACCRETION_COMPOUNDING_METHOD", "ACCRETION_START_DATE", "ACCRUAL_COMPOUNDING_METHOD",
"ACCRUAL_DATE", "ACCRUAL_DAYS_PRIOR", "ACCRUED_INTEREST_IND", "ACCRUED_ROLL_CONVENTION", "AGENCY_DISC_NOTE_IND", "AMOUNT_OUTSTANDING_RATIO", "AMOUNT_OUTSTANDING_RATIO_DATE", "AVERAGE_LIFE",
"BOND_FORM", "BON
D_INSURER", "BUSINESS_DAY_CONVENTION_CODE", "CALL_IND", "CAPITALIZED_FLAG", "CHILD_IND", "CHILD_ISSUE_INSURANCE_TYPE", "CHILD_ISSUE_INSURED_DATE", "CLASS", "COMPOUND_INTEREST_IND",
"CONTINGENT_INTEREST_IND", "CONVERTIBLE_IND", "COUPON_SUMMARY", "COUPON_TYPE", "COUPON_TYPE_METHOD", "CREDIT_ENHANCEMENT_TYPE", "CREDIT_SENSITIVE_IND", "CREDIT_SENSITIVE_RATE_ADJ_TYPE",
"CURRENT_COUPON_DETERM_DATE", "CU
RRENT_COUPON_NEXT_RESET_DATE", "CURRENT_COUPON_RATE", "CURRENT_COUPON_RESET_DATE", "CURRENT_CPN_NEXT_DETERM_DATE", "DATED_DATE", "DAYCOUNT_BASIS_TYPE", "DEBT_RANK_TYPE", "DEBT_TYPE",
"DEFAULT_IND", "DEFAULT_PERIOD_END_DATE", "DEFAULT_PERIOD_START_DATE", "DEFEASANCE_IND", "DEFERRABLE_IND", "DEFERRED_INTEREST_PERIOD", "DEFERRED_INTEREST_PERIOD_UNITS",
"DEFERRED_INTEREST_TYPE", "DELAYED_FIRST_PAYMENT
_AMOUNT", "DEPOSITORY_TYPE", "DETERM_DATE_OFFSET", "DTC_IND", "EQUITY_LINKED_NOTE_IND", "EUSD_TYPE", "FINAL_COUPON_PAYMENT_DATE", "FIRST_COUPON_DATE", "FIRST_END_ACCRUAL_DATE",
"FIRST_PAYMENT_DATE", "FITCH_DEBT_LEVEL", "GRACE_PERIOD", "GRACE_PERIOD_UNITS", "GREEN_BOND_IND", "GREEN_BOND_THIRD_PARTY", "GROSS_UP_IND", "GUARANTEED_TYPE", "INDEX_LINKED_FIXING_DATE",
"INFLATION_PROTECTED_INDEX", "INFLAT
ION_PROTECTED_INDEX_BASE", "INSTRUMENT_ID", "INTEREST_CALCULATION_METHOD", "INTEREST_PAYMENT_DATE_CODE", "INTEREST_PAYMENT_FREQ", "ISSUE_AMOUNT", "IS_CONVERT_IND", "LAST_ACCRUAL_DATE",
"LAST_COUPON_PERIOD_TYPE", "LAST_DAY_OF_MONTH_PAYMENT_IND", "LOADED_BY", "MATURITY_DATE", "MATURITY_REDEMPTION_PCT", "MATURITY_YEARS", "MOODY_SENIORITY_TYPE", "MORTGAGE_TYPE",
"NEXT_COUPON_CHANGE_DATE", "NEXT_COUPON
_PAYMENT_DATE", "NOMINAL_VALUE", "NO_UNDERWRITER_TYPE", "ODD_FIRST_COUPON_DATE", "OFFERING_TYPE", "OID_IND", "ORIGINAL_MATURITY_DATE", "ORIG_AVERAGE_LIFE_DATE", "ORIG_COUPON_RATE",
"ORIG_COUPON_RATE_IND", "ORIG_INTEREST_PAYMENT_FREQ", "ORIG_PRINCIPAL_AMOUNT", "ORIG_YIELD", "OTHER_ACCRUAL_DATE", "OTHER_TAX_EXEMPT_IND", "OUTSTANDING_AMOUNT", "PAYMENT_MONTHDAY",
"PAYMENT_ROLL_CONVENTION", "PERMANENT_
INTEREST_IND", "PFANDBRIEF_TYPE", "PIK_COUPON_RATE", "PIK_DESC_TYPE", "PIK_END_DATE", "PIK_IND", "PIK_NOTICE", "PIK_NOTICE_UNITS", "PIK_START_DATE", "PRE_REDENOMINATION_AMOUNT",
"PROGRAM_TYPE", "PROSPECTUS_DATE", "PUT_IND", "RANGE_BOND_IND", "RECORD_DATE_FORMULA", "RECORD_DATE_HOLIDAY_RULES", "RECORD_DATE_OFFSET", "RECORD_DATE_RULES", "REMARKETED_COUPON_RATE",
"REMARKETED_IND", "REMARKETED_INTERES
T_ACCRUAL_ST", "SALE_TYPE", "SERIES", "SETTLEMENT_TYPE", "SINK_IND", "SNP_DEBT_TYPE", "SNP_DERIVED_DEBT_TYPE", "SPECIAL_FIRST_PAYMENT", "STATED_FIRST_COUPON_DATE", "STILL_CALLABLE_IND",
"STRIPPABLE_IND", "STRIP_TYPE", "SUBORDINATION_TYPE", "SYNTHETIC_IND", "TAXABLE_IND", "TENDER_EXCHANGE_OFFER_IND", "TIPS_IND", "TRACE_LAST_TRADE_SIZE", "TRACE_LAST_TRADE_TIME", "TRANCHE_IND",
"USTREAS_OTR_TERM", "U
STREAS_OTR_TYPE", "US_TREASURY_TYPE", "VARIABLE_DELAY_RULE", "WARRANTS_IND", "WARRANT_EXPIRATION_DATE") AS
SELECT
ACCRETING_IND,
ACCRETION_BASIS,
ACCRETION_COMPOUNDING_METHOD,
ACCRETION_START_DATE,
ACCRUAL_COMPOUNDING_METHOD,
ACCRUAL_DATE,
ACCRUAL_DAYS_PRIOR,
ACCRUED_INTEREST_IND,
ACCRUED_ROLL_CONVENTION,
AGENCY_DISC_NOTE_IND,
AMOUNT_OUTSTANDING_RATIO,
AMOUNT_OUTSTANDING_RATIO_DATE,
AVERAGE_LIFE,
BOND_FORM,
BOND_INSURER,
BUSINESS_DAY_CONVENTION_CODE,
CALL_IND,
CAPITALIZED_FLAG,
CHILD_IND,
CHILD_ISSUE_INSURANCE_TYPE,
CHILD_ISSUE_INSURED_DATE,
CLASS,
COMPOUND_INTEREST_IND,
CONTINGENT_INTEREST_IND,
CONVERTIBLE_IND,
COUPON_SUMMARY,
COUPON_TYPE,
COUPON_TYPE_METHOD,
CREDIT_ENHANCEMENT_TYPE,
CREDIT_SENSITIVE_IND,
CREDIT_SENSITIVE_RATE_ADJ_TYPE,
CURRENT_COUPON_DETERM_DATE,
CURRENT_COUPON_NEXT_RESET_DATE,
CURRENT_COUPON_RATE,
CURRENT_COUPON_RESET_DATE,
CURRENT_CPN_NEXT_DETERM_DATE,
DATED_DATE,
DAYCOUNT_BASIS_TYPE,
DEBT_RANK_TYPE,
DEBT_TYPE,
DEFAULT_IND,
DEFAULT_PERIOD_END_DATE,
DEFAULT_PERIOD_START_DATE,
DEFEASANCE_IND,
DEFERRABLE_IND,
DEFERRED_INTEREST_PERIOD,
DEFERRED_INTEREST_PERIOD_UNITS,
DEFERRED_INTEREST_TYPE,
DELAYED_FIRST_PAYMENT_AMOUNT,
DEPOSITORY_TYPE,
DETERM_DATE_OFFSET,
DTC_IND,
EQUITY_LINKED_NOTE_IND,
EUSD_TYPE,
FINAL_COUPON_PAYMENT_DATE,
FIRST_COUPON_DATE,
FIRST_END_ACCRUAL_DATE,
FIRST_PAYMENT_DATE,
FITCH_DEBT_LEVEL,
GRACE_PERIOD,
GRACE_PERIOD_UNITS,
GREEN_BOND_IND,
GREEN_BOND_THIRD_PARTY,
GROSS_UP_IND,
GUARANTEED_TYPE,
INDEX_LINKED_FIXING_DATE,
INFLATION_PROTECTED_INDEX,
INFLATION_PROTECTED_INDEX_BASE,
INSTRUMENT_ID,
INTEREST_CALCULATION_METHOD,
INTEREST_PAYMENT_DATE_CODE,
INTEREST_PAYMENT_FREQ,
ISSUE_AMOUNT,
IS_CONVERT_IND,
LAST_ACCRUAL_DATE,
LAST_COUPON_PERIOD_TYPE,
LAST_DAY_OF_MONTH_PAYMENT_IND,
LOADED_BY,
MATURITY_DATE,
MATURITY_REDEMPTION_PCT,
MATURITY_YEARS,
MOODY_SENIORITY_TYPE,
MORTGAGE_TYPE,
NEXT_COUPON_CHANGE_DATE,
NEXT_COUPON_PAYMENT_DATE,
NOMINAL_VALUE,
NO_UNDERWRITER_TYPE,
ODD_FIRST_COUPON_DATE,
OFFERING_TYPE,
OID_IND,
ORIGINAL_MATURITY_DATE,
ORIG_AVERAGE_LIFE_DATE,
ORIG_COUPON_RATE,
ORIG_COUPON_RATE_IND,
ORIG_INTEREST_PAYMENT_FREQ,
ORIG_PRINCIPAL_AMOUNT,
ORIG_YIELD,
OTHER_ACCRUAL_DATE,
OTHER_TAX_EXEMPT_IND,
OUTSTANDING_AMOUNT,
PAYMENT_MONTHDAY,
PAYMENT_ROLL_CONVENTION,
PERMANENT_INTEREST_IND,
PFANDBRIEF_TYPE,
PIK_COUPON_RATE,
PIK_DESC_TYPE,
PIK_END_DATE,
PIK_IND,
PIK_NOTICE,
PIK_NOTICE_UNITS,
PIK_START_DATE,
PRE_REDENOMINATION_AMOUNT,
PROGRAM_TYPE,
PROSPECTUS_DATE,
PUT_IND,
RANGE_BOND_IND,
RECORD_DATE_FORMULA,
RECORD_DATE_HOLIDAY_RULES,
RECORD_DATE_OFFSET,
RECORD_DATE_RULES,
REMARKETED_COUPON_RATE,
REMARKETED_IND,
REMARKETED_INTEREST_ACCRUAL_ST,
SALE_TYPE,
SERIES,
SETTLEMENT_TYPE,
SINK_IND,
SNP_DEBT_TYPE,
SNP_DERIVED_DEBT_TYPE,
SPECIAL_FIRST_PAYMENT,
STATED_FIRST_COUPON_DATE,
STILL_CALLABLE_IND,
STRIPPABLE_IND,
STRIP_TYPE,
SUBORDINATION_TYPE,
SYNTHETIC_IND,
TAXABLE_IND,
TENDER_EXCHANGE_OFFER_IND,
TIPS_IND,
TRACE_LAST_TRADE_SIZE,
TRACE_LAST_TRADE_TIME,
TRANCHE_IND,
USTREAS_OTR_TERM,
USTREAS_OTR_TYPE,
US_TREASURY_TYPE,
VARIABLE_DELAY_RULE,
WARRANTS_IND,
WARRANT_EXPIRATION_DATE
FROM IC.F_INCOME
;
CREATE OR REPLACE FORCE EDITIONABLE VIEW "BT"."I_XREF_CUR" ("EFFECTIVE_DATE_START", "IDENTIFIER", "IDENTIFIER_TYPE", "INSTRUMENT_ID", "LOADED_BY") AS
SELECT
EFFECTIVE_DATE_START,
IDENTIFIER,
IDENTIFIER_TYPE,
INSTRUMENT_ID,
LOADED_BY
FROM BT.I_XREF
WHERE sys_extract_utc(systimestamp) between RECORDED_DATE_START and RECORDED_DATE_END
AND sys_extract_utc(systimestamp) between EFFECTIVE_DATE_START and EFFECTIVE_DATE_END
;
Chris Saxon
November 06, 2020 - 9:16 am UTC

Sadly I think this may be a case of "you're reading lots of data, it's going to take a while"

The only real filtering is bond_ticker = upper (:tkr), everything else is joins. In both plans the query is returning over 6 million rows at the step before the top-N filtering.

I doubt this will make much difference, but it's worth checking that all the outer joins do need to be outer. Switching to inner may help a little.

Can you add any other filters to the query?

More to Explore

Performance

Get all the information about database performance in the Database Performance guide.