Skip to Main Content

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, Luis Gustavo.

Asked: July 14, 2016 - 2:45 pm UTC

Last updated: May 17, 2017 - 3:40 pm UTC

Version: 10.1.2

Viewed 10K+ times! This question is

You Asked

I would like to understand why my sql instruction doesn't use an index;

I have a table that has 6 million rows. And I'm trying to extract some rows and Oracle optmizer are not using the index existent in column.

TABLE ESTRUCTURE:
COLUMN_NAME  DATA_TYPE

ID_PRODUTO_FILIAL NUMBER(18,0)
ID_PRODUTO  NUMBER(10,0)
ID_FILIAL  NUMBER(10,0)
QT_ESTOQUE_MINIMO NUMBER(18,6)
DATA_CADASTRO  DATE
ID_USUARIO_CADASTRO NUMBER(18,0)
DATA_ALTERACAO  DATE
DT_ULTIMA_VENDA  DATE

 SELECT * FROM CAM_PRODUTO_FILIAL
----------------------------------------------------------------------------------------
| Id  | Operation         | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                    |  6.841K|   443M|  1157  (16)| 00:00:01 |
|   1 |  TABLE ACCESS FULL| CAM_PRODUTO_FILIAL |  6.841K|   443M|  1157  (16)| 00:00:01 |
----------------------------------------------------------------------------------------

Using a filter by a not indexed column, the number of rows have an expressive reduction but the cost doesn't.

SELECT * FROM CAM_PRODUTO_FILIAL WHERE ID_FILIAL = 1
----------------------------------------------------------------------------------------
| Id  | Operation         | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                    | 53.373 |  3544K|  1054   (8)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| CAM_PRODUTO_FILIAL | 53.373 |  3544K|  1054   (8)| 00:00:01 |
----------------------------------------------------------------------------------------

So, I decided to create a index ID_FILIAL column.

CREATE INDEX PRODUTOFILIAL_ID_FILIAL_IDX ON CAM_PRODUTO_FILIAL(ID_FILIAL);


Extracting query again...
----------------------------------------------------------------------------------------
| Id  | Operation         | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                    | 53373 |  3544K|  1054   (8)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| CAM_PRODUTO_FILIAL | 53373 |  3544K|  1054   (8)| 00:00:01 |
----------------------------------------------------------------------------------------

Could anyone tell me why ?

Thanks.

and Chris said...

There are many reasons why this may be the case. But first:

Are you sure using an index is faster than a full tablescan?

53,373 rows is a lot! Assuming this is accurate, there's a good chance that the full tablescan is faster.

https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:9422487749968

To figure this out, you need to look at the execution plans for the indexed query and the full tablescan. Hint these if necessary. To find out how to create an execution plan, see:

https://blogs.oracle.com/sql/entry/how_to_create_an_execution

If the index is better, somethings to check:

- Are the table stats up to date?
- Has anyone changed system settings such as OPTIMIZER_INDEX_COST_ADJ?
- Are there SQL profiles or plan baselines for the query?

If you want further help, please post the execution plans for the full tablescan and indexed queries.

Rating

  (5 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

few more to check

Rajeshwaran, Jeyabal, July 15, 2016 - 2:15 am UTC

- Are the table stats up to date?
- Has anyone changed system settings such as OPTIMIZER_INDEX_COST_ADJ?
- Are there SQL profiles or plan baselines for the query?


In addition to the above, few more to check,
-How about the Clustering factor of the index? (is that close to the num_rows or blocks? - aka Organization of data in disk)
-How about the data distribution in this column "ID_FILIAL", do we have some massively skewed data values in it?

Is 53K rows a lot for a 6M rows table?

Leandro Martins de Lima, May 15, 2017 - 4:52 pm UTC

Hello, Chris.

Could you please elaborate why 53,373 rows is a lot for a 6 million rows table? I'm failing to see how <1% of total data volume can be considered a lot in this scenario.
Connor McDonald
May 15, 2017 - 7:23 pm UTC

53,373 rows could each be on a different block.

So that could conceivably be 53,373 individual I/O calls to get each of those blocks.

Your table has some number and date columns - let's say each row is about 50 bytes long. That means around 160 rows per block (lets say 140 with pctfree=10), so with an 8k block size, 7million rows would be ... surprise surprise ... around 400meg.

Now...depending on your init.ora, we can read up to 1meg in a single call to the storage layer. So that would be 400 calls.

Which would *you* prefer - 400 calls or 53,000 calls ?

Leandro Martins de Lima, May 16, 2017 - 1:15 pm UTC

That's an interesting explanation, but I still find hard to see such situation in real life.

Somebody would have to put much effort to have about 53k rows evenly split on 53k differents blocks, like almost a manual job. Although not really impossible, to combine this with the limit per read one would have to manually change it to about 1meg as stated on the doc for the parameter DB_FILE_MULTIBLOCK_READ_COUNT :

"Even though the default value may be a large value, the optimizer will not favor large plans if you do not set this parameter. It would do so only if you explicitly set this parameter to a large value"

I'm not questioning the explanation, in fact I'm just delving into the subject because I'm experiencing a similar situation as the original question: a 18 million rows table that refuses to use an optimal index when returning more than 500K rows.

When hinting the index (for a more than 600k rows predicate) I see the cost higher than a FTS. But it's curious to see it using the index as expected if the query returns less than or about 500k rows.

I just tried copying the table to another schema (with a CTAS) and adding the index; a fresh table to avoid the "one row per block" situation, updated statistics and all. Still getting the same problem, when running the query for more than 600K rows the index is skipped in favor of a FTS.

What else could explain the bad performance for what is, in theory, an optimal index for my query?
Connor McDonald
May 17, 2017 - 3:40 pm UTC

A friend Wolfgang did a great writeup on the basics of how the costing will be done. Have a look at his whitepaper

http://www.centrexcc.com/A%20Look%20under%20the%20Hood%20of%20CBO%20-%20the%2010053%20Event.pdf


To - Leandro Martins de Lima

Rajeshwaran, Jeyabal, May 16, 2017 - 6:31 pm UTC


Somebody would have to put much effort to have about 53k rows evenly split on 53k differents blocks,


think about it, an index on the employee_name column, the values would come randomly over the period of time ( and employee_name is not something monotonically increasing like a sequence values )

the first row to the emp table could have employee_name as ADAM.
the second row could have employee_name as CHRIS
so these two values would be sitting next to each other on the same leaf block.
upon third row insert could have employee_name as BOB, then this value should lie between ADAM and CHRIS in the leaf block, as over the period of time more values come in and lies between the existing values, could lead to leaf block split and the values could easily spread across the different leaf blocks.

that could be the case here.


I'm experiencing a similar situation as the original question: a 18 million rows table that refuses to use an optimal index when returning more than 500K rows.


Perhaps that could be due to Clustering Factor of the index.

demo@ORA12C> create table t1
  2  parallel 4
  3  as
  4  select *
  5  from big_table
  6  order by id ;

Table created.

demo@ORA12C>
demo@ORA12C> create table t2
  2  parallel 4
  3  as
  4  select *
  5  from big_table
  6  order by dbms_random.random;

Table created.

demo@ORA12C> alter table t1 noparallel;

Table altered.

demo@ORA12C> alter table t2 noparallel;

Table altered.

demo@ORA12C>
demo@ORA12C> create index t1_idx on t1(id) parallel 4 nologging;

Index created.

demo@ORA12C> create index t2_idx on t2(id) parallel 4 nologging;

Index created.

demo@ORA12C> alter index t1_idx noparallel;

Index altered.

demo@ORA12C> alter index t2_idx noparallel;

Index altered.

demo@ORA12C> select i.index_name,i.clustering_factor,t.blocks,t.num_rows,i.degree
  2  from user_indexes i,
  3       user_tables t
  4  where i.table_name = t.table_name
  5  and t.table_name in ('T1','T2');

INDEX_NAME CLUSTERING_FACTOR     BLOCKS   NUM_ROWS DEGREE
---------- ----------------- ---------- ---------- -----------------------------------
T1_IDX                187987     188656   10000000 1
T2_IDX               9999962     188640   10000000 1

demo@ORA12C>


Same set of data exists in both T1 and T2 but their data organization is different.

T1_IDX has the clustering factor close to the number of blocks
T2_IDX has the clustering factor close to the number of rows.

When retrieving 1% of rows from both the table, T1 uses index where as T2 uses Full table scan.

demo@ORA12C> set serveroutput off
demo@ORA12C> set feedback only
demo@ORA12C> select /*+ gather_plan_statistics */ * from t1 where id <=100000;

100000 rows selected.

demo@ORA12C> set feedback on
demo@ORA12C> select * from table(dbms_xplan.display_cursor(format=>'allstats last'));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------
SQL_ID  4yatghqu001ph, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ * from t1 where id <=100000

Plan hash value: 1775246573

--------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |        |      1 |        |    100K|00:00:01.49 |    3546 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1     |      1 |    100K|    100K|00:00:01.49 |    3546 |
|*  2 |   INDEX RANGE SCAN                  | T1_IDX |      1 |    100K|    100K|00:00:00.49 |     888 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID"<=100000)


19 rows selected.

demo@ORA12C> set feedback only
demo@ORA12C> select /*+ gather_plan_statistics */ * from t2 where id <=100000;

100000 rows selected.

demo@ORA12C> set feedback on
demo@ORA12C> select * from table(dbms_xplan.display_cursor(format=>'allstats last'));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------
SQL_ID  0n41wz4nwc6p1, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ * from t2 where id <=100000

Plan hash value: 1513984157

---------------------------------------------------------------------------------------------
| Id  | Operation         | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |      1 |        |    100K|00:00:02.46 |     188K|    187K|
|*  1 |  TABLE ACCESS FULL| T2   |      1 |    100K|    100K|00:00:02.46 |     188K|    187K|
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("ID"<=100000)


18 rows selected.

demo@ORA12C> 

Leandro Martins de Lima, May 16, 2017 - 7:06 pm UTC

Yeah, it's awful to accept defeat but you guys were right, we have a bad case of high clustering factor.

Although I got my answer and found the root of the problem thinking about this situation made me ask if its an evolution of the old days of fragmentation: rebuilding a table when it grows too big. I also found this entry here on AskTOM which states the same technique to solve it:

https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:9524251800346726054#9524263800346897661

So, rebuilding the table in a better order is the only approach? And it isn't a very useful one since this may help some specifics indexes while impacting negatively others.
Connor McDonald
May 17, 2017 - 3:37 pm UTC

One way is - dbms_redefinition lets you reorg a table nice and simply and specify an ordering sequence.

Another option is attribute clustering. Richard Foote has a nice set of posts on this

https://richardfoote.wordpress.com/category/attribute-clustering/

More to Explore

Performance

Get all the information about database performance in the Database Performance guide.