Skip to Main Content

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Rauf.

Asked: July 06, 2018 - 12:07 pm UTC

Last updated: May 10, 2019 - 2:01 pm UTC

Version: 11g 11.2.0.4.0

Viewed 1000+ times

You Asked

Hi Tom,
I have table with 1 755 001 rows and functional-based index as follows:

create table OC_YKB.TAB_TEST3 (guid number(16), pan varchar2(19 byte));

create sequence oc_ykb.sq_test3;

INSERT /*+ APPEND */ INTO OC_YKB.TAB_TEST3 select oc_ykb.sq_test3.NEXTVAL, 536238||lpad(oc_ykb.sq_test3.NEXTVAL,10,0) from dual connect by level <= 1000000;
INSERT /*+ APPEND */ INTO OC_YKB.TAB_TEST3 select oc_ykb.sq_test3.NEXTVAL, 516823||lpad(oc_ykb.sq_test3.NEXTVAL,10,0) from dual connect by level <= 500000;
INSERT /*+ APPEND */ INTO OC_YKB.TAB_TEST3 select oc_ykb.sq_test3.NEXTVAL, 550222||lpad(oc_ykb.sq_test3.NEXTVAL,10,0) from dual connect by level <= 200000;
INSERT /*+ APPEND */ INTO OC_YKB.TAB_TEST3 select oc_ykb.sq_test3.NEXTVAL, 550358||lpad(oc_ykb.sq_test3.NEXTVAL,10,0) from dual connect by level <= 50000;
INSERT /*+ APPEND */ INTO OC_YKB.TAB_TEST3 select oc_ykb.sq_test3.NEXTVAL, 558649||lpad(oc_ykb.sq_test3.NEXTVAL,10,0) from dual connect by level <= 5000;
INSERT /*+ APPEND */ INTO OC_YKB.TAB_TEST3 select oc_ykb.sq_test3.NEXTVAL, 539638||lpad(oc_ykb.sq_test3.NEXTVAL,10,0) from dual connect by level <= 1;

CREATE INDEX OC_YKB.IX_TAB_TEST_FUNC_1 ON OC_YKB.TAB_TEST3
(SUBSTR(pan, 1, 6));

exec DBMS_STATS.GATHER_TABLE_STATS(ownname => 'OC_YKB' , tabname => 'TAB_TEST3', cascade => true);


Index works for this queries:

SELECT SUBSTR(PAN, 1, 6) FROM OC_YKB.TAB_TEST3 WHERE SUBSTR(PAN, 1, 6) = '539638';

| Id  | Operation        | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                    |       |       |     7 (100)|          |
|*  1 |  INDEX RANGE SCAN| IX_TAB_TEST_FUNC_1 |  1449 | 10143 |     7   (0)| 00:00:01 |


SELECT *
  FROM OC_YKB.TAB_TEST3
 WHERE SUBSTR(PAN, 1, 6) = 
  (   SELECT '539638' FROM OC_YKB.TAB_TEST3 WHERE SUBSTR(PAN, 1, 6) = '539638'
  );

| Id  | Operation                   | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                    |       |       |    17 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| TAB_TEST3          |  1449 | 42021 |    17   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IX_TAB_TEST_FUNC_1 |  1450 |       |     7   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN         | IX_TAB_TEST_FUNC_1 |     2 |    14 |     3   (0)| 00:00:01 |


SELECT *
  FROM OC_YKB.TAB_TEST3
 WHERE SUBSTR(PAN, 1, 6) in 
  (   SELECT SUBSTR(PAN, 1, 6) FROM OC_YKB.TAB_TEST3 WHERE SUBSTR(PAN, 1, 6) = '539638'
  );

| Id  | Operation                    | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                    |       |       |    24 (100)|          |
|*  1 |  HASH JOIN RIGHT SEMI        |                    |  1449 | 52164 |    24   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN           | IX_TAB_TEST_FUNC_1 |  1449 | 10143 |     7   (0)| 00:00:01 |
|   3 |   TABLE ACCESS BY INDEX ROWID| TAB_TEST3          |  1449 | 42021 |    17   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN          | IX_TAB_TEST_FUNC_1 |  1450 |       |     7   (0)| 00:00:01 |



but does not work for this queries:

SELECT *
  FROM OC_YKB.TAB_TEST3
 WHERE SUBSTR(PAN, 1, 6) = 
  (   SELECT SUBSTR(PAN, 1, 6) FROM OC_YKB.TAB_TEST3 WHERE SUBSTR(PAN, 1, 6) = '539638'
  )

| Id  | Operation         | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                    |       |       |  1914 (100)|          |
|*  1 |  TABLE ACCESS FULL| TAB_TEST3          |   292K|  8283K|  1911   (4)| 00:00:23 |
|*  2 |   INDEX RANGE SCAN| IX_TAB_TEST_FUNC_1 |     2 |    14 |     3   (0)| 00:00:01 |


SELECT *
  FROM OC_YKB.TAB_TEST3
 WHERE SUBSTR(PAN, 1, 6) in 
  (   SELECT '539638' FROM OC_YKB.TAB_TEST3 WHERE SUBSTR(PAN, 1, 6) = '539638'
  );

| Id  | Operation          | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                    |       |       |  1901 (100)|          |
|*  1 |  FILTER            |                    |       |       |            |          |
|   2 |   TABLE ACCESS FULL| TAB_TEST3          |  1755K|    48M|  1883   (3)| 00:00:23 |
|*  3 |   FILTER           |                    |       |       |            |          |
|*  4 |    INDEX RANGE SCAN| IX_TAB_TEST_FUNC_1 |     2 |    14 |     3   (0)| 00:00:01 |



Can you explain why index does not work in second examples. Thanks

and Connor said...

The index *can* work, but it is a costing decision by the optimizer. Here's the intial run

SQL>
SQL> create table TAB_TEST3 (guid number(16), pan varchar2(19 byte));

Table created.

SQL>
SQL> create sequence sq_test3 cache 1000;

Sequence created.

SQL>
SQL> INSERT /*+ APPEND */ INTO TAB_TEST3 select sq_test3.NEXTVAL, 536238||lpad(sq_test3.NEXTVAL,10,0) from dual connect by level <= 1000000;


1000000 rows created.

SQL> commit;

Commit complete.

SQL> INSERT /*+ APPEND */ INTO TAB_TEST3 select sq_test3.NEXTVAL, 516823||lpad(sq_test3.NEXTVAL,10,0) from dual connect by level <= 500000;

500000 rows created.

SQL> commit;

Commit complete.

SQL> INSERT /*+ APPEND */ INTO TAB_TEST3 select sq_test3.NEXTVAL, 550222||lpad(sq_test3.NEXTVAL,10,0) from dual connect by level <= 200000;

200000 rows created.

SQL> commit;

Commit complete.

SQL> INSERT /*+ APPEND */ INTO TAB_TEST3 select sq_test3.NEXTVAL, 550358||lpad(sq_test3.NEXTVAL,10,0) from dual connect by level <= 50000;

50000 rows created.

SQL> commit;

Commit complete.

SQL> INSERT /*+ APPEND */ INTO TAB_TEST3 select sq_test3.NEXTVAL, 558649||lpad(sq_test3.NEXTVAL,10,0) from dual connect by level <= 5000;

5000 rows created.

SQL> commit;

Commit complete.

SQL> INSERT /*+ APPEND */ INTO TAB_TEST3 select sq_test3.NEXTVAL, 539638||lpad(sq_test3.NEXTVAL,10,0) from dual connect by level <= 1;

1 row created.

SQL> commit;

Commit complete.

SQL>
SQL> CREATE INDEX IX_TAB_TEST_FUNC_1 ON TAB_TEST3
  2  (SUBSTR(pan, 1, 6));

Index created.

SQL> exec DBMS_STATS.delete_TABLE_STATS(ownname => '' , tabname => 'TAB_TEST3');

PL/SQL procedure successfully completed.

SQL>
SQL> set autotrace traceonly explain
SQL> SELECT SUBSTR(PAN, 1, 6) FROM TAB_TEST3 WHERE SUBSTR(PAN, 1, 6) = '539638';

Execution Plan
----------------------------------------------------------
Plan hash value: 1490009622

---------------------------------------------------------------------------------------
| Id  | Operation        | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                    |   292K|  1999K|   738   (1)| 00:00:09 |
|*  1 |  INDEX RANGE SCAN| IX_TAB_TEST_FUNC_1 |   292K|  1999K|   738   (1)| 00:00:09 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access(SUBSTR("PAN",1,6)='539638')

SQL>
SQL> SELECT *
  2    FROM TAB_TEST3
  3   WHERE SUBSTR(PAN, 1, 6) =
  4    (   SELECT '539638' FROM TAB_TEST3 WHERE SUBSTR(PAN, 1, 6) = '539638'
  5    );

Execution Plan
----------------------------------------------------------
Plan hash value: 2195403411

--------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                    |   292K|  8283K|  1849   (1)| 00:00:23 |
|   1 |  TABLE ACCESS BY INDEX ROWID| TAB_TEST3          |   292K|  8283K|  1849   (1)| 00:00:23 |
|*  2 |   INDEX RANGE SCAN          | IX_TAB_TEST_FUNC_1 |   292K|       |   738   (1)| 00:00:09 |
|*  3 |    INDEX RANGE SCAN         | IX_TAB_TEST_FUNC_1 |     3 |    21 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access(SUBSTR("PAN",1,6)= (SELECT '539638' FROM "TAB_TEST3" "TAB_TEST3" WHERE
              SUBSTR("PAN",1,6)='539638'))
   3 - access(SUBSTR("PAN",1,6)='539638')

SQL>
SQL> SELECT *
  2    FROM TAB_TEST3
  3   WHERE SUBSTR(PAN, 1, 6) in
  4    (   SELECT SUBSTR(PAN, 1, 6) FROM TAB_TEST3 WHERE SUBSTR(PAN, 1, 6) = '539638'
  5    );

Execution Plan
----------------------------------------------------------
Plan hash value: 16876070

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name               | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                    |   292K|    10M|       |  3421   (1)| 00:00:42 |
|*  1 |  HASH JOIN RIGHT SEMI        |                    |   292K|    10M|  5432K|  3421   (1)| 00:00:42 |
|*  2 |   INDEX RANGE SCAN           | IX_TAB_TEST_FUNC_1 |   292K|  1999K|       |   738   (1)| 00:00:09 |
|   3 |   TABLE ACCESS BY INDEX ROWID| TAB_TEST3          |   292K|  8283K|       |  1849   (1)| 00:00:23 |
|*  4 |    INDEX RANGE SCAN          | IX_TAB_TEST_FUNC_1 |   292K|       |       |   738   (1)| 00:00:09 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access(SUBSTR("PAN",1,6)=SUBSTR("PAN",1,6))
   2 - access(SUBSTR("PAN",1,6)='539638')
   4 - access(SUBSTR("PAN",1,6)='539638')

SQL>
SQL> SELECT *
  2    FROM TAB_TEST3
  3   WHERE SUBSTR(PAN, 1, 6) =
  4    (   SELECT SUBSTR(PAN, 1, 6) FROM TAB_TEST3 WHERE SUBSTR(PAN, 1, 6) = '539638'
  5    );

Execution Plan
----------------------------------------------------------
Plan hash value: 2195403411

--------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                    |   292K|  8283K|  1849   (1)| 00:00:23 |
|   1 |  TABLE ACCESS BY INDEX ROWID| TAB_TEST3          |   292K|  8283K|  1849   (1)| 00:00:23 |
|*  2 |   INDEX RANGE SCAN          | IX_TAB_TEST_FUNC_1 |   292K|       |   738   (1)| 00:00:09 |
|*  3 |    INDEX RANGE SCAN         | IX_TAB_TEST_FUNC_1 |     3 |    21 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access(SUBSTR("PAN",1,6)= (SELECT SUBSTR("PAN",1,6) FROM "TAB_TEST3" "TAB_TEST3"
              WHERE SUBSTR("PAN",1,6)='539638'))
   3 - access(SUBSTR("PAN",1,6)='539638')

SQL>
SQL> SELECT *
  2    FROM TAB_TEST3
  3   WHERE SUBSTR(PAN, 1, 6) in
  4    (   SELECT '539638' FROM TAB_TEST3 WHERE SUBSTR(PAN, 1, 6) = '539638'
  5    );

Execution Plan
----------------------------------------------------------
Plan hash value: 772303409

-----------------------------------------------------------------------------------------
| Id  | Operation          | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                    |    85G|  2310G|  1865   (1)| 00:00:23 |
|*  1 |  FILTER            |                    |       |       |            |          |
|   2 |   TABLE ACCESS FULL| TAB_TEST3          |  1755K|    48M|  1847   (1)| 00:00:23 |
|*  3 |   FILTER           |                    |       |       |            |          |
|*  4 |    INDEX RANGE SCAN| IX_TAB_TEST_FUNC_1 |     3 |    21 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter( EXISTS (SELECT 0 FROM "TAB_TEST3" "TAB_TEST3" WHERE :B1='539638'
              AND SUBSTR("PAN",1,6)='539638'))
   3 - filter(:B1='539638')
   4 - access(SUBSTR("PAN",1,6)='539638')

SQL>
SQL> set autotrace off



And then I repeated the exercise with a histogram because we have skew in the data


SQL>
SQL> exec DBMS_STATS.GATHER_TABLE_STATS(ownname => '' , tabname => 'TAB_TEST3', method_opt=>'for all columns size auto');

PL/SQL procedure successfully completed.

SQL>
SQL> set autotrace traceonly explain
SQL> SELECT SUBSTR(PAN, 1, 6) FROM TAB_TEST3 WHERE SUBSTR(PAN, 1, 6) = '539638';

Execution Plan
----------------------------------------------------------
Plan hash value: 1490009622

---------------------------------------------------------------------------------------
| Id  | Operation        | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                    |  1886 | 13202 |     7   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| IX_TAB_TEST_FUNC_1 |  1886 | 13202 |     7   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access(SUBSTR("PAN",1,6)='539638')

SQL>
SQL> SELECT *
  2    FROM TAB_TEST3
  3   WHERE SUBSTR(PAN, 1, 6) =
  4    (   SELECT '539638' FROM TAB_TEST3 WHERE SUBSTR(PAN, 1, 6) = '539638'
  5    );

Execution Plan
----------------------------------------------------------
Plan hash value: 2195403411

--------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                    |  1886 | 54694 |    15   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| TAB_TEST3          |  1886 | 54694 |    15   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IX_TAB_TEST_FUNC_1 |  1886 |       |     7   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN         | IX_TAB_TEST_FUNC_1 |     2 |    14 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access(SUBSTR("PAN",1,6)= (SELECT '539638' FROM "TAB_TEST3" "TAB_TEST3" WHERE
              SUBSTR("PAN",1,6)='539638'))
   3 - access(SUBSTR("PAN",1,6)='539638')

SQL>
SQL> SELECT *
  2    FROM TAB_TEST3
  3   WHERE SUBSTR(PAN, 1, 6) in
  4    (   SELECT SUBSTR(PAN, 1, 6) FROM TAB_TEST3 WHERE SUBSTR(PAN, 1, 6) = '539638'
  5    );

Execution Plan
----------------------------------------------------------
Plan hash value: 16876070

---------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                    |  1886 | 67896 |    22   (0)| 00:00:01 |
|*  1 |  HASH JOIN RIGHT SEMI        |                    |  1886 | 67896 |    22   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN           | IX_TAB_TEST_FUNC_1 |  1886 | 13202 |     7   (0)| 00:00:01 |
|   3 |   TABLE ACCESS BY INDEX ROWID| TAB_TEST3          |  1886 | 54694 |    15   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN          | IX_TAB_TEST_FUNC_1 |  1886 |       |     7   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access(SUBSTR("PAN",1,6)=SUBSTR("PAN",1,6))
   2 - access(SUBSTR("PAN",1,6)='539638')
   4 - access(SUBSTR("PAN",1,6)='539638')

SQL>
SQL> SELECT *
  2    FROM TAB_TEST3
  3   WHERE SUBSTR(PAN, 1, 6) =
  4    (   SELECT SUBSTR(PAN, 1, 6) FROM TAB_TEST3 WHERE SUBSTR(PAN, 1, 6) = '539638'
  5    );

Execution Plan
----------------------------------------------------------
Plan hash value: 2195403411

--------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                    |   292K|  8283K|  1849   (1)| 00:00:23 |
|   1 |  TABLE ACCESS BY INDEX ROWID| TAB_TEST3          |   292K|  8283K|  1849   (1)| 00:00:23 |
|*  2 |   INDEX RANGE SCAN          | IX_TAB_TEST_FUNC_1 |   292K|       |   738   (1)| 00:00:09 |
|*  3 |    INDEX RANGE SCAN         | IX_TAB_TEST_FUNC_1 |     2 |    14 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access(SUBSTR("PAN",1,6)= (SELECT SUBSTR("PAN",1,6) FROM "TAB_TEST3" "TAB_TEST3"
              WHERE SUBSTR("PAN",1,6)='539638'))
   3 - access(SUBSTR("PAN",1,6)='539638')

SQL>
SQL> SELECT *
  2    FROM TAB_TEST3
  3   WHERE SUBSTR(PAN, 1, 6) in
  4    (   SELECT '539638' FROM TAB_TEST3 WHERE SUBSTR(PAN, 1, 6) = '539638'
  5    );

Execution Plan
----------------------------------------------------------
Plan hash value: 772303409

-----------------------------------------------------------------------------------------
| Id  | Operation          | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                    |   551M|    14G|  1865   (1)| 00:00:23 |
|*  1 |  FILTER            |                    |       |       |            |          |
|   2 |   TABLE ACCESS FULL| TAB_TEST3          |  1755K|    48M|  1847   (1)| 00:00:23 |
|*  3 |   FILTER           |                    |       |       |            |          |
|*  4 |    INDEX RANGE SCAN| IX_TAB_TEST_FUNC_1 |     2 |    14 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter( EXISTS (SELECT 0 FROM "TAB_TEST3" "TAB_TEST3" WHERE :B1='539638'
              AND SUBSTR("PAN",1,6)='539638'))
   3 - filter(:B1='539638')
   4 - access(SUBSTR("PAN",1,6)='539638')

SQL>
SQL> set autotrace off



So all bar the last one could potentially use the index. The last is simple a costing decision - if I force the index you can see the cost came out similar but slightly higher than the full scan.


SQL>
SQL> set autotrace traceonly explain
SQL> SELECT /*+ index(TAB_TEST3) */ *
  2    FROM TAB_TEST3
  3   WHERE SUBSTR(PAN, 1, 6) =
  4    (   SELECT SUBSTR(PAN, 1, 6) FROM TAB_TEST3 WHERE SUBSTR(PAN, 1, 6) = '539638'
  5    );

Execution Plan
----------------------------------------------------------
Plan hash value: 2195403411

--------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                    |   292K|  8283K|  1849   (1)| 00:00:23 |
|   1 |  TABLE ACCESS BY INDEX ROWID| TAB_TEST3          |   292K|  8283K|  1849   (1)| 00:00:23 |
|*  2 |   INDEX RANGE SCAN          | IX_TAB_TEST_FUNC_1 |   292K|       |   738   (1)| 00:00:09 |
|*  3 |    INDEX RANGE SCAN         | IX_TAB_TEST_FUNC_1 |     2 |    14 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access(SUBSTR("PAN",1,6)= (SELECT SUBSTR("PAN",1,6) FROM "TAB_TEST3" "TAB_TEST3"
              WHERE SUBSTR("PAN",1,6)='539638'))
   3 - access(SUBSTR("PAN",1,6)='539638')

SQL> set autotrace off
SQL>
SQL>



Rating

  (10 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

plan with low cost

Rajeshwaran, Jeyabal, July 30, 2018 - 2:27 pm UTC

....
So all bar the last one could potentially use the index. The last is simple a costing decision - if I force the index you can see the cost came out similar but slightly higher than the full scan.
....


But what we see is the other way around, index cost is 1849, where as the full scan cost is 1865.
since the cost of using index is low, why don't the optimizer pick the plan with the lowest cost here? please advice.
Connor McDonald
August 20, 2018 - 5:58 am UTC

My mistake - I edited them, so they are in fact *different* queries, and you can't compare costs between 2 different queries

SQL> set autotrace traceonly explain
SQL>
SQL> SELECT *
  2    FROM TAB_TEST3
  3   WHERE SUBSTR(PAN, 1, 6) =
  4    (   SELECT '539638' FROM TAB_TEST3 WHERE SUBSTR(PAN, 1, 6) = '539638'
  5    );

Execution Plan
----------------------------------------------------------
Plan hash value: 2195403411

--------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                    |   292K|  8283K|  1849   (1)| 00:00:23 |
|   1 |  TABLE ACCESS BY INDEX ROWID| TAB_TEST3          |   292K|  8283K|  1849   (1)| 00:00:23 |
|*  2 |   INDEX RANGE SCAN          | IX_TAB_TEST_FUNC_1 |   292K|       |   738   (1)| 00:00:09 |
|*  3 |    INDEX RANGE SCAN         | IX_TAB_TEST_FUNC_1 |     3 |    21 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access(SUBSTR("PAN",1,6)= (SELECT '539638' FROM "TAB_TEST3" "TAB_TEST3" WHERE
              SUBSTR("PAN",1,6)='539638'))
   3 - access(SUBSTR("PAN",1,6)='539638')

SQL>
SQL> SELECT /*+ index(TAB_TEST3) */ *
  2    FROM TAB_TEST3
  3   WHERE SUBSTR(PAN, 1, 6) in
  4    (   SELECT SUBSTR(PAN, 1, 6) FROM TAB_TEST3 WHERE SUBSTR(PAN, 1, 6) = '539638'
  5    );

Execution Plan
----------------------------------------------------------
Plan hash value: 16876070

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name               | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                    |   292K|    10M|       |  3421   (1)| 00:00:42 |
|*  1 |  HASH JOIN RIGHT SEMI        |                    |   292K|    10M|  5432K|  3421   (1)| 00:00:42 |
|*  2 |   INDEX RANGE SCAN           | IX_TAB_TEST_FUNC_1 |   292K|  1999K|       |   738   (1)| 00:00:09 |
|   3 |   TABLE ACCESS BY INDEX ROWID| TAB_TEST3          |   292K|  8283K|       |  1849   (1)| 00:00:23 |
|*  4 |    INDEX RANGE SCAN          | IX_TAB_TEST_FUNC_1 |   292K|       |       |   738   (1)| 00:00:09 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access(SUBSTR("PAN",1,6)=SUBSTR("PAN",1,6))
   2 - access(SUBSTR("PAN",1,6)='539638')
   4 - access(SUBSTR("PAN",1,6)='539638')



plan with low cost

Rajeshwaran, Jeyabal, August 13, 2018 - 1:26 am UTC

Team,

are we looking for more information/something missing in the above followup to answer?

question persists :)

Rauf Huseynzade, August 16, 2018 - 5:01 pm UTC

Hi, thanks for the answer, but question persists. It is a core question - why in ( my opinion ) similar queries index costs just 15-22 in one query and 1800-1900 in similar query :)

question persists :)

Rauf Huseynzade, August 16, 2018 - 5:10 pm UTC

Thanks for the answer, but core question persists, why index helps in one query ( makes cost 22 ) and doesn't help in very similar query ( and cost remains near 1850 )
Connor McDonald
August 20, 2018 - 6:01 am UTC

That is a limitation on the optimizer. We have to make estimate about how many rows will come back from each step in the operation. If the estimate is out (and by definition, there will always be that potential because we don't hold statistics on every single permutation of query/predicate/data distribution), then that can easily magnify into subsequent steps.


initial question remains open

Rauf, August 27, 2018 - 9:19 am UTC

Thank you for the answer, but initial question is still open, why index helps in one query and not help in very similar one ? :)

initial question remains open

Rauf, August 27, 2018 - 9:21 am UTC

Thank you for the answer, but initial question is still open, why index helps in one query and not help in very similar one ? :)
Connor McDonald
August 28, 2018 - 1:38 am UTC

As I said, it is a "costing decision by the optimizer"

The optimizer cannot get the perfect answer *every* time, because would need statistics on every single permutation of query. You can see the costs were very close (1849/1865) so it was only "just" wrong...but wrong nonetheless.

intial question still remains

Rauf Huseynzade, August 29, 2018 - 10:45 am UTC

Thanks for the answer, but initial question still remains. Why index helps in one query and does not help in similar and even more simple query. Thanks
Connor McDonald
September 04, 2018 - 5:08 am UTC

The initial answer remains :-) The optimizer got it wrong. Every release of Oracle the optimizer evolves. For example, in 12.2 we opted for the index in all cases

SQL> create table TAB_TEST3 (guid number(16), pan varchar2(19 byte));

Table created.

SQL>
SQL> create sequence sq_test3 cache 1000;

Sequence created.

SQL>
SQL> begin
  2  INSERT /*+ APPEND */ INTO TAB_TEST3 select sq_test3.NEXTVAL, 536238||lpad(sq_test3.NEXTVAL,10,0) from dual connect by level <= 1000000; commit;
  3  INSERT /*+ APPEND */ INTO TAB_TEST3 select sq_test3.NEXTVAL, 516823||lpad(sq_test3.NEXTVAL,10,0) from dual connect by level <= 500000;commit;
  4  INSERT /*+ APPEND */ INTO TAB_TEST3 select sq_test3.NEXTVAL, 550222||lpad(sq_test3.NEXTVAL,10,0) from dual connect by level <= 200000;commit;
  5  INSERT /*+ APPEND */ INTO TAB_TEST3 select sq_test3.NEXTVAL, 550358||lpad(sq_test3.NEXTVAL,10,0) from dual connect by level <= 50000;commit;
  6  INSERT /*+ APPEND */ INTO TAB_TEST3 select sq_test3.NEXTVAL, 558649||lpad(sq_test3.NEXTVAL,10,0) from dual connect by level <= 5000;commit;
  7  INSERT /*+ APPEND */ INTO TAB_TEST3 select sq_test3.NEXTVAL, 539638||lpad(sq_test3.NEXTVAL,10,0) from dual connect by level <= 1;commit;
  8  end;
  9  /

PL/SQL procedure successfully completed.

SQL>
SQL> CREATE INDEX IX_TAB_TEST_FUNC_1 ON TAB_TEST3
  2  (SUBSTR(pan, 1, 6));

Index created.

SQL>
SQL> exec DBMS_STATS.GATHER_TABLE_STATS(ownname => '' , tabname => 'TAB_TEST3', cascade => true, method_opt=> 'for all columns size 254');

PL/SQL procedure successfully completed.

SQL> set autotrace traceonly explain
SQL>
SQL> SELECT SUBSTR(PAN, 1, 6) FROM TAB_TEST3 WHERE SUBSTR(PAN, 1, 6) = '539638';

Execution Plan
----------------------------------------------------------
Plan hash value: 1490009622

---------------------------------------------------------------------------------------
| Id  | Operation        | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                    |     1 |     7 |     3   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| IX_TAB_TEST_FUNC_1 |     1 |     7 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access(SUBSTR("PAN",1,6)='539638')

SQL>
SQL> SELECT *
  2    FROM TAB_TEST3
  3   WHERE SUBSTR(PAN, 1, 6) =
  4    (   SELECT '539638' FROM TAB_TEST3 WHERE SUBSTR(PAN, 1, 6) = '539638'
  5    );

Execution Plan
----------------------------------------------------------
Plan hash value: 1092833001

----------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                    |     1 |    29 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| TAB_TEST3          |     1 |    29 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | IX_TAB_TEST_FUNC_1 |     1 |       |     3   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                 | IX_TAB_TEST_FUNC_1 |     1 |     7 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access(SUBSTR("PAN",1,6)= (SELECT '539638' FROM "TAB_TEST3" "TAB_TEST3" WHERE
              SUBSTR("PAN",1,6)='539638'))
   3 - access(SUBSTR("PAN",1,6)='539638')

SQL>
SQL> SELECT *
  2    FROM TAB_TEST3
  3   WHERE SUBSTR(PAN, 1, 6) in
  4    (   SELECT SUBSTR(PAN, 1, 6) FROM TAB_TEST3 WHERE SUBSTR(PAN, 1, 6) = '539638'
  5    );

Execution Plan
----------------------------------------------------------
Plan hash value: 3797178570

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                    |     1 |    36 |     6   (0)| 00:00:01 |
|   1 |  NESTED LOOPS SEMI                   |                    |     1 |    36 |     6   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| TAB_TEST3          |     1 |    29 |     4   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | IX_TAB_TEST_FUNC_1 |     1 |       |     3   (0)| 00:00:01 |
|*  4 |   INDEX RANGE SCAN                   | IX_TAB_TEST_FUNC_1 |     1 |     7 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access(SUBSTR("PAN",1,6)='539638')
   4 - access(SUBSTR("PAN",1,6)='539638')
       filter(SUBSTR("PAN",1,6)=SUBSTR("PAN",1,6))

SQL>
SQL> SELECT *
  2    FROM TAB_TEST3
  3   WHERE SUBSTR(PAN, 1, 6) in
  4    (   SELECT '539638' FROM TAB_TEST3 WHERE SUBSTR(PAN, 1, 6) = '539638'
  5    );

Execution Plan
----------------------------------------------------------
Plan hash value: 2001827591

--------------------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |                    |     1 |    24 |     7  (15)| 00:00:01 |
|   1 |  VIEW                                   | VM_NWVW_2          |     1 |    24 |     7  (15)| 00:00:01 |
|   2 |   HASH UNIQUE                           |                    |     1 |    36 |     7  (15)| 00:00:01 |
|   3 |    MERGE JOIN CARTESIAN                 |                    |     1 |    36 |     6   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN                    | IX_TAB_TEST_FUNC_1 |     1 |     7 |     3   (0)| 00:00:01 |
|   5 |     BUFFER SORT                         |                    |     1 |    29 |     3   (0)| 00:00:01 |
|   6 |      TABLE ACCESS BY INDEX ROWID BATCHED| TAB_TEST3          |     1 |    29 |     3   (0)| 00:00:01 |
|*  7 |       INDEX RANGE SCAN                  | IX_TAB_TEST_FUNC_1 |     1 |       |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access(SUBSTR("PAN",1,6)='539638')
   7 - access(SUBSTR("PAN",1,6)='539638')

SQL>
SQL>   SELECT *
  2    FROM TAB_TEST3
  3   WHERE SUBSTR(PAN, 1, 6) =
  4    (   SELECT SUBSTR(PAN, 1, 6) FROM TAB_TEST3 WHERE SUBSTR(PAN, 1, 6) = '539638'
  5    );

Execution Plan
----------------------------------------------------------
Plan hash value: 1092833001

----------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                    |   292K|  8283K|  1852   (1)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| TAB_TEST3          |   292K|  8283K|  1852   (1)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | IX_TAB_TEST_FUNC_1 |   292K|       |   739   (1)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                 | IX_TAB_TEST_FUNC_1 |     1 |     7 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access(SUBSTR("PAN",1,6)= (SELECT SUBSTR("PAN",1,6) FROM "TAB_TEST3" "TAB_TEST3" WHERE
              SUBSTR("PAN",1,6)='539638'))
   3 - access(SUBSTR("PAN",1,6)='539638')

SQL>


Sorry for duplicate reviews :)

Rauf Huseynzade, September 04, 2018 - 7:32 am UTC

Sorry for duplicate reviews, i was not able to post them , so tried again from time to time and they appeared all together :) thanks for answer, I will try to investigate more details in trace stack

solution to the issue

Rauf Huseynzade, May 07, 2019 - 1:26 pm UTC

Hi, looks like we found correct explanation for main question - "why cost is so big when index is used?".

So, we have about 2 millions of rows, but just 6 distinct values. As index is a balanced tree it makes just a few leafs (maybe just 6) and every leaf has lets say 2 000 000 / 6 = nearly 330 000 rows (just rough example). So, it does not matter that we have only 1 row with value '539638', Oracle will quickly locate leaf where this value is located, but will go through all rows in this final leaf (that is 330 000 in our rough example, cardinality) to find desired row. That is why cardinality is so big and hence cost is so high.

We checked vice versa example, created another table with 2 millions of rows and all 2 millions of rows have distinct values. So, as we understand, index created on that field has about 2 millions of leaves, and that is why while running the same query - cardinality has value 1 and cost is also very low.

In conclusion, the more is rate of distinct values in column compared to number of rows - the better profit of using B-tree index.
Chris Saxon
May 10, 2019 - 10:11 am UTC

That's not quite right...

Function-based indexes are also balanced B-tree indexes. Just like regular, non-function-based ones.

Also the index has a leaf entry for every non-null row it covers. So if the table has 2 million rows, there will be 2 million leaf entries. Regardless of how many distinct values there are.

But yes, the more distinct values there are in the columns you index, the more effective it's likely to be.

Comment on follow up

Rauf Huseynzade, May 10, 2019 - 12:20 pm UTC

Regarding this:
"If the table has 2 million rows, there will be 2 million leaf entries. Regardless of how many distinct values there are."

I think, if all 2 million values are the same value, then it will be only one index chunk(leaf) with all these values in it. That is why cardinality for index of course will be equal to 2 millions, as full table scan
Chris Saxon
May 10, 2019 - 2:01 pm UTC

No. Every indexed row has its own entry in the index. Even if it's the same value as all the other rows.

More to Explore

Performance

Get all the information about database performance in the Database Performance guide.