Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Brigitte .

Asked: March 17, 2002 - 3:47 pm UTC

Last updated: October 17, 2012 - 3:12 pm UTC

Version: 8.1.7

Viewed 10K+ times! This question is

You Asked

Hi Tom

I have 4 questions of performance tuning for Oracle 8.1.7

1. I read DISTINCT implies a internal sort. I have a query like

select distinct attribut1 from tab1 order by attribut1;

Can I ommit the order by clause?


2. I have a concatenated index of 3 columns. In the select statements the columns were always queried together. The access to 2 columns is with "=" and to the 3. column with "in". Where should the column in the index positioned which is queried with "in" , first or last?


3. Can I use the INDEX_COMBINE hint in a query of more then 1 table? I want to change the full table scan of a growing table to 1/3 with a bitmap index (there are only 3 different values in the column). I hadn't success with it. I had only success in a query with 1 table.


4. How can I alter tables to "organisation index" in a productive database?
Which problems can occur?

Thanks

Brigitte


and Tom said...

1) only if you want the data to come out in any order we like. If you want data ordered, order it. If you don't care about the order, don't order it.

The ONLY way to assure ordered data is to order by it. Don't rely on the data coming out sorted under any other circumstances.

2) won't much matter. A query like:

select * from t where a = 5 and b = 6 and c in ( 1, 2 );

will be processed much like:

select * from t where a = 5 and b = 6 and c = 1
union all
select * from t where a = 5 and b = 6 and c = 2
/

So, I might choose to put the LEAST selective columns first and use index key compression (eg: if you:

select count(distinct a)/count(*), count(distinct b)/ count(*), count(distinct c)/count(*)
fromt t;

and find a's ratio to be the smallest, c's ratio to be the next smallest and b's ratio the largest -- create the index on "t(a,c,b) compress 2"



3) well, typically people would use PARTITIONING for this. Using a bitmap index to do this is the wrong approach (i pray you are not in an oltp system, I've seen people try this "approach" in those systems only to have it fall over dead).

This is a partitioning thing, not a bitmap index solution.

but yes, index combine works with more then one table, if the combined indexes can be used to resolve all of the columns and the optimizer agrees it makes sense (remember, a hint is a HINT -- not an order. It is a suggestion, not a directive)


ops$tkyte@ORA817DEV.US.ORACLE.COM> create table emp as select * from scott.emp;
Table created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> create table dept as select * from scott.dept;
Table created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> create bitmap index emp_bmi_1 on emp(job);
Index created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> create bitmap index emp_bmi_2 on emp(deptno);
Index created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> create index dept_idx on dept(deptno);
Index created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> set autotrace on
ops$tkyte@ORA817DEV.US.ORACLE.COM> select /*+ index_combine( emp emp_bmi_1 emp_bmi_2 ) */
2 emp.job, emp.deptno, dept.loc
3 from emp, dept
4 where emp.deptno = dept.deptno
5 and emp.job = 'CLERK'
6 and emp.deptno = 10
7 /

JOB DEPTNO LOC
--------- ---------- -------------
CLERK 10 NEW YORK


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1 Bytes=40)
1 0 MERGE JOIN (CARTESIAN) (Cost=4 Card=1 Bytes=40)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'DEPT' (Cost=1 Card=1 Bytes=21)
3 2 INDEX (RANGE SCAN) OF 'DEPT_IDX' (NON-UNIQUE) (Cost=1 Card=1)
4 1 SORT (JOIN) (Cost=3 Card=1 Bytes=19)
5 4 BITMAP CONVERSION (TO ROWIDS)
6 5 BITMAP AND
7 6 BITMAP INDEX (SINGLE VALUE) OF 'EMP_BMI_1'
8 6 BITMAP INDEX (SINGLE VALUE) OF 'EMP_BMI_2'

4) do you mean "production" or "productive" -- probably production.

In 8i, you'll need some downtime, as you have to rebuild the object. In 9i, you can do this online with dbms_redefinition. However, I'm not really 100% sure what you are after here.

If you are interested in the structure and uses of IOTs (index organized tables) I have a pretty extensive chapter on them and all table types in my book.

By all means BENCHMARK you use of them in a test environment prior to doing anything. (that advice is also all over the book)...



Rating

  (64 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Can I do this

Pawan, March 26, 2002 - 9:19 am UTC

Tom,
I have a table which has primary key (sequence number). Can I alter this table to be INDEX ORGANIZED.This is in a Test environment. Basically can I create an IOT with " Create table AS Select " clause.

Thanks

Tom Kyte
March 26, 2002 - 9:37 am UTC

yes, you can.

ops$tkyte@ORA817DEV.US.ORACLE.COM> create table dept as select * from scott.dept;

Table created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> alter table dept add constraint dept_pk primary key(deptno);

Table altered.

ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> create table dept_iot (deptno primary key, dname, loc )
  2  organization index
  3  as
  4  select * from dept;

Table created.
 

Thanks

pawan, March 26, 2002 - 9:56 am UTC


What is happening?

Pawan, March 26, 2002 - 10:54 am UTC

Tom,
I was just trying to see the improvement that I would get by running the same query on two tables - one index organized and the other not. I did the following and then "SET TIMING ON" - The results are baffling me. There no Indexes created on either table expect for the primary key.I donot have any manual or yr book with me presently (have it at home) and hence and am not able to go deep.Can u PLEASE help me decipher the results and point what I am doing wrong. Also why one trace shows the Cost and other does not.
Thanks in advance
--------------
ADHOC@SF028I> create table indorg_iot (SEQ_ID Primary key,
2 OWNER,
3 OBJECT_NAME,
4 SUBOBJECT_NAME,
5 OBJECT_ID,
6 DATA_OBJECT_ID,
7 OBJECT_TYPE,
8 CREATED,
9 LAST_DDL_TIME,
10 TIMESTAMP,
11 STATUS,
12 TEMPORARY,
13 GENERATED,
14 SECONDARY)
15 organization index
16 as select * from indorg;

Table created.


ADHOC@SF028I> @count
Enter table name:indorg_iot
old 1: select count(1) from &TABLE
new 1: select count(1) from indorg_iot

COUNT(1)
---------
1077312

ADHOC@SF028I> @count
Enter table name:not_indorg
old 1: select count(1) from &TABLE
new 1: select count(1) from not_indorg

COUNT(1)
---------
1077312

ADHOC@SF028I> ADHOC@SF028I> /

TABLE_NAME INDEX_NAME
-------------------- ------------------------------
INDORG_IOT SYS_IOT_TOP_30637

ADHOC@SF028I> set timing on
ADHOC@SF028I> select count(*) from not_indorg
2 where status='VALID';

COUNT(*)
---------
1069952

Elapsed: 00:00:13.01
ADHOC@SF028I> select count(*) from indorg_iot
2 where status='VALID';

COUNT(*)
---------
1069952

Elapsed: 00:00:27.26


ADHOC@SF028I> set autotrace traceonly explain
ADHOC@SF028I> select count(*) from not_indorg
2 where status='VALID';
Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (FULL) OF 'NOT_INDORG'



ADHOC@SF028I> select count(*) from indorg_iot
2 where status='VALID';
Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1 Bytes=5)
1 0 SORT (AGGREGATE)
2 1 INDEX (FAST FULL SCAN) OF 'SYS_IOT_TOP_30637' (UNIQUE) (
Cost=4 Card=1 Bytes=5


Harish harbham, March 26, 2002 - 11:23 am UTC

Can you please clarify why the least selective column should be used first ? Is this something related to index compression ?

Tom Kyte
March 26, 2002 - 1:41 pm UTC

I did not say "should be" rather I said:

So, I might choose to put the LEAST selective columns first and use index key
compression (eg: if you:



so yes -- it was in regards to key compression. There is nothing to the story that most selective items should be first in the index. they items in the index should be ordered by HOW they are used, not their selectivity.

Which Order of Colums is Better?

Jan, September 12, 2002 - 9:21 pm UTC

I have a look up IOT table with compress=2 with fallowing columns:

CATEGORY VARCHAR2(10) pk
CODE VARCHAR2(10) pk
LANG VARCHAR2(3) pk
CODE_DESC VARCHAR2(100)

In our Environment, we will have only one language. What is better, if the order of PK is:

1) LANG,CATEGORY,CODE or
2) CATEGORY,LANG,CODE ?

Solution 1) is more compressible, but probably Index Scanning does more jobs here because all records has the same value in LANG column. Is it true?


Another question, if I have IOT with e.g. 5 columns and only 2 of them are PK (col1,col2), how are the rows ordered in IOT?

It is by col1,col2,col3,col4,col5? It means that is the order of columns in CREATE TABLE statment important?
ORDER BY

Thanks,

joined table

Chris, October 28, 2002 - 12:31 am UTC

Dear Tom,

I use TKPROF to analyze the query below. But I get the slower query result when I index the nmfile. Why was that.
In other SQL statement, by index nmfile I got faster query.
#### 1. without index on nmfile

SELECT a.*
FROM t_cdr200 a, t_profile_anumber b, t_profile_yn c
WHERE
(SUBSTR(a.bnumber,1,1) != '0' AND a.anumber = b.anumber
AND a.nmfile = c.batchNo AND c.batchID >= 1 AND c.batchID <= 1
AND a.durasi > (b.dcrphlocal+(0.01 * b.dcrphlocal)))
AND c.batchNo not in
(select d.batchNo from t_fraudullent_yn d, t_profile_yn e where d.batchNo = e.
batchNo and d.fraudcodeid = 1)
order by a.nmfile, a.anumber

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 2 0.00 0.00 0 0 0 0
Fetch 1 0.00 0.00 13842 14541 15 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.00 13842 14541 15 0

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 37 (DBA1)

Rows Row Source Operation
------- ---------------------------------------------------
0 SORT ORDER BY
0 FILTER
1 MERGE JOIN
2 SORT JOIN
1 NESTED LOOPS
2 TABLE ACCESS FULL T_PROFILE_YN
1 TABLE ACCESS BY INDEX ROWID T_CDR200
201 INDEX RANGE SCAN (object id 9985)
1 FILTER
1 SORT JOIN
633772 TABLE ACCESS FULL T_PROFILE_ANUMBER
0 NESTED LOOPS
0 TABLE ACCESS FULL T_PROFILE_YN
0 INDEX UNIQUE SCAN (object id 9654)


Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: CHOOSE
0 SORT (ORDER BY)
0 FILTER
1 MERGE JOIN
2 SORT (JOIN)
1 NESTED LOOPS
2 TABLE ACCESS (FULL) OF 'T_PROFILE_YN'
1 TABLE ACCESS (BY INDEX ROWID) OF 'T_CDR200'
201 INDEX (RANGE SCAN) OF 'NMFILE_NDX' (NON-UNIQUE)
1 FILTER
1 SORT (JOIN)
633772 TABLE ACCESS (FULL) OF 'T_PROFILE_ANUMBER'
0 NESTED LOOPS
0 TABLE ACCESS (FULL) OF 'T_PROFILE_YN'
0 INDEX (UNIQUE SCAN) OF 'PK_FRAUDULLENT_YN' (UNIQUE)


#### 2. with index on nmfile

SELECT a.*
FROM t_cdr200 a, t_profile_anumber b, t_profile_yn c
WHERE
(SUBSTR(a.bnumber,1,1) != '0' AND a.anumber = b.anumber
AND a.nmfile = c.batchNo AND c.batchID >= 1 AND c.batchID <= 1
AND a.durasi > (b.dcrphlocal+(0.01 * b.dcrphlocal)))
AND c.batchNo not in
(select d.batchNo from t_fraudullent_yn d, t_profile_yn e where d.batchNo = e.
batchNo and d.fraudcodeid = 1) order by a.nmfile, a.anumber

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 2 0.00 0.00 0 0 0 0
Execute 4 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.00 27684 29082 30 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 8 0.00 0.00 27684 29082 30 0

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 37 (DBA1)

Rows Row Source Operation
------- ---------------------------------------------------
0 SORT ORDER BY
0 FILTER
1 MERGE JOIN
2 SORT JOIN
1 NESTED LOOPS
2 TABLE ACCESS FULL T_PROFILE_YN
1 TABLE ACCESS BY INDEX ROWID T_CDR200
201 INDEX RANGE SCAN (object id 9985)
1 FILTER
1 SORT JOIN
633772 TABLE ACCESS FULL T_PROFILE_ANUMBER
0 NESTED LOOPS
0 TABLE ACCESS FULL T_PROFILE_YN
0 INDEX UNIQUE SCAN (object id 9654)


Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: CHOOSE
0 SORT (ORDER BY)
0 FILTER
1 MERGE JOIN
2 SORT (JOIN)
1 NESTED LOOPS
2 TABLE ACCESS (FULL) OF 'T_PROFILE_YN'
1 TABLE ACCESS (BY INDEX ROWID) OF 'T_CDR200'
201 INDEX (RANGE SCAN) OF 'NMFILE_NDX' (NON-UNIQUE)
1 FILTER
1 SORT (JOIN)
633772 TABLE ACCESS (FULL) OF 'T_PROFILE_ANUMBER'
0 NESTED LOOPS
0 TABLE ACCESS (FULL) OF 'T_PROFILE_YN'
0 INDEX (UNIQUE SCAN) OF 'PK_FRAUDULLENT_YN' (UNIQUE)

Tom Kyte
October 28, 2002 - 7:21 am UTC

geez -- they both have an index on nmfile_ndx apparently -- they are the same query.

did you know that indexes do not

o imply fast=true
o go faster then non-index paths
o assure you the answer in less time



I would suggest using the CBO as well here.

Sorry wrong attachment

Chris, October 28, 2002 - 1:03 am UTC

Sorry Tom...I put wrong report attachment. here is the right one :

######### without index on nmfile

SELECT a.*
FROM t_cdr200 a, t_profile_anumber b, t_profile_yn c
WHERE
(SUBSTR(a.bnumber,1,1) != '0' AND a.anumber = b.anumber
AND a.nmfile = c.batchNo AND c.batchID >= 1 AND c.batchID <= 1
AND a.durasi > (b.dcrphlocal+(0.01 * b.dcrphlocal)))
AND c.batchNo not in
(select d.batchNo from t_fraudullent_yn d, t_profile_yn e where d.batchNo = e.
batchNo and d.fraudcodeid = 1)
order by a.nmfile, a.anumber

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 3 1 6 0
Execute 2 0.00 0.00 0 0 0 0
Fetch 1 0.00 0.00 17680 14543 220 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.00 17683 14544 226 0

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 37 (DBA1)

Rows Row Source Operation
------- ---------------------------------------------------
0 SORT ORDER BY
0 FILTER
1 NESTED LOOPS
1 MERGE JOIN
633773 SORT JOIN
633772 TABLE ACCESS FULL T_PROFILE_ANUMBER
633772 FILTER
1 SORT JOIN
1 TABLE ACCESS FULL T_CDR200
0 TABLE ACCESS BY INDEX ROWID T_PROFILE_YN
0 INDEX UNIQUE SCAN (object id 9652)
0 NESTED LOOPS
0 TABLE ACCESS FULL T_PROFILE_YN
0 INDEX UNIQUE SCAN (object id 9654)


Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: CHOOSE
0 SORT (ORDER BY)
0 FILTER
1 NESTED LOOPS
1 MERGE JOIN
633773 SORT (JOIN)
633772 TABLE ACCESS (FULL) OF 'T_PROFILE_ANUMBER'
633772 FILTER
1 SORT (JOIN)
1 TABLE ACCESS (FULL) OF 'T_CDR200'
0 TABLE ACCESS (BY INDEX ROWID) OF 'T_PROFILE_YN'
0 INDEX (UNIQUE SCAN) OF 'PK_BATCHNO' (UNIQUE)
0 NESTED LOOPS
0 TABLE ACCESS (FULL) OF 'T_PROFILE_YN'
0 INDEX (UNIQUE SCAN) OF 'PK_FRAUDULLENT_YN' (UNIQUE)

********************************************************************************

######## with index on nmfile

SELECT a.*
FROM t_cdr200 a, t_profile_anumber b, t_profile_yn c
WHERE
(SUBSTR(a.bnumber,1,1) != '0' AND a.anumber = b.anumber
AND a.nmfile = c.batchNo AND c.batchID >= 1 AND c.batchID <= 1
AND a.durasi > (b.dcrphlocal+(0.01 * b.dcrphlocal)))
AND c.batchNo not in
(select d.batchNo from t_fraudullent_yn d, t_profile_yn e where d.batchNo = e.
batchNo and d.fraudcodeid = 1)
order by a.nmfile, a.anumber

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 2 0.00 0.00 3 1 6 0
Execute 4 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.00 32033 29084 235 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 8 0.00 0.00 32036 29085 241 0

Misses in library cache during parse: 2
Optimizer goal: CHOOSE
Parsing user id: 37 (DBA1)

Rows Row Source Operation
------- ---------------------------------------------------
0 SORT ORDER BY
0 FILTER
1 NESTED LOOPS
1 MERGE JOIN
633773 SORT JOIN
633772 TABLE ACCESS FULL T_PROFILE_ANUMBER
633772 FILTER
1 SORT JOIN
1 TABLE ACCESS FULL T_CDR200
0 TABLE ACCESS BY INDEX ROWID T_PROFILE_YN
0 INDEX UNIQUE SCAN (object id 9652)
0 NESTED LOOPS
0 TABLE ACCESS FULL T_PROFILE_YN
0 INDEX UNIQUE SCAN (object id 9654)


Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: CHOOSE
0 SORT (ORDER BY)
0 FILTER
1 MERGE JOIN
1 SORT (JOIN)
633773 NESTED LOOPS
633772 TABLE ACCESS (FULL) OF 'T_PROFILE_YN'
633772 TABLE ACCESS (BY INDEX ROWID) OF 'T_CDR200'
1 INDEX (RANGE SCAN) OF 'NMFILE_NDX' (NON-UNIQUE)
1 FILTER
0 SORT (JOIN)
0 TABLE ACCESS (FULL) OF 'T_PROFILE_ANUMBER'
0 NESTED LOOPS
0 TABLE ACCESS (FULL) OF 'T_PROFILE_YN'
0 INDEX (UNIQUE SCAN) OF 'PK_FRAUDULLENT_YN' (UNIQUE)

********************************************************************************





Tom Kyte
October 28, 2002 - 7:23 am UTC

the answer is simple:

the plans are different.
indexes do not mean "fast=true"
the RBO is index happy.
the CBO is what you should be using...

Great

Chris, October 29, 2002 - 4:53 am UTC

Thanks Tom for your explanation :)
I use optimizer hints all rows and it goes faster :)


Nested Loops

A Reader, June 27, 2003 - 4:51 pm UTC

Hi Tom,

Simple Question. There is one query joining two tables 
site => contact uses hash join. site has one index and its on on site_id, contact doesn't have any index. It takes 7 minutes to finish. 

SELECT Count(*) AS ForAvg , COUNT(*) AS Expr1
  FROM SITE SITE, CONTACT CONTACT  
   WHERE  SITE.SITE_ID = CONTACT.SITE_ID
     and CONTACT.MAIL_CONT In ('1', ' ')
     AND SITE.MAIL_SITE = '1'
     AND SITE.DSF_INDEX In ('1', '2')
     AND SITE.COUNTRY_CD = 'US'
     AND SITE.SS_GROUP = 'EDU'
     AND SITE.SS_SUBGRP =  'ELE'
     AND CONTACT.LIST_PRI = '191'


SQL> set autotrace traceonly
SQL> @qry3.test

1 row selected.

Elapsed: 00:07:09.33

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=28695 Card=1 Bytes=3
          9)

   1    0   SORT (AGGREGATE)
   2    1     HASH JOIN (Cost=28695 Card=243 Bytes=9477)
   3    2       TABLE ACCESS (FULL) OF 'SITE' (Cost=15205 Card=241 Byt
          es=5543)

   4    2       TABLE ACCESS (FULL) OF 'CONTACT' (Cost=13490 Card=1013
          3 Bytes=162128)


Statistics
----------------------------------------------------------
          0  recursive calls
       1998  db block gets
     188388  consistent gets
     187326  physical reads
        120  redo size
        443  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 



If another condition ("SITE.LST_ORD_SS Between '20020101' AND '20021231'" ) is added, the optimzer uses NESTED LOOPS join and it takes over 2 hrs to complete. 


SQL> @qry3.test

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=28695 Card=1 Bytes=4
          7)

   1    0   SORT (AGGREGATE)
   2    1     NESTED LOOPS (Cost=28695 Card=1 Bytes=47)
   3    2       TABLE ACCESS (FULL) OF 'SITE' (Cost=15205 Card=1 Bytes
          =31)

   4    2       TABLE ACCESS (FULL) OF 'CONTACT' (Cost=13490 Card=1013
          3 Bytes=162128)


Both the joining columns are not null and SITE.SITE_ID has an index on it.


I would appreciate any comment on this.


Thanks, 

Tom Kyte
June 27, 2003 - 5:17 pm UTC

It believes that that predicate will return 1 row (see the CARD)

how many rows from SITE does that actually return? If the optimizer is so far off -- why is it so far off? are the stats current? or really stale?

Nested Loops

A Reader, June 27, 2003 - 4:58 pm UTC

Hi Tom,

Simple Question. There is one query joining two tables 
site => contact uses hash join. site has one index and its on on site_id, contact doesn't have any index. It takes 7 minutes to finish. 

SELECT Count(*) AS ForAvg , COUNT(*) AS Expr1
  FROM SITE SITE, CONTACT CONTACT  
   WHERE  SITE.SITE_ID = CONTACT.SITE_ID
     and CONTACT.MAIL_CONT In ('1', ' ')
     AND SITE.MAIL_SITE = '1'
     AND SITE.DSF_INDEX In ('1', '2')
     AND SITE.COUNTRY_CD = 'US'
     AND SITE.SS_GROUP = 'EDU'
     AND SITE.SS_SUBGRP =  'ELE'
     AND CONTACT.LIST_PRI = '191'


SQL> set autotrace traceonly
SQL> @qry3.test

1 row selected.

Elapsed: 00:07:09.33

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=28695 Card=1 Bytes=3
          9)

   1    0   SORT (AGGREGATE)
   2    1     HASH JOIN (Cost=28695 Card=243 Bytes=9477)
   3    2       TABLE ACCESS (FULL) OF 'SITE' (Cost=15205 Card=241 Byt
          es=5543)

   4    2       TABLE ACCESS (FULL) OF 'CONTACT' (Cost=13490 Card=1013
          3 Bytes=162128)


Statistics
----------------------------------------------------------
          0  recursive calls
       1998  db block gets
     188388  consistent gets
     187326  physical reads
        120  redo size
        443  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 



If another condition ("SITE.LST_ORD_SS Between '20020101' AND '20021231'" ) is added, the optimzer uses NESTED LOOPS join and it takes over 2 hrs to complete. 


SQL> @qry3.test

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=28695 Card=1 Bytes=4
          7)

   1    0   SORT (AGGREGATE)
   2    1     NESTED LOOPS (Cost=28695 Card=1 Bytes=47)
   3    2       TABLE ACCESS (FULL) OF 'SITE' (Cost=15205 Card=1 Bytes
          =31)

   4    2       TABLE ACCESS (FULL) OF 'CONTACT' (Cost=13490 Card=1013
          3 Bytes=162128)


Both the joining columns are not null and SITE.SITE_ID has an index on it.


I would appreciate any comment on this.


Thanks, 

A reader, June 30, 2003 - 11:17 am UTC

Hi Tom, Thanks for you reply.

The site table returned 765 rows. I have gathered the stats of site again, though they weren't stale.  I still get the same plan for the query. What's am I missing ?. 

SQL> @qry3.test

    FORAVG      EXPR1
---------- ----------
       765        765

1 row selected.

Elapsed: 3:42:48.09

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=28695 Card=1 Bytes=4
          7)

   1    0   SORT (AGGREGATE)
   2    1     NESTED LOOPS (Cost=28695 Card=1 Bytes=47)
   3    2       TABLE ACCESS (FULL) OF 'SITE' (Cost=15205 Card=1 Bytes
          =31)

   4    2       TABLE ACCESS (FULL) OF 'CONTACT' (Cost=13490 Card=1013
          3 Bytes=162128)





Statistics
----------------------------------------------------------
          0  recursive calls
    1013821  db block gets
   99628913  consistent gets
   96030025  physical reads
          0  redo size
        443  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 


I would appreciate any comments.


Thanks, 

Tom Kyte
June 30, 2003 - 1:17 pm UTC

how do you GATHER the stats, exact command.

what does

SELECT Count(*) AS ForAvg , COUNT(*) AS Expr1
FROM SITE SITE
WHERE SITE.MAIL_SITE = '1'
AND SITE.DSF_INDEX In ('1', '2')
AND SITE.COUNTRY_CD = 'US'
AND SITE.SS_GROUP = 'EDU'
AND SITE.SS_SUBGRP = 'ELE'
and SITE.LST_ORD_SS Between '20020101' AND '20021231'
/


A Reader, June 30, 2003 - 10:32 pm UTC

Hi Tom, Thanks for your reply.

I am analyzing SITE using "analyze table site compute statistics"


Optimizer choses nested loops join for most of the queries if
statistics are gathered using

"execute dbms_stats.gather_table_stats('SCOTT','SITE') "


Thanks,

Tom Kyte
July 01, 2003 - 8:01 am UTC

so, what is the output of the query from right above.

A Reader, July 01, 2003 - 9:50 am UTC

Thanks Tom. 

Here is the output of the query :

SQL> @qry4.test

    FORAVG      EXPR1
---------- ----------
      1127       1127

Elapsed: 00:01:48.09

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=15205 Card=1 Bytes=1
          9)

   1    0   SORT (AGGREGATE)
   2    1     TABLE ACCESS (FULL) OF 'SITE' (Cost=15205 Card=1 Bytes=1
          9)



Statistics
----------------------------------------------------------
          0  recursive calls
       1097  db block gets
      99838  consistent gets
      99956  physical reads
         60  redo size
        443  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 



Thanks, 

Tom Kyte
July 01, 2003 - 10:08 am UTC

are you sure you use 'analyze table site compute statistics' as that should get histograms (do you have rows in user_tab_histograms for this table?)

Obviously, the statistics are leading the optimizer to the wrong conclusion. It firmly believes "one row" -- and is getting 1,127.

A Reader, July 01, 2003 - 4:20 pm UTC

Thanks Tom,

I have confirmed that site has had
"analyze table site compute statistics", also, user_Tab_histograms do have information of site.

I would appreciate any comments on this.


Thanks,



Tom Kyte
July 01, 2003 - 4:30 pm UTC

I would file it as a bug with support.

That the cardinality is being computed so wrong is not good. that is the cause of the bad plan.

given that the table appears to be a gig in size, I'm not going to ask you to email it to me or anything like that ;)

A Reader, July 02, 2003 - 9:59 pm UTC

Thanks Tom for helping me understand the problem. This site has helped me a lot for lot of other issues also.

Just want to ask one more question regarding the problem.

Are there any other factors(init parameters, db_block_size etc.) which can affect analyze to gather wrong stats or optimzer to make wrong plans ?.

I would apppreciate any comment on this.



Thanks,

Tom Kyte
July 03, 2003 - 9:01 am UTC

there are lots of parameters -- search this site for 10053

then, enable that event, explain a query.

read the resulting trace file. You'll see all of the parameters that affect the CBO (in my book due out in august, i go through many of them with examples of the effects they have)

correlated predicates

Owen Jones, December 16, 2004 - 6:14 am UTC

The problem here is probably due to one of two things, one easily fixable, one not.

1) You may have skewed columns and cursor sharing going on (i.e. system/session paraemter Cursor_sharing set to something other than exact). If you then issue this statement more than once, with different values, oracle may attempt to speed things up by avoiding the repeated parsing of it, by replacing literals with bind variables. (This is generally a GOOD idea in OLTP environments where parsing often takes much more time than executing it.)

The problem with this is of course that it then has no idea what the actual values are in subsequent uses, and just sticks to the first execution plan it came up with, despite the fact that if it reparsed the query with these values it would be able to pick a much better execution plan. Up to date statistics and perfect histograms won't help here, because it is deliberately ignoring the actual values given. (And so can go hideously wrong sometimes).

Fortunately there is an easy solution to this, use the hint /* cursor_sharing_exact */ in the query.

2) The much nastier problem, correlated predicates. This is a drawback in the way the Oracle optimiser estimates the selective power of multiple predicates. The optimiser (AFAIK) is based on the simplest assumption, that the values in one column are completely unrelated to those in another.

Based on this you can get the following kinds of problems: if you have a column with 50 fairly evenly distributed values (say state), then the optimiser believes that where "state = 'NEW YORK'" should cut the number of rows down by about a factor of 50. If you have another column in the table with say 1000 evenly distributed values (e.g. city) then the optimiser believes that "where city = 'NEW YORK'" should reduce the number of rows by about a factor of 1000.

The problem then becomes clear. Although there may be new york's in other states (and there certainly are some cities with the same name in more than one state), I'm guessing that the vast majority of addresses involving new york are 'NEW YORK, NEW YORK'. The optimiser however, presented with "where state='NEW YORK' and city = 'NEW YORK'" will assume that the columns are independant (i.e. there are 1000 cities in each state) and that this will reduce the number of rows by a factor of about 50,000! Histograms won't help much here, because although it will spot that NEW YORK is a really big city (only reduces by a factor of 50), and NEW YORK is a bigish state (maybe by a factor of 40), it'll still multiply the two together and get it massively wrong. The more columns you add in that aren't actually very independent, the worse things get, and you can quickly get to the optimiser being suprised it even gets one row back from a multi-million row table, when it actually often gets tens of thousands..

So:
SELECT Count(*) AS ForAvg , COUNT(*) AS Expr1
FROM SITE SITE
WHERE SITE.MAIL_SITE = '1'
AND SITE.DSF_INDEX In ('1', '2')
AND SITE.COUNTRY_CD = 'US'
AND SITE.SS_GROUP = 'EDU'
AND SITE.SS_SUBGRP = 'ELE'
and SITE.LST_ORD_SS Between '20020101' AND '20021231'

I'm guessing that 'ELE' stands for elementary school, and EDU for education, and so 'EDU' doesn't actually cut things down once you're given 'ELE'. Now if mail_site, dsf_index or country are related to each other to any degree (bet they are), or if more 'edu' or 'ele' records fall into the period in question than average (say you've recently started that kind of business), or any of a whole array of possible problems, the optimiser is always going to be wildly wrong about the cardinality it expects.

Note that even without the predicate on SITE.LST_ORD_SS the cardinality estimate was wrong (i.e. there were definitely more than 241 rows, I'm guessing way more), you just got lucky and it was still high enough to cause the same choice that would have been made had it been right. The extra column cut the number of rows expected down low enough to mean the wrong estimate led to Nested Loops, whereas the real number was still well into hash join territory.

This problem is much harder to solve. Short of a major overhaul of the oracle optimser to take account of joint distibution of columns (and god help the poor sap who gets that job!), our only option is to litter our SQL with combinations of cardinality, selectivity, use_hash and other such hints, and play around until the optimiser takes some notice.

Your best bet here is probably simply to avoid the really slow cases by forceing a hash join with a hint and take the smaller hit in those cases where nested loops might really be quicker than a hash join, if there actually are any (hash joins tend to not be very bad).

Note if each SS_SUBGRP really actually only appears in one SS_GROUP, the you should move SS_GROUP to a separate lookup table, i.e. normalise your data. But if there is actually some SS_SUBGRP's in more than one SS_GROUP (despite the names of the columns), you'll have this problem.

To estimate the degree to which the optimiser will overestimate joint selectivity, you can try

SELECT
(SELECT count(DISTINCT A) * count(DISTINCT B) FROM T)
/ (SELECT COUNT(*) FROM (SELECT DISTINCT A, B FROM T))
row_count_underestimate_factor
FROM dual

where A and B are column names, and T is your table (or in the real world, a copy of a sample of it!).

The closer this number is to 1, the more independant the columns are. For larger numbers, the optimiser will tend to expect this times too few rows when both columns are used together as predicates.

Gather Table Stats and Oracle Restart

Jim Cox, December 07, 2011 - 4:52 pm UTC

Hi Tom

I know this will sound kind of goofy, but here goes

I have two windows 2003 servers running 10.2.0.5.12

Server A is Production and has 8gb of RAM and 16 cpus using a SAN
Server B is Test and has 8gb of RAM with 2 cpus using a SAN

I have some sql that takes forever to run on the schema on Server A (6-7 hours) as the schema owner
I exported the schema from Server A to Server B with the schema stats
I ran the sql on Server B as the schema owner and as figured it took forever
I decided to gather the stats again on one of the tables that i thought was the problem

execute dbms_stats.gather_table_stats(ownname=>'LLTEST',tabname=>'DTREE',-
estimate_percent=>NULL,cascade=>TRUE,degree=>DBMS_STATS.DEFAULT_DEGREE,method_opt=>'FOR ALL INDEXED COLUMNS');

After it finished i ran the sql again hoping to see a difference in time
No good, took just as long

I then decided to restart Oracle and try the sql again.
For some reason the sql now responded within a couple of minutes

I tried this twice following the same steps as above
Both times the sql took too a long to finish UNTIL I restarted Oracle
Then the repsonse was with 1-2 minutes

So I am at a loss as to why the Oracle restart seems to have that affect

Have you seen or heard of this before by any chance ?

What am I missing ?


Thanks
Jim
Tom Kyte
December 08, 2011 - 11:06 am UTC

could be a number of reasons, rather than hypothesize - what we'd like to see would be a tkprof of the two situations.

do this:

exec dbms_monitor.session_trace_enable( binds => true, waits => true );
select ....
exit

in sqlplus, then run tkprof against the result trace files and post the relevant bits of the report

Gather Table Stats and Oracle Restart -Followup

Jim Cox, December 13, 2011 - 12:41 pm UTC

Hi Tom

that sql is going to take about 5-6 hours to run.

i can only execute this as sysdba
exec dbms_monitor.session_trace_enable( binds => true, waits => true );

do you want me to run the sql as the schema owner then or does it have to be run as the sysdba ?

Thanks
Jim
Tom Kyte
December 13, 2011 - 12:56 pm UTC

you can run that as any user that has execute on dbms_monitor. grant execute on that package to the appropriate people.

How to Tune this case

A reader, December 14, 2011 - 3:15 am UTC

Dear Tom,

I have a range partitioned table say t(id, part_dat, dat1, dat2, x, y, z,) in which part_dat is the partition key. I have a complex query in which the access to this table represents the most consuming operation. Query against this table is

select *
  from t
where
    id = :b1
and nvl(trunc(dat1), to_date('10112011 00:00:00','ddmmrrrr hh24:mi:ss'))         
        BETWEEN to_date('10112011 00:00:00','ddmmrrrr hh24:mi:ss') AND to_date('10112011 23:59:59','ddmmrrrr hh24:mi:ss')
and nvl(trunc(dat2), to_date('10112011 00:00:00','ddmmrrrr hh24:mi:ss'))         
        BETWEEN to_date('10112011 00:00:00','ddmmrrrr hh24:mi:ss') AND to_date('10112011 23:59:59','ddmmrrrr hh24:mi:ss')
What I did first is to rewrite this query to be

select *
  from t
 where
    id = :b1
 and trunc(dat1)= to_date('10112011','ddmmrrrr'))         
 and trunc(dat2)= to_date('10112011','ddmmrrrr'))         
union all
select *
  from t
 where
    id = :b1
 and trunc(dat1)= to_date('10112011','ddmmrrrr'))         
 and dat2 is null
union all
select *
  from t
 where
    id = :b1
 and trunc(dat2)= to_date('10112011','ddmmrrrr'))         
 and dat1 is null;


I would like to cover those 3 unioned selects by a single and precise function based index . I created the following index

create index ind_1_fbi on t(id1, trunc(dat1), trunc(dat2)) compress;

But this will not cover null dates

How you will handle this case if you were asked to tune it?

Thanks a lot in advance


Tom Kyte
December 15, 2011 - 7:30 am UTC

ops$tkyte%ORA11GR2> create index t_idx on t(id,dat1,dat2);

Index created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> variable x number;
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> select *
  2    from t
  3   where id = :x
  4     and (
  5          (
  6                   dat1 BETWEEN to_date('10112011','ddmmrrrr') AND to_date('10113011','ddmmrrrr')-1/24/60/60
  7           and
  8                   dat2 BETWEEN to_date('10112011','ddmmrrrr') AND to_date('10113011','ddmmrrrr')-1/24/60/60
  9                  )
 10             OR
 11             (
 12              dat1 is null
 13          and
 14                  dat2 BETWEEN to_date('10112011','ddmmrrrr') AND to_date('10113011','ddmmrrrr')-1/24/60/60
 15             )
 16             OR
 17             (
 18               dat2 is null
 19                   and
 20                   dat1 BETWEEN to_date('10112011','ddmmrrrr') AND to_date('10113011','ddmmrrrr')-1/24/60/60
 21             )
 22           )
 23  /

Execution Plan
----------------------------------------------------------
Plan hash value: 470836197

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |    26 |  2054 |     0   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS BY INDEX ROWID| T     |    26 |  2054 |     0   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T_IDX |     1 |       |     0   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("DAT1" IS NULL AND "DAT2">=TO_DATE('10112011','ddmmrrrr') AND
              "DAT2"<=TO_DATE('10113011','ddmmrrrr')-.0000115740740740740740740740740740740
              7407407 OR "DAT2" IS NULL AND "DAT1">=TO_DATE('10112011','ddmmrrrr') AND
              "DAT1"<=TO_DATE('10113011','ddmmrrrr')-.0000115740740740740740740740740740740
              7407407 OR "DAT1">=TO_DATE('10112011','ddmmrrrr') AND
              "DAT2">=TO_DATE('10112011','ddmmrrrr') AND
              "DAT1"<=TO_DATE('10113011','ddmmrrrr')-.0000115740740740740740740740740740740
              7407407 AND "DAT2"<=TO_DATE('10113011','ddmmrrrr')-.0000115740740740740740740
              7407407407407407407)
   2 - access("ID"=TO_NUMBER(:X))

ops$tkyte%ORA11GR2> set autotrace off


presuming a full scan wasn't the right approach in the first place. I don't know

a) how many rows in the table
b) how many rows you expect to get back.

If you expect to get back thousands or less rows - go for the index.

If you expect hundreds of thousands, millions or more - look at full scanning and perhaps a different partitioning scheme to permit for partition pruning.

Gather Table Stats and Oracle Restart-How Do I

Jim Cox, December 15, 2011 - 9:42 am UTC

Hi Tom

well, it took me three days to get the tests done.

it says i can only upload 1000 words how can i get you all the output from the tkprof files ?

Thanks
Jim
Tom Kyte
December 16, 2011 - 6:21 am UTC

it is 32k, edit down the information so we have just what we need. I cannot use a really big thing, make it digestable, easily readable, terse - yet complete.

How to Tune this case

A reader, December 15, 2011 - 1:22 pm UTC

Thanks very much for your help. My query is returning 1540 rows and using the index is appropriate in this case

Best regards
Tom Kyte
December 16, 2011 - 5:10 pm UTC

I don't have any context here - who are you and what are you referring to?

A reader, December 19, 2011 - 4:52 am UTC

It's about my question entitled

How to Tune this case December 14, 2011
Tom Kyte
December 19, 2011 - 8:01 am UTC

oh, then I already answered you??? Not sure what you are asking now.


I showed you how to get an index range scan, assuming that is what you really wanted. You seem to want it.

did you try it?

Gather Table Stats and Oracle Restart-How Do I-Followup

Jim Cox, December 20, 2011 - 12:59 pm UTC

Hi Tom

I have looked at the tkptof files (3) and I am not knowledgeable enough to provide you with the data that you would need to assist me. To me, it all looks important. Each file has stats that are different, so I am totaly in over my head.

Thanks
Jim
Tom Kyte
December 20, 2011 - 2:12 pm UTC

you just need to post the one query from the different runs. Just need the three sections that would look only like this:

********************************************************************************

SQL ID: a5ks9fhw2v9s1 Plan Hash: 272002086

select *
from
 dual


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.00       0.00          2          2          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.00       0.00          2          2          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 489
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  TABLE ACCESS FULL DUAL (cr=2 pr=2 pw=0 time=2149 us cost=2 size=2 card=1)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       2        0.00          0.00
  db file sequential read                         2        0.00          0.00
  SQL*Net message from client                     2        0.00          0.00
********************************************************************************



except we'd see your query, the little call/count/cpu etc report section, the row source operation and the wait event section.

Just three of them - one from each of the tkprofs, just for your query.

Gather Table Stats and Oracle Restart-Results

Jim Cox, December 21, 2011 - 12:52 pm UTC

Hi Tom

Well I am just about going nuts here on this slow sql issue
This is the second slow sql that would not complete and now they both complete on my test server (LLTEST)

I have been running tests on the test server and database for over a week now and for some strange reason after I imported the schema from 12-6-11) and ran the two queries again believe it or not, they now run like they should.

I did not gather stats on the DTree table this time

So I went back to the production server where this schema came from and ran the slow sql on that server and it does not finish. The DTree stats were gathered on 12-8-11

I am now thinking that because I gatherd stats on the DTREE table for LLPROD and not in LLTEST that that is
why the sql will not finish.

I am including tkprofs from the second query I ran

My tkprof files do not have some of the items you mentioned. Am I running it incorrectly ?

i.e. (Rows (1st) Rows (avg) Rows (max) Row Source Operation)

tkprof K:/oracle/admin/LLTEST/udump/lltest_ora_2604.trc K:/oracle/admin/LLTEST/udump/lltest_ora_2604 explain=LLTEST/xxxxx sys=no
tkprof K:/oracle/admin/LLPROD/udump/llprod_ora_6048.trc K:/oracle/admin/LLPROD/udump/llprod_ora_6048 explain=LLPROD/xxxxx sys=no


Thanks for Your Patience on This One
Jim


Query:

select count (*)
from Recd_OperationSummary ROT
where exists
(select 1
from
DTree DTreeItems,
DTree DTreeOwners
where
ROT.DataID = DTreeItems.DataID
and
DTreeItems.OwnerID = DTreeOwners.OwnerID
and
DTreeOwners.SubType IN ( 148, 161, 162 ))



********************
Server 1 (runs fine LLTEST)
********************

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 5.71 5.77 0 65698 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 5.73 5.79 0 65698 0 1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 209 (LLTEST)

Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=65698 pr=0 pw=0 time=5779518 us)
8 HASH JOIN SEMI (cr=65698 pr=0 pw=0 time=370055 us)
267964 INDEX FULL SCAN GITYPEACTUSERDATAREFCNT (cr=1445 pr=0 pw=0 time=536018 us)(object id 223835)
148270 VIEW VW_SQ_1 (cr=64253 pr=0 pw=0 time=11417990 us)
148270 NESTED LOOPS (cr=64253 pr=0 pw=0 time=10824904 us)
8454 INDEX FULL SCAN DTREE_RECD_OPINIONSUMMARY (cr=46912 pr=0 pw=0 time=3644448 us)(object id 224148)
148270 INDEX RANGE SCAN DTREE_INDEX1 (cr=17341 pr=0 pw=0 time=599956 us)(object id 224136)


Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT MODE: ALL_ROWS
1 SORT (AGGREGATE)
8 HASH JOIN (SEMI)
267964 INDEX MODE: ANALYZED (FULL SCAN) OF
'GITYPEACTUSERDATAREFCNT' (INDEX)
148270 VIEW OF 'VW_SQ_1' (VIEW)
148270 NESTED LOOPS
8454 INDEX MODE: ANALYZED (FULL SCAN) OF
'DTREE_RECD_OPINIONSUMMARY' (INDEX)
148270 INDEX MODE: ANALYZED (RANGE SCAN) OF 'DTREE_INDEX1'
(INDEX (UNIQUE))


Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 2 0.00 0.00
SQL*Net message from client 2 2.87 2.87



********************************************************************************

OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.01 0 0 0 0
Execute 2 0.00 0.00 0 0 0 1
Fetch 2 5.71 5.77 0 65698 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 5 5.73 5.79 0 65698 0 2

Misses in library cache during parse: 1

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 3 0.00 0.00
SQL*Net message from client 3 8.27 11.14





********************
Server 2 (does not finish, had to abort it after 30 minutes LLPROD)
********************

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.02 0 0 0 0
Execute 1 0.00 0.01 0 0 0 0
Fetch 1 1757.56 1762.82 3485 144809233 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 1757.56 1762.86 3485 144809233 0 0

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 157

Rows Row Source Operation
------- ---------------------------------------------------
0 SORT AGGREGATE (cr=0 pr=0 pw=0 time=24 us)
0 INDEX FULL SCAN GITYPEACTUSERDATAREFCNT (cr=0 pr=0 pw=0 time=4 us)(object id 297249)
0 TABLE ACCESS BY INDEX ROWID DTREE (cr=143941443 pr=3484 pw=0 time=1746014743 us)
786837686 NESTED LOOPS (cr=2207919 pr=531 pw=0 time=1583529341 us)
124 INDEX RANGE SCAN DTREE_RECD_OPINIONSUMMARY (cr=985 pr=1 pw=0 time=22187 us)(object id 297562)
786837236 INDEX RANGE SCAN DTREE_INDEX1 (cr=2206934 pr=530 pw=0 time=795856648 us)(object id 297550)


Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 1 0.00 0.00
db file sequential read 3485 0.06 4.18



********************************************************************************

OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.02 0 0 0 0
Execute 2 0.00 0.08 2 58 0 1
Fetch 1 1757.56 1762.82 3485 144809233 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 1757.56 1762.93 3487 144809291 0 1

Misses in library cache during parse: 1
Misses in library cache during execute: 1

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 2 0.00 0.00
SQL*Net message from client 1 9.63 9.63
db file sequential read 3485 0.06 4.18


********************************************************************************
Tom Kyte
December 21, 2011 - 3:03 pm UTC

well, it would appear you have completely different sets of data - on the "working" machine the index range scan on dtree_index1 found 148,270 rows.

Before you killed it, the other server had found 786,837,236 via the same index.

Are we comparing apples with apples?


I am now thinking that because I gatherd stats on the DTREE table for LLPROD
and not in LLTEST that that is
why the sql will not finish.


I'm confused. Are you gathering stats on the table being queried or are you importing them from somewhere else?

This looks like a statistics problem here - we would not use an index to get 3/4ths of a billion rows, we shouldn't anyway.


Gather Table Stats and Oracle Restart-Followup

Jim Cox, December 22, 2011 - 12:35 pm UTC

Hi Tom

well the LLTEST data is back from 12-1-11 and the LLPROD data is from 12-21-11 and is more current and growing daily. That is why there is a difference.

I started my tests again. I imported a schema from LLPROD taken on 12-1-11 into LLTEST, including indexes. I did NO stats gathering this time. I excuted the first query different then the one above and it took over 16 hours to complete. I got the results just now as i just got into work and would appreciate you looking at them. Bottom line is the two querirs are not working again on LLTEST, so I am confused, again.

Thanks
Jim


Misses in library cache during parse: 0
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 210 (LLTEST)

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 1 0.00 0.00
SQL*Net message from client 1 59.13 59.13
********************************************************************************

SELECT
dt.DataID,
dt.CreateDate,
dt.SubType
FROM
DTree dt
INNER JOIN DTree parent ON dt.ParentID = parent.DataID
INNER JOIN DTree owner ON dt.OwnerID = -owner.DataID
left join DVersData dvd on dt.DataID = dvd.DocID
WHERE
dt.CreateDate >= TO_DATE( '2011/12/01', 'YYYY/MM/DD' )
and
parent.SubType != 154
AND
owner.SubType NOT IN ( 148, 161, 162 )
AND
dt.SubType in (144)
AND dvd.MimeType not in
(
'application/gzip-compressed',
'application/octet-stream',
'application/outlook-pst',
'application/vnd.ms-access',
'application/vnd.ms-binder',
'application/vnd.xfdl',
'application/x-dbf',
'application/x-dpa',
'application/x-gzip',
'application/x-java-class',
'application/x-ldb',
'application/x-lnk',
'application/x-max',
'application/x-mpx',
'application/x-ms-wmz',
'application/x-pps',
'application/x-python-bytecode',
'application/x-shs',
'application/x-trn',
'application/x-zip-compressed',
'audio/basic',
'audio/mp3',
'audio/mpeg',
'audio/x-aiff',
'audio/x-wav',
'drawing/dwg',
'image/bmp',
'image/gif',
'image/ief',
'image/jpeg',
'image/pcx',
'image/png',
'image/psd',
'image/tga',
'image/tiff',
'image/x-bmp',
'image/x-cmu-raster',
'image/x-pcx',
'image/x-pic',
'image/x-pict',
'image/x-portable-anymap',
'image/x-portable-graymap',
'image/x-portable-pixmap',
'image/x-rgb',
'image/x-wmf',
'image/x-xbitmap',
'image/x-xpixmap',
'image/x-xwindowdump',
'video/asx',
'video/mpeg',
'video/quicktime',
'video/x-ms-wmv',
'video/x-msvideo',
'video/x-sgi-movie'
)
ORDER BY
dt.CreateDate DESC,
dt.DataID DESC

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.23 0.21 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 795 55997.14 56258.23 49489 558177258 0 11897
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 797 55997.37 56258.45 49489 558177258 0 11897

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 210 (LLTEST)

Rows Row Source Operation
------- ---------------------------------------------------
11897 SORT ORDER BY (cr=558177258 pr=49489 pw=0 time=423596876 us)
11897 NESTED LOOPS (cr=558177258 pr=49489 pw=0 time=60944179729 us)
11897 NESTED LOOPS (cr=77089 pr=2500 pw=0 time=41136018 us)
14748 NESTED LOOPS (cr=47489 pr=2033 pw=0 time=24563802 us)
14748 TABLE ACCESS BY INDEX ROWID DTREE (cr=17884 pr=1989 pw=0 time=23796881 us)
20929 INDEX RANGE SCAN RECD_DTREE_INDEX1 (cr=59 pr=59 pw=0 time=886018 us)(object id 225615)
14748 INDEX RANGE SCAN DTREE_RECD_OPINIONSUMMARY (cr=29605 pr=44 pw=0 time=1089697 us)(object id 225618)
11897 INDEX RANGE SCAN RECD_DOCID_MIMETYPE (cr=29600 pr=467 pw=0 time=2890663 us)(object id 225168)
11897 INDEX FULL SCAN DTREE_RECD_OPINIONSUMMARY (cr=558100169 pr=46989 pw=0 time=56216030240 us)(object id 225618)


Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT MODE: ALL_ROWS
11897 SORT (ORDER BY)
11897 HASH JOIN
11897 NESTED LOOPS
14748 NESTED LOOPS
14748 TABLE ACCESS MODE: ANALYZED (BY INDEX ROWID) OF
'DTREE' (TABLE)
20929 INDEX MODE: ANALYZED (RANGE SCAN) OF
'RECD_DTREE_INDEX1' (INDEX)
14748 INDEX MODE: ANALYZED (RANGE SCAN) OF
'DTREE_RECD_OPINIONSUMMARY' (INDEX)
11897 INDEX MODE: ANALYZED (RANGE SCAN) OF
'RECD_DOCID_MIMETYPE' (INDEX)
11897 INDEX MODE: ANALYZED (FULL SCAN) OF
'DTREE_RECD_OPINIONSUMMARY' (INDEX)


Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 795 0.00 0.00
db file sequential read 4601 0.14 17.42
db file scattered read 3086 0.09 12.86
latch: cache buffers chains 4 0.00 0.00
latch free 5 0.00 0.00
SQL*Net message from client 795 580.83 597.54



********************************************************************************

OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.23 0.21 0 0 0 0
Execute 2 0.00 0.00 0 49 0 1
Fetch 795 55997.14 56258.23 49489 558177258 0 11897
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 798 55997.37 56258.45 49489 558177307 0 11898

Misses in library cache during parse: 1
Misses in library cache during execute: 1

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 796 0.00 0.00
SQL*Net message from client 796 580.83 656.67
db file sequential read 4601 0.14 17.42
db file scattered read 3086 0.09 12.86
latch: cache buffers chains 4 0.00 0.00
latch free 5 0.00 0.00


OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 43 0.00 0.02 0 0 0 0
Execute 477 0.98 0.88 0 0 0 0
Fetch 712 0.14 0.60 86 1610 0 4124
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 1232 1.12 1.51 86 1610 0 4124

Misses in library cache during parse: 21
Misses in library cache during execute: 21

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
db file sequential read 86 0.04 0.49

********************************************************************************




Tom Kyte
December 22, 2011 - 2:12 pm UTC

I am guessing bad plan from bad stats.


if you were to gather stats - just using

gather_table_stats( user, 'tablename' )

does the plan change - it was doing just fine until it hit this:

11897 INDEX FULL SCAN DTREE_RECD_OPINIONSUMMARY (cr=558100169 pr=46989
pw=0 time=56216030240 us)(object id 225618)

Really Confused Now

Jim Cox, December 22, 2011 - 3:03 pm UTC

Hi Tom

I can try your suggestion, but here is what I do not get
I started this query around 13:44 yesterday. It finished at 0522 this morning. I sent you out the TKPROF results. After I did that. I logged back into the database and ran the same sql. It took less than 2 minutes to complete. This is why I am out of my wits.

Thanks
Jim
Tom Kyte
December 22, 2011 - 3:25 pm UTC

but - where is the tkprof for the "fast" one - that is what we need - I don't think I've seen it.

Hey, wait a minute...........


do you have the default stats job running in the database? What is the last analyzed time on those objects?

By default that job runs in the middle of the night, it would not affect an already running query - but it could certainly affect queries after that.

So.... when were these objects last analyzed according to the dictionary?

Stats-TKPROF Info

Jim Cox, December 22, 2011 - 3:40 pm UTC

Hi Tom

when i started the sql the stats were from 12-1-11 from the import as i checked at 12:34 and started the query at 13:44. The stats right now say 12-21-11, so they got run sometime yesterday after the query started

The problem is I cannot get a clean tkprof for you

Another odd thing I am seeing:

running the query and NOT executing the below statement it works quickly
exec dbms_monitor.session_trace_enable( binds => true, waits => true );

if i execute the above statement and then the query, it just keeps chugging and I have to abort it (runing since 13:21 right now

Thanks
Jim
Tom Kyte
December 22, 2011 - 3:54 pm UTC

.. The stats right now say 12-21-11, so
they got run sometime yesterday after the query started ..


that is part of the issue - you have entirely different stats due to the automatic job kicking in and gathering them. the imported statistics are likely "not representative of the data"


Can you look in v$sql and see how many copies of the sql statement you see in the shared pool? I'll guess there is more than one - the bad plan and the good plan. You can use dbms_xplan.display_cursor to display the query plans used by each child cursor - see if they are not different. for the sql trace query - it might be attached to a bad plan (we have to use different child cursors for sql traced queries versus non sql traced queries, they can potentially have different plans)


But I think we've answered the "it runs slow - until later" - the stats were completely different. The "overnight" information was what I needed to hear to figure that out.

v$SQL Info

Jim Cox, December 22, 2011 - 4:28 pm UTC

Hi Tom

well you lost me on just what i need to do to solve my problem

and i still do not know why the query will run fine if i do not execute this statement first: exec dbms_monitor.session_trace_enable( binds => true, waits => true );

are you saying that one query with the dbms_monitor is using one plan and the one with dbms_monitor is using another ?

I pulled this info from v$sql:
select
sql_id,
address,
hash_value,
child_address,
child_number,
plan_hash_value,
SUBSTR (sql_text,1,40) sql_text
--SUBSTR (sql_fulltext,1,40) sql_fulltext
from v$sql
where
sql_text like 'SELECT dt.DataID%';


SQL_ID ADDRESS HASH_VALUE CHILD_ADDRESS CHILD_NUMBER PLAN_HASH_VALUE SQL_TEXT
------------- ---------------- ---------- ---------------- ------------ --------------- ----------------------------------------
csjx4r3r3bsrf 000000027E7275D0 3996508910 000000027E7273A0 0 4269733020 SELECT dt.DataID, dt.CreateDate, dt.SubT
csjx4r3r3bsrf 000000027E7275D0 3996508910 000000026DF4FB78 1 570399632 SELECT dt.DataID, dt.CreateDate, dt.SubT
Tom Kyte
December 22, 2011 - 5:13 pm UTC

are you saying that one query with the dbms_monitor is using one plan and the
one with dbms_monitor is using another ?


if you enable trace, we have to create a child cursor with its own plan which may or may not be the same as the plan the other instance is using.


use

select * from table( dbms_xplan.display_cursor( 'csjx4r3r3bsrf', 0 ) );

select * from table( dbms_xplan.display_cursor( 'csjx4r3r3bsrf', 1 ) );


to see if those two child cursors have the same or different plans associated withthem...

Jim Cox, December 22, 2011 - 5:57 pm UTC

Hi Tom

I think by redoing the stats on the table for today it got cleaned up

select * from table( dbms_xplan.display_cursor( 'csjx4r3r3bsrf', 0 ) );

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------
SQL_ID csjx4r3r3bsrf, child number 0
-------------------------------------
SELECT dt.DataID, dt.CreateDate, dt.SubType FROM DTree dt INNER JOIN DTree parent ON dt.ParentID =
parent.DataID INNER JOIN DTree owner ON dt.OwnerID = -owner.DataID left join DVersData dvd on dt.DataID =
dvd.DocID WHERE dt.CreateDate >= TO_DATE( '2011/12/01', 'YYYY/MM/DD' ) and parent.SubType != 154 AND
owner.SubType NOT IN ( 148, 161, 162 ) AND dt.SubType in (144) AND dvd.MimeType not in (
'application/gzip-compressed', 'application/octet-stream', 'application/outlook-pst',
'application/vnd.ms-access', 'application/vnd.ms-binder', 'application/vnd.xfdl', 'application/x-dbf',
'application/x-dpa', 'application/x-gzip', 'application/x-java-class', 'application/x-ldb',
'application/x-lnk', 'application/x-max', 'application/x-mpx', 'application/x-ms-wmz', 'application/x-pps',
'application/x-python-bytecode', 'application/x-shs', 'application/x-trn', 'application/x-zip-compressed',
'audio/basic', 'audio/mp3', 'audio/mpeg', 'audio/x-aiff', 'audio/x-wav', 'drawing/dwg', 'image/bmp', 'image/g

Plan hash value: 570399632

---------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | | 8462K(100)| |
| 1 | SORT ORDER BY | | 452M| 32G| 37G| 8462K (4)| 09:31:22 |
|* 2 | HASH JOIN | | 452M| 32G| | 23525 (66)| 00:01:36 |
| 3 | NESTED LOOPS | | 12163 | 807K| | 3532 (1)| 00:00:15 |
| 4 | NESTED LOOPS | | 11784 | 448K| | 2350 (1)| 00:00:10 |
|* 5 | TABLE ACCESS BY INDEX ROWID| DTREE | 11767 | 333K| | 1170 (1)| 00:00:05 |
|* 6 | INDEX RANGE SCAN | RECD_DTREE_INDEX1 | 14749 | | | 4 (0)| 00:00:01 |
|* 7 | INDEX RANGE SCAN | DTREE_RECD_OPINIONSUMMARY | 1 | 10 | | 1 (0)| 00:00:01 |
|* 8 | INDEX RANGE SCAN | RECD_DOCID_MIMETYPE | 1 | 29 | | 1 (0)| 00:00:01 |
|* 9 | INDEX FULL SCAN | DTREE_RECD_OPINIONSUMMARY | 8591K| 81M| | 4695 (3)| 00:00:20 |
---------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("DT"."OWNERID"=(-"OWNER"."DATAID"))
5 - filter("DT"."SUBTYPE"=144)
6 - access("DT"."CREATEDATE">=TO_DATE(' 2011-12-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
7 - access("DT"."PARENTID"="PARENT"."DATAID")
filter("PARENT"."SUBTYPE"<>154)
8 - access("DT"."DATAID"="DVD"."DOCID")
filter(("DVD"."MIMETYPE"<>'image/jpeg' AND "DVD"."MIMETYPE"<>'application/octet-stream' AND
"DVD"."MIMETYPE"<>'image/gif' AND "DVD"."MIMETYPE"<>'application/vnd.xfdl' AND
"DVD"."MIMETYPE"<>'application/x-zip-compressed' AND "DVD"."MIMETYPE"<>'image/tiff' AND
"DVD"."MIMETYPE"<>'image/png' AND "DVD"."MIMETYPE"<>'image/bmp' AND
"DVD"."MIMETYPE"<>'application/vnd.ms-binder' AND "DVD"."MIMETYPE"<>'application/x-dbf' AND
"DVD"."MIMETYPE"<>'application/gzip-compressed' AND "DVD"."MIMETYPE"<>'application/vnd.ms-access' AND
"DVD"."MIMETYPE"<>'drawing/dwg' AND "DVD"."MIMETYPE"<>'audio/mp3' AND "DVD"."MIMETYPE"<>'image/x-wmf' AND
"DVD"."MIMETYPE"<>'video/quicktime' AND "DVD"."MIMETYPE"<>'video/x-msvideo' AND
"DVD"."MIMETYPE"<>'audio/x-wav' AND "DVD"."MIMETYPE"<>'application/x-ms-wmz' AND
"DVD"."MIMETYPE"<>'image/psd' AND "DVD"."MIMETYPE"<>'image/x-portable-graymap' AND
"DVD"."MIMETYPE"<>'video/mpeg' AND "DVD"."MIMETYPE"<>'video/x-ms-wmv' AND
"DVD"."MIMETYPE"<>'application/outlook-pst' AND "DVD"."MIMETYPE"<>'application/x-dpa' AND
"DVD"."MIMETYPE"<>'application/x-gzip' AND "DVD"."MIMETYPE"<>'application/x-java-class' AND
"DVD"."MIMETYPE"<>'application/x-ldb' AND "DVD"."MIMETYPE"<>'application/x-lnk' AND
"DVD"."MIMETYPE"<>'application/x-max' AND "DVD"."MIMETYPE"<>'application/x-mpx' AND
"DVD"."MIMETYPE"<>'application/x-pps' AND "DVD"."MIMETYPE"<>'application/x-python-bytecode' AND
"DVD"."MIMETYPE"<>'application/x-shs' AND "DVD"."MIMETYPE"<>'application/x-trn' AND
"DVD"."MIMETYPE"<>'audio/basic' AND "DVD"."MIMETYPE"<>'audio/mpeg' AND "DVD"."MIMETYPE"<>'audio/x-aiff' AND
"DVD"."MIMETYPE"<>'image/ief' AND "DVD"."MIMETYPE"<>'image/pcx' AND "DVD"."MIMETYPE"<>'image/tga' AND
"DVD"."MIMETYPE"<>'image/x-bmp' AND "DVD"."MIMETYPE"<>'image/x-cmu-raster' AND
"DVD"."MIMETYPE"<>'image/x-pcx' AND "DVD"."MIMETYPE"<>'image/x-pic' AND "DVD"."MIMETYPE"<>'image/x-pict' AND
"DVD"."MIMETYPE"<>'image/x-portable-anymap' AND "DVD"."MIMETYPE"<>'image/x-portable-pixmap' AND
"DVD"."MIMETYPE"<>'image/x-rgb' AND "DVD"."MIMETYPE"<>'image/x-xbitmap' AND
"DVD"."MIMETYPE"<>'image/x-xpixmap' AND "DVD"."MIMETYPE"<>'image/x-xwindowdump' AND
"DVD"."MIMETYPE"<>'video/asx' AND "DVD"."MIMETYPE"<>'video/x-sgi-movie'))
9 - filter(("OWNER"."SUBTYPE"<>162 AND "OWNER"."SUBTYPE"<>148 AND "OWNER"."SUBTYPE"<>161))


66 rows selected.

select * from table( dbms_xplan.display_cursor( 'csjx4r3r3bsrf', 1 ) );

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------
SQL_ID: csjx4r3r3bsrf, child number: 1 cannot be found

Thanks
Jim

Query Does Not Finish

JIm Cox, December 27, 2011 - 10:10 am UTC

Hi Tom

I have another query that is causing me grief on production database so I am testing it on a test database.
This was a "delete from" but i changed it to a select count (*) so i could test it

I started this query on 12-23-11 at 1:10am
I checked it this morning (12-26-11) and it is still running and I am not sure why
it has not completed at this time adn this is what is in the trace file so far

there are 267964 records in Recd_OperationSummary
there are 8599372 in DTree

Any info you can give me as to why this is not completing is appreciated


Thanks
Jim

##############


select count (*)
from Recd_OperationSummary ROT
where exists
( select 1
from
DTree DTreeItems,
DTree DTreeOwners
where
ROT.DataID = DTreeItems.DataID
and
DTreeItems.OwnerID = DTreeOwners.OwnerID
and
DTreeOwners.SubType IN ( 148, 161, 162 ))

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.04 0.08 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 0.04 0.08 0 0 0 0

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 210 (LLTEST)

Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT MODE: ALL_ROWS
0 SORT (AGGREGATE)
0 INDEX MODE: ANALYZED (FULL SCAN) OF
'GITYPEACTUSERDATAREFCNT' (INDEX)
0 TABLE ACCESS MODE: ANALYZED (BY INDEX ROWID) OF 'DTREE'
(TABLE)
0 NESTED LOOPS
0 INDEX MODE: ANALYZED (RANGE SCAN) OF
'DTREE_RECD_OPINIONSUMMARY' (INDEX)
0 INDEX MODE: ANALYZED (RANGE SCAN) OF 'DTREE_INDEX1'
(INDEX (UNIQUE))


Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 1 0.00 0.00
db file sequential read 19655 0.56 95.45
db file scattered read 22531 0.20 129.92
latch: shared pool 6 0.00 0.00
latch: cache buffers chains 3 0.00 0.00



********************************************************************************

OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.04 0.08 0 0 0 0
Execute 2 0.01 0.00 0 49 0 1
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 0.06 0.09 0 49 0 1

Misses in library cache during parse: 1
Misses in library cache during execute: 1

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 2 0.00 0.00
SQL*Net message from client 1 0.00 0.00
db file sequential read 19655 0.56 95.45
db file scattered read 22531 0.20 129.92
latch: shared pool 6 0.00 0.00
latch: cache buffers chains 3 0.00 0.00


OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 23 0.04 0.01 0 0 0 0
Execute 193 0.35 0.34 0 0 0 0
Fetch 256 0.01 0.06 2 662 0 1017
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 472 0.42 0.42 2 662 0 1017

Misses in library cache during parse: 14
Misses in library cache during execute: 14

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
db file sequential read 2 0.03 0.04

2 user SQL statements in session.
193 internal SQL statements in session.
195 SQL statements in session.
1 statement EXPLAINed in this session.
Tom Kyte
December 27, 2011 - 10:44 am UTC

... This was a "delete from" but i changed it to a select count (*) so i could test
it ..

that is a waste of time.

count(*) will result in entirely different plans.


delete from t where x = 5;

say X was indexed, that would be an index range scan followed by a table access by index rowid to get the row.

select count(*) from t where x = 5;

would be a simple index range scan - nothing more, nothing less.


never 'test' with count(*) - it is a complete and utter waste of time and will get you nowhere.




use v$session_wait and v$session_event to see what it is currently doing. Do you have any of the tuning/diagnostic packs?

Query Does Not Finish - Followup

Jim Cox, December 27, 2011 - 4:05 pm UTC

Hi Tom

uh oh.. Sorry about that. Did not realize that. Lession Learned

here is some sql that I have for v$session and v$session_wait

The only tuning/diagnostic packs i have are what comes with OEM
I am runnig DBConsole. Do Not have Grid Control


Thanks
Jim

######################
select
-- saddr,
-- paddr,
machine,
program,
osuser,
sid,
-- serial#,
username,
status,
server,
process,
P1,
P1TEXT,
P2,
P2TEXT,
P3,
P3TEXT
-- LOCKWAIT,
-- ROW_WAIT_OBJ#,
-- ROW_WAIT_FILE#,
-- ROW_WAIT_BLOCK#,
-- ROW_WAIT_ROW#,
-- CURRENT_QUEUE_DURATION,
-- SQL_ADDRESS,
-- SQL_HASH_VALUE
from
v$session
--where
-- LOWER (program) = 'llserver.exe'
--and
-- status = 'ACTIVE'
order by
machine;


output:

Server Program Logon SID Oracle Name Status Connection Process P1 P1TEXT P2 P2TEXT P3 P3TEXT
-------------------------------- -------------------- ------------------------------ ------ ------------- -------- ------------ ---------- ---------- ------------------ ---------- ------------------ ---------- ------------------
LOSANGELES-2K\NSAB-SS68-ORA-N sqlplus.exe NSAB-SS68-ORA-N\LLAdmin 82 LLTEST ACTIVE DEDICATED 5044:5052 1413697536 driver id 1 #bytes 0
sqlplus.exe NSAB-SS68-ORA-N\LLAdmin 91 LLTEST ACTIVE DEDICATED 2040:3608 16 file# 28645 block# 1 blocks

NSAB-SS68-ORA-N ORACLE.EXE (q000) SYSTEM 94 ACTIVE DEDICATED 1100 0 0 0
ORACLE.EXE (QMNC) SYSTEM 97 ACTIVE DEDICATED 3488 0 0 0
ORACLE.EXE (ARC1) SYSTEM 100 ACTIVE DEDICATED 3476 30000 timeout 0 0
ORACLE.EXE (ARC0) SYSTEM 101 ACTIVE DEDICATED 2368 6000 timeout 0 0
ORACLE.EXE (MMNL) SYSTEM 105 ACTIVE DEDICATED 3504 100 timeout 0 0
ORACLE.EXE (MMON) SYSTEM 106 ACTIVE DEDICATED 1260 300 timeout 0 0
ORACLE.EXE (CJQ0) SYSTEM 107 ACTIVE DEDICATED 984 500 timeout 0 0
ORACLE.EXE (RECO) SYSTEM 108 ACTIVE DEDICATED 3372 180000 timeout 0 0
ORACLE.EXE (SMON) SYSTEM 109 ACTIVE DEDICATED 164 300 sleep time 0 failed 0
ORACLE.EXE (CKPT) SYSTEM 110 ACTIVE DEDICATED 1380 300 timeout 0 0
ORACLE.EXE (LGWR) SYSTEM 111 ACTIVE DEDICATED 1592 300 timeout 0 0
ORACLE.EXE (DBW0) SYSTEM 112 ACTIVE DEDICATED 516 300 timeout 0 0
ORACLE.EXE (MMAN) SYSTEM 113 ACTIVE DEDICATED 160 300 timeout 0 0
ORACLE.EXE (PSP0) SYSTEM 114 ACTIVE DEDICATED 2036 300 timeout 0 0
ORACLE.EXE (J000) SYSTEM 92 ACTIVE DEDICATED 2032 0 0 0
ORACLE.EXE (PMON) SYSTEM 115 ACTIVE DEDICATED 2500 300 duration 0 0
ORACLE.EXE (q001) SYSTEM 85 ACTIVE DEDICATED 3860 0 0 0




select
sid,
event,
p1,
p2,
p3,
-- p1raw,
seconds_in_wait,
wait_time,
state
from
v$session_wait
--where event = 'virtual circuit status'
--where state = 'WAITING'
--and event not like '%SQL%'
--where event not like '%rdbms%'
--and event not like '%timer%'
order by
seconds_in_wait desc;



prompt
prompt Checking Buffer Busy Waits
prompt

select
sid,
event,
p1,
p2,
p3,
-- p1raw,
seconds_in_wait,
wait_time,
state
from
v$session_wait
where event = 'buffer busy waits'
order by
event;



prompt
prompt Identify Segment with P1 and P2 Parms
prompt

select
owner,
partition_name,
segment_name,
segment_type,
tablespace_name,
file_id,
relative_fno
from
dba_extents
where
file_id = &P1
and
&P2 between block_id and block_id + blocks -1;


output:


SID Event Parm 1 (P1) Parm 2 (P2) Parm 3 (P3) Seconds in Wait Wait Time State
---------- ------------------------------------ ----------- ----------- ----------- --------------- ---------- -------------------
115 pmon timer 300 0 0 392106 0 WAITING
113 rdbms ipc message 300 0 0 392103 0 WAITING
105 rdbms ipc message 100 0 0 392100 0 WAITING
97 Streams AQ: qmn coordinator idle wai 0 0 0 392089 0 WAITING
108 rdbms ipc message 180000 0 0 392054 0 WAITING
109 smon timer 300 0 0 132514 0 WAITING
85 Streams AQ: qmn slave idle wait 0 0 0 92165 0 WAITING
94 Streams AQ: waiting for time managem 0 0 0 1871 0 WAITING
91 db file sequential read 16 28645 1 1725 -1 WAITED SHORT TIME
100 rdbms ipc message 30000 0 0 228 0 WAITING
106 rdbms ipc message 300 0 0 150 0 WAITING
114 rdbms ipc message 300 0 0 69 0 WAITING
107 rdbms ipc message 500 0 0 69 0 WAITING
101 rdbms ipc message 6000 0 0 39 0 WAITING
92 jobq slave wait 0 0 0 9 0 WAITING
112 rdbms ipc message 300 0 0 9 0 WAITING
111 rdbms ipc message 300 0 0 3 0 WAITING
110 rdbms ipc message 300 0 0 0 0 WAITING
82 SQL*Net message to client 1413697536 1 0 0 -1 WAITED SHORT TIME



Checking Buffer Busy Waits


no rows selected


Identify Segment with P1 and P2 Parms

Enter value for p1: 16
Enter value for p2: 28645

OWNER PARTITION_NAME SEGMENT_NAME SEGMENT_TYPE TABLESPACE_NAME FILE_ID RELATIVE_FNO
------------------------------ ------------------------------ --------------------------------------------------------------------------------- ------------------ ------------------------------ ---------- --------
LLTEST GITYPEACTUSERDATAREFCNT INDEX LIVELINK_INDEX 16 16



Tom Kyte
December 27, 2011 - 4:10 pm UTC

I cannot really read this, needs a bit of processing by you to extract just relevant information - and only for the session in question please - the backgrounds and other sessions are just not relevant.


you'll want to monitor the session over some period of time, peeking at v$session_wait and v$session_event - just for that session - and try to see what it is doing, what it is stuck on (if anything)

Query Does Not Finish - Followup

Jim Cox, December 27, 2011 - 4:56 pm UTC

Hi Tom

sorry about that

is this any better ?

looks like it is hung up on an index called GITYPEACTUSERDATAREFCNT

Thanks
Jim


Session:
Program Logon SID Oracle Name Status Connection Process P1 P1TEXT P2 P2TEXT P3 P3TEXT
-------------------------------- -------------------- ------------------------------ ------ ------------- -------- ------------ ---------- ---------- --- ----------
sqlplus.exe NSAB-SS68-ORA-N\LLAdmin 91 LLTEST ACTIVE DEDICATED 2040:3608 16 file# 28647 block# 1 blocks


Sessions_Events:

SID Event Parm 1 (P1) Parm 2 (P2) Parm 3 (P3) Seconds in Wait Wait Time State
---------- ------------------------------------ ------------- ------------- ----------- --------------- ---------- -------------------
115 pmon timer 300 0 0 394907 0 WAITING
113 rdbms ipc message 300 0 0 394904 0 WAITING
105 rdbms ipc message 100 0 0 394901 0 WAITING
97 Streams AQ: qmn coordinator idle wai 0 0 0 394890 0 WAITING
108 rdbms ipc message 180000 0 0 394855 0 WAITING
109 smon timer 300 0 0 135315 0 WAITING
85 Streams AQ: qmn slave idle wait 0 0 0 94966 0 WAITING
94 Streams AQ: waiting for time managem 0 0 0 2428 0 WAITING
99 SQL*Net message from client 675562835 1 0 391 0 WAITING
79 SQL*Net message from client 675562835 1 0 389 0 WAITING
93 SQL*Net message from client 675562835 1 0 352 0 WAITING
80 SQL*Net message from client 1952673792 1 0 334 0 WAITING
106 rdbms ipc message 300 0 0 305 0 WAITING
91 db file sequential read 16 28647 1 259 1 WAITED KNOWN TIME
76 SQL*Net message from client 675562835 1 0 191 0 WAITING
107 rdbms ipc message 500 0 0 108 0 WAITING
114 rdbms ipc message 300 0 0 105 0 WAITING
89 SQL*Net message from client 675562835 1 0 84 0 WAITING
88 SQL*Net message from client 675562835 1 0 51 0 WAITING
82 SQL*Net message from client 675562835 1 0 48 0 WAITING
75 SQL*Net message from client 1413697536 1 0 48 0 WAITING
92 jobq slave wait 0 0 0 42 0 WAITING
100 rdbms ipc message 30000 0 0 27 0 WAITING
84 wait for unread message on broadcast 10725920136 10725879600 0 24 0 WAITING
101 rdbms ipc message 6000 0 0 18 0 WAITING
83 SQL*Net message from client 675562835 1 0 6 0 WAITING
112 rdbms ipc message 300 0 0 3 0 WAITING
74 Streams AQ: waiting for messages in 8880 10723317280 5 0 0 WAITING
111 rdbms ipc message 300 0 0 0 0 WAITING
73 SQL*Net message to client 1413697536 1 0 0 -1 WAITED SHORT TIME
87 SQL*Net message from client 675562835 1 0 0 0 WAITING
110 rdbms ipc message 300 0 0 0 0 WAITING

Identify Segment with P1 and P2 Parms

Enter value for p1: 16
Enter value for p2: 28647

OWNER SEGMENT_NAME SEGMENT_TYPE TABLESPACE_NAME FILE_ID
------------------------------ ---------------------------- ---------------- --------------- -------
LLTEST GITYPEACTUSERDATAREFCNT INDEX LIVELINK_INDEX 16



FILE_ID FILE_NAME
---------- ----------------------------------------------------------

16 M:\ORACLE\ORADATA\LLTEST\LIVELINK_INDEX_LLTEST_02.DBF

Tom Kyte
December 27, 2011 - 5:24 pm UTC

can you post *just the single sid* having the problem, everything else is just noise.

where clause

Ahmad Al-Sallal, December 27, 2011 - 11:47 pm UTC

Hi Tom,
1)
Regarding tuning,please consider the following:
where ( condition1 or condition2 ) and ...
for performance issues i need to till Oracle to go through condition1 first (which is most likely TRUE and consumes less) to tune my query, how can i do that ?
2)
Does the ordering in where clause (joins first ! or conditions with indexes ! or ... ) effect the performace ?
3)
which one is better (using between) or (outer query with >= and the inner one with <= ) ?
4)
i need a favour, a Tom Kyte special link regarding everything about set autotrace in details, "StopKey","Merge Join",....
i think it's the first step in tuning for a beginer guy, isn't it?
Tom Kyte
December 29, 2011 - 10:44 am UTC

1) in general that is the job of the optimizer - it'll look at estimated cardinality, costs associated with functions and so on and just do that.

You could force the issue using CASE

where ( case condition1 
        then 1
        else case condition2
             then 1
         end ) = 1  and ....


if condition1 is true, the case immediately returns 1, else it evaluates condition2 and then if true, returns 1 else returns null.


I am not a fan of this approach in general, it will preclude access paths that a simple "where (c1 or c2) and ...." would permit - hiding the conditions in a CASE statement would in general be a bad idea.


2) no, the CBO doesn't care about the order of predicates, it will move them around as it sees fit.

3) doesn't matter, they are the same.

ops$tkyte%ORA11GR2> select * from dual where dummy between 'A' and 'Z';

Execution Plan
----------------------------------------------------------
Plan hash value: 272002086

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |     2 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| DUAL |     1 |     2 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("DUMMY">='A' AND "DUMMY"<='Z')



see the predicate section? between is just shorthand for >=, <=



4) I don't have a link - much of it is something I wrote about in Effective Oracle by Design - which, even though it was written just as 10gr1 came out - is still up to date.

Query Does Not Finish - Followup

Jim Cox, December 28, 2011 - 10:35 am UTC

Hi Tom

data for the SID

Thanks
Jim


Program SID Process P1 P1TEXT P2 P2TEXT P3 P3TEXT
------------ ------ --------- ---- -------- ------ -------- ----- --------
sqlplus.exe 91 2040:3608 18 file# 330109 block# 1 blocks




SID Event (P1) (P2) (P3) Seconds in Wait Wait Time State
---- ------------------------ ------ ------ ---- --------------- ---------- -------------------
91 db file sequential read 18 330109 1 105 4 WAITED KNOWN TIME



Identify Segment with P1 and P2 Parms

Enter value for p1: 18
Enter value for p2: 330109

SEGMENT_NAME SEGMENT_TYPE TABLESPACE_NAME FILE_ID
------------------------------ ---------------- ------------------ --------
GITYPEACTUSERDATAREFCNT INDEX LIVELINK_INDEX 18
Tom Kyte
December 29, 2011 - 10:55 am UTC

I would say there is something wrong with that disk - look at the wait time for a single block read. 105 seconds.

Query Does Not Finish - Thanks

Jim Cox, December 30, 2011 - 5:13 pm UTC

Hi Tom

Was hoping it was not a disk issue. I will have to get with sys admins for assistance. Thanks very much for all your help this past couple of weeks. Hope you have a great coming year !

Regards
Jim

unique index -- optimizor

A reader, January 03, 2012 - 9:23 pm UTC

create table test(id int, a int, b int);
create unique index ind1 on test(id,a);
set autotrace on
select * from test where id=1 and a=1;
select * from test where id=1 and a=1 and b=1;
-- both index unique scan ind1
create index ind2 on test(id,a,b);
select * from test where id=1 and a=1;
-- index range scan ind2, wrong, should be index unique scan
select * from test where id=1 and a=1 and b=1;
-- index unique scan on ind1, wrong, should be index unique scan on ind2
drop index ind1;
select * from test where id=1 and a=1;
select * from test where id=1 and a=1 and b=1;
-- both index range scan on ind2

my thought is if ind1(id,a) is unique, then ind2(id,a,b) should be also unique. so above bond wrong indicated is what i am expecting, could you pls throw some light on this. Also would like to know whether it is possible for a 3 column composite index be not unique while its 2 leading column composite index is unique.
Tom Kyte
January 04, 2012 - 9:24 am UTC

you can only get an index unique scan on a unique index.

if we use an index that is not unique, it will NEVER be a unique scan - it cannot be.

It doesn't matter if some other unique index exists - if we decide to use a non-unique index, it will NEVER be a unique index scan.



Your example is so tiny (no data) that is really doesn't matter WHAT index we use - does it? Everything is empty.


Let us look at this step by step. First - just understand and realize that - unless the index used is unique - it will NOT be a unique scan, it cannot be - it is not a unique index (unique indexes are different - stored differently on the block - from a non-unique index).


The first pair of queries - when only ind1 exists - use the unique index. It is the only game in town.

After you create ind2 - and run the query:


select * from test where id=1 and a=1; 


It correctly uses ind2 - why correctly? because it very intelligently realized "I don't have to go to the table if I use the index, I can get id, a, b right from the index". It takes less IO to get it from ind2 - we skip the table access by index rowid.

If you add a column or two to the table (eg:


ops$tkyte%ORA11GR2> create table t(id int, a int, b int, data char(40) );

Table created.

ops$tkyte%ORA11GR2> insert into t
  2  select rownum, -rownum, rownum, 'x'
  3    from all_objects;

72189 rows created.

ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'T' )

PL/SQL procedure successfully completed.


you'll find the plan to be:

ops$tkyte%ORA11GR2> create index ind2 on t(id,a,b);

Index created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select * from t where id=1 and a=1;

Execution Plan
----------------------------------------------------------
Plan hash value: 316242928

------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |     1 |    57 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T    |     1 |    57 |     2   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | IND1 |     1 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID"=1 AND "A"=1)


Here, since it has to go to the table to pick up another column - it will use the unique index.


So, when you say "wrong", it is you that is wrong :) We should use the non-unique index because we skip going to the table!



On to
select * from test where id=1 and a=1 and b=1; 
-- index unique scan on ind1, wrong, should be index unique scan on ind2 



Again - wrong. What if id=1, a=1 was found in the index - but then when we go to the table b was 42, not 1? We'd have gone to the table for nothing. So, it makes much much more sense to verify from the index that b=1 as well - BEFORE going to the table to check.

Since you had only id,a,b in the table - and id,a,b, existed in the index - it is correct to use ind2 for the reason outlined above.

But if you add data to the table - you'll find we STILL use ind2 - so that we can verify id, a AND b in the index before wasting our time to go to the table.

Also would like to know whether it is possible for a 3 column composite index be not unique while its 2 leading column composite index is unique.

obviously - it is, you just proved it! You created a non-unique index.

Now, maybe your question is:

is it possible for a 3 column non-unique index to actually contain a non-unique value if the leading 2 columns are in a unique index by themselves already


the answer to that is "no". But you can still define the index as non-unique. And if you do - you'll never "UNIQUE" scan it.

thank you very much

A reader, January 04, 2012 - 8:18 pm UTC

Hi Tom,

Thank you very much.

Answered my question perfectly and i know what is going wrong with myself:)

Your highlighted is exactly what's my question. Got it now.


Join of table

A reader, January 05, 2012 - 12:23 am UTC

Hi Tom,

IS it better to join as much as column as possible in query which is join of 2 or more table.

Suppose i have table a,b

select * from a,b where a.col1 = b.col1

have better execution plan

select * from a,b where a.col1 = b.col1 and b.col2 = a.col2
Tom Kyte
January 05, 2012 - 9:33 am UTC

this doesn't make any sense.

those are entirely different queries that return potentially different result sets.

You run the query that actually retrieves the data you need.

A reader, January 05, 2012 - 9:19 pm UTC

Hi Tom,

Both the query will get the same result.AS both column 1 and 2 will be same.

col1 is primary key in both the table.And found in some case where execution was worst after joining col2.
Tom Kyte
January 10, 2012 - 9:21 pm UTC

you wrote:

Suppose i have table a,b

select * from a,b where a.col1 = b.col1

have better execution plan

select * from a,b where a.col1 = b.col1 and b.col2 = a.col2



I wrote that they cannot be compared, they are entirely different queries. Here is evidence that what I said was true and that what you are currently saying is very wrong:

ops$tkyte%ORA11GR2> create table a ( col1 int primary key, col2 int );

Table created.

ops$tkyte%ORA11GR2> create table b ( col1 int primary key, col2 int );

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> insert into a values (1, 1 );

1 row created.

ops$tkyte%ORA11GR2> insert into b values (1, 2 );

1 row created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select * from a,b where a.col1 = b.col1;

      COL1       COL2       COL1       COL2
---------- ---------- ---------- ----------
         1          1          1          2

ops$tkyte%ORA11GR2> select * from a,b where a.col1 = b.col1 and b.col2 = a.col2;

no rows selected

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> 

A reader, January 11, 2012 - 10:50 am UTC

Hi Tom,

Maybe i have not asked question correctly.Sorry.

Let say i have 3 column in table a and b.And now in join query col1 of both table is same and joined in where clause.Also the col2 value will be always same for both table as per our business logic.If i join this col2 in some case i find good execution plan but in some case not.can you explain why ?? sorry if have i missed something.

create table a ( col1 int primary key, col2 int ,col3 int);

create table b ( col1 int primary key, col2 int,col3int );

insert into a values (1, 2,1 );

insert into b values (1, 2,2 );

Tom Kyte
January 11, 2012 - 3:13 pm UTC

can you give me an example to work with - give me a "for example, take this table, and that table - and notice these two query plans - see how they are different and one performs better than the other"

give me a concrete example to work with - just like I always give you guys.


Also the col2 value will be always
same for both table as per our business logic


what or how do you enforce that? and why wouldn't these things be the SAME TABLE - why the heck is there even a join here at all???

Query Does Not Finish - Again

Jim Cox, January 13, 2012 - 2:16 pm UTC

Hi Tom

I have a very perplexing problem that i want to run by you with regards to what you helped me with last time

I have imported a copy of my database to test again
I have locked down the schema stats so they cannot be changed
I run the same sql and the only difference being is the line: WHERE dt.CreateDate >=
I have selected 2012/01/09 and several dates before that date and the queries run within less than a minute
I then change the date to 2012/01/10 and for some reason the query will not finish

I have looked at the explain plans and the only thing i see different is that the one that does not finish
does not do a HASH JOIN for some reason. It does three NESTED LOOPS and i am assuming that is the cause

Any advice is appreciated
I am not understanding why the date change makes a difference

Thanks
Jim


sql that is being run:

SELECT
dt.DataID,
dt.CreateDate,
dt.SubType
FROM DTree dt
INNER JOIN DTree parent ON dt.ParentID = parent.DataID
INNER JOIN DTree owner ON dt.OwnerID = -owner.DataID
left join DVersData dvd on dt.DataID = dvd.DocID
WHERE dt.CreateDate >= TO_DATE( '2012/01/10', 'YYYY/MM/DD' )
and parent.SubType != 154
AND owner.SubType NOT IN ( 148, 161, 162 )
AND dt.SubType in (144)
AND dvd.MimeType not in
(
'application/gzip-compressed',
'application/octet-stream',
'application/outlook-pst',
'application/vnd.ms-access',
'application/vnd.ms-binder',
'application/vnd.xfdl',
'application/x-dbf',
'application/x-dpa',
'application/x-gzip',
'application/x-java-class',
'application/x-ldb',
'application/x-lnk',
'application/x-max',
'application/x-mpx',
'application/x-ms-wmz',
'application/x-pps',
'application/x-python-bytecode',
'application/x-shs',
'application/x-trn',
'application/x-zip-compressed',
'audio/basic',
'audio/mp3',
'audio/mpeg',
'audio/x-aiff',
'audio/x-wav',
'drawing/dwg',
'image/bmp',
'image/gif',
'image/ief',
'image/jpeg',
'image/pcx',
'image/png',
'image/psd',
'image/tga',
'image/tiff',
'image/x-bmp',
'image/x-cmu-raster',
'image/x-pcx',
'image/x-pic',
'image/x-pict',
'image/x-portable-anymap',
'image/x-portable-graymap',
'image/x-portable-pixmap',
'image/x-rgb',
'image/x-wmf',
'image/x-xbitmap',
'image/x-xpixmap',
'image/x-xwindowdump',
'video/asx',
'video/mpeg',
'video/quicktime',
'video/x-ms-wmv',
'video/x-msvideo',
'video/x-sgi-movie'
)
ORDER BY dt.CreateDate DESC, dt.DataID DESC ;


check for plans:

select
sql_id,
address,
hash_value,
child_address,
child_number,
plan_hash_value,
SUBSTR (sql_text,1,40) sql_text
--SUBSTR (sql_fulltext,1,40) sql_fulltext
from v$sql
where
sql_text like 'SELECT dt.DataID%';


SQL_ID ADDRESS HASH_VALUE CHILD_ADDRESS CHILD_NUMBER PLAN_HASH_VALUE SQL_TEXT
------------- ---------------- ---------- ---------------- ------------ --------------- ----------------------------------------
118680kxcjcbp 000000027E909D08 3133714805 000000027E909B80 0 570399632 SELECT dt.DataID, dt.CreateDate, dt.SubT
0artw48qap06b 000000027E6A5498 749371595 000000027E6A5268 0 4269733020 SELECT dt.DataID, dt.CreateDate, dt.SubT






This one works:

select * from table( dbms_xplan.display_cursor( '118680kxcjcbp', 0 ) );

Plan hash value: 570399632

---------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | | 1602K(100)| |
| 1 | SORT ORDER BY | | 85M| 6369M| 7271M| 1602K (3)| 01:48:14 |
|* 2 | HASH JOIN | | 85M| 6369M| | 8726 (35)| 00:00:36 |
| 3 | NESTED LOOPS | | 2803 | 186K| | 809 (1)| 00:00:04 |
| 4 | NESTED LOOPS | | 2699 | 102K| | 538 (1)| 00:00:03 |
|* 5 | TABLE ACCESS BY INDEX ROWID| DTREE | 2712 | 78648 | | 266 (1)| 00:00:02 |
|* 6 | INDEX RANGE SCAN | RECD_DTREE_INDEX1 | 3397 | | | 1 (0)| 00:00:01 |
|* 7 | INDEX RANGE SCAN | DTREE_RECD_OPINIONSUMMARY | 1 | 10 | | 1 (0)| 00:00:01 |
|* 8 | INDEX RANGE SCAN | RECD_DOCID_MIMETYPE | 1 | 29 | | 1 (0)| 00:00:01 |
|* 9 | INDEX FULL SCAN | DTREE_RECD_OPINIONSUMMARY | 8676K| 82M| | 5022 (3)| 00:00:21 |
---------------------------------------------------------------------------------------------------------------------

This one does not work:

select * from table( dbms_xplan.display_cursor( '0artw48qap06b', 0 ) );

Plan hash value: 4269733020

---------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | | 2088 (100)| |
| 1 | SORT ORDER BY | | 48021 | 3657K| 4184K| 2088 (11)| 00:00:09 |
| 2 | NESTED LOOPS | | 48021 | 3657K| | 1198 (18)| 00:00:05 |
| 3 | NESTED LOOPS | | 2 | 136 | | 3 (0)| 00:00:01 |
| 4 | NESTED LOOPS | | 2 | 78 | | 2 (0)| 00:00:01 |
|* 5 | TABLE ACCESS BY INDEX ROWID| DTREE | 2 | 58 | | 1 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | RECD_DTREE_INDEX1 | 2 | | | 1 (0)| 00:00:01 |
|* 7 | INDEX RANGE SCAN | DTREE_RECD_OPINIONSUMMARY | 1 | 10 | | 1 (0)| 00:00:01 |
|* 8 | INDEX RANGE SCAN | RECD_DOCID_MIMETYPE | 1 | 29 | | 1 (0)| 00:00:01 |
|* 9 | INDEX FULL SCAN | DTREE_RECD_OPINIONSUMMARY | 30550 | 298K| | 597 (18)| 00:00:03 |
---------------------------------------------------------------------------------------------------------------------


Tom Kyte
January 17, 2012 - 11:04 am UTC

in looking at the plan that isn't working so well, are the estimated cardinalities correct or way off - use your knowledge of the data to answer that - look at the filter/access predicates and ask yourself "would that filter return that many rows or many many more"

I am going to guess that the cardinality estimate is WAY off - and it is centered around your date column.

Locking statistics, if they are not representative of the data, would not be the way to go about fixing it.

Query Does Not Finish - Again (follow-up)

Jim Cox, January 17, 2012 - 4:14 pm UTC

Hi Tom

the reason i locked the stats down was because the last time i ran the sql it was taking so long that the stats changed over night and that was a problem, so this time i locked them down so they would not change.

the sql that ran fast using a date 0f 2012/01/09 took less than a minute and returned 10186 rows

the sql that took 8 hours using a date 0f 2012/01/10 returned 7790 rows


is that what you wanted to know ?

if this is a cardinality problem, how would i go about fixing it ?

I am not up to speed on cardinality

Thanks
Jim
Tom Kyte
January 17, 2012 - 4:32 pm UTC

No, i want to know if the rows column in the above plans are

a) close to reality, it will really be getting those numbers of rows from each step in the plan

b) far off from reality, the number of rows ESTIMATED to be retrieved by each and every step of the plan is so far away from what really is going to be happening.


for example, this part of the plan:

|*  5 |      TABLE ACCESS BY INDEX ROWID| DTREE                     |     2 |
|*  6 |       INDEX RANGE SCAN          | RECD_DTREE_INDEX1         |     2 |



we think we are going to get 2 rows. Does that match your expectation - if you look at the plan (you have the filters and access predicates - I cannot see them) would you think that where clause would return only 2 rows? or 2,000,000 rows?



Query Does Not Finish - Again (follow-up)

Jim Cox, January 18, 2012 - 10:45 am UTC

Hi Tom

i am not understanding what you want me to do to get you
what you need to help me. I am going to have to plead stupid on this one. My apologies.
Does this info help at all:

Thanks for yor Patience
Jim

Predicate Information (identified by operation id):
---------------------------------------------------

5 - filter("DT"."SUBTYPE"=144)
6 - access("DT"."CREATEDATE">=TO_DATE(' 2012-01-10 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
7 - access("DT"."PARENTID"="PARENT"."DATAID")
filter("PARENT"."SUBTYPE"<>154)
8 - access("DT"."DATAID"="DVD"."DOCID")
filter(("DVD"."MIMETYPE"<>'image/jpeg' AND "DVD"."MIMETYPE"<>'application/octet-stream' AND
"DVD"."MIMETYPE"<>'image/gif' AND "DVD"."MIMETYPE"<>'application/vnd.xfdl' AND
"DVD"."MIMETYPE"<>'application/x-zip-compressed' AND "DVD"."MIMETYPE"<>'image/png' AND
"DVD"."MIMETYPE"<>'image/tiff' AND "DVD"."MIMETYPE"<>'image/bmp' AND
"DVD"."MIMETYPE"<>'application/vnd.ms-binder' AND "DVD"."MIMETYPE"<>'drawing/dwg' AND
"DVD"."MIMETYPE"<>'video/x-ms-wmv' AND "DVD"."MIMETYPE"<>'application/gzip-compressed' AND
"DVD"."MIMETYPE"<>'application/x-dbf' AND "DVD"."MIMETYPE"<>'audio/x-wav' AND
"DVD"."MIMETYPE"<>'application/vnd.ms-access' AND "DVD"."MIMETYPE"<>'audio/mp3' AND
"DVD"."MIMETYPE"<>'image/x-wmf' AND "DVD"."MIMETYPE"<>'video/quicktime' AND
"DVD"."MIMETYPE"<>'video/x-msvideo' AND "DVD"."MIMETYPE"<>'application/x-ms-wmz' AND
"DVD"."MIMETYPE"<>'image/psd' AND "DVD"."MIMETYPE"<>'image/tga' AND "DVD"."MIMETYPE"<>'video/mpeg' AND
"DVD"."MIMETYPE"<>'application/outlook-pst' AND "DVD"."MIMETYPE"<>'application/x-dpa' AND
"DVD"."MIMETYPE"<>'application/x-gzip' AND "DVD"."MIMETYPE"<>'application/x-java-class' AND
"DVD"."MIMETYPE"<>'application/x-ldb' AND "DVD"."MIMETYPE"<>'application/x-lnk' AND
"DVD"."MIMETYPE"<>'application/x-max' AND "DVD"."MIMETYPE"<>'application/x-mpx' AND
"DVD"."MIMETYPE"<>'application/x-pps' AND "DVD"."MIMETYPE"<>'application/x-python-bytecode' AND
"DVD"."MIMETYPE"<>'application/x-shs' AND "DVD"."MIMETYPE"<>'application/x-trn' AND
"DVD"."MIMETYPE"<>'audio/basic' AND "DVD"."MIMETYPE"<>'audio/mpeg' AND "DVD"."MIMETYPE"<>'audio/x-aiff' AND
"DVD"."MIMETYPE"<>'image/ief' AND "DVD"."MIMETYPE"<>'image/pcx' AND "DVD"."MIMETYPE"<>'image/x-bmp' AND
"DVD"."MIMETYPE"<>'image/x-cmu-raster' AND "DVD"."MIMETYPE"<>'image/x-pcx' AND
"DVD"."MIMETYPE"<>'image/x-pic' AND "DVD"."MIMETYPE"<>'image/x-pict' AND
"DVD"."MIMETYPE"<>'image/x-portable-anymap' AND "DVD"."MIMETYPE"<>'image/x-portable-graymap' AND
"DVD"."MIMETYPE"<>'image/x-portable-pixmap' AND "DVD"."MIMETYPE"<>'image/x-rgb' AND
"DVD"."MIMETYPE"<>'image/x-xbitmap' AND "DVD"."MIMETYPE"<>'image/x-xpixmap' AND
"DVD"."MIMETYPE"<>'image/x-xwindowdump' AND "DVD"."MIMETYPE"<>'video/asx' AND
"DVD"."MIMETYPE"<>'video/x-sgi-movie'))
9 - filter(("OWNER"."SUBTYPE"<>162 AND "OWNER"."SUBTYPE"<>148 AND "OWNER"."SUBTYPE"<>161 AND
"DT"."OWNERID"=(-"OWNER"."DATAID")))
Tom Kyte
January 18, 2012 - 12:25 pm UTC

does the where clause:

5 - filter("DT"."SUBTYPE"=144)
6 - access("DT"."CREATEDATE">=TO_DATE(' 2012-01-10 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

(and those together) against that table return 2 rows, that is what we are assuming it will do.



Query Does Not Finish - Again (follow-up)

Jim Cox, January 18, 2012 - 1:19 pm UTC

No Sir.
it does not



SQL> select
  2  count (*)
  3  from
  4  dtree
  5  where
  6  subtype=144
  7  and
  8  CREATEDATE >=TO_DATE('2012-01-10 00:00:00', 'syyyy-mm-dd hh24:mi:ss');

  COUNT(*)
----------
      8285


SQL> select
  2  count (*)
  3  from
  4  dtree
  5  where
  6  subtype=144;

  COUNT(*)
----------
   6951135


SQL> select
  2  count (*)
  3  from
  4  dtree
  5  where
  6  CREATEDATE >=TO_DATE('2012-01-10 00:00:00', 'syyyy-mm-dd hh24:mi:ss');

  COUNT(*)
----------
     10913

Tom Kyte
January 18, 2012 - 1:45 pm UTC

so, here I'm going to have to blame the statistics.

If you run this with dynamic sampling set to 10, what happens.

alter session set OPTIMIZER_DYNAMIC_SAMPLING = 10;

what do the estimated cardinalities and the plan become.

I've a feeling that your date is falling off of the end of the statistics, we are thinking "very very very few dates are greater than jan-10th", when in fact many are.

We our guess (2 in this case) is way off from reality (8,285) - good plans go bad.



Query Finishes Now

Jim Cox, January 18, 2012 - 2:58 pm UTC

Hi Tom

came back in about 1-2 minutes this time

explanantion would be greatly appreciated why it now works as advertised


Thanks
Jim

alter session set OPTIMIZER_DYNAMIC_SAMPLING = 10;

Execution Plan
----------------------------------------------------------
Plan hash value: 570399632

---------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 261M| 19G| | 4892K (3)| 05:30:22 |
| 1 | SORT ORDER BY | | 261M| 19G| 21G| 4892K (3)| 05:30:22 |
|* 2 | HASH JOIN | | 261M| 19G| | 15528 (58)| 00:01:03 |
| 3 | NESTED LOOPS | | 8564 | 568K| | 1659 (1)| 00:00:07 |
| 4 | NESTED LOOPS | | 8246 | 314K| | 832 (1)| 00:00:04 |
|* 5 | TABLE ACCESS BY INDEX ROWID| DTREE | 8285 | 234K| | 1 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | RECD_DTREE_INDEX1 | 2 | | | 1 (0)| 00:00:01 |
|* 7 | INDEX RANGE SCAN | DTREE_RECD_OPINIONSUMMARY | 1 | 10 | | 1 (0)| 00:00:01 |
|* 8 | INDEX RANGE SCAN | RECD_DOCID_MIMETYPE | 1 | 29 | | 1 (0)| 00:00:01 |
|* 9 | INDEX FULL SCAN | DTREE_RECD_OPINIONSUMMARY | 8676K| 82M| | 5022 (3)| 00:00:21 |
---------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("DT"."OWNERID"=(-"OWNER"."DATAID"))
5 - filter("DT"."SUBTYPE"=144)
6 - access("DT"."CREATEDATE">=TO_DATE(' 2012-01-10 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
7 - access("DT"."PARENTID"="PARENT"."DATAID")
filter("PARENT"."SUBTYPE"<>154)
8 - access("DT"."DATAID"="DVD"."DOCID")
filter("DVD"."MIMETYPE"<>'image/jpeg' AND "DVD"."MIMETYPE"<>'application/octet-stream' AND
"DVD"."MIMETYPE"<>'image/gif' AND "DVD"."MIMETYPE"<>'application/vnd.xfdl' AND
"DVD"."MIMETYPE"<>'application/x-zip-compressed' AND "DVD"."MIMETYPE"<>'image/png' AND
"DVD"."MIMETYPE"<>'image/tiff' AND "DVD"."MIMETYPE"<>'image/bmp' AND
"DVD"."MIMETYPE"<>'application/vnd.ms-binder' AND "DVD"."MIMETYPE"<>'drawing/dwg' AND
"DVD"."MIMETYPE"<>'video/x-ms-wmv' AND "DVD"."MIMETYPE"<>'application/gzip-compressed' AND
"DVD"."MIMETYPE"<>'application/x-dbf' AND "DVD"."MIMETYPE"<>'audio/x-wav' AND
"DVD"."MIMETYPE"<>'application/vnd.ms-access' AND "DVD"."MIMETYPE"<>'audio/mp3' AND
"DVD"."MIMETYPE"<>'image/x-wmf' AND "DVD"."MIMETYPE"<>'video/quicktime' AND
"DVD"."MIMETYPE"<>'video/x-msvideo' AND "DVD"."MIMETYPE"<>'application/x-ms-wmz' AND
"DVD"."MIMETYPE"<>'image/psd' AND "DVD"."MIMETYPE"<>'image/tga' AND "DVD"."MIMETYPE"<>'video/mpeg' AND
"DVD"."MIMETYPE"<>'application/outlook-pst' AND "DVD"."MIMETYPE"<>'application/x-dpa' AND
"DVD"."MIMETYPE"<>'application/x-gzip' AND "DVD"."MIMETYPE"<>'application/x-java-class' AND
"DVD"."MIMETYPE"<>'application/x-ldb' AND "DVD"."MIMETYPE"<>'application/x-lnk' AND
"DVD"."MIMETYPE"<>'application/x-max' AND "DVD"."MIMETYPE"<>'application/x-mpx' AND
"DVD"."MIMETYPE"<>'application/x-pps' AND "DVD"."MIMETYPE"<>'application/x-python-bytecode' AND
"DVD"."MIMETYPE"<>'application/x-shs' AND "DVD"."MIMETYPE"<>'application/x-trn' AND
"DVD"."MIMETYPE"<>'audio/basic' AND "DVD"."MIMETYPE"<>'audio/mpeg' AND "DVD"."MIMETYPE"<>'audio/x-aiff' AND
"DVD"."MIMETYPE"<>'image/ief' AND "DVD"."MIMETYPE"<>'image/pcx' AND "DVD"."MIMETYPE"<>'image/x-bmp' AND
"DVD"."MIMETYPE"<>'image/x-cmu-raster' AND "DVD"."MIMETYPE"<>'image/x-pcx' AND
"DVD"."MIMETYPE"<>'image/x-pic' AND "DVD"."MIMETYPE"<>'image/x-pict' AND
"DVD"."MIMETYPE"<>'image/x-portable-anymap' AND "DVD"."MIMETYPE"<>'image/x-portable-graymap' AND
"DVD"."MIMETYPE"<>'image/x-portable-pixmap' AND "DVD"."MIMETYPE"<>'image/x-rgb' AND
"DVD"."MIMETYPE"<>'image/x-xbitmap' AND "DVD"."MIMETYPE"<>'image/x-xpixmap' AND
"DVD"."MIMETYPE"<>'image/x-xwindowdump' AND "DVD"."MIMETYPE"<>'video/asx' AND
"DVD"."MIMETYPE"<>'video/x-sgi-movie')
9 - filter("OWNER"."SUBTYPE"<>162 AND "OWNER"."SUBTYPE"<>148 AND "OWNER"."SUBTYPE"<>161)

Note
-----
- dynamic sampling used for this statement









Tom Kyte
January 18, 2012 - 3:40 pm UTC

it has to do with the stats - your stats *were not correct*, dynamic sampling accidentally fixed that for you when you ran the query.


read this article:

http://www.oracle.com/technetwork/issue-archive/2009/09-jan/o19asktom-086775.html


Query Finishes Now - Followup

Jim Cox, January 18, 2012 - 5:03 pm UTC

Hi Tom


i read the article you sent me to so I guess my question is what are my options

it seems that the only way this sql will finish in 1-2 minutes
is if the dynamic sampling is set to 9 or 10 according to my tests.

currently dynamic sampling uses the default of 2 for 10g.

what's baffling is the fact that this sql had a very slow completion time starting on the 12th
and then just last night around 12am this morning the sql started responding within 1-2 minutes
again and is running fine right now, so far

i am waiting for the next slow down time right now

this is sql that the vendor runs though their app and it runs every 5 minutes (except when it does not finish)
so it is not something that is run via sqlplus, which would allow me to set the sampling before it would be run

Thanks
Jim

Tom Kyte
January 18, 2012 - 6:04 pm UTC

it seems that the only way this sql will finish in 1-2 minutes
is if the dynamic sampling is set to 9 or 10 according to my tests.


or if you gathered statistics that included the current set of data - tell me, when the stats were gathered on this table - was 10-jan-2012 in there yet? Does 10-jan-2012 *exist* in your statistics. I know it is your data, but when the stats where gathered - was it there?

dynamic sampling is a very good thing however when you have literals in your sql, at level 3 and above (the levels are described in that article), it'll validate "guesses" - guesses made when you have predicates involving more than one column. So, it isn't necessarily a bad thing.


dynamic sampling can be set

o statement by statement - perhaps not applicable for you.

o session by session (you could have a logon trigger that alters the session of the sessions you want to have this on, the application doesn't have to do it)

o at the database level, I'm not a huge fan of this one, I'd prefer to isolate it - to keep the database defaults mostly at the defaults. But push come to shove, this is a parameter that is settable at the instance level as well. In certain reporting/warehouse situations - it would even be advisable to set it, spend a little extra time parsing to get the best plan possible.

Query Finishes Now - Followup

Jim Cox, January 19, 2012 - 2:16 pm UTC

Hi Tom

Question:
was 10-jan-2012 in there yet? Does 10-jan-2012 *exist* in your statistics. I know it is your data, but when the stats where gathered - was it there?


Answer:
the answer to your question is that the stats on the table were updated on 01-09 at 23:07 and the tests were run using the date of 01-10; so no sir

i locked the stats when i did the import for the tests so they would not effect my query since it took 8 hours to complete.

so on my production database right now the stats were updated on 01-17-2012 at 22:35 and the query date is 01-17-2012 and it is running at 5 minute intervals like it should

are you thinking that once the sql uses 01-18-2012 or 01-19-2012 that the query will run slow again because the stats will have not been updated to include the future date(s)?

OEM job is controlling when stats are gathered, so does that mean i should be updating the stats on this table and indexes more often ?


Thanks
Jim
Tom Kyte
January 19, 2012 - 2:26 pm UTC

the answer to your question is that the stats on the table were updated on
01-09 at 23:07 and the tests were run using the date of 01-10; so no sir,


and that is the root cause.

i locked the stats when i did the import for the tests so they would not effect
my query since it took 8 hours to complete.


and that is an underlying reason.... your stats were not representative of the data contained in your table.


and it is running at 5 minute intervals
like it should


what is running? the query or stats gathering?



are you thinking that once the sql uses 01-18-2012 or 01-19-2012 that the query
will run slow again because the stats will have not been updated to include the
future date(s)?


if your stats fail to be representative of the data, yes, you'll get bad cardinality estimates and potentially the wrong plan due to that.


OEM job is controlling when stats are gathered, so does that mean i should be
updating the stats on this table and indexes more often ?


maybe.

you could

a) gather stats more frequently, but that is resource and labor intensive

b) recognize this column high value is constantly changing - and use dbms_stats.set_column_stats to let us know that - set the new high value to today, or tomorrow. That way, you don't have to physically gather stats too often. The relative size of the table might not be changing much - but the high value is, just let us know that.




Query Finishes Now - Followup

Jim Cox, January 19, 2012 - 3:46 pm UTC

Hi Tom

Remark:
i locked the stats when i did the import for the tests so they would not effect my query since it took 8 hours to complete.

and that is an underlying reason.... your stats were not representative of the data contained in your table.

Comment:
but Tom, my stats will only be representative of the data for the query when the stats happened to be updated on the same date that is used in the query


Remark:
and it is running at 5 minute intervals like it should

what is running? the query or stats gathering?

Comment:
the query runs at 5 minute intervals

the stats do not seem to get updated for days (last few times were 12-27-11, 1-9-12, 1-17-12)
so the date used in the select will not be incorporated in the stats at times causing the problem



Recommendation:
b) recognize this column high value is constantly changing - and use dbms_stats.set_column_stats to let us know that - set the new high value to today, or tomorrow.
That way, you don't have to physically gather stats too often.
The relative size of the table might not be changing much - but the high value is, just let us know that.

Request:
can you provide me example on how to go about this or a link

thanks
jim
Tom Kyte
January 20, 2012 - 9:33 am UTC

Query Finishes Now - Followup

Jim Cox, January 20, 2012 - 11:17 am UTC

Hi Tom

i will take a look at it

just an fyi,

Confirmation:
as expected last night at midnight when the sql ran with the date not in the stats, it bogged down again.
stats were updated on the 17th and the date in the sql is the 18th.

Thanks for all your help on this one
Jim

Query Finishes Now - Question

Jim Cox, January 20, 2012 - 3:21 pm UTC

Hi Tom

I came across this statement on a web site
would you agree or not agree with what the writer says ?

"Remember, the dbms_stats.set_column_stats is ONLY used for tuning testing and the set_column_stats statement should never be used in a production environment".

Thanks
Jim
Tom Kyte
January 23, 2012 - 3:50 pm UTC

that would be very short sighted.

WE (Oracle) use dbms_stats.set* routines!!! (in production no less! in every production database 10g and above actually...) AWR uses it, when they create a new partition it is empty - that would be bad for the optimizer to create plans when it is empty and run those as they fill up. We'd rather use plans that were generated by the optimizer thinking the partition is FULL (a plan generated by the optimizer thinking a segment is 'big' generally works ok when it is small. a plan generated by the optimizer thinking a segment is 'small' generally does NOT work ok when it is big)

we use copy table stats to move over the last partitions stats to use as a starting point and then use the set routines to adjust high/low values for things like dates and such.

we come back and gather the real stats later, when the data is there to be gathered against.


so, to say "never" is "not smart".


Never say never
Never say always
I always say


if you google up that quote you gave me, it is very easy to find the source. Beware of articles that

a) don't have any examples
b) are rife with obvious typos (distant?)
c) have code samples that would never compile (hence have never been run)
d) make blanket statements without a shred of evidence that what is being stated is actually true
e) give no reasoning behind opinions
f) ====>>>> DO NOT GIVE YOU THE ABILITY TO QUESTION THE CONTENT <<<<=======

be very very very afraid of articles like that.

The bit about "blanket statements without a shred of evidence" is perhaps the most important part. How often do you see me say "you should do X". How often do you see me say "you should do X because.... and here are some numbers/scenarios that bear this out......."

Or, on second thought, maybe the reason (f) is the most important to be scared of some information. No comment ability, no feedback ability - nothing. No way to say "yeah, but what about....", or "I don't agree - for these reasons....". Commenting is probably the most important part. eh, they are both relevant I guess.

It takes more work to do what I do - but I think you get much more reliable advice out of it.


dbms_stats.set_column_stats would be an *excellent* procedure to use in production, YOU ALREADY ARE!!!! Everyone of you (running 10g and above anyway....)


The optimizer does not care one whit were the statistics come from, the optimizer only cares that the statistics are REPRESENTATIVE of the data in the table.



If the author of that article wants to suggest that you need to gather stats on your table to bump a high value for a given column, fine - so be it. I think that would be rather short sighted and inefficient myself.


So, to answer your question

a) No, I do not agree even a tiny bit with that author
b) I think that advice is short sighted and wrong

Query Finishes Now - Question Followup

Jim Cox, January 23, 2012 - 2:25 pm UTC

Hi Tom


thanks for your words of wisdom on my question and for your time
appreciate your point of view

after i ented the question i found an note and copied this code from 1276174.1
ran the tests and they worked, but when i go to
substitute my owner, table, and column i get the below errors


SQL> DECLARE
  2     srec               DBMS_STATS.STATREC;
  3     v_distcnt          NUMBER;
  4     v_density          NUMBER;
  5     v_nullcnt          NUMBER;
  6     v_avgclen          NUMBER;
  7     numvals            DBMS_STATS.NUMARRAY;
  8     charvals           DBMS_STATS.CHARARRAY;
  9     datevals           DBMS_STATS.DATEARRAY;
 10
 11  BEGIN
 12
 13     -- Date Type
 14
 15      DBMS_STATS.get_column_stats (ownname      => 'LLTEST',
 16                                   tabname      => 'DTREE',
 17                                   colname      => 'CREATEDATE',
 18                                   distcnt      => v_distcnt,
 19                                   density      => v_density,
 20                                   nullcnt      => v_nullcnt,
 21                                   srec         => srec,
 22                                   avgclen      => v_avgclen
 23                                  );
 24
 25
 26     datevals := DBMS_STATS.datearray (sysdate, sysdate);
 27  --   datevals := DBMS_STATS.datearray (to_date('1601-01-01','yyyy-mm-dd'),sysdate);
 28
 29     DBMS_STATS.prepare_column_values (srec, datevals);
 30
 31     DBMS_STATS.set_column_stats (ownname      => 'LLTEST',
 32                                  tabname      => 'DTREE',
 33                                  colname      => 'CREATEDATE',
 34                                  distcnt      => v_distcnt,
 35                                  density      => v_density,
 36                                  nullcnt      => v_nullcnt,
 37                                  srec         => srec,
 38                                  avgclen      => v_avgclen
 39                                 );
 40
 41     COMMIT;
 42  END;
 43  /
DECLARE
*
ERROR at line 1:
ORA-20001: Invalid or inconsistent input values
ORA-06512: at "SYS.DBMS_STATS", line 5012
ORA-06512: at line 29 


it is having issues with 
DBMS_STATS.prepare_column_values (srec, datevals);

any idea on what could be the issue
my date low value is 01-01-1601 for some reason so at first
i thought that was the issue, but tried using it and the same issue
that is why it is commented out

SELECT 
low_value, 
to_char (raw_to_date(low_value),'yyyy-mm-dd') low_value_actual, 
high_value, 
to_char (raw_to_date(high_value),'yyyy-mm-dd')  high_value_actual
FROM 
user_tab_col_statistics
WHERE 
table_name = 'DTREE'
AND 
column_name = 'CREATEDATE';

LOW_VALUE          LOW_VALUE_ACTUAL       HIGH_VALUE        HIGH_VALUE_ACTUAL
------------------ ---------------------- ----------------- --------------------
74650101010101     1601-01-01             78700109171C19    2012-01-09



Thanks
Jim

is storing calculative columns in plsql is better than writed decode clause in sql

GR, January 25, 2012 - 5:13 am UTC

Hi Tom,

I've requirement to convert row data to column after some calculation. Like I've a set of rows which has different units for 106 rows. I've written a sql code using dcode and max with group by to get these 106 records in one row and 106 units in columns defined in the target table. The sql codes shows as below with a group by on common cols. The statement takes a lot of time to complete. I would like to know if I can write it in a pl/sql rather in sql and store all the calculative col values in variables and make one insert statement at the end of the loop will help me to reduce the processing time .I think declaring so many variables in pl/sql block and running in a loop for 106 times will take more time than below sql. Please suggest the best approch.

MAX(DECODE(gf.date,0,gf.UNS,0)) uns0,
MAX(DECODE(gf.date,0,gf.UNS,0)) uns1,
...
MAX(DECODE(gf.date,0,gf.UNS,0)) uns105



Regards,
GR
Tom Kyte
January 26, 2012 - 10:06 am UTC

My mantra is:

if you can do it in a single sql statement - do it
if you cannot, use as little plsql as possible to do it
if plsql doesn't work for some sound technical reason, use as little java/c as possible

so, stick with sql.

Point for Jim Cox from El Segundo, CA USA

Ian, January 25, 2012 - 9:17 am UTC

Hi Jim

One thing you should be aware of is that your low value of 01-Jan-1610 is going to seriously affect your cardinality estimates and could result in bad plans.

Consider you have 1 years worth of data in your table evenly distributed at the rate of 1 row every minute of every hour of every day from 01-Jan-2011 to 31-Dec-2011. That's 525,600 rows. Suppose you have an index on CREATEDATE.

If you ran the following query, which will return 50% of the table...

select *
from dtree
where createdate between to_date('01-jan-2011','dd-mon-yyyy')
                     and to_date('01-jun-2011','dd-mon-yyyy')-1;


...The CBO will estimate a cardinality of 262,800 and go (correctly) for a Full Table Scan.

Now insert 1 row with a createdate of 01-Jan-1610 and re-gather stats.

Now for the same query the CBO will estimate the cardinality as 539 and use (incorrectly) the index.

Why 539? Simplified - to get the estimated cardinality the CBO figures out how many days between the low and high values, divides the total number of rows by that, then multiplies by how many days we want. The number of days between the low and high values should be 365 - but our single 01-Jan-1610 changes that to 146,826 days!

Beware false low values!

Regards

Ian

Point for Jim Cox from El Segundo, CA USA - Reply

Jim Cox, January 25, 2012 - 11:35 am UTC

Hi Ian

i understand your point but the data was loaded years ago and there was no verification on the date validity. I have dates in there from 1601 to 1956. So that is a problem. I guess I can determine a correct date for those records, but if i cannot, i am stuck with the date in the column.

Thanks
Jim

Hi Jin

Ian, January 26, 2012 - 4:55 am UTC

Hi Jim

I guess you could figure out the first "genuine" date and then use set_column_stats to set that as the low value.

Cheers

Ian

Hi Jim - Followup

Jim Cox, January 26, 2012 - 1:21 pm UTC

Hi Ian

i fixed all the bogus dates and gathered stats on the table but still getting same error when i try to adjust column in table:

It has to be something with the column because if i run a test using Note 1276174.1, it works

Only thing I am changing is the table and column


Thanks
Jim

TABLE_NAME COLUMN_NAME LOW_VAL HIGH_VAL
------------------------------ ------------------------------ --------- ---------
DTREE CREATEDATE 01-JAN-85 24-JAN-12


DECLARE
srec DBMS_STATS.STATREC;
v_distcnt NUMBER;
v_density NUMBER;
v_nullcnt NUMBER;
v_avgclen NUMBER;
numvals DBMS_STATS.NUMARRAY;
charvals DBMS_STATS.CHARARRAY;
datevals DBMS_STATS.DATEARRAY;

BEGIN

-- Date Type

DBMS_STATS.get_column_stats (ownname => 'LLTEST',
tabname => 'DTREE',
colname => 'CREATEDATE',
distcnt => v_distcnt,
density => v_density,
nullcnt => v_nullcnt,
srec => srec,
avgclen => v_avgclen
);


datevals := DBMS_STATS.datearray (sysdate-365, sysdate);

DBMS_STATS.prepare_column_values (srec, datevals);

DBMS_STATS.set_column_stats (ownname => 'LLTEST',
tabname => 'DTREE',
colname => 'CREATEDATE',
distcnt => v_distcnt,
density => v_density,
nullcnt => v_nullcnt,
srec => srec,
avgclen => v_avgclen
);

COMMIT;
END;
/
DECLARE
*
ERROR at line 1:
ORA-20001: Invalid or inconsistent input values
ORA-06512: at "SYS.DBMS_STATS", line 5012
ORA-06512: at line 28

SQlplan

Anand, July 04, 2012 - 4:32 am UTC

Hi Tom,

Currently i am using below query to find out the query having high CPU cost in production .But for some sql_id where CPU cost is very high(in top 5 cpu cost) is coming quickly.Is there any thing wrong on this.Please suggest.

select distinct a.sql_id,
b.module,
operation,
options,
cost,
cardinality,
cpu_cost,
io_cost,
object_name,
timestamp
from v$sql_plan a, gv$sql b
where a.sql_id = b.sql_id
and parsing_schema_name = 'NCBSHOST'
and module not in ('PL/SQL Developer')
and object_owner = 'NCBSHOST'
and cpu_cost > 0
order by cpu_cost desc


Tom Kyte
July 05, 2012 - 7:07 am UTC

you do know the cost is a guess right? An estimate?

you should be looking at ACTUAL real world response times, not the guesstimate.

Anand, July 08, 2012 - 10:10 pm UTC

Hi Tom,

But the sql_plan is the actual plan capture while executing the query.Sorry if i am wrong .

Tom Kyte
July 11, 2012 - 10:50 am UTC

the plan is a guess, it is developed way before we ever run the query.

it is a *guess*, an estimate.

look at real world performance.

Bulk Insert

anand, July 10, 2012 - 1:18 pm UTC

Hi Tom

We have GTT table having 3 column lets say a,b,c
And c is session id having default value userenv('sessionid').

We insert lots of rows by reading another table(test).

insert into gtt(a,b)
select a,b from test

but when i changed my query with below i got 10 min of benefit on my batch

var_l_session_id NUMBER := userenv('sessionid')
insert into gtt (a,b,c)
select a,b,var_l_session_id from test.


Can u explain me the reason why we got such a gud benefit with this change.
Tom Kyte
July 11, 2012 - 5:20 pm UTC

if you call a function a lot, what do you think would happen if you don't call it a lot?

It would take less time.


since global temporary tables are visible only to the current session anyway, you could make this faster by dropping column c and never calling userenv - why are you doing that since it always has a single value in that table - never never would a query see more than one value.

Anand, October 13, 2012 - 10:05 am UTC

Hi Tom,

I have a table xyz having 1548854 as below :

Name Type Nullable Default Comments
-------------------- ------------ -------- -------------------------------------------------- --------
COD_CC_BRN NUMBER(5)
COD_PROD NUMBER(5)
COD_ACCT_NO CHAR(16)
COD_CUST NUMBER(10)
COD_DORM_XFER_UNIT CHAR(1)
CTR_DORM_XFER_VALUE NUMBER(5)
CTR_UNCLM_XFER_VALUE NUMBER(5)
DAT_NEXT_PROC_CR DATE Y
DAT_NEXT_PROC_DR DATE Y
DAT_TO_CLOSE_AC DATE Y
CTR_DORMANT NUMBER(5)
COD_OFFICR_ID VARCHAR2(36) Y
NAM_CUST_SHRT VARCHAR2(60) Y
NAM_CCY_SHORT VARCHAR2(9) Y
BAL_AVAILABLE NUMBER Y 0
DAT_LAST_TXN DATE Y
COD_ENTITY_VPD NUMBER(5) NVL(sys_context('CLIENTCONTEXT','entity_code'),11)

and index on below column :

create unique index IN_XYZ_1 on XYZ (COD_ACCT_NO, COD_ENTITY_VPD) ;

create index IN_XYZ_2 on XYZ (COD_PROD, COD_CC_BRN, COD_ENTITY_VPD) ;

Now While going through AWR found that below select query on
batch have high CPU cost :

SELECT
COD_CC_BRN ,
COD_PROD ,
COD_ACCT_NO ,
COD_CUST ,
COD_DORM_XFER_UNIT ,
CTR_DORM_XFER_VALUE ,
CTR_UNCLM_XFER_VALUE ,
DAT_NEXT_PROC_CR ,
DAT_NEXT_PROC_DR ,
DAT_TO_CLOSE_AC ,
CTR_DORMANT ,
COD_OFFICR_ID ,
NAM_CUST_SHRT ,
NAM_CCY_SHORT ,
BAL_AVAILABLE ,
DAT_LAST_TXN
FROM
xyz
WHERE
((dat_next_proc_cr BETWEEN var_dat_last_eod AND var_dat_process
AND dat_next_proc_dr <= dat_next_proc_cr )
OR ( dat_next_proc_dr BETWEEN var_dat_last_eod AND var_dat_process


so now decide to create index on it and change the query as below :

(SELECT
COD_CC_BRN ,
COD_PROD ,
COD_ACCT_NO ,
COD_CUST ,
COD_DORM_XFER_UNIT ,
CTR_DORM_XFER_VALUE ,
CTR_UNCLM_XFER_VALUE ,
DAT_NEXT_PROC_CR ,
DAT_NEXT_PROC_DR ,
DAT_TO_CLOSE_AC ,
CTR_DORMANT ,
COD_OFFICR_ID ,
NAM_CUST_SHRT ,
NAM_CCY_SHORT ,
BAL_AVAILABLE ,
DAT_LAST_TXN
FROM
xyz
WHERE
(dat_next_proc_cr BETWEEN var_dat_last_eod AND var_dat_process
AND dat_next_proc_dr <= dat_next_proc_cr )
UNION
SELECT
COD_CC_BRN ,
COD_PROD ,
COD_ACCT_NO ,
COD_CUST ,
COD_DORM_XFER_UNIT ,
CTR_DORM_XFER_VALUE ,
CTR_UNCLM_XFER_VALUE ,
DAT_NEXT_PROC_CR ,
DAT_NEXT_PROC_DR ,
DAT_TO_CLOSE_AC ,
CTR_DORMANT ,
COD_OFFICR_ID ,
NAM_CUST_SHRT ,
NAM_CCY_SHORT ,
BAL_AVAILABLE ,
DAT_LAST_TXN
FROM
xyz
WHERE ( dat_next_proc_dr BETWEEN var_dat_last_eod AND var_dat_process
AND dat_next_proc_cr <= dat_next_proc_dr));

CREATE INDEX IN_xyz_3 ON xyz (DAT_NEXT_PROC_DR,DAT_NEXT_PROC_CR,COD_ENTITY_VPD)
/
CREATE INDEX IN_xyz_4 ON xyz (DAT_TO_CLOSE_AC,COD_ENTITY_VPD)
/


Is it i am doing in correct way.Please suggest.

Thanks alot.
Tom Kyte
October 15, 2012 - 8:30 am UTC

since you do not have your unique key in the select list, you may be removing records with the UNION (union does a distinct),

ops$tkyte%ORA11GR2> create table t ( x int, y int, z int );

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> insert into t values ( 1, 1, 1 );

1 row created.

ops$tkyte%ORA11GR2> insert into t values ( 1, 1, 1 );

1 row created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select * from t where (y=1) or (z=1);

         X          Y          Z
---------- ---------- ----------
         1          1          1
         1          1          1

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select * from t where (y=1)
  2  union
  3  select * from t where (z=1);

         X          Y          Z
---------- ---------- ----------
         1          1          1


on the other hand, if you turn it into a union all - you might get duplicate records you didn't want....

ops$tkyte%ORA11GR2> create table t ( x int, y int, z int );

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> insert into t values ( 1, 1, 1 );

1 row created.

ops$tkyte%ORA11GR2> insert into t values ( 1, 2, 1 );

1 row created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select * from t where (y=1) or (z=1);

         X          Y          Z
---------- ---------- ----------
         1          1          1
         1          2          1

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select * from t where (y=1)
  2  union all
  3  select * from t where (z=1);

         X          Y          Z
---------- ---------- ----------
         1          1          1
         1          1          1
         1          2          1


so, you tell me - does that second query answer the question you are interested in asking?

Index

Anand, October 15, 2012 - 6:15 am UTC

Hi Tom,

Please ignore the above post.

I have a table xyz having 1548854 as below :

Name Type Nullable Default
Comments
-------------------- ------------ -------- --------------------------------------------------
--------
COD_CC_BRN NUMBER(5)

COD_PROD NUMBER(5)

COD_ACCT_NO CHAR(16)

COD_CUST NUMBER(10)

COD_DORM_XFER_UNIT CHAR(1)

CTR_DORM_XFER_VALUE NUMBER(5)

CTR_UNCLM_XFER_VALUE NUMBER(5)

DAT_NEXT_PROC_CR DATE Y

DAT_NEXT_PROC_DR DATE Y

DAT_TO_CLOSE_AC DATE Y

CTR_DORMANT NUMBER(5)

COD_OFFICR_ID VARCHAR2(36) Y

NAM_CUST_SHRT VARCHAR2(60) Y

NAM_CCY_SHORT VARCHAR2(9) Y

BAL_AVAILABLE NUMBER Y 0

DAT_LAST_TXN DATE Y

COD_ENTITY_VPD NUMBER(5) NVL(sys_context('CLIENTCONTEXT','entity_code'),11)


and index on below column :

create unique index IN_XYZ_1 on XYZ (COD_ACCT_NO, COD_ENTITY_VPD) ;

create index IN_XYZ_2 on XYZ (COD_PROD, COD_CC_BRN, COD_ENTITY_VPD) ;

Now While going through AWR found that below select query on
batch have high CPU cost :

SELECT
COD_CC_BRN ,
COD_PROD ,
COD_ACCT_NO ,
COD_CUST ,
COD_DORM_XFER_UNIT ,
CTR_DORM_XFER_VALUE ,
CTR_UNCLM_XFER_VALUE ,
DAT_NEXT_PROC_CR ,
DAT_NEXT_PROC_DR ,
DAT_TO_CLOSE_AC ,
CTR_DORMANT ,
COD_OFFICR_ID ,
NAM_CUST_SHRT ,
NAM_CCY_SHORT ,
BAL_AVAILABLE ,
DAT_LAST_TXN
FROM
xyz
WHERE
((dat_next_proc_cr BETWEEN var_dat_last_eod AND var_dat_process
AND dat_next_proc_dr <= dat_next_proc_cr )
OR ( dat_next_proc_dr BETWEEN var_dat_last_eod AND var_dat_process
AND dat_next_proc_cr <= dat_next_proc_dr));


so now decide to create index on it and change the query as below :

(SELECT
COD_CC_BRN ,
COD_PROD ,
COD_ACCT_NO ,
COD_CUST ,
COD_DORM_XFER_UNIT ,
CTR_DORM_XFER_VALUE ,
CTR_UNCLM_XFER_VALUE ,
DAT_NEXT_PROC_CR ,
DAT_NEXT_PROC_DR ,
DAT_TO_CLOSE_AC ,
CTR_DORMANT ,
COD_OFFICR_ID ,
NAM_CUST_SHRT ,
NAM_CCY_SHORT ,
BAL_AVAILABLE ,
DAT_LAST_TXN
FROM
xyz
WHERE
(dat_next_proc_cr BETWEEN var_dat_last_eod AND var_dat_process
AND dat_next_proc_dr <= dat_next_proc_cr )
UNION
SELECT
COD_CC_BRN ,
COD_PROD ,
COD_ACCT_NO ,
COD_CUST ,
COD_DORM_XFER_UNIT ,
CTR_DORM_XFER_VALUE ,
CTR_UNCLM_XFER_VALUE ,
DAT_NEXT_PROC_CR ,
DAT_NEXT_PROC_DR ,
DAT_TO_CLOSE_AC ,
CTR_DORMANT ,
COD_OFFICR_ID ,
NAM_CUST_SHRT ,
NAM_CCY_SHORT ,
BAL_AVAILABLE ,
DAT_LAST_TXN
FROM
xyz
WHERE ( dat_next_proc_dr BETWEEN var_dat_last_eod AND var_dat_process
AND dat_next_proc_cr <= dat_next_proc_dr));

CREATE INDEX IN_xyz_3 ON xyz (DAT_NEXT_PROC_DR,DAT_NEXT_PROC_CR,COD_ENTITY_VPD)
/
CREATE INDEX IN_xyz_4 ON xyz (DAT_TO_CLOSE_AC,COD_ENTITY_VPD)
/


Is it i am doing in correct way.Please suggest.

Thanks alot.

Tom Kyte
October 15, 2012 - 10:34 am UTC

see my answer above, it is unchanged

anand, October 15, 2012 - 12:24 pm UTC

Hi Tom,

I want distinct reord that's why I use union.
Do you think its correct way to do ?

As you always say if possible do it in single query and here I have splited into 2 query with union.

Please suggest.my concern is just to pickup proper index which I will create.

"The best thing happen to oracle is that we have tom"
Tom Kyte
October 15, 2012 - 2:11 pm UTC

Do you think its correct way to do ?


you tell me - is it the correct answer to the question you are asking? only you - truly, only you, can answer that.

You have a single query. I don't see two queries?


there is something missing here - why is that column that ends with VPD in there, I presume there might be a VPD policy on the table? how about all of the details here...

Anand, October 15, 2012 - 11:32 pm UTC

Hi Tom,

Below is the old query :

SELECT
COD_CC_BRN ,
COD_PROD ,
COD_ACCT_NO ,
COD_CUST ,
COD_DORM_XFER_UNIT ,
CTR_DORM_XFER_VALUE ,
CTR_UNCLM_XFER_VALUE ,
DAT_NEXT_PROC_CR ,
DAT_NEXT_PROC_DR ,
DAT_TO_CLOSE_AC ,
CTR_DORMANT ,
COD_OFFICR_ID ,
NAM_CUST_SHRT ,
NAM_CCY_SHORT ,
BAL_AVAILABLE ,
DAT_LAST_TXN
FROM
xyz
WHERE
((dat_next_proc_cr BETWEEN var_dat_last_eod AND var_dat_process
AND dat_next_proc_dr <= dat_next_proc_cr )
OR ( dat_next_proc_dr BETWEEN var_dat_last_eod AND var_dat_process
AND dat_next_proc_cr <= dat_next_proc_dr));


And now new query splited into two query by using union :

(SELECT
COD_CC_BRN ,
COD_PROD ,
COD_ACCT_NO ,
COD_CUST ,
COD_DORM_XFER_UNIT ,
CTR_DORM_XFER_VALUE ,
CTR_UNCLM_XFER_VALUE ,
DAT_NEXT_PROC_CR ,
DAT_NEXT_PROC_DR ,
DAT_TO_CLOSE_AC ,
CTR_DORMANT ,
COD_OFFICR_ID ,
NAM_CUST_SHRT ,
NAM_CCY_SHORT ,
BAL_AVAILABLE ,
DAT_LAST_TXN
FROM
xyz
WHERE
(dat_next_proc_cr BETWEEN var_dat_last_eod AND var_dat_process
AND dat_next_proc_dr <= dat_next_proc_cr )
UNION
SELECT
COD_CC_BRN ,
COD_PROD ,
COD_ACCT_NO ,
COD_CUST ,
COD_DORM_XFER_UNIT ,
CTR_DORM_XFER_VALUE ,
CTR_UNCLM_XFER_VALUE ,
DAT_NEXT_PROC_CR ,
DAT_NEXT_PROC_DR ,
DAT_TO_CLOSE_AC ,
CTR_DORMANT ,
COD_OFFICR_ID ,
NAM_CUST_SHRT ,
NAM_CCY_SHORT ,
BAL_AVAILABLE ,
DAT_LAST_TXN
FROM
xyz
WHERE ( dat_next_proc_dr BETWEEN var_dat_last_eod AND var_dat_process
AND dat_next_proc_cr <= dat_next_proc_dr));


Yes Tom.That VPD is policy.

Can You tell what is missing ? I am sorry if i have asked a sily question.

My concerned is that when i create new index as below query should come out quickly :

CREATE INDEX IN_xyz_3 ON xyz (DAT_NEXT_PROC_DR,DAT_NEXT_PROC_CR,COD_ENTITY_VPD)
Tom Kyte
October 16, 2012 - 9:33 am UTC

what is the vpd policy - that is, what does the query REALLY look like after the query rewrite due to vpd would take place. that is what you need to tune.

I'm almost positively sure that the *_vpd column should be FIRST in the index index you are probably doing something like adding a predcate

"cod_entity_vpd = SOMETHING"

things that use equals should always go first in the index, things that range (>, <, like, between, etc) should go second, things that can help us avoid hitting the table altogether last.

for the where clause of the second part of the union:

WHERE ( dat_next_proc_dr BETWEEN var_dat_last_eod AND var_dat_process
AND dat_next_proc_cr <= dat_next_proc_dr));


the index should *probably* be:

cod_entity_vpd, dat_next_proc_dr, dat_next_proc_cr

you want cod_entity_vpd first because you use equals on it.
you'd want dat_next_proc_dr second so that we can process the between on it
and dat_next_proc_cr last so we can do that filter

and only rows that match on all three conditions would have us go to the table and pick up the row.

Anand, October 16, 2012 - 11:21 pm UTC

Hi Tom,

Thanks alot....

Can you tell me or give some reference site how to use "filter predict" while tuning some query ?
Tom Kyte
October 17, 2012 - 3:12 pm UTC

I don't know what you mean - there is a filter predicate and access predicates (access => index access typically, filter means we got the row maybe via a full scan, maybe via an index and will filter it more)

it is just a label, not a tuning tool.

More to Explore

VLDB

If you are new to partitioning, check out Connor McDonald's introduction series here.

VLDB

Documentation set on VLDB and Partitioning.