Skip to Main Content
  • Questions
  • tune these max statements with subquery

Breadcrumb

May 4th

Question and Answer

Connor McDonald

Thanks for the question, josh.

Asked: May 16, 2003 - 7:27 pm UTC

Last updated: December 09, 2022 - 4:13 am UTC

Version: 8.1.7

Viewed 50K+ times! This question is

You Asked

How can I convince my developers there is a better way to do this?


SELECT MAX (weight) weight
FROM client.pet_weight
WHERE pet_id = :pet_id
AND setup_date =
(SELECT MAX (setup_date)
FROM client.pet_weight
WHERE pet_id = :pet_id)

and Tom said...

the answer is always -- BENCHMARK IT....

ops$tkyte@ORA920> create table t ( weight int, pet_id int, setup_date date );

Table created.

ops$tkyte@ORA920>
ops$tkyte@ORA920> insert into t ( pet_id, weight, setup_date )
2 select mod(rownum,1000), dbms_random.random, sysdate-1000+dbms_random.value(0,1000)
3 from all_objects
4 /

31611 rows created.

ops$tkyte@ORA920>
ops$tkyte@ORA920> create index t_idx on t( pet_id, setup_date, weight );

Index created.

ops$tkyte@ORA920> analyze table t compute statistics
2 for table
3 for all indexes
4 for all indexed columns;

Table analyzed.



SELECT max(weight) from t where pet_id = :b1
and setup_date = (select max(setup_date)
from t
where pet_id = :b1 )

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1000 0.42 0.45 0 2000 0 0
Fetch 1000 0.06 0.04 0 2005 0 1000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2001 0.48 0.50 0 4005 0 1000

Rows Row Source Operation
------- ---------------------------------------------------
1000 SORT AGGREGATE (cr=2005 r=0 w=0 time=33826 us)
1000 INDEX RANGE SCAN T_IDX (cr=2005 r=0 w=0 time=24134 us)(object id 49798)

that is the current way -- 1/2 cpu second, 4,000 logical IOs..

********************************************************************************
SELECT weight from (select weight from t where pet_id = :b1
order by pet_id DESC, setup_date DESC, weight DESC )
where rownum = 1

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1000 0.21 0.20 0 0 0 0
Fetch 1000 0.06 0.07 0 2000 0 1000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2001 0.28 0.28 0 2000 0 1000

Rows Row Source Operation
------- ---------------------------------------------------
1000 COUNT STOPKEY (cr=2000 r=0 w=0 time=66046 us)
1000 VIEW (cr=2000 r=0 w=0 time=52862 us)
1000 INDEX RANGE SCAN DESCENDING T_IDX (cr=2000 r=0 w=0 time=46646 us)(object id 49798)

that gets the same results -- in about 1/2 the cpu time with 1/2 the LIO's..

and if they cannot use order by in a subquery, just declare a cursor that looks like the following and open/fetch 1 row/close it:



********************************************************************************
SELECT weight from t where pet_id = :b1
order by pet_id DESC, setup_date DESC, weight DESC

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1000 0.25 0.19 0 0 0 0
Fetch 1000 0.05 0.05 0 2000 0 1000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2001 0.31 0.25 0 2000 0 1000

Rows Row Source Operation
------- ---------------------------------------------------
1000 INDEX RANGE SCAN DESCENDING T_IDX (cr=2000 r=0 w=0 time=46575 us)(object id 49798)

for similar results


Rating

  (138 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

tuning

lakshmi, May 19, 2003 - 12:26 am UTC

Excellent

One minor question

Alla, May 19, 2003 - 7:27 am UTC

Tom;

You re-wrote the statement the following way:

SELECT weight from (select weight from t where pet_id = :b1
order by pet_id DESC, setup_date DESC, weight DESC )
where rownum = 1


Why did you say "order by pet_id", when your pet_id is a single value?

Thanks

Alla

Tom Kyte
May 19, 2003 - 7:55 am UTC

just helps ensure the index is used. doesn't hurt.

Too good

Arun Gupta, May 19, 2003 - 9:44 pm UTC

Tom
You need to share with us the secret of how you think of these clever ways of rewriting a query...

Tom Kyte
May 20, 2003 - 7:32 am UTC

think in sets, know what is available to you, understand SQL. it all comes from that.

Too Good

umesh, May 20, 2003 - 1:38 am UTC

that was great Tom, You keep me in "MY JOB" Thanks Tom
keep up the good work

Too Good

Aijaz, May 20, 2003 - 8:23 am UTC

Excellent workaround.

Vipin, May 20, 2003 - 9:40 am UTC

Excellent Tom

inline view

A reader, May 20, 2003 - 11:07 am UTC

just wondering, why the pedicate is not pushed (or view merged) to the inline view?

SELECT weight from (select weight from t where pet_id = :b1
order by pet_id DESC, setup_date DESC, weight DESC )
where rownum = 1


pushed to

SELECT weight from (select weight from t where pet_id = :b1
and rownum = 1
order by pet_id DESC, setup_date DESC, weight DESC )


Tom Kyte
May 20, 2003 - 1:36 pm UTC

because rownum is special, like aggregates, order by, set operations, distinct, etc.

you can actually use rownum to stop merging from happening

CONVERT SQL

Clab, May 20, 2003 - 11:41 am UTC

CONVERT SQL

SELECT /*+ INDEX_DESC(T T_IDX) */
WEIGHT
FROM T
WHERE PET_ID = :B1
AND ROWNUM = 1

Tom Kyte
May 20, 2003 - 1:38 pm UTC

that is a terrible approach.

don't do that anyone!

if you have that in your code, get rid of that developer....


hints are hints, they are not directives, not commands.

if you want the right answer, all of the time, please use my approach.


If you use this approach, with a hint, you are basically saying "just give me any row for that pet_id -- thanks".

Don't do this
Don't do this
Don't do this
Don't do this
(just keep repeating that over and over and over)

simply superb!!!!!!

praveen, May 20, 2003 - 11:36 pm UTC

Excellent solution.

Mirjana, May 21, 2003 - 1:32 am UTC


curious!

A Reader, May 21, 2003 - 2:33 am UTC

i am just wondering, if you push pedicate to the inner query and if the result is same then why there is a need of the upper one.


SELECT weight from (select weight from t where pet_id = :b1
where rownum = 1
order by pet_id DESC, setup_date DESC, weight DESC )


it looks similar to

select weight from t
where pet_id = :b1
and rownum = 1
order by pet_id DESC, setup_date DESC, weight DESC

Tom, what is the diff ? Could please explain us more.

Thank you.

Tom Kyte
May 21, 2003 - 8:36 am UTC

they are not the same.

We want to

a) sort
b) then get the first row


Your query with it "pushed" would

a) get A row
b) sort it


quite different.

How do i optimize this qeury?. It is taking more than 10 min....

siva, May 21, 2003 - 3:02 am UTC

Hi TOM,
Could you please tell me how to optimize this query?.
I created indexes for all joining columns. Each table has has 5 lakhs, 1 lakhs, 10 thousand, 1 thousand and 500 rows.

It is taking 10 -15 min to run the report from this query. I'm waiting for your reply.

Thanks & Regards,
Siva

SELECT
JOB.EMPLID,
JOB.DEPTID,
JOB.SETID_DEPT ,
job.LOCATION,
job.SETID_LOCATION,
loc.descr ,
ADDRESS.NAME_PREFIX,
ADDRESS.FIRST_NAME,
ADDRESS.MIDDLE_NAME,
ADDRESS.LAST_NAME,
ADDRESS.ADDRESS1,
ADDRESS.ADDRESS2,
ADDRESS.ADDRESS3,
ADDRESS.ADDRESS4,
ADDRESS.CITY,
ADDRESS.STATE,
ADDRESS.POSTAL,
ADDRESS.COUNTRY,
SCHEME.GRANT_DT
FROM PS_JOB JOB,
PS_S_SS_PERDATA_VW ADDRESS,
ps_location_tbl Loc,
PS_S_SS_SCHEME SCHEME ,
PS_S_SS_CTRY_SETUP COUNTRY
WHERE 1=1
AND JOB.REG_TEMP = 'R'
AND JOB.EMPL_CLASS IN ('F','U')
AND JOB.EMPL_STATUS IN ( 'A','L','S','P')
AND TO_CHAR(GRANT_DT,'MMYYYY') = '052003'
AND JOB.EFFDT =
( SELECT MAX(B.EFFDT)
FROM PS_JOB B
WHERE JOB.EMPLID =B.EMPLID
AND JOB.EMPL_RCD = B.EMPL_RCD
AND B.EFFDT <= sysdate )
AND JOB.EFFSEQ =
(SELECT MAX(C.EFFSEQ)
FROM PS_JOB C
WHERE JOB.EMPLID =C.EMPLID
AND JOB.EMPL_RCD =C.EMPL_RCD
AND JOB.EFFDT = C.EFFDT )
AND JOB.EMPL_RCD=0
AND Loc.effdt =
( Select max(LocEffdt.effdt) from ps_location_tbl LocEffdt
where loc.setid = LocEffdt.setid
and loc.location = LocEffdt.location
and LocEffdt.effdt <= sysdate )
AND LOC.EFF_STATUS='A'
AND COUNTRY.EFFDT =
( SELECT MAX(CNTRY.EFFDT)
FROM PS_CTRY_SETUP CNTRY
WHERE COUNTRY.COUNTRY = CNTRY.COUNTRY
AND CNTRY.EFFDT <= SYSDATE)
AND COUNTRY.EFF_STATUS = 'A'
AND JOB.EMPLID=ADDRESS.EMPLID
AND job.SETID_LOCATION = loc.setid
AND job.LOCATION = loc.location
AND loc.country = country.country
AND SCHEME.SCHEME_TYPE = COUNTRY.SCHEME_TYPE

/


Tom Kyte
May 21, 2003 - 8:45 am UTC

hopefully, it is not using your indexes, they could be the major cause of performance issues.

tell you what, when I'm taking questions -- feel free to post the query, a defintion of a "lakh", a tkprof, and the output of a set autotrace traceonly explain.

Make sure you are using the CBO and look at ANALYTIC functions instead of those nastly correlated subqueries. for example, these two queries are the same:

select object_name, object_type, created
from t
where created = ( select max(created)
from t t2
where t2.object_type = t.object_type )
/
select object_name, object_type, created
from (
select object_name, object_type, created, max(created) over (partition by object_type) max_created
from t
)
where created = max_created
/


(create table t as select * from all_objects) but the second is more efficient.

question regadring complex view merging and push predicates

A reader, May 21, 2003 - 8:01 am UTC

Hello

Oracle says views with

Set operators (UNION, UNION ALL, INTERSECT, MINUS)
A CONNECT BY clause
A ROWNUM pseudocolumn
Aggregate functions (AVG, COUNT, MAX, MIN, SUM) in the select list

are not mergeble

and views with

A GROUP BY clause
A DISTINCT operator in the select list

can be merged with complx view merging, but if a view definision such as

CREATE VIEW avg_salary_view AS
SELECT department_id, AVG(salary) AS avg_sal_dept,
FROM employees
GROUP BY department_id;

and a query

SELECT departments.location_id, avg_sal_dept
FROM departments, avg_salary_view
WHERE departments.department_id = avg_salary_view.department_id
AND departments.location_id = 2400;

Oracle merges it to

SELECT departments.loc, AVG(salary)
FROM departments, employees
WHERE departments.department_id = employees.department_id
AND departments.location_id = 2400
GROUP BY departments.rowid, departments.location_id;

Why he can merge here??? We have an aggreate funcion AVG() here!

And why in the following example which is very similar to above

CREATE VIEW emp_group_by_deptno
AS SELECT department_id,
AVG(salary) avg_sal,
MIN(salary) min_sal,
MAX(salary) max_sal
FROM employees
GROUP BY department_id;


SELECT *
FROM emp_group_by_deptno
WHERE department_id = 50;

transformed into

SELECT department_id,
AVG(salary) avg_sal,
MIN(salary) min_sal,
MAX(salary) max_sal,
FROM employees
WHERE department_id = 50
GROUP BY department_id;

Why here the view is nor merged instead it pushes the predicate???

Both example have aggreate functions and group by, but one can use complex view merging but the other not... All these examples are from the documentation.

Confused :-((

Tom Kyte
May 21, 2003 - 9:05 am UTC

and this relates to the question at hand how?

Lakh

Kalita, May 21, 2003 - 8:57 am UTC

One lakh = 100,000 in India :-)

Randy, May 21, 2003 - 10:41 am UTC

I have been trying to use what you have showed here but I have not had the success I'd hoped for. I'f you have time, Please take a look at what I have.

The SQL is geneated from our custom application for adhoc reports. I have all the data the users can query in four tables. The sub-select is what is slowing it down.

If running the whole SQL, runtime is ~ 28 sec. and
62K physical reads. If I do first do the sub-select and put those results into the main select, total runtime is less than 5 sec.

My database is OEE 8.1.7.3.4. V_RPT_ALL is a view of three tables RPT_CR(65k rows) and RPT_ASSOC(300k rows) have a many to many relationship with Cross_association in the middle of the two.

Here is the explain plan for the whole select and as two parts:

SELECT CRTABLE.CR_ID, CRTABLE.STATUS_ID, CRTABLE.DATEREQUESTED, CRTABLE.LOCATION, CRTABLE.AREA,
CRLARGE.DESCRIPTIONTEXT
FROM CMR.V_RPT_ALL CRTABLE,
CMR.V_CMRGETCMRTEXTDETAILS CRLARGE
WHERE CRTABLE.CR_ID IN
(SELECT CR_ID
FROM CMR.V_RPT_ALL A
WHERE DATEREQUESTED BETWEEN
TO_DATE('3/18/2003','mm/dd/yyyy')
AND TO_DATE('4/2/2003 23:59','mm/dd/yyyy hh24:mi')
AND LOCATION_ID IN (657)
AND SYS_TYPE_ID IN (208244)
AND SYS_NAME_ID IN (4885)
AND ASSOC_TYPE_ID = 8
GROUP BY CR_ID
HAVING COUNT (*) >= 1)
AND CRTABLE.CR_ID = CRLARGE.CR_ID
ORDER BY CRTABLE.CR_ID;

143 rows selected.

Elapsed: 00:00:28.05

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=12306 Card=7 Bytes=14567)
1 0 SORT (ORDER BY) (Cost=12306 Card=7 Bytes=14567)
2 1 NESTED LOOPS (Cost=12304 Card=7 Bytes=14567)
3 2 NESTED LOOPS (Cost=12297 Card=7 Bytes=14343)
4 3 NESTED LOOPS (Cost=12294 Card=1 Bytes=2041)
5 4 NESTED LOOPS (Cost=15 Card=1 Bytes=26)
6 5 VIEW OF 'VW_NSO_1' (Cost=14 Card=1 Bytes=13)
7 6 FILTER
8 7 SORT (GROUP BY) (Cost=14 Card=1 Bytes=33)
9 8 HASH JOIN (Cost=11 Card=1 Bytes=33)
10 9 NESTED LOOPS (Cost=6 Card=21 Bytes=462)
11 10 TABLE ACCESS (BY INDEX ROWID) OF 'RPT_ASSOC' (Cost=3 Card=1 Bytes=14)
12 11 INDEX (RANGE SCAN) OF 'AK_RPT_ASSOC_SYS_NAME' (NON-UNIQUE) (Cost=1 Card=1)
13 10 INDEX (RANGE SCAN) OF 'AK_CROSS_ASSOCIATION' (UNIQUE) (Cost=3 Card=403417 Bytes=3227336)
14 9 INDEX (RANGE SCAN) OF 'AK_RPT_CR' (NON-UNIQUE) (Cost=4 Card=867 Bytes=9537)
15 5 TABLE ACCESS (BY INDEX ROWID) OF 'RPT_CR' (Cost=1 Card=61103 Bytes=794339)
16 15 INDEX (UNIQUE SCAN) OF 'PK_RPT_CR' (UNIQUE)
17 4 VIEW OF 'V_CMRGETCMRTEXTDETAILS'
18 17 SORT (GROUP BY) (Cost=12279 Card=60957 Bytes=7375797)
19 18 MERGE JOIN (OUTER) (Cost=12279 Card=241598 Bytes=29233358)
20 19 SORT (JOIN) (Cost=126 Card=61490 Bytes=245960)
21 20 INDEX (FAST FULL SCAN) OF 'PK_CHANGEREQUEST' (UNIQUE) (Cost=8 Card=61490 Bytes=245960)
22 19 SORT (JOIN) (Cost=12153 Card=241598 Bytes=28266966)
23 22 TABLE ACCESS (BY INDEX ROWID) OF 'LARGE_TEXT' (Cost=5691 Card=241598 Bytes=28266966)
24 23 INDEX (FULL SCAN) OF 'PK_LARGE_TEXT' (UNIQUE) (Cost=606 Card=241598)
25 3 INDEX (RANGE SCAN) OF 'PK_CROSS_ASSOCIATON' (UNIQUE) (Cost=3 Card=403417 Bytes=3227336)
26 2 TABLE ACCESS (BY INDEX ROWID) OF 'RPT_ASSOC' (Cost=1 Card=191931 Bytes=6141792)
27 26 INDEX (UNIQUE SCAN) OF 'PK_RPT_ASSOC' (UNIQUE)

Statistics
----------------------------------------------------------
29 recursive calls
12 db block gets
6283 consistent gets
62428 physical reads
0 redo size
15119 bytes sent via SQL*Net to client
1424 bytes received via SQL*Net from client
11 SQL*Net roundtrips to/from client
12 sorts (memory)
3 sorts (disk)
143 rows processed

/* This is the Sub-select. The "HAVING COUNT" part is nessesary because of Buisness Rules. The Count >= Number will be equal to the number of IDs in the LOCATION_ID IN statment. */

SELECT CR_ID
FROM CMR.V_RPT_ALL A
WHERE DATEREQUESTED
BETWEEN TO_DATE('3/18/2003','mm/dd/yyyy')
AND TO_DATE ('4/2/2003 23:59','mm/dd/yyyy hh24:mi')
AND LOCATION_ID IN (657)
AND SYS_TYPE_ID IN (208244)
AND SYS_NAME_ID IN (4885)
AND ASSOC_TYPE_ID = 8
GROUP BY CR_ID
HAVING COUNT (*) >= 1;

35 rows selected.

Elapsed: 00:00:00.81

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=13 Card=1 Bytes=33)
1 0 FILTER
2 1 SORT (GROUP BY) (Cost=13 Card=1 Bytes=33)
3 2 HASH JOIN (Cost=11 Card=1 Bytes=33)
4 3 NESTED LOOPS (Cost=6 Card=21 Bytes=462)
5 4 TABLE ACCESS (BY INDEX ROWID) OF 'RPT_ASSOC' (Cost=3 Card=1 Bytes=14)
6 5 INDEX (RANGE SCAN) OF 'AK_RPT_ASSOC_SYS_NAME' (NON-UNIQUE) (Cost=1 Card=1)
7 4 INDEX (RANGE SCAN) OF 'AK_CROSS_ASSOCIATION' (UNIQUE) (Cost=3 Card=403417 Bytes=3227336)
8 3 INDEX (RANGE SCAN) OF 'AK_RPT_CR' (NON-UNIQUE) (Cost=4 Card=867 Bytes=9537)

Statistics
----------------------------------------------------------
7 recursive calls
0 db block gets
17 consistent gets
0 physical reads
0 redo size
1457 bytes sent via SQL*Net to client
647 bytes received via SQL*Net from client
4 SQL*Net roundtrips to/from client
4 sorts (memory)
0 sorts (disk)
35 rows processed

-- Main Select with results from sub-select

SELECT CRTABLE.CR_ID, CRTABLE.STATUS_ID,
CRTABLE.DATEREQUESTED, CRTABLE.LOCATION,
CRTABLE.AREA, CRLARGE.DESCRIPTIONTEXT
FROM CMR.V_RPT_ALL CRTABLE,
CMR.V_CMRGETCMRTEXTDETAILS CRLARGE
WHERE CRTABLE.CR_ID IN
(79645,79655,79657,79667,79692,79718,
79721,79726,79847,79888,80116,80166,
80172,80187,80216,80221,80241,80414,
80417,80418,80525,80551,80648,80854,
80923,80925,80931,80997,81002,81017,
81020,81169,81207,81208,81239)
AND CRTABLE.CR_ID = CRLARGE.CR_ID
ORDER BY CRTABLE.CR_ID;

143 rows selected.

Elapsed: 00:00:04.22

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=153 Card=1 Bytes=2068)
1 0 NESTED LOOPS (Cost=153 Card=1 Bytes=2068)
2 1 NESTED LOOPS (Cost=152 Card=1 Bytes=2036)
3 2 MERGE JOIN (Cost=149 Card=1 Bytes=2028)
4 3 VIEW OF 'V_CMRGETCMRTEXTDETAILS' (Cost=142 Card=35 Bytes=70525)
5 4 SORT (GROUP BY) (Cost=142 Card=35 Bytes=4235)
6 5 NESTED LOOPS (OUTER) (Cost=142 Card=139 Bytes=16819)
7 6 INLIST ITERATOR
8 7 INDEX (RANGE SCAN)OF 'PK_CHANGEREQUEST' (UNIQUE) (Cost=2 Card=35 Bytes=140)
9 6 TABLE ACCESS (BY INDEX ROWID) OF 'LARGE_TEXT' (Cost=4 Card=241598 Bytes=28266966)
10 9 INDEX (RANGE SCAN) OF 'PK_LARGE_TEXT' (UNIQUE) (Cost=3 Card=241598)
11 3 SORT (JOIN) (Cost=7 Card=35 Bytes=455)
12 11 INLIST ITERATOR
13 12 TABLE ACCESS (BY INDEX ROWID) OF 'RPT_CR' (Cost=5 Card=35 Bytes=455)
14 13 INDEX (RANGE SCAN) OF 'PK_RPT_CR' (UNIQUE) (Cost=2 Card=35)
15 2 INLIST ITERATOR
16 15 INDEX (RANGE SCAN) OF 'PK_CROSS_ASSOCIATON' (UNIQUE) (Cost=3 Card=231 Bytes=1848)
17 1 TABLE ACCESS (BY INDEX ROWID) OF 'RPT_ASSOC' (Cost=1 Card=191931 Bytes=6141792)
18 17 INDEX (UNIQUE SCAN) OF 'PK_RPT_ASSOC' (UNIQUE)

Statistics
----------------------------------------------------------
22 recursive calls
0 db block gets
4311 consistent gets
0 physical reads
0 redo size
15119 bytes sent via SQL*Net to client
1424 bytes received via SQL*Net from client
11 SQL*Net roundtrips to/from client
8 sorts (memory)
0 sorts (disk)
143 rows processed


Thanks

Tom Kyte
May 21, 2003 - 11:22 am UTC

what does something like:

select *
from (
SELECT CRTABLE.CR_ID, CRTABLE.STATUS_ID, CRTABLE.DATEREQUESTED,
CRTABLE.LOCATION, CRTABLE.AREA,
CRLARGE.DESCRIPTIONTEXT,
max(case when daterequested between TO_DATE('3/18/2003','mm/dd/yyyy') AND TO_DATE('4/2/2003 23:59','mm/dd/yyyy hh24:mi')
and location_id in (657)
and SYS_TYPE_ID IN (208244)
AND SYS_NAME_ID IN (4885)
AND ASSOC_TYPE_ID = 8
then 1
else null
end) over(partition by crtable.cr_id) flag
FROM CMR.V_RPT_ALL CRTABLE,
CMR.V_CMRGETCMRTEXTDETAILS CRLARGE
where CRTABLE.CR_ID = CRLARGE.CR_ID
)
where flag = 1
ORDER BY CR_ID;


do

Randy, May 21, 2003 - 11:37 am UTC

Not so good. I also tried moving the sub-select up into the FROM part but that didn't help ether.

143 rows selected.

Elapsed: 00:01:74.80

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=56610 Card=398550 By
tes=846520200)
1 0 VIEW (Cost=56610 Card=398550 Bytes=846520200)
2 1 WINDOW (BUFFER)
3 2 MERGE JOIN (Cost=56610 Card=398550 Bytes=828186900)
4 3 MERGE JOIN (Cost=9309 Card=402035 Bytes=25328205)
5 4 SORT (JOIN) (Cost=8949 Card=403417 Bytes=20170850)
6 5 MERGE JOIN (Cost=3902 Card=403417 Bytes=20170850)
7 6 SORT (JOIN) (Cost=2366 Card=191931 Bytes=8061102)
8 7 TABLE ACCESS (FULL) OF 'RPT_ASSOC' (Cost=260 Card=191931 Bytes=8061102)
9 6 SORT (JOIN) (Cost=1536 Card=403417 Bytes=3227336)
10 9 TABLE ACCESS (FULL) OF 'CROSS_ASSOCIATION' (Cost=59 Card=403417 Bytes=3227336)
11 4 SORT (JOIN) (Cost=361 Card=61103 Bytes=794339)
12 11 TABLE ACCESS (FULL) OF 'RPT_CR' (Cost=166 Card=61103 Bytes=794339)
13 3 SORT (JOIN) (Cost=47301 Card=60957 Bytes=122828355)
14 13 VIEW OF 'V_CMRGETCMRTEXTDETAILS' (Cost=12279 Card=60957 Bytes=122828355)
15 14 SORT (GROUP BY) (Cost=12279 Card=60957 Bytes=7375797)
16 15 MERGE JOIN (OUTER) (Cost=12279 Card=241598 Bytes=29233358)
17 16 SORT (JOIN) (Cost=126 Card=61490 Bytes=245960)
18 17 INDEX (FAST FULL SCAN) OF 'PK_CHANGEREQUEST' (UNIQUE) (Cost=8 Card=61490 Bytes=245960)
19 16 SORT (JOIN) (Cost=12153 Card=241598 Bytes=28266966)
20 19 TABLE ACCESS (BY INDEX ROWID) OF 'LARGE_TEXT' (Cost=5691 Card=241598 Bytes=28266966)
21 20 INDEX (FULL SCAN) OF 'PK_LARGE_TEXT' (UNIQUE) (Cost=606 Card=241598)


Statistics
----------------------------------------------------------
0 recursive calls
197 db block gets
13753 consistent gets
55425 physical reads
0 redo size
15174 bytes sent via SQL*Net to client
1424 bytes received via SQL*Net from client
11 SQL*Net roundtrips to/from client
0 sorts (memory)
9 sorts (disk)
143 rows processed


A reader, May 21, 2003 - 12:39 pm UTC



A reader, May 21, 2003 - 12:42 pm UTC

Hey Tom,

As many visitors of this web site are indian, may be you should get used to some Indian phrases :-)

Randy, May 21, 2003 - 1:27 pm UTC

Tom,  Thanks for your help.  I think I found the problem.

The view V_CMRGETCMRTEXTDETAILS is used to flatten the text data into four columns.

CREATE OR REPLACE VIEW CMR.V_CMRGETCMRTEXTDETAILS (
   CR_ID,
   DESCRIPTIONTEXT,
   CHANGEPLANTEXT,
   TESTPLANTEXT,
   BACKOUTPLANTEXT )
AS
(SELECT CR.CR_ID,
             MAX (DECODE (D.TEXT_TYPE_ID, 1, D.TEXT, NULL)) AS DESCRIPTIONTEXT,
             MAX (DECODE (D.TEXT_TYPE_ID, 2, D.TEXT, NULL)) AS CHANGEPLANTEXT,
             MAX (DECODE (D.TEXT_TYPE_ID, 3, D.TEXT, NULL)) AS TESTPLANTEXT,
             MAX (DECODE (D.TEXT_TYPE_ID, 4, D.TEXT, NULL)) AS BACKOUTPLANTEXT
        FROM CMR.LARGE_TEXT D, CMR.CHANGEREQUEST CR
       WHERE CR.CR_ID = D.CR_ID(+)
    GROUP BY CR.CR_ID)
/

SQL> DESC LARGE_TEXT
 Name                   Null?    Type
 --------------------- -------- ----------------
 CR_ID                  NOT NULL NUMBER(15)
 TEXT_TYPE_ID           NOT NULL NUMBER(15)
 VERSION                NOT NULL NUMBER(15)
 TEXT                            VARCHAR2(4000)

I created a table by (select * from V_CMRGETCMRTEXTDETAILS)
and it returned very quick.

I guess I will take this route unless you can suggest a better way.

Thanks for your time.
 

Query with explain plan

siva, May 22, 2003 - 7:31 am UTC

Hello TOM,
   I ran the query with autotrace on.   Can you give the suggestion how can i rewrite the second query. I'm fighting with this query for the past one week.
   Is it possible to go for materialized view for this query?. if it is, which method (COMPLETE REFRESH etc) of MV i can use.
   I used analytic functions. But the output are different. Why?. How do i use analytic function on PS_JOB Table. B'cos there are two MAX function with 
  different logic.

SQL> get s_ss_r01_1_qry
  1  SELECT count(*)  FROM        PS_JOB JOB,
  2        PS_S_SS_PERDATA_VW ADDRESS,
  3        ps_location_tbl Loc,
  4        PS_S_SS_CTRY_SETUP  COUNTRY ,
  5        PS_S_SS_SCHEME_TBL  SCHEME
  6        WHERE 1=1
  7        AND JOB.REG_TEMP = 'R'
  8        AND JOB.EMPL_CLASS IN ('F','U')
  9        AND JOB.EMPL_STATUS IN ( 'A','L','S','P')
 10        AND JOB.EMPL_RCD=0
 11        AND TO_CHAR(S_SS_GRANT_DT,'MMYYYY') =  '052003'
 12        AND JOB.EMPLID=ADDRESS.EMPLID
 13        AND job.SETID_LOCATION = loc.setid
 14        AND job.LOCATION =  loc.location
 15        AND loc.country = country.country
 16        AND SCHEME.S_SS_SCHEME_TYPE = COUNTRY.S_SS_SCHEME_TYPE
 17        AND JOB.EFFDT = ( SELECT
 18                     MAX(B.EFFDT) FROM PS_JOB B
 19             WHERE JOB.EMPLID =B.EMPLID
 20             AND JOB.EMPL_RCD = B.EMPL_RCD AND B.EFFDT  <= sysdate)
 21             AND JOB.EFFSEQ = (SELECT MAX(C.EFFSEQ)
 22                     FROM PS_JOB C WHERE JOB.EMPLID =C.EMPLID
 23             AND JOB.EMPL_RCD =C.EMPL_RCD
 24                     AND JOB.EFFDT = C.EFFDT )
 25             AND Loc.effdt = ( Select max(LocEffdt.effdt)
 26             from ps_location_tbl LocEffdt where loc.setid = LocEffdt.setid and
 27             loc.location = LocEffdt.location and LocEffdt.effdt <= sysdate ) AND
 28             LOC.EFF_STATUS='A' AND COUNTRY.EFFDT = ( SELECT MAX(CNTRY.EFFDT) FROM
 29             PS_S_SS_CTRY_SETUP CNTRY WHERE COUNTRY.COUNTRY = CNTRY.COUNTRY AND
 30           CNTRY.EFFDT <= JOB.EFFDT ) AND COUNTRY.EFF_STATUS = 'A'
 31* ORDER BY      SCHEME.S_SS_SCHEME_TYPE,COUNTRY.COUNTRY,JOB.EMPLID
SQL> /

  COUNT(*)
----------
     27288


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=1 Bytes=105)
   1    0   SORT (AGGREGATE)
   2    1     FILTER
   3    2       NESTED LOOPS (Cost=5 Card=1 Bytes=105)
   4    3         NESTED LOOPS (Cost=4 Card=1 Bytes=97)
   5    4           NESTED LOOPS (Cost=3 Card=1 Bytes=57)
   6    5             NESTED LOOPS (Cost=2 Card=1 Bytes=31)
   7    6               TABLE ACCESS (FULL) OF 'PS_S_SS_CTRY_SETUP' (C
          ost=1 Card=1 Bytes=18)

   8    6               TABLE ACCESS (BY INDEX ROWID) OF 'PS_S_SS_SCHE
          ME_TBL' (Cost=1 Card=1 Bytes=13)

   9    8                 INDEX (RANGE SCAN) OF 'PS1S_SS_SCHEME_TBL' (
          NON-UNIQUE) (Cost=1 Card=1)

  10    5             TABLE ACCESS (BY INDEX ROWID) OF 'PS_LOCATION_TB
          L' (Cost=1 Card=61 Bytes=1586)

  11   10               INDEX (RANGE SCAN) OF 'PSALOCATION_TBL' (NON-U
          NIQUE)

  12    4           TABLE ACCESS (BY INDEX ROWID) OF 'PS_JOB' (Cost=1
          Card=1 Bytes=40)

  13   12             INDEX (RANGE SCAN) OF 'PSDJOB' (NON-UNIQUE)
  14    3         INDEX (RANGE SCAN) OF 'PS_PERSONAL_DATA' (UNIQUE)
  15    2       SORT (AGGREGATE)
  16   15         FIRST ROW (Cost=3 Card=2 Bytes=38)
  17   16           INDEX (RANGE SCAN (MIN/MAX)) OF 'PSAJOB' (NON-UNIQ
          UE) (Cost=3 Card=2)

  18    2       SORT (AGGREGATE)
  19   18         FIRST ROW (Cost=3 Card=1 Bytes=22)
  20   19           INDEX (RANGE SCAN (MIN/MAX)) OF 'PSAJOB' (NON-UNIQ
          UE) (Cost=3 Card=1)

  21    2       SORT (AGGREGATE)
  22   21         FIRST ROW (Cost=2 Card=1 Bytes=20)
  23   22           INDEX (RANGE SCAN (MIN/MAX)) OF 'PS_LOCATION_TBL'
          (UNIQUE) (Cost=2 Card=1)

  24    2       SORT (AGGREGATE)
  25   24         FIRST ROW (Cost=2 Card=1 Bytes=12)
  26   25           INDEX (RANGE SCAN (MIN/MAX)) OF 'PS_S_SS_CTRY_SETU
          P' (UNIQUE) (Cost=2 Card=1)





Statistics
----------------------------------------------------------
          7  recursive calls
          4  db block gets
    3144530  consistent gets
       4336  physical reads
          0  redo size
        203  bytes sent via SQL*Net to client
        315  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
         29  sorts (memory)
          0  sorts (disk)
          1  rows processed



  1  SELECT count(1)
  2  FROM  PS_JOB JOB,  PS_S_SS_PERDATA_VW ADDRESS,
  3        (select setid, location, descr, COUNTRY, effdt from ( select setid, location, descr ,COUNTRY, eff_status
  4         effdt, max(effdt) over (partition by  setid, location) max_effdt
  5             from ps_location_tbl    )
  6             where effdt = max_effdt
  7            and effdt<=sysdate
  8            and eff_status = 'A' ) Loc,
  9        (select Country ,s_ss_scheme_type,effdt from
 10           ( select Country ,s_ss_scheme_type, eff_status,
 11           effdt, max(effdt) over (partition by Country ) max_effdt
 12      from PS_S_SS_CTRY_SETUP  )
 13      where effdt = max_effdt
 14      and effdt<=sysdate
 15      and eff_status = 'A' )  COUNTRY ,
 16        PS_S_SS_SCHEME_TBL  SCHEME
 17        WHERE 1=1
 18   AND JOB.REG_TEMP = 'R'
 19   AND JOB.EMPL_CLASS IN ('F','U')
 20   AND JOB.EMPL_STATUS IN ( 'A','L','S','P')
 21   AND JOB.EMPL_RCD=0
 22        AND TO_CHAR(S_SS_GRANT_DT,'MMYYYY') =  '052003'
 23        AND JOB.EMPLID=ADDRESS.EMPLID
 24        AND job.SETID_LOCATION = loc.setid
 25        AND job.LOCATION =  loc.location
 26        AND loc.country = country.country
 27        AND JOB.EFFDT = ( SELECT
 28                     MAX(B.EFFDT) FROM PS_JOB B
 29             WHERE JOB.EMPLID =B.EMPLID
 30             AND JOB.EMPL_RCD = B.EMPL_RCD AND B.EFFDT  <= sysdate)
 31       AND JOB.EFFSEQ = (SELECT MAX(C.EFFSEQ)
 32                     FROM PS_JOB C WHERE JOB.EMPLID =C.EMPLID
 33                     AND JOB.EMPL_RCD =C.EMPL_RCD
 34                     AND JOB.EFFDT = C.EFFDT )
 35*       AND SCHEME.S_SS_SCHEME_TYPE = COUNTRY.S_SS_SCHEME_TYPE
SQL> save job1 repl
Wrote file job1
SQL> /

  COUNT(1)
----------
     26563


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=44 Card=1 Bytes=122)
   1    0   SORT (AGGREGATE)
   2    1     FILTER
   3    2       NESTED LOOPS (Cost=44 Card=1 Bytes=122)
   4    3         NESTED LOOPS (Cost=43 Card=1 Bytes=114)
   5    4           NESTED LOOPS (Cost=39 Card=35 Bytes=2590)
   6    5             NESTED LOOPS (Cost=6 Card=1 Bytes=40)
   7    6               TABLE ACCESS (FULL) OF 'PS_S_SS_SCHEME_TBL' (C
          ost=1 Card=1 Bytes=13)

   8    6               VIEW
   9    8                 WINDOW (SORT)
  10    9                   TABLE ACCESS (FULL) OF 'PS_S_SS_CTRY_SETUP
          ' (Cost=1 Card=82 Bytes=1476)

  11    5             VIEW (Cost=34 Card=3526 Bytes=119884)
  12   11               WINDOW (SORT)
  13   12                 TABLE ACCESS (FULL) OF 'PS_LOCATION_TBL' (Co
          st=14 Card=3526 Bytes=91676)

  14    4           TABLE ACCESS (BY INDEX ROWID) OF 'PS_JOB' (Cost=1
          Card=1 Bytes=40)

  15   14             INDEX (RANGE SCAN) OF 'PSDJOB' (NON-UNIQUE)
  16    3         INDEX (RANGE SCAN) OF 'PS_PERSONAL_DATA' (UNIQUE)
  17    2       SORT (AGGREGATE)
  18   17         FIRST ROW (Cost=3 Card=2 Bytes=38)
  19   18           INDEX (RANGE SCAN (MIN/MAX)) OF 'PSAJOB' (NON-UNIQ
          UE) (Cost=3 Card=2)

  20    2       SORT (AGGREGATE)
  21   20         FIRST ROW (Cost=3 Card=1 Bytes=22)
  22   21           INDEX (RANGE SCAN (MIN/MAX)) OF 'PSAJOB' (NON-UNIQ
          UE) (Cost=3 Card=1)





Statistics
----------------------------------------------------------
         13  recursive calls
         12  db block gets
    1375223  consistent gets
       2612  physical reads
          0  redo size
        203  bytes sent via SQL*Net to client
        315  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
         33  sorts (memory)
          0  sorts (disk)
          1  rows processed

 

Tom Kyte
May 23, 2003 - 8:10 am UTC

when I'm taking questions -- feel free to put that there with all of the supporting information one would need to be able to give you a suggestion with. you know, tkprofs, descriptions of the tables, cardinalities, etc.

Query with explain plan

siva, May 22, 2003 - 8:06 am UTC

Hello TOM,
   I ran the query with autotrace on.   Can you give the suggestion how can i rewrite the second query. I'm fighting with this query for the past one week.
   Is it possible to go for materialized view for this query?. if it is, which method (COMPLETE REFRESH etc) of MV i can use.
   I used analytic functions. But the output are different. Why?. How do i use analytic function on PS_JOB Table. B'cos there are two MAX function with 
  different logic.

SQL> get s_ss_r01_1_qry
  1  SELECT count(*)  FROM        PS_JOB JOB,
  2        PS_S_SS_PERDATA_VW ADDRESS,
  3        ps_location_tbl Loc,
  4        PS_S_SS_CTRY_SETUP  COUNTRY ,
  5        PS_S_SS_SCHEME_TBL  SCHEME
  6        WHERE 1=1
  7        AND JOB.REG_TEMP = 'R'
  8        AND JOB.EMPL_CLASS IN ('F','U')
  9        AND JOB.EMPL_STATUS IN ( 'A','L','S','P')
 10        AND JOB.EMPL_RCD=0
 11        AND TO_CHAR(S_SS_GRANT_DT,'MMYYYY') =  '052003'
 12        AND JOB.EMPLID=ADDRESS.EMPLID
 13        AND job.SETID_LOCATION = loc.setid
 14        AND job.LOCATION =  loc.location
 15        AND loc.country = country.country
 16        AND SCHEME.S_SS_SCHEME_TYPE = COUNTRY.S_SS_SCHEME_TYPE
 17        AND JOB.EFFDT = ( SELECT
 18                     MAX(B.EFFDT) FROM PS_JOB B
 19             WHERE JOB.EMPLID =B.EMPLID
 20             AND JOB.EMPL_RCD = B.EMPL_RCD AND B.EFFDT  <= sysdate)
 21             AND JOB.EFFSEQ = (SELECT MAX(C.EFFSEQ)
 22                     FROM PS_JOB C WHERE JOB.EMPLID =C.EMPLID
 23             AND JOB.EMPL_RCD =C.EMPL_RCD
 24                     AND JOB.EFFDT = C.EFFDT )
 25             AND Loc.effdt = ( Select max(LocEffdt.effdt)
 26             from ps_location_tbl LocEffdt where loc.setid = LocEffdt.setid and
 27             loc.location = LocEffdt.location and LocEffdt.effdt <= sysdate ) AND
 28             LOC.EFF_STATUS='A' AND COUNTRY.EFFDT = ( SELECT MAX(CNTRY.EFFDT) FROM
 29             PS_S_SS_CTRY_SETUP CNTRY WHERE COUNTRY.COUNTRY = CNTRY.COUNTRY AND
 30           CNTRY.EFFDT <= JOB.EFFDT ) AND COUNTRY.EFF_STATUS = 'A'
 31* ORDER BY      SCHEME.S_SS_SCHEME_TYPE,COUNTRY.COUNTRY,JOB.EMPLID
SQL> /

  COUNT(*)
----------
     27288


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=1 Bytes=105)
   1    0   SORT (AGGREGATE)
   2    1     FILTER
   3    2       NESTED LOOPS (Cost=5 Card=1 Bytes=105)
   4    3         NESTED LOOPS (Cost=4 Card=1 Bytes=97)
   5    4           NESTED LOOPS (Cost=3 Card=1 Bytes=57)
   6    5             NESTED LOOPS (Cost=2 Card=1 Bytes=31)
   7    6               TABLE ACCESS (FULL) OF 'PS_S_SS_CTRY_SETUP' (C
          ost=1 Card=1 Bytes=18)

   8    6               TABLE ACCESS (BY INDEX ROWID) OF 'PS_S_SS_SCHE
          ME_TBL' (Cost=1 Card=1 Bytes=13)

   9    8                 INDEX (RANGE SCAN) OF 'PS1S_SS_SCHEME_TBL' (
          NON-UNIQUE) (Cost=1 Card=1)

  10    5             TABLE ACCESS (BY INDEX ROWID) OF 'PS_LOCATION_TB
          L' (Cost=1 Card=61 Bytes=1586)

  11   10               INDEX (RANGE SCAN) OF 'PSALOCATION_TBL' (NON-U
          NIQUE)

  12    4           TABLE ACCESS (BY INDEX ROWID) OF 'PS_JOB' (Cost=1
          Card=1 Bytes=40)

  13   12             INDEX (RANGE SCAN) OF 'PSDJOB' (NON-UNIQUE)
  14    3         INDEX (RANGE SCAN) OF 'PS_PERSONAL_DATA' (UNIQUE)
  15    2       SORT (AGGREGATE)
  16   15         FIRST ROW (Cost=3 Card=2 Bytes=38)
  17   16           INDEX (RANGE SCAN (MIN/MAX)) OF 'PSAJOB' (NON-UNIQ
          UE) (Cost=3 Card=2)

  18    2       SORT (AGGREGATE)
  19   18         FIRST ROW (Cost=3 Card=1 Bytes=22)
  20   19           INDEX (RANGE SCAN (MIN/MAX)) OF 'PSAJOB' (NON-UNIQ
          UE) (Cost=3 Card=1)

  21    2       SORT (AGGREGATE)
  22   21         FIRST ROW (Cost=2 Card=1 Bytes=20)
  23   22           INDEX (RANGE SCAN (MIN/MAX)) OF 'PS_LOCATION_TBL'
          (UNIQUE) (Cost=2 Card=1)

  24    2       SORT (AGGREGATE)
  25   24         FIRST ROW (Cost=2 Card=1 Bytes=12)
  26   25           INDEX (RANGE SCAN (MIN/MAX)) OF 'PS_S_SS_CTRY_SETU
          P' (UNIQUE) (Cost=2 Card=1)





Statistics
----------------------------------------------------------
          7  recursive calls
          4  db block gets
    3144530  consistent gets
       4336  physical reads
          0  redo size
        203  bytes sent via SQL*Net to client
        315  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
         29  sorts (memory)
          0  sorts (disk)
          1  rows processed



  1  SELECT count(1)
  2  FROM  PS_JOB JOB,  PS_S_SS_PERDATA_VW ADDRESS,
  3        (select setid, location, descr, COUNTRY, effdt from ( select setid, location, descr ,COUNTRY, eff_status
  4         effdt, max(effdt) over (partition by  setid, location) max_effdt
  5             from ps_location_tbl    )
  6             where effdt = max_effdt
  7            and effdt<=sysdate
  8            and eff_status = 'A' ) Loc,
  9        (select Country ,s_ss_scheme_type,effdt from
 10           ( select Country ,s_ss_scheme_type, eff_status,
 11           effdt, max(effdt) over (partition by Country ) max_effdt
 12      from PS_S_SS_CTRY_SETUP  )
 13      where effdt = max_effdt
 14      and effdt<=sysdate
 15      and eff_status = 'A' )  COUNTRY ,
 16        PS_S_SS_SCHEME_TBL  SCHEME
 17        WHERE 1=1
 18   AND JOB.REG_TEMP = 'R'
 19   AND JOB.EMPL_CLASS IN ('F','U')
 20   AND JOB.EMPL_STATUS IN ( 'A','L','S','P')
 21   AND JOB.EMPL_RCD=0
 22        AND TO_CHAR(S_SS_GRANT_DT,'MMYYYY') =  '052003'
 23        AND JOB.EMPLID=ADDRESS.EMPLID
 24        AND job.SETID_LOCATION = loc.setid
 25        AND job.LOCATION =  loc.location
 26        AND loc.country = country.country
 27        AND JOB.EFFDT = ( SELECT
 28                     MAX(B.EFFDT) FROM PS_JOB B
 29             WHERE JOB.EMPLID =B.EMPLID
 30             AND JOB.EMPL_RCD = B.EMPL_RCD AND B.EFFDT  <= sysdate)
 31       AND JOB.EFFSEQ = (SELECT MAX(C.EFFSEQ)
 32                     FROM PS_JOB C WHERE JOB.EMPLID =C.EMPLID
 33                     AND JOB.EMPL_RCD =C.EMPL_RCD
 34                     AND JOB.EFFDT = C.EFFDT )
 35*       AND SCHEME.S_SS_SCHEME_TYPE = COUNTRY.S_SS_SCHEME_TYPE
SQL> save job1 repl
Wrote file job1
SQL> /

  COUNT(1)
----------
     26563


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=44 Card=1 Bytes=122)
   1    0   SORT (AGGREGATE)
   2    1     FILTER
   3    2       NESTED LOOPS (Cost=44 Card=1 Bytes=122)
   4    3         NESTED LOOPS (Cost=43 Card=1 Bytes=114)
   5    4           NESTED LOOPS (Cost=39 Card=35 Bytes=2590)
   6    5             NESTED LOOPS (Cost=6 Card=1 Bytes=40)
   7    6               TABLE ACCESS (FULL) OF 'PS_S_SS_SCHEME_TBL' (C
          ost=1 Card=1 Bytes=13)

   8    6               VIEW
   9    8                 WINDOW (SORT)
  10    9                   TABLE ACCESS (FULL) OF 'PS_S_SS_CTRY_SETUP
          ' (Cost=1 Card=82 Bytes=1476)

  11    5             VIEW (Cost=34 Card=3526 Bytes=119884)
  12   11               WINDOW (SORT)
  13   12                 TABLE ACCESS (FULL) OF 'PS_LOCATION_TBL' (Co
          st=14 Card=3526 Bytes=91676)

  14    4           TABLE ACCESS (BY INDEX ROWID) OF 'PS_JOB' (Cost=1
          Card=1 Bytes=40)

  15   14             INDEX (RANGE SCAN) OF 'PSDJOB' (NON-UNIQUE)
  16    3         INDEX (RANGE SCAN) OF 'PS_PERSONAL_DATA' (UNIQUE)
  17    2       SORT (AGGREGATE)
  18   17         FIRST ROW (Cost=3 Card=2 Bytes=38)
  19   18           INDEX (RANGE SCAN (MIN/MAX)) OF 'PSAJOB' (NON-UNIQ
          UE) (Cost=3 Card=2)

  20    2       SORT (AGGREGATE)
  21   20         FIRST ROW (Cost=3 Card=1 Bytes=22)
  22   21           INDEX (RANGE SCAN (MIN/MAX)) OF 'PSAJOB' (NON-UNIQ
          UE) (Cost=3 Card=1)





Statistics
----------------------------------------------------------
         13  recursive calls
         12  db block gets
    1375223  consistent gets
       2612  physical reads
          0  redo size
        203  bytes sent via SQL*Net to client
        315  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
         33  sorts (memory)
          0  sorts (disk)
          1  rows processed

 

Excellent work

A reader, June 04, 2003 - 5:26 am UTC

Excellent work....


Tuning Subquery

Praveen, June 11, 2003 - 9:27 am UTC

Hi Tom,
Can you pl. help me to tune this query,
csa_acc_perf->129 million records, and cln_pre_account-> 500,000 records.
SELECT a.account_id,
a.bill_sequence,
a.tot_amount,
a.bill_amount,
a.tot_vat,
a.bill_vat
FROM (SELECT account_id,
bill_sequence,
tot_amount,
bill_amount,
tot_vat,
bill_vat
max(bill_sequence)
over (partition by account_id) max_bill_sequence
FROM csa_acc_perf ) a
where a.bill_sequence = max_bill_sequence
and exists (select 'X'
from cln_pre_account cln
where a.account_id = cln.account_id)


Tom Kyte
June 11, 2003 - 6:21 pm UTC

it'll go really fast if you don't run it (thats the easiest way to speed something up).

seriously -- the inline view and the way you have it coded could be harming you here. It precludes the predicate being pushed down -- so that query must materialize the inline view for all account_ids, even ones NOT in the cln_pre_account table.

Perhaps you mean to query:

select * from
(
SELECT account_id,
bill_sequence,
tot_amount,
bill_amount,
tot_vat,
bill_vat
max(bill_sequence)
over (partition by account_id) max_bill_sequence
FROM csa_acc_perf
where account_id IN ( select account_id from cln_pre_account )
)
where bill_sequence = max_bill_sequence


or if account_id is known to be unique in cln_pre_account

select * from (
SELECT account_id,
bill_sequence,
tot_amount,
bill_amount,
tot_vat,
bill_vat
max(bill_sequence)
over (partition by account_id) max_bill_sequence
FROM csa_acc_perf a, cln_pre_account b
where a.caccount_id = b.account_id )
where bill_sequence = max_bill_sequence
/


remember analytics get done AFTER a where clause, so if you push them down into an inline view they must get done BEFORE the outer portion of the query can work on them -- hence, you are analytic'ing lots of data perhaps that you need not.



A reader, June 12, 2003 - 12:34 am UTC


Very smart.

Kashif, August 15, 2003 - 4:35 pm UTC


rownum vs sub select

Alex, August 19, 2003 - 12:59 am UTC

Was just wondering wether I should be recomending rownum instead of sub-select, and here is the evidence for rownum!

/*+ index_desc

Mark, October 15, 2003 - 10:11 am UTC

In the above example with the /*+ index_desc */ hint you say this is a bad approach.

Would you agree that it is useful in the following scenario to get the list of id's in latest order first ?

select /*+ index_desc ( d table_uk ) */ *
from drive_table
order by id desc

table_uk - unique key for id.

The plan for with and without the hint:


SELECT STATEMENT Optimizer MODE=CHOOSE 3 M 66000
SORT ORDER BY 3 M 348 M 66000
TABLE ACCESS FULL MRP.DRIVE_TABLE 3 M 348 M 7967

SELECT STATEMENT Optimizer MODE=CHOOSE 3 M 755514
TABLE ACCESS BY INDEX ROWID MRP.DRIVE_TABLE 3 M 348 M 755514
INDEX FULL SCAN DESCENDING MRP.TABLE_UK 3 M 8312


In my query with large table the performance is massively improved - why doesn't the optimizer identify this plan by default (8.1.7.4).



Tom Kyte
October 15, 2003 - 6:34 pm UTC

is it improved?

I cannot tell from your plan how many rows are coming out of this (don't know what you used to dump the plan but i don't know which is cost/card and bytes)

I'll guess 3million

to read 3 million rows via an index -- really utterly bad

it gets the first rows first -- but the last row could take days.

Now, if you want the first rows first, you can (and should) say /*+ first_rows */

that is a 'safe' hint. it gives the optimizer something to work with - it is not dictating HOW to get first rows first, just that you would like the first rows first.

Else, it is thinking -- hmmm, lots of rows - index = kiss of death.

Hmm

Mark, October 16, 2003 - 10:02 am UTC

We want the last 10 orders in a 3million order table:

apps@apr11i.devl> set autotrace traceonly
apps@apr11i.devl> SELECT *
FROM (
SELECT /*+ index_desc ( a mrp_gross_requirements_u1) */ *
FROM mrp_gross_requirements a
ORDER BY demand_id DESC )
WHERE ROWNUM <=10
2 3 4 5 6 7
apps@apr11i.devl> /

10 rows selected.


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=755514 Card=3763361
Bytes=1802649919)

1 0 COUNT (STOPKEY)
2 1 VIEW (Cost=755514 Card=3763361 Bytes=1802649919)
3 2 TABLE ACCESS (BY INDEX ROWID) OF 'MRP_GROSS_REQUIREMEN
TS' (Cost=755514 Card=3763361 Bytes=365046017)

4 3 INDEX (FULL SCAN DESCENDING) OF 'MRP_GROSS_REQUIREME
NTS_U1' (UNIQUE) (Cost=8312 Card=3763361)





Statistics
----------------------------------------------------------
11 recursive calls
0 db block gets
31 consistent gets
0 physical reads
0 redo size
5021 bytes sent via SQL*Net to client
355 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
3 sorts (memory)
0 sorts (disk)
10 rows processed

1 SELECT *
2 FROM (
3 SELECT *
4 FROM mrp_gross_requirements a
5 ORDER BY demand_id DESC )
6* WHERE ROWNUM <=10
apps@apr11i.devl> /

10 rows selected.


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=66000 Card=3763361 B
ytes=1802649919)

1 0 COUNT (STOPKEY)
2 1 VIEW (Cost=66000 Card=3763361 Bytes=1802649919)
3 2 SORT (ORDER BY STOPKEY) (Cost=66000 Card=3763361 Bytes
=365046017)

4 3 TABLE ACCESS (FULL) OF 'MRP_GROSS_REQUIREMENTS' (Cos
t=7967 Card=3763361 Bytes=365046017)





Statistics
----------------------------------------------------------
17 recursive calls
4 db block gets
62519 consistent gets
62507 physical reads
0 redo size
5021 bytes sent via SQL*Net to client
355 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
4 sorts (memory)
0 sorts (disk)
10 rows processed

And First Rows:

1 SELECT *
2 FROM (
3 SELECT /*+ First_Rows */ *
4 FROM mrp_gross_requirements a
5 ORDER BY demand_id DESC )
6* WHERE ROWNUM <=10
apps@apr11i.devl> /

10 rows selected.


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: FIRST_ROWS (Cost=66000 Card
=3763361 Bytes=1802649919)

1 0 COUNT (STOPKEY)
2 1 VIEW (Cost=66000 Card=3763361 Bytes=1802649919)
3 2 SORT (ORDER BY STOPKEY) (Cost=66000 Card=3763361 Bytes
=365046017)

4 3 TABLE ACCESS (FULL) OF 'MRP_GROSS_REQUIREMENTS' (Cos
t=7967 Card=3763361 Bytes=365046017)





Statistics
----------------------------------------------------------
3 recursive calls
4 db block gets
62516 consistent gets
62506 physical reads
0 redo size
5021 bytes sent via SQL*Net to client
355 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
4 sorts (memory)
0 sorts (disk)
10 rows processed

It looks like the gains are made because there is no 'SORT' operation needed when using the index and it speeds up our process significantly.

Why isn't /*+ First_Rows */ hint causing the optimizer to say 'hey I have an index here I can use to get the answer really quickly ?'.

Tom Kyte
October 16, 2003 - 11:29 am UTC

does demand_id have a NOT NULL constraint on it?

I cannot reproduce your findings in 8174:

big_table@ORA817DEV> big_table@ORA817DEV> big_table@ORA817DEV> set autotrace traceonly explain
big_table@ORA817DEV>
big_table@ORA817DEV> SELECT *
2 FROM (
3 SELECT /*+ FIRST_ROWS */ *
4 FROM big_table a
5 ORDER BY id DESC )
6 WHERE ROWNUM <=10
7 /

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: FIRST_ROWS (Cost=26416 Card=1576192 Bytes=222243072)
1 0 COUNT (STOPKEY)
2 1 VIEW (Cost=26416 Card=1576192 Bytes=222243072)
3 2 TABLE ACCESS (BY INDEX ROWID) OF 'BIG_TABLE' (Cost=26416 Card=1576192 Bytes=143433472)
4 3 INDEX (FULL SCAN DESCENDING) OF 'BIG_TABLE_IDX' (NON-UNIQUE) (Cost=3594 Card=1576192)



big_table@ORA817DEV> SELECT *
2 FROM (
3 SELECT /*+ index_desc( A BIG_TABLE_IDX ) */ *
4 FROM big_table a
5 ORDER BY id DESC )
6 WHERE ROWNUM <=10
7 /

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=26416 Card=1576192 Bytes=222243072)
1 0 COUNT (STOPKEY)
2 1 VIEW (Cost=26416 Card=1576192 Bytes=222243072)
3 2 TABLE ACCESS (BY INDEX ROWID) OF 'BIG_TABLE' (Cost=26416 Card=1576192 Bytes=143433472)
4 3 INDEX (FULL SCAN DESCENDING) OF 'BIG_TABLE_IDX' (NON-UNIQUE) (Cost=3594 Card=1576192)



big_table@ORA817DEV>
big_table@ORA817DEV> set autotrace off
big_table@ORA817DEV>


Yep definitiely not nullable

Mark, October 17, 2003 - 3:38 am UTC

Demand ID is not null (apologies for SQL layout - its autogenerated from TOAD)

CREATE TABLE MRP_GROSS_REQUIREMENTS
(
DEMAND_ID NUMBER,
LAST_UPDATE_DATE DATE,
LAST_UPDATED_BY NUMBER,
CREATION_DATE DATE,
CREATED_BY NUMBER,
LAST_UPDATE_LOGIN NUMBER,
INVENTORY_ITEM_ID NUMBER,
ORGANIZATION_ID NUMBER,
COMPILE_DESIGNATOR VARCHAR2(10),
USING_ASSEMBLY_ITEM_ID NUMBER,
USING_ASSEMBLY_DEMAND_DATE DATE,
USING_REQUIREMENTS_QUANTITY NUMBER,
ASSEMBLY_DEMAND_COMP_DATE DATE,
DEMAND_TYPE NUMBER,
ORIGINATION_TYPE NUMBER,
DISPOSITION_ID NUMBER,
DAILY_DEMAND_RATE NUMBER,
REQUEST_ID NUMBER,
RESERVE_QUANTITY NUMBER,
PROGRAM_APPLICATION_ID NUMBER,
PROGRAM_ID NUMBER,
PROGRAM_UPDATE_DATE DATE,
SOURCE_ORGANIZATION_ID NUMBER,
RESERVATION_ID NUMBER,
UPDATED NUMBER,
STATUS NUMBER,
APPLIED NUMBER,
DEMAND_CLASS VARCHAR2(30),
FIRM_QUANTITY NUMBER,
FIRM_DATE DATE,
OLD_DEMAND_QUANTITY NUMBER,
OLD_DEMAND_DATE DATE,
DEMAND_SCHEDULE_NAME VARCHAR2(10),
PROJECT_ID NUMBER(15),
TASK_ID NUMBER(15),
PLANNING_GROUP VARCHAR2(30),
END_ITEM_UNIT_NUMBER VARCHAR2(30),
LENDING_PROJECT_ID NUMBER,
LENDING_TASK_ID NUMBER,
CHECK (DEMAND_ID IS NOT NULL),
CHECK (LAST_UPDATE_DATE IS NOT NULL),
CHECK (LAST_UPDATED_BY IS NOT NULL),
CHECK (CREATION_DATE IS NOT NULL),
CHECK (CREATED_BY IS NOT NULL),
CHECK (INVENTORY_ITEM_ID IS NOT NULL),
CHECK (ORGANIZATION_ID IS NOT NULL),
CHECK (COMPILE_DESIGNATOR IS NOT NULL),
CHECK (USING_ASSEMBLY_ITEM_ID IS NOT NULL),
CHECK (USING_ASSEMBLY_DEMAND_DATE IS NOT NULL),
CHECK (USING_REQUIREMENTS_QUANTITY IS NOT NULL),
CHECK (DEMAND_TYPE IS NOT NULL),
CHECK (ORIGINATION_TYPE IS NOT NULL)
)
TABLESPACE MRPTAB
PCTUSED 80
PCTFREE 2
INITRANS 10
MAXTRANS 255
STORAGE (
INITIAL 426M
NEXT 20M
MINEXTENTS 1
MAXEXTENTS 2147483645
PCTINCREASE 0
FREELISTS 4
FREELIST GROUPS 1
BUFFER_POOL DEFAULT
)
LOGGING
NOCACHE
NOPARALLEL;


CREATE INDEX MRP_GROSS_REQUIREMENTS_N1 ON MRP_GROSS_REQUIREMENTS
(ORGANIZATION_ID, COMPILE_DESIGNATOR, INVENTORY_ITEM_ID, USING_ASSEMBLY_ITEM_ID, USING_ASSEMBLY_DEMAND_DATE)
LOGGING
TABLESPACE MRPIDX
PCTFREE 1
INITRANS 11
MAXTRANS 255
STORAGE (
INITIAL 154640K
NEXT 880K
MINEXTENTS 1
MAXEXTENTS 2147483645
PCTINCREASE 0
FREELISTS 4
FREELIST GROUPS 1
BUFFER_POOL DEFAULT
)
NOPARALLEL;


CREATE INDEX MRP_GROSS_REQUIREMENTS_N2 ON MRP_GROSS_REQUIREMENTS
(APPLIED, COMPILE_DESIGNATOR, ORGANIZATION_ID)
LOGGING
TABLESPACE MRPIDX
PCTFREE 1
INITRANS 11
MAXTRANS 255
STORAGE (
INITIAL 86080K
NEXT 6160K
MINEXTENTS 1
MAXEXTENTS 2147483645
PCTINCREASE 0
FREELISTS 4
FREELIST GROUPS 1
BUFFER_POOL DEFAULT
)
NOPARALLEL;


CREATE UNIQUE INDEX MRP_GROSS_REQUIREMENTS_U1 ON MRP_GROSS_REQUIREMENTS
(DEMAND_ID)
LOGGING
TABLESPACE MRPIDX
PCTFREE 1
INITRANS 11
MAXTRANS 255
STORAGE (
INITIAL 8920K
NEXT 8920K
MINEXTENTS 1
MAXEXTENTS 2147483645
PCTINCREASE 0
FREELISTS 4
FREELIST GROUPS 1
BUFFER_POOL DEFAULT
)
NOPARALLEL;


Any chance its an init.ora setting ??

SELECT name,value
FROM v$parameter
ORDER BY name
/

O7_DICTIONARY_ACCESSIBILITY TRUE
_complex_view_merging TRUE
_fast_full_scan_enabled FALSE
_like_with_bind_as_equality TRUE
_new_initial_join_orders TRUE
_optimizer_mode_force TRUE
_optimizer_undo_changes FALSE
_or_expand_nvl_predicate TRUE
_ordered_nested_loop TRUE
_push_join_predicate TRUE
_push_join_union_view TRUE
_shared_pool_reserved_min_alloc 4100
_sort_elimination_cost_ratio 5
_sqlexec_progression_cost 0
_table_scan_cost_plus_one TRUE
_trace_files_public TRUE
_use_column_stats_for_function TRUE
active_instance_count
always_anti_join NESTED_LOOPS
always_semi_join STANDARD
aq_tm_processes 1
audit_file_dest ?/rdbms/audit
audit_trail NONE
background_core_dump full
background_dump_dest /orav8174APR11I/rdbms/log/APR11I/background
backup_tape_io_slaves FALSE
bitmap_merge_area_size 1048576
blank_trimming FALSE
buffer_pool_keep
buffer_pool_recycle
commit_point_strength 1
compatible 8.1.7
control_file_record_keep_time 7
control_files /orav8174APR11I/dbs/dbsAPR11I/cntl01APR11I.dbf, /orav8174APR11I/dbs/dbsAPR11I/cntl02APR11I.dbf, /orav8174APR11I/dbs/dbsAPR11I/cntl03APR11I.dbf
core_dump_dest /orav8174APR11I/rdbms/log/APR11I/core
cpu_count 6
create_bitmap_area_size 8388608
cursor_sharing EXACT
cursor_space_for_time FALSE
db_block_buffers 41400
db_block_checking FALSE
db_block_checksum TRUE
db_block_lru_latches 3
db_block_max_dirty_target 0
db_block_size 8192
db_domain
db_file_direct_io_count 64
db_file_multiblock_read_count 8
db_file_name_convert
db_files 500
db_name APR11I
db_writer_processes 1
dblink_encrypt_login FALSE
dbwr_io_slaves 0
disk_asynch_io TRUE
distributed_transactions 165
dml_locks 10000
enqueue_resources 32000
event 10262 trace name context forever, level 4000
fast_start_io_target 41400
fast_start_parallel_rollback LOW
fixed_date
gc_defer_time 10
gc_files_to_locks
gc_releasable_locks 0
gc_rollback_locks 0-1024=32!8REACH
global_names FALSE
hash_area_size 30000000
hash_join_enabled TRUE
hash_multiblock_io_count 0
hi_shared_memory_address 0
hpux_sched_noage
hs_autoregister TRUE
ifile ?/dbs/ifilecbo.ora
instance_groups
instance_name APR11I
instance_number 0
java_max_sessionspace_size 0
java_pool_size 100000000
java_soft_sessionspace_limit 8000000
job_queue_interval 90
job_queue_processes 5
large_pool_size 0
license_max_sessions 0
license_max_users 0
license_sessions_warning 0
lm_locks 12000
lm_ress 6000
local_listener
lock_name_space
lock_sga FALSE
log_archive_dest /archlogs/archAPR11I/arch
log_archive_dest_1
log_archive_dest_2
log_archive_dest_3
log_archive_dest_4
log_archive_dest_5
log_archive_dest_state_1 enable
log_archive_dest_state_2 enable
log_archive_dest_state_3 enable
log_archive_dest_state_4 enable
log_archive_dest_state_5 enable
log_archive_duplex_dest
log_archive_format %t_%s.dbf
log_archive_max_processes 1
log_archive_min_succeed_dest 1
log_archive_start FALSE
log_archive_trace 0
log_buffer 1048576
log_checkpoint_interval 100000
log_checkpoint_timeout 1200
log_checkpoints_to_alert TRUE
log_file_name_convert
max_commit_propagation_delay 700
max_dump_file_size UNLIMITED
max_enabled_roles 100
max_rollback_segments 100
mts_circuits 0
mts_dispatchers
mts_listener_address
mts_max_dispatchers 5
mts_max_servers 20
mts_multiple_listeners FALSE
mts_servers 0
mts_service APR11I
mts_sessions 0
nls_calendar
nls_comp
nls_currency
nls_date_format DD-MON-RR
nls_date_language
nls_dual_currency
nls_iso_currency
nls_language american
nls_numeric_characters .,
nls_sort binary
nls_territory america
nls_time_format
nls_time_tz_format
nls_timestamp_format
nls_timestamp_tz_format
object_cache_max_size_percent 10
object_cache_optimal_size 102400
open_cursors 1000
open_links 4
open_links_per_instance 4
ops_interconnects
optimizer_features_enable 8.1.7
optimizer_index_caching 90
optimizer_index_cost_adj 40
optimizer_max_permutations 2000
optimizer_mode CHOOSE
optimizer_percent_parallel 0
oracle_trace_collection_name
oracle_trace_collection_path /orav8174APR11I/rdbms/log/APR11I/core
oracle_trace_collection_size 5242880
oracle_trace_enable FALSE
oracle_trace_facility_name oracled
oracle_trace_facility_path ?/otrace/admin/fdf
os_authent_prefix ops$
os_roles FALSE
parallel_adaptive_multi_user FALSE
parallel_automatic_tuning FALSE
parallel_broadcast_enabled FALSE
parallel_execution_message_size 2152
parallel_instance_group
parallel_max_servers 16
parallel_min_percent 0
parallel_min_servers 0
parallel_server FALSE
parallel_server_instances 1
parallel_threads_per_cpu 2
partition_view_enabled FALSE
plsql_v2_compatibility FALSE
pre_page_sga FALSE
processes 500
query_rewrite_enabled TRUE
query_rewrite_integrity enforced
rdbms_server_dn
read_only_open_delayed FALSE
recovery_parallelism 5
remote_dependencies_mode TIMESTAMP
remote_login_passwordfile NONE
remote_os_authent FALSE
remote_os_roles FALSE
replication_dependency_tracking TRUE
resource_limit FALSE
resource_manager_plan
rollback_segments rb01, rb02, rb03, rb04, rb05, rb06, rb07, rb08, rb09, rb10, rb99
row_locking always
serial_reuse DISABLE
serializable FALSE
service_names APR11I
session_cached_cursors 200
session_max_open_files 10
sessions 600
shadow_core_dump full
shared_memory_address 0
shared_pool_reserved_size 52428800
shared_pool_size 524288000
sort_area_retained_size 0
sort_area_size 11796480
sort_multiblock_read_count 2
sql92_security FALSE
sql_trace FALSE
sql_version NATIVE
standby_archive_dest ?/dbs/arch
star_transformation_enabled FALSE
tape_asynch_io TRUE
text_enable FALSE
thread 0
timed_os_statistics 0
timed_statistics TRUE
tracefile_identifier
transaction_auditing TRUE
transactions 660
transactions_per_rollback_segment 20
use_indirect_data_buffers FALSE
use_sigio TRUE
user_dump_dest /orav8174APR11I/rdbms/log/APR11I/user
utl_file_dir /usr/tmp


Tom Kyte
October 17, 2003 - 9:56 am UTC

how about this -- use the 10053 trace event, compress the resulting trace file, email to me (and then wait till next week - taking a 3 day weekend :)

i'll take a look at that.

alter session set events '10053 trace name context forever, level 1';
set autotrace traceonly explain
query with first_rows
query with index_desc
exit



query tuning question

A reader, October 17, 2003 - 11:19 am UTC

Tom,

select count(distinct(a)), b
from t1
group by b;

A query like above takes a long time in our system -
I don't have access to the tkprof yet (sorry) Anyways,
I was thinking that will the above query benefit if
we create an index on a, b - to use index as a
skinny table?

Unfortunately, as I was typing, I just realized b is nullable so this option is perhaps not feasible!
Version is 9.0.1.4.

Any other comments from you may be helpful (I know
I have not given you enough info)

Thank you for your unbelievable service!

Tom Kyte
October 17, 2003 - 12:40 pm UTC

as long as one of A or B is defined as "NOT NULL" or you put in a predicate that asserts that null A's and B's are not part of the result set -- sure.

If B is nullable and A is NOT NULL -- it is OK -- the entire table will be covered by the index. It would only be if both A and B are not null that the index could not be used.

thanx!

A reader, October 17, 2003 - 3:59 pm UTC


why index_desc is bad approach?

A reader, October 24, 2003 - 7:06 am UTC

Hi

I dont understand why using this hint is a bad approach. I get what asked for. For example

select * from emp;

EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
7999 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30
7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30
7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30
7698 BLAKE MANAGER 7839 01-MAY-81 2850 30
7782 CLARK MANAGER 7839 09-JUN-81 2450 10
7839 KING PRESIDENT 17-NOV-81 5000 10
7369 SMITH CLERK 7902 17-DEC-80 800 20
7566 JONES MANAGER 7839 02-APR-81 2975 20
7788 SCOTT ANALYST 7566 09-DEC-82 3000 20
7876 ADAMS CLERK 7788 12-JAN-83 1100 20
7900 JAMES CLERK 7698 03-DEC-81 950 30
7902 FORD ANALYST 7566 03-DEC-81 3000 10 20
7934 MILLER CLERK 7782 12-JAN-83 1300 10

I want empno 7999

select /*+ index_desc (emp emp_pk) */ * from emp where rownum = 1

EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
7999 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30

I get what I asked.... If this is a bad approach then when do we use this hint???

Tom Kyte
October 24, 2003 - 9:42 am UTC

why use sql at that point. we should just write in assembler maybe.

you don't get what you asked for there. what you get is very unstable, unreliable code that if I ever saw in any application I was responsible for would get you a sound smack on the head.

I don't know what you were "asking for" there but let me tell you what this query:

select /*+ index_desc (emp emp_pk) */ * from emp where rownum = 1



does -- it returns a RANDOM record from emp. nothing more, nothing less. It does not have to use that index. That index might not exist.

it is very very very unreliable. don't use hints in an attempt to get an "answer". the only thing you'll get is a "bug"


this is as bad as relying on "group by" to sort -- the database just doesn't work that way.

I dont get your point, it does not return random values

A reader, October 24, 2003 - 10:46 am UTC

Hi

I have this table

select empno, ename from emp;

EMPNO ENAME
---------- ----------
7999 ALLEN
7521 WARD
7654 MARTIN
7698 BLAKE
7782 CLARK
7839 KING
8100 SMITH
7566 JONES
7788 SCOTT
7876 ADAMS
7900 JAMES
7902 FORD
7934 MILLER

If I want the max empno I can use

select /*+ index_desc (emp emp_pk) */ empno, ename from emp where rownum = 1;

EMPNO ENAME
---------- ----------
8100 SMITH

without sorting huh

Tom Kyte
October 24, 2003 - 11:08 am UTC

no you cannot

no you cannot

no you cannot

you cannot do this reliably, you cannot count on this being so.

if you do, you are mistaken, you write fragile code that WILL BREAK.

suppose one year from now -- you go on vacation. Someone comes along and says "you know what, emp_pk doesn't fit the corporate standard for index names. It must be ABC$123_EMP_PRIM_KEY_INDEX. Let me fix that"

now what?

or -- you know what, EMPNO needs to be unique but it must permit nulls for temporary employees (system change). Lets make EMPNO nullable. Now what? (index path for your query is obviated, cannot be used)


or -- you upgrade. And we decide that since hints are hints and not directives that we'll ignore you. The FASTEST way to answer that query is in fact a FULL SCAN and just stop after the first row. Fair is Fair -- we are TOTALLY correct and justified in doing that.



The ONLY way -- repeat -- THE ONLY way to get data from a relational database is to use the ORDER BY clause. PERIOD. anything else -- well, thats a bug waiting to happen.


The funny thing is -- the database will do the right thing for you "out of the box", without you telling it anything special.

big_table@ORA920> set autotrace traceonly explain
big_table@ORA920> select * from ( select * from big_table order by id desc ) where rownum = 1;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=58184 Card=1 Bytes=141)
1 0 COUNT (STOPKEY)
2 1 VIEW (Cost=58184 Card=3500000 Bytes=493500000)
3 2 TABLE ACCESS (BY INDEX ROWID) OF 'BIG_TABLE' (Cost=58184 Card=3500000 Bytes=350000000)
4 3 INDEX (FULL SCAN DESCENDING) OF 'BIG_TABLE_PK' (UNIQUE) (Cost=7658 Card=3500000)




That is the only correct way to achieve your goal.

your approach -- frightens me, scares me to death. if you do this lots, you have just created a maintenance nightmare of EPIC proportions for the developers/dba's that come after you.





How about this one ?

A reader, November 05, 2003 - 6:56 am UTC

Hi Tom,

I have a query but using min function (and I think similiar to max function). The query is to get the min seq for each id. What do you of this query compare with your suggestion?

select a.id, a.seq from seq_table a
where not exist (select 1 from seq_table b
where b.id = a.id and a.seq < b.seq)

Thanks,
David

Tom Kyte
November 05, 2003 - 9:29 am UTC

looks pretty inefficient, but benchmark it.

think about the join it must do to satisfy that!

i'd rather make a single pass on the table, not multiple ones.

Something similar to this

A reader, December 09, 2003 - 11:47 am UTC

Related to this, how do I go about doing the following:

Suppose I have a date in my query results like so

select ...,my_date,.... from ....

I want to get all the rows where my_date equals the second highest date in the resultset. Basically, get max()-1 and then get all rows where my_date=max(my_date)-1

I tried something like this

select ...,dense_rank() over (order by my_date)
from ...

I get something like

1
1
1
2
2
3
3
3
4

Now, I want to get just the rows with 3.

How can I do this? max(dense_rank() ...)-1 is not allowed!

Thanks

Table Join Order in From List

A reader, January 14, 2004 - 7:22 pm UTC

Hi Tom,

I was asked by our DBA to list the bigger tables first in the FROM list and he said although Oracle doesn't rely on the order in the list to decide the order to join the tables, sometime it may be easy to tune the statements using ORDERED hint if the bigger tables are listed first.

How reasonable does this sound to you?

Tom Kyte
January 15, 2004 - 3:09 am UTC

sounds really *wrong* ;) on a couple of levels

a) you shouldn't be using ordered hints in practice
b) the size of the table is simply not relevant. the size of the result set returned by the table is relevant
c) you generally want the smaller tables to drive (generally, not always, but generally)

consider:


big_table@ORA9IR2> /*
DOC>drop table t;
DOC>create table t ( x varchar2(30) );
DOC>insert into t values ( 'OWNER' );
DOC>commit;
DOC>analyze table t compute statistics;
DOC>*/
big_table@ORA9IR2>
big_table@ORA9IR2> select count(*) from t;

COUNT(*)
----------
1

big_table@ORA9IR2> select count(*) from big_table;

COUNT(*)
----------
1000000

big_table@ORA9IR2> set autotrace on
big_table@ORA9IR2> select t.x, big_table.created
2 from big_table, t
3 where t.x = big_table.owner
4 /

X CREATED
------------------------------ ---------
OWNER 12-MAY-02


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1353 Card=35701 Bytes=714020)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'BIG_TABLE' (Cost=1351 Card=35701 Bytes=535515)
2 1 NESTED LOOPS (Cost=1353 Card=35701 Bytes=714020)
3 2 TABLE ACCESS (FULL) OF 'T' (Cost=2 Card=1 Bytes=5)
4 2 INDEX (RANGE SCAN) OF 'BIG_TABLE_OWNER_IDX' (NON-UNIQUE) (Cost=178 Card=35701)




Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
8 consistent gets
0 physical reads
0 redo size
438 bytes sent via SQL*Net to client
499 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

big_table@ORA9IR2>
big_table@ORA9IR2> select /*+ ORDERED */ t.x, big_table.created
2 from big_table, t
3 where t.x = big_table.owner
4 /

X CREATED
------------------------------ ---------
OWNER 12-MAY-02


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2043 Card=35701 Bytes=714020)
1 0 HASH JOIN (Cost=2043 Card=35701 Bytes=714020)
2 1 TABLE ACCESS (FULL) OF 'BIG_TABLE' (Cost=1379 Card=999620 Bytes=14994300)
3 1 TABLE ACCESS (FULL) OF 'T' (Cost=2 Card=1 Bytes=5)




Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
14343 consistent gets
15484 physical reads
120 redo size
438 bytes sent via SQL*Net to client
499 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

big_table@ORA9IR2>
big_table@ORA9IR2> set autotrace off


hmmm, big table first not so good....

maybe they are thinking of the RBO -- not the CBO with the "big tables first" (implying "small tables last")

but in any case, don't list the tables in any order other than one that pleases you as you code it.

re: Table Join Order (Continuing)

A reader, January 14, 2004 - 7:38 pm UTC

Hi Tom,

Also please explain the table join order mechenism for ANSI joins and if the ORDERED hint applies to ANSI join as well.

Thanks a bunch.

Tom Kyte
January 15, 2004 - 3:11 am UTC

the join order mechanism for ansi joins is identical to "old fashioned" joins. the optimizer decides.

and yes, you get the same bad results as above using:

big_table@ORA9IR2> select /*+ ORDERED */ t.x, big_table.created
2 from big_table join t on (t.x = big_table.owner)
3 /

X CREATED
------------------------------ ---------
OWNER 12-MAY-02


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2043 Card=35701 Bytes=714020)
1 0 HASH JOIN (Cost=2043 Card=35701 Bytes=714020)
2 1 TABLE ACCESS (FULL) OF 'BIG_TABLE' (Cost=1379 Card=999620 Bytes=14994300)
3 1 TABLE ACCESS (FULL) OF 'T' (Cost=2 Card=1 Bytes=5)




Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
14341 consistent gets
15502 physical reads
0 redo size
438 bytes sent via SQL*Net to client
499 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed


Getting back to the original question...

A reader, April 15, 2004 - 2:07 pm UTC

Even this will work:

select max_weight
from ( select setup_date,
max (weight) over (partition by setup_date) max_weight,
max (setup_date) over () max_date
from t6
where pet_id = :b1 )
where setup_date = max_date
/

And it seems to take about the same amount of LIOs as Tom's version of the query.


Tom Kyte
April 15, 2004 - 2:39 pm UTC

it'll depend on the number of rows by pet_id actually -- and the width of the rows.

Setting up the example:

SELECT weight from (select weight from t where pet_id = :b1
order by pet_id DESC, setup_date DESC, weight DESC )
where rownum = 1

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1000 0.12 0.12 0 0 0 0
Fetch 1000 0.06 0.02 0 2000 0 1000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2001 0.18 0.15 0 2000 0 1000

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 121 (recursive depth: 1)

Rows Row Source Operation
------- ---------------------------------------------------
1000 COUNT STOPKEY (cr=2000 r=0 w=0 time=20803 us)
1000 VIEW (cr=2000 r=0 w=0 time=15485 us)
1000 INDEX RANGE SCAN DESCENDING T_IDX (cr=2000 r=0 w=0 time=12237 us)(object id 37026)

********************************************************************************

SELECT max_weight from ( select setup_date,
max (weight) over (partition by setup_date) max_weight,
max (setup_date) over () max_date
from t
where pet_id = :b1 )
where setup_date = max_date

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1000 0.12 0.12 0 0 0 0
Fetch 1000 0.22 0.22 0 2129 0 1000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2001 0.34 0.35 0 2129 0 1000

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 121 (recursive depth: 1)

Rows Row Source Operation
------- ---------------------------------------------------
1000 VIEW (cr=2129 r=0 w=0 time=222837 us)
30622 WINDOW BUFFER (cr=2129 r=0 w=0 time=192353 us)
30622 INDEX RANGE SCAN T_IDX (cr=2129 r=0 w=0 time=42307 us)(object id 37026)



You can see the analytics in this case were not as effective as saying "get me the first row here"

Analytics would be useful if we needed to get the first row for a bunch of groups (eg: get me the max weight for this SET of pet_ids). the rownum = 1 trick would not work.

When I have a singleton row like this -- i tend to go for rownum over analytics. analytics are great for sets.

Use of rownum for obtaining max value

David Swift, July 13, 2004 - 10:52 am UTC

Very useful information - thanks

how to do some form of order by in a sub-select

Ryan Gaffuri, July 15, 2004 - 3:07 pm UTC

I need to do an insert into a table based on a join of three tables

ParentTable
ChildTableA
ChildTableB

There is a row in child table that needs the following decode:

decode(childTableA.col1,'XX',childTableA.col2,'AA',childTableB.col2,'DUMMY')

It is possible for many records to be in the child table each with different col1 values. So a 'distinct' will not limit me to one row. However, if 'XX' or 'AA' appear always use col2, even if other values exist.

How do I ensure only one row comes back? I am on 9.2.0.3. Doesn't appear that I can use an order by in a sub-select. Also, doesn't appear, I have anything 'natural' to order on. I thought about adding an extra column as a 'key', such that

XX 1
AA 2
All others 3

So I can order by on that field.

My DML is of the form:

insert into myNewTable
Select
ParentTable.primaryKey
,ParentTable.col5
,( select decode(childTableA.col1,'XX',childTableA.col2,'AA',childTableB.col2,'DUMMY')
from childTableA
where primaryKey = parentTable.PrimaryKey
order by col1)
From ParentTable,
ChildTableA
ChildTableB
Where ParentTable.primary_key = childTableA.primary_key
and parentTAble.primary_key = childTableB.primary_key

I can add a 'key' field that maps to childTable.col1 to establish priorities, so I can order by, but I am hoping there is a way to do the whole thing in SQL. I can do it in pl/sql also, but am hoping it can be done with just an insert select.

Tom Kyte
July 15, 2004 - 6:28 pm UTC

not following -- why the scalar subquery? seems you already have access to everything post join.

how's about a simple tiny example (few columns as few rows as you can) to demonstrate the problem -- show the inputs and describe the outputs you need.

followup

Ryan Gaffuri, July 15, 2004 - 3:09 pm UTC

Doesn't look like I was clear. I need 1 record return for every record in the parentTable.

Tom Kyte
July 15, 2004 - 6:30 pm UTC

ahh, ok, something like:


insert into myNewTable
Select ParentTable.primaryKey,ParentTable.col5,
decode( (select col1 from childtablea where fkey = pkey ),
'XX', (select col2 from childtablea where fkey=pkey),
'AA', (select col2 from childtableb where fkey=pkey),
'DUMMY' )
from parenttable
/



join performs full table scan

a reader, July 16, 2004 - 3:42 am UTC

hi tom,
i have a doubt in the table scans.
i have 2 queries...
1) select e.empno,e.ename,d.loc
from emp e,dept d
where e.deptno=d.deptno and d.loc='CHICAGO';
2) select e.empno,e.ename,d.loc
from emp e,(select deptno,loc from dept where loc='CHICAGO') d
where e.deptno=d.deptno;

both the queries are giving the same execution plan y??
because, as the first query is doing join first and the checking for location, it is performing full table scan.but in the second one, only those deptnos are coming with theta locaion.so y it is performing the full table scan here also.....

is there any mistake in my understanding plz clear me

Tom Kyte
July 16, 2004 - 11:08 am UTC

you are prescribing procedural logic to a non-procedural language.

those queries are semantically equivalent and the optimizer sees them "as the same query"

() in SQL do not force an operation to take place at a certain point - by wrapping the () about the dept query did not say "find loc=chicago and then join".

only with AND's and OR's in a predicate would they do much.

any way to optimize?

Ryan Gaffuri, July 16, 2004 - 8:58 am UTC

good suggestion tom.

I have two questions:

1. What happens if that sub-select has 3 records returned:

'XX'
'AA'
'Some other value'

XX and AA have priority, but what if 'some other value' is returned first... then it will trigger the decode and I will get an incorrect value. My requirement is that if there are 3 records and if one of them is either 'XX' or 'AA' do return a specific value. Only do option 2 if 'some other value exists' and neither 'XX' nor 'AA' values exist.

Question 2:

any way to optimize this? so that I do not have to perform the second round of selects in the decode? There are several hundred thousand records involved so we are talking about several hundred thousand additional select statements against the same table?



Tom Kyte
July 16, 2004 - 11:24 am UTC

insert into myNewTable
Select ParentTable.primaryKey,ParentTable.col5,
decode( (select max(col1) from childtablea where fkey = pkey and col1 in ('XX','AA') ),
'XX', (select col2 from childtablea where fkey=pkey),
'AA', (select col2 from childtableb where fkey=pkey),
'DUMMY' )
from parenttable
/

depends on the relationship between the child tables -- if you expect to find it in A most of the times -- and such (for optimization). if I almost always expected to get from A, i might do it differently and so on.


Analytical Functions?

Vinnie, September 01, 2004 - 2:49 pm UTC

Tom, could I use analytical functions to help in the following?

event table
time_id number
id varchar2
side_id varchar2
etime_id number

side table
time_id number
id varchar2
name varchar2
etime_id number

select unique a.id from event a, side b
where a.side_id = b.id
and a.etime_id <= (select max(etime_id) from event where time_id = a.time_id and id=a.id and etime_id <= a.time_Id)
and b.etime_id <= (select max(etime_id) from side where
time_id = b.time_id and id= b.id and etime_id <= a.time_Id)

???

IF not, what would a more appropriate way to accomplish this be?


Tom Kyte
September 01, 2004 - 7:57 pm UTC

probably, looks like it.

but I'd need a create table and some insert into's to play with (with boundary values thought about and such as well)

Help

V, September 02, 2004 - 8:17 am UTC

Can this be sent via e-mail because of security reason?

Tom Kyte
September 02, 2004 - 9:44 am UTC

umm, no -- you already have the query here, i could DEDUCE the table structure needed -- and insert some data.

I'm just too lazy to do so.

eg: make something up.

Made up Date

Vinnie, September 02, 2004 - 11:59 am UTC

Here is the made up data.
In the actual tables there are 46000 rows in event
& 25 rows in event_status.

CREATE TABLE EVENT (
ROW_SEQUENCE_ID NUMBER (19) NOT NULL,
TIME NUMBER (19) NOT NULL,
NEW_TIME NUMBER NOT NULL,
ID VARCHAR2 (256),
ES_ID VARCHAR2 (256),
EVENT_NAME VARCHAR2 (256));


CREATE TABLE EVENT_STATUS (
ROW_SEQUENCE_ID NUMBER (19) NOT NULL,
TIME NUMBER (19) NOT NULL,
NEW_TIME NUMBER NOT NULL,
ID VARCHAR2 (256),
NAME VARCHAR2 (256));

insert into event values (1,35,73,138,'20695','AUDIT')
insert into event values (1,35,150,138,'20695','AUDIT')
insert into event values (1,35,73,138,'20695','AUDIT')
insert into event values (1,35,73,138,'20696','AUDIT')
insert into event values (1,35,150,138,'20696','AUDIT')

insert into event_status values (1,35,74,'20696','off')
insert into event_status values (1,35,74,'20695','on')




select unique name from event a, event_status b
where a.es_id = b.id
and
a.new_time <= (select max(new_time) from event
where time=a.time
and id = a.id
and new_time <= a.new_time)
and b.new_time <= (select max(new_time) from event_status
where time=b.time
and id = b.id
and new_time <= a.new_time)

Tom Kyte
September 02, 2004 - 1:16 pm UTC

In looking at this deeper, I'm confused.

a.new_time <= (select max(new_time) from event
where time=a.time
and id = a.id
and new_time <= a.new_time)


if you get the max(new_time) from event

SUCH THAT

new_time <= a.new_time

well, that is like saying "where 1=1" isn't it?

you are getting the set of records from even where (time,id) = (a.time,a.id) and new_time <= a.new_time.

Well, that set always includes the record "A"
and the max(new_time) is going to be a.new_time by definition???



correct me if I'm wrong -- but that is just

select distinct name from event a, event_stats b where a.es_id = b.id;



More to follow

V, September 03, 2004 - 5:55 pm UTC

The event table continually adds rows for a given name at different time intervals. Occationally a row is added that has a relationship to the event_status table.
i.e

EVENT EVENT_STATUS
ID ES_ID NEW_TIME TIME EVENT_NAME ID NAME NEW_TIME
1 200 10 35 AUDIT 200 E 10
2 10 35 A
3 300 10 35 B 300 J 10
1 350 11 35 AUDIT 200 E 20
3 300 12 35 B 300 J 12


This is hard to explain!
WE want to be able to select a given NEW_TIME (say 11)
And retrieve only:
EVENT EVENT_STATUS
ID ES_ID NEW_TIME TIME EVENT_NAME ID NAME NEW_TIME
2 10 35 A
3 300 10 35 B 300 J 10
1 350 11 35 AUDIT 200 NE 20

Hope this helps!

Tom Kyte
September 03, 2004 - 6:18 pm UTC

right -- but your query is *wrong* then,


So, when you give me the correct query (or the test case to play with -- create table, insert into -- "this is the output we need, the rows are there because <details here>") we can work from that.

otherwise, your query has a pair of "where 1=1 and 1=1"'s in it.

Detail

V, September 07, 2004 - 9:32 am UTC

Hopefully this is what you need.

CREATE TABLE EVENT (
ID VARCHAR2 (256),
TIME NUMBER NOT NULL,
NEW_TIME NUMBER (19) NOT NULL,
ES_ID VARCHAR2 (256),
EVENT_NAME VARCHAR2 (256));


CREATE TABLE EVENT_STATUS (
ID VARCHAR2 (256),
NAME VARCHAR2 (256)
NEW_TIME NUMBER (19) NOT NULL,
TIME NUMBER NOT NULL
);

insert into event values ('1',35,11,'200','AUDIT');
insert into event values ('3',35,10,'300','B');
insert into event values ('1',35,12,'350','AUDIT');
insert into event values ('3',35,12,'300','B');
insert into event values ('1',35,13,'200','AUDIT');

insert into event_status values ('200','E',10,35);
insert into event_status values ('300','J',10,35);
insert into event_status values ('200','E',20,35);
insert into event_status values ('300','J',12,35);
insert into event_status values ('350','aa',12,35);
insert into event_status values ('200','E1',13,35);

The Output we need is:

A=EVENT
B=EVENT_STATUS
WHERE NEW_TIME <=11

A.ID B.ES_ID A.NEW_TIME A.TIME A.EVENT_NAME B.ID
3 300 10 35 B J
1 200 11 35 AUDIT NE


Tom Kyte
September 07, 2004 - 10:02 am UTC

that is a join with a where as specified?!?



ops$tkyte@ORA9IR2> select distinct a.ID ,       a.ES_ID ,  A.NEW_TIME,   A.TIME ,  A.EVENT_NAME,   B.ID
  2  from event a, event_status b where a.es_id = b.id and a.new_time <= 11;
 
ID    ES_ID   NEW_TIME       TIME EVENT ID
----- ----- ---------- ---------- ----- -----
1     200           11         35 AUDIT 200
3     300           10         35 B     300
 

DBA

David, September 07, 2004 - 5:50 pm UTC

Hi Tom,

In one of our production databases, following SQL is a killer due to high number of physical reads. Could you give me some suggestion how to approach to get rid of full table scan with max() function ?
--
Create table t ( entity_type VARCHAR2(50)
updt_dtt DATE )
--
select entity_type, max(updt_dtt) from t
group by entity_type;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=12863 Card=9 Bytes=1
62)

1 0 SORT (GROUP BY) (Cost=12863 Card=9 Bytes=162)
2 1 TABLE ACCESS (FULL) OF 'T' (Cost=1446 Card=177
6231 Bytes=31972158)

Thanks! Thanks! Thanks! in advance
--David

Tom Kyte
September 08, 2004 - 8:13 am UTC

depends -- without knowing the physical schema -- really hard to say.

for example, if you have:

a) a table with the approximately 9 (so says the optimizer) entity_types listed AND
b) there was an index in entity_type, updt_dtt on this table

then i could give you a query that would be amazingly fast.

Else, all I could say is "if entity_type or updt_dtt is made NOT NULL, and you index entity_type, updt_dtt, then the optimizer could use an index fast_full scan"


So, here is the example showing what I mean (i hope)....

REM drop table t;
REM
REM create table t ( entity_type varchar2(50), updt_dtt date );
REM
REM insert into t select 'entity ' || rownum, sysdate from all_users where rownum <= 9;
REM alter table t minimize records_per_block;
REM
REM insert /*+ append */ into t select entity_type, updt_dtt-rownum from t;
REM commit;
REM insert /*+ append */ into t select entity_type, updt_dtt-rownum from t;
REM commit;
REM insert /*+ append */ into t select entity_type, updt_dtt-rownum from t;
REM commit;
REM insert /*+ append */ into t select entity_type, updt_dtt-rownum from t;
REM commit;
REM insert /*+ append */ into t select entity_type, updt_dtt-rownum from t;
REM commit;
REM insert /*+ append */ into t select entity_type, updt_dtt-rownum from t;
REM commit;
REM insert /*+ append */ into t select entity_type, updt_dtt-rownum from t;
REM commit;
REM insert /*+ append */ into t select entity_type, updt_dtt-rownum from t;
REM commit;
REM insert /*+ append */ into t select entity_type, updt_dtt-rownum from t;
REM commit;
REM insert /*+ append */ into t select entity_type, updt_dtt-rownum from t;
REM commit;
REM insert /*+ append */ into t select entity_type, updt_dtt-rownum from t;
REM commit;
REM insert /*+ append */ into t select entity_type, updt_dtt-rownum from t;
REM commit;
REM insert /*+ append */ into t select entity_type, updt_dtt-rownum from t;
REM commit;
REM insert /*+ append */ into t select entity_type, updt_dtt-rownum from t;
REM commit;
REM insert /*+ append */ into t select entity_type, updt_dtt-rownum from t;
REM commit;

<b>that'll give me about 294,912 rows on lots of blocks (big)

We then query:
</b>

ops$tkyte@ORA9IR2> begin
  2          dbms_stats.gather_table_stats
  3          ( user, 'T', method_opt => 'for all indexed columns size 254', cascade=> TRUE );
  4  end;
  5  /
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> @trace
ops$tkyte@ORA9IR2> alter session set events '10046 trace name context forever, level 12';
 
Session altered.
 
ops$tkyte@ORA9IR2> select entity_type, max(updt_dtt) from t no_index group by entity_type;
 
ENTITY_TYPE                                        MAX(UPDT_
-------------------------------------------------- ---------
entity 1                                           08-SEP-04
entity 2                                           08-SEP-04
entity 3                                           08-SEP-04
entity 4                                           08-SEP-04
entity 5                                           08-SEP-04
entity 6                                           08-SEP-04
entity 7                                           08-SEP-04
entity 8                                           08-SEP-04
entity 9                                           08-SEP-04
 
9 rows selected.

<b>That'll definitely full scan (no other path..)</b>

 
ops$tkyte@ORA9IR2> @traceoff
ops$tkyte@ORA9IR2> alter session set events '10046 trace name context off';
 
Session altered.
 
ops$tkyte@ORA9IR2> create index t_idx on t(entity_type,updt_dtt);
 
Index created.
 
ops$tkyte@ORA9IR2> begin
  2          dbms_stats.gather_table_stats
  3          ( user, 'T', method_opt => 'for all indexed columns size 254', cascade=> TRUE );
  4  end;
  5  /
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> @trace
ops$tkyte@ORA9IR2> alter session set events '10046 trace name context forever, level 12';
 
Session altered.
 
ops$tkyte@ORA9IR2> select entity_type, max(updt_dtt) from t with_index group by entity_type;
 
ENTITY_TYPE                                        MAX(UPDT_
-------------------------------------------------- ---------
entity 1                                           08-SEP-04
entity 2                                           08-SEP-04
entity 3                                           08-SEP-04
entity 4                                           08-SEP-04
entity 5                                           08-SEP-04
entity 6                                           08-SEP-04
entity 7                                           08-SEP-04
entity 8                                           08-SEP-04
entity 9                                           08-SEP-04
 
9 rows selected.

<b>that'll full scan as well -- since the optimizer doesn't know it could use the index...  no NOT NULL's</b>
 
ops$tkyte@ORA9IR2> @traceoff
ops$tkyte@ORA9IR2> alter session set events '10046 trace name context off';
 
Session altered.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> alter table t modify entity_type not null;
 
Table altered.
 
ops$tkyte@ORA9IR2> @trace
ops$tkyte@ORA9IR2> alter session set events '10046 trace name context forever, level 12';
 
Session altered.
 
ops$tkyte@ORA9IR2> select entity_type, max(updt_dtt) from t with_index_and_not_null group by entity_type;
 
ENTITY_TYPE                                        MAX(UPDT_
-------------------------------------------------- ---------
entity 1                                           08-SEP-04
entity 2                                           08-SEP-04
entity 3                                           08-SEP-04
entity 4                                           08-SEP-04
entity 5                                           08-SEP-04
entity 6                                           08-SEP-04
entity 7                                           08-SEP-04
entity 8                                           08-SEP-04
entity 9                                           08-SEP-04
 
9 rows selected.

<b>that'll index fast full scan (we hope...) because of the NOT NULL</b> 

ops$tkyte@ORA9IR2> @traceoff
ops$tkyte@ORA9IR2> alter session set events '10046 trace name context off';
 
Session altered.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create table t2 as select distinct entity_type from t;
 
Table created.
 
ops$tkyte@ORA9IR2> @trace
ops$tkyte@ORA9IR2> alter session set events '10046 trace name context forever, level 12';
 
Session altered.
 
ops$tkyte@ORA9IR2> select entity_type, (select max(updt_dtt) from t where entity_type = t2.entity_type) upd_dtt
  2    from t2;
 
ENTITY_TYPE                                        UPD_DTT
-------------------------------------------------- ---------
entity 1                                           08-SEP-04
entity 2                                           08-SEP-04
entity 3                                           08-SEP-04
entity 4                                           08-SEP-04
entity 5                                           08-SEP-04
entity 6                                           08-SEP-04
entity 7                                           08-SEP-04
entity 8                                           08-SEP-04
entity 9                                           08-SEP-04
 
9 rows selected.

<b>and that one will literally *fly*</b>


ops$tkyte@ORA9IR2> @traceoff
ops$tkyte@ORA9IR2> alter session set events '10046 trace name context off';
 
Session altered.


<b>tkprof says....</b>


select entity_type, max(updt_dtt) from t no_index group by entity_type
                                                                                                             
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      1.05       1.10      31996      36892          0           9
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      1.05       1.10      31996      36892          0           9
                                                                                                             
Rows     Row Source Operation
-------  ---------------------------------------------------
      9  SORT GROUP BY (cr=36892 r=31996 w=0 time=1102260 us)
 294912   TABLE ACCESS FULL T (cr=36892 r=31996 w=0 time=721012 us)
********************************************************************************
select entity_type, max(updt_dtt) from t with_index group by entity_type
                                                                                                             
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      1.03       1.30      33194      36892          0           9
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      1.03       1.30      33194      36892          0           9
                                                                                                             
Rows     Row Source Operation
-------  ---------------------------------------------------
      9  SORT GROUP BY (cr=36892 r=33194 w=0 time=1305103 us)
 294912   TABLE ACCESS FULL T (cr=36892 r=33194 w=0 time=687102 us)
********************************************************************************
select entity_type, max(updt_dtt) from t with_index_and_not_null group by entity_type
                                                                                                             
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.56       0.84          0       1165          0           9
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.57       0.85          0       1165          0           9
                                                                                                             
Rows     Row Source Operation
-------  ---------------------------------------------------
      9  SORT GROUP BY (cr=1165 r=0 w=0 time=849853 us)
 294912   INDEX FAST FULL SCAN T_IDX (cr=1165 r=0 w=0 time=231213 us)(object id 29260)
********************************************************************************
select entity_type, (select max(updt_dtt) from t where entity_type = t2.entity_type) upd_dtt
  from t2
                                                                                                             
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          1          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.00       0.00          1         24          0           9
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.00       0.00          1         25          0           9
                                                                                                             
Rows     Row Source Operation
-------  ---------------------------------------------------
      9  SORT AGGREGATE (cr=20 r=0 w=0 time=174 us)
      9   FIRST ROW  (cr=20 r=0 w=0 time=129 us)
      9    INDEX RANGE SCAN (MIN/MAX) T_IDX (cr=20 r=0 w=0 time=108 us)(object id 29260)
      9  TABLE ACCESS FULL T2 (cr=4 r=1 w=0 time=132 us)


<b>so, assuming you have a super type table with the list of entity types AND you have entity_type/updt_dtt indexed in this table -- I believe we could get this query running in about 0.00 to 0.01 seconds on average....</b>


 

DBA

David, September 08, 2004 - 2:38 pm UTC

Great!!! Tom, Thanks for your excellent example.
---
here is my test results:

SQL> create table dba_t as select * from production_table;
Table created.

SQL> desc dba_t;
Name                Null?    Type
------------------- -------- ---------------------------
ENTITY_TYPE                  VARCHAR2(50)
UPDT_DTT                     DATE

SQL> select count(*) from dba_t;

  COUNT(*)
----------
   1776682
SQL> select count(distinct ENTITY_TYPE ) from dba_t;

COUNT(DISTINCTENTITY_TYPE)
--------------------------
                         9
SQL> select count(*) from dba_t
  2  where ENTITY_TYPE is null;

  COUNT(*)
----------
         0
SQL> select count(*) from dba_t
  2  where updt_dtt is null;

  COUNT(*)
----------
         0

**** There are 9 ENTITY_TYPE in list, and no Null value in either columns ****

SQL> create index dba_t_indx001 on dba_t(entity_type,updt_dtt);

Index created.

set timing on
set autotrace traceonly
SQL> begin
  2  dbms_stats.gather_table_stats
  3  ( user, 'DBA_T', method_opt => 'for all indexed columns size 254', cascade=> TRUE )
  4  end;
  5  /

PL/SQL procedure successfully completed.

SQL> select ENTITY_TYPE, max(UPDT_DTT) from dba_t
  2  group by ENTITY_TYPE;

9 rows selected.

Elapsed: 00:00:12.18

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=12873 Card=9 Bytes=1
          62)

   1    0   SORT (GROUP BY) (Cost=12873 Card=9 Bytes=162)
   2    1     TABLE ACCESS (FULL) OF 'DBA_T' (Cost=1446 Card=1776682 B
          ytes=31980276)

Statistics
----------------------------------------------------------
          0  recursive calls
         24  db block gets
       6042  consistent gets
       6034  physical reads
          0  redo size
        548  bytes sent via SQL*Net to client
        252  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
          9  rows processed

SQL> alter table dba_t modify entity_type not null;

Table altered.

SQL> select ENTITY_TYPE, max(UPDT_DTT) from dba_t
  2  group by ENTITY_TYPE;

9 rows selected.

Elapsed: 00:00:13.69

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=7621 Card=9 Bytes=16
          2)

   1    0   SORT (GROUP BY NOSORT) (Cost=7621 Card=9 Bytes=162)
   2    1     INDEX (FULL SCAN) OF 'DBA_T_INDX001' (NON-UNIQUE) (Cost=
          7621 Card=1776682 Bytes=31980276)

Statistics
----------------------------------------------------------
        153  recursive calls
          0  db block gets
       7647  consistent gets
       7621  physical reads
          0  redo size
        548  bytes sent via SQL*Net to client
        252  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          5  sorts (memory)
          0  sorts (disk)
          9  rows processed

*** At this point, I found whether Full table or Index scan, the "physical reads" and "consistent gets" are same level. ***

SQL> set autotrace off
SQL> create table dba_t2 as select distinct ENTITY_TYPE from dba_t;

Table created.

SQL> select * from dba_t2;

ENTITY_TYPE
--------------------------------
CARR_T
DLVY_SCHD_T
ECHG_RATE_T
FC_CRS_RFRC_T
GEO_AREA_T
HUB_T
LANE_ASSC_T
TFF_T
ZN_T

SQL> set autotrace traceonly
SQL> select entity_type, (select max(updt_dtt) from dba_t where
  2  entity_type = dba_t2.entity_type) upd_dtt
  3* from dba_t2

9 rows selected.

Elapsed: 00:00:02.43

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=82 Bytes=2214
          )

   1    0   TABLE ACCESS (FULL) OF 'DBA_T2' (Cost=1 Card=82 Bytes=2214
          )

Statistics
----------------------------------------------------------
         22  recursive calls
         12  db block gets
         35  consistent gets
         15  physical reads
          0  redo size
        542  bytes sent via SQL*Net to client
        252  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
          9  rows processed

SQL> alter table dba_t modify entity_type null;

Table altered.

*** purpose for eliminating changes in our production environment ***

SQL> select entity_type, (select max(updt_dtt) from dba_t where
  2  entity_type = dba_t2.entity_type) upd_dtt
  3  from dba_t2;

9 rows selected.

Elapsed: 00:00:02.73

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=82 Bytes=2214
          )

   1    0   TABLE ACCESS (FULL) OF 'DBA_T2' (Cost=1 Card=82 Bytes=2214
          )

Statistics
----------------------------------------------------------
        143  recursive calls
         12  db block gets
         60  consistent gets
         19  physical reads
          0  redo size
        542  bytes sent via SQL*Net to client
        252  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          6  sorts (memory)
          0  sorts (disk)
          9  rows processed

*** This particular SQL comes from a view which's been called (query) for a couple of thousands a day. I believe rewrite the view with join a entity_type list table will solve the issue. ***

Again, really appreciate your help and effort.
--David
 



 

LEAST of 3 date fields

A reader, October 12, 2004 - 5:08 pm UTC

I have a large table (55 million rows) with 3 date fields. There is a non-unique index on each of these 3 date fields. I want to get the least of these 3 dates.

When I do

select min(d1) from big_table;

I get a INDEX FULL SCAN (MIN/MAX) and get the answer back in a jiffy (3 LIOs)

But when I do

select least(min(d1),min(d2),min(d3)) from big_table;

It does a FTS on the table!

Why is the CBO not smart enough to execute the above as 3 separate min/max probes on each of the 3 indexes and return the least of them?!

Thanks


Tom Kyte
October 12, 2004 - 6:57 pm UTC

select min(d)
from ( select min(d1) d from big_table
union all
select min(d2) d from big_table
union all
select min(d3) d from big_table )


you are apply functions and such -- in general, your query is "much harder than just read three indexes"

but use the union all, it'll do what you want.


A reader, October 12, 2004 - 7:01 pm UTC

Yes, I did use that workaround to make it use all 3 indexes, but why is my query "much harder than just read three indexes"? Why cant the CBO figure this out itself?

Forget about the LEAST() part, even a simple query like

select min(d1),min(d2),min(d3) from big_table;

with separate indexes on d1,d2 and d3 results in a plan involving index full scans (no min/max), hash join and what not!

Why is this so difficult?

Tom Kyte
October 12, 2004 - 7:54 pm UTC

once it gets beyond the simple

select min(column) from t;

it in general "gets really hard" and is "not the common case"

think about it.

select min(d1), min(d2), count(*) from t;

that is very close to yours, we could use d1-idx, d2-idx, and a fast full scan on some non-null index. but we won't.

select min(d1+d2) from t;

close -- but different, not the simple case.

It is easy to teach software to find the "common case that many people do" and optimize for it.

Much harder to generalize out for every possible case.

This frankly is just a query with a set of indexes that is in general "not very common". sure it technically "could" but once you get beyond the simple "we can index min/max a single index", it doesn't

A reader, October 12, 2004 - 8:24 pm UTC

I guess we will have to disagree

I dont see why needing the least value out of 3 columns is "uncommon". I do have the 3 indexes to help the CBO out.

Yes, select min(d1),min(d2),count(*) is also a good example where the CBO should be smarter.

I am not saying generalize out every possible case. Just see if all the columns in the select are 100% aggregate functions with no other data. If so, rewrite it internally as N individual UNION ALL selects and optimize each select as it currently does!

MVs and query rewrite do this kind of stuff all the time and much more complicated than this, why is this a big deal? The CBO is such a complicated beast, it takes into account zillions of things, this is so trivial.

Thanks

A reader, October 15, 2004 - 12:15 pm UTC

(1) select min(d1),min(d2),min(d3) from big_Table

(2) select min(d)
from ( select min(d1) d from big_table
union all
select min(d2) d from big_table
union all
select min(d3) d from big_table )

Earlier, I had (1) above in a packaged function defined with a WNDS pragma and it compiled fine [This is a 100 year old package that predates me so dont ask me why the pragma was there, probably some restriction on Oracle 7.x or something]

When I changed the query to (2) above, the package wouldnt compile saying that the pragma was violated!

Can you see a reason why (1) wouldnt violate the pragma but (2) would?

Thanks

Tom Kyte
October 15, 2004 - 3:07 pm UTC

ops$tkyte@ORA9IR2> create table t ( d1 int, d2 int, d3 int );
 
Table created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create or replace package foo
  2  as
  3          function f return number;
  4          pragma restrict_references( f, wnds );
  5
  6          pragma restrict_references(foo,wnds);
  7  end;
  8  /
 
Package created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create or replace package body foo
  2  as
  3          function f return number
  4          is
  5                  l_num number;
  6          begin
  7                  select min(d) into l_Num
  8                    from ( select min(d1) d from t
  9                           union all
 10                                   select min(d2) d from t
 11                                   union all
 12                                   select min(d3) d from t );
 13                  return l_num;
 14          end;
 15  end;
 16  /
 
Package body created.


<b>can you help me reproduce?</b>
 

A reader, October 15, 2004 - 3:49 pm UTC

Argh!

SQL> DROP TABLE t;

Table dropped.

SQL> create table t ( d1 int, d2 int, d3 int );

Table created.

SQL> 
SQL> 
SQL> create or replace package foo
  2  as
  3           function f return number;
  4           pragma restrict_references( f, wnds );
  5  
  6           pragma restrict_references(foo,wnds);
  7  end;
  8  /

Package created.

SQL> 
SQL> create or replace package body foo
  2   as
  3           function f return number
  4           is
  5                   l_num number;
  6           begin
  7                   select min(d) into l_Num
  8                     from ( select min(d1) d from t
  9                            union all
 10                                    select min(d2) d from t
 11                                    union all
 12                                    select min(d3) d from t );
 13                   return l_num;
 14           end;
 15  BEGIN
 16      EXECUTE IMMEDIATE 'alter session set skip_unsable_indexes=true';
 17  END;
 18  /

Warning: Package Body created with compilation errors.

SQL> sho err
Errors for PACKAGE BODY FOO:

LINE/COL ERROR
-------- -----------------------------------------------------------------
3/10     PLS-00452: Subprogram 'F' violates its associated pragma
6/10     PLS-00452: Subprogram 'FOO' violates its associated pragma

I guess in this case the EXECUTE IMMEDIATE is really WNDS, but since it can, in theory, do anything, Oracle raises the error, right? 

Tom Kyte
October 15, 2004 - 6:00 pm UTC

correct -- hence the union all had nothing to do with it :) -- that is why i asked for you to help repro it

it was the new execute immediate you added (which didn't exist way back in 7.3)

Question related to max query tuning.

A reader, December 09, 2004 - 12:54 pm UTC

I think we have really deviated from the actual thread. I found this thread really interesting and would like to have some ideas on max query tuning. I have a query, which want to post (as written) by the developer. Using some ideas from this thread, I used analytics to transform the query. The newer query is certainly out perfoming the original query, but just wanted to make sure that the new query, will not deviate in results depending on varying data sets. This is my first dip into analytics and it has sparked my curiosity to no limits.

Original Query :

select count(*) from
ps_project p, ps_ey_gb_track_prj r, ps_ey_erp e
where r.ey_tn = 'R1'
and r.ey_an >= 8
and p.project_id = r.ey_za
and e.business_unit = p.business_unit
and e.project_id = p.project_id
and e.setid = 'GLOBE'
and e.effdt = (select max(effdt) from ps_ey_erp
where business_unit = e.business_unit
and project_id = e.project_id
and setid = e.setid
and effdt <= SYSDATE)

Re-written query :

select count(*) from
ps_project p, ps_ey_gb_track_prj r,
(select ey_erp_pct,project_id,business_unit
from (select ey_erp_pct,project_id,business_unit,setid,effdt,
max(effdt) over (partition by business_unit,project_id,setid) m_eff
from ps_ey_erp)
where effdt = m_eff
and effdt <= sysdate
and setid = 'GLOBE' ) e
where r.ey_tn = 'R1'
and r.ey_an >= 8
and p.project_id = r.ey_za
and e.business_unit = p.business_unit
and e.project_id = p.project_id
/

Thanks a lot.

subquery tunning

daniel, May 09, 2005 - 8:25 pm UTC

If this doesn't apply to this topic, please disregard.

I'm trying to tune the following query with the subquery.

This one took about 27 minutes.
SQL> SELECT cd.sor_account_number,open_date,common_account_id,sor_timestamp,cis_change_detail_id
  2        FROM v_cis_change_detail cd
  3           ,(SELECT sor_account_number,COUNT(1) inc_cnt
  4             FROM v_cis_account a
  5             GROUP BY sor_account_number
  6             HAVING COUNT(1) > 1
  7                     ) m_a
  8        WHERE changed_table_name = 'ldgrods'
  9        AND cd.sor_account_number = m_a.sor_account_number
 10        AND fk_system_info_id = 11
 11        ORDER BY sor_account_number,sor_timestamp,cis_change_detail_id
 12  /

36072 rows selected.

Execution Plan
----------------------------------------------------------                      
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=237263 Card=1277598  Bytes=89431860)                                                       
   1    0   FILTER                                                              
   2    1     SORT (GROUP BY) (Cost=237263 Card=1277598 Bytes=89431860)                                                                     
   3    2       HASH JOIN (Cost=83679 Card=25551946 Bytes=1788636220)           
   4    3         INDEX (FAST FULL SCAN) OF 'IDX_CIS_ACCT_SAN_FSII' (NON-UNIQUE) (Cost=577 Card=3800775 Bytes=34206975)                     
   5    3         TABLE ACCESS (FULL) OF 'CIS_CHANGE_DETAIL' (Cost=592  56 Card=25572478 Bytes=1559921158)        

Then, I created a table from the subquery test_p
and used it in the query, and it took about 6 minutes.

  1  SELECT cd.sor_account_number,open_date,common_account_id,sor_timestamp,cis_change_detail_id
  2        FROM v_cis_change_detail cd
  3           ,test_p m_a
  4        WHERE changed_table_name = 'ldgrods'
  5        AND cd.sor_account_number = m_a.sor_account_number
  6        AND fk_system_info_id = 11
  7*       ORDER BY sor_account_number,sor_timestamp,cis_change_detail_id

  1  SELECT cd.sor_account_number,open_date,common_account_id,sor_timestamp,cis_change_detail_id
  2        FROM v_cis_change_detail cd
  3           ,odsobj.test_p m_a
  4        WHERE changed_table_name = 'ldgrods'
  5        AND cd.sor_account_number = m_a.sor_account_number
  6        AND fk_system_info_id = 11
  7*       ORDER BY sor_account_number,sor_timestamp,cis_change_detail_id

36072 rows selected.

Elapsed: 00:06:30.08

Execution Plan
----------------------------------------------------------                      
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=21184 Card=28283 Byt          
          es=1781829)                                                           
                                                                                
   1    0   SORT (ORDER BY) (Cost=21184 Card=28283 Bytes=1781829)               
   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'CIS_CHANGE_DETAIL' (Co          
          st=5 Card=7 Bytes=378)                                                
                                                                                
   3    2       NESTED LOOPS (Cost=21037 Card=28283 Bytes=1781829)              
   4    3         TABLE ACCESS (FULL) OF 'TEST_P' (Cost=2 Card=4207 By          
          tes=37863)                                                            
                                                                                
   5    3         INDEX (RANGE SCAN) OF 'IDX_CIS_CHGDTL_SACT_STS' (NON          
          -UNIQUE) (Cost=2 Card=20)                                             
                                                                                

Is there a way to force the optimizer to emulate this process...looks like it might perform better if it first got the results from the subquery and then join to the main query.  Or am I comparing apples to oranges here?
Thanks Tom. 

Tom Kyte
May 10, 2005 - 7:32 am UTC

and how long did it take to create and analyze the table test_p?

test_p

daniel, May 10, 2005 - 8:12 am UTC

About a minute or so to create and I did not analyze the table.  All the other tables in the query have been analyzed.  
I don't have the stats for the create, but I have the stats for the select part:
SQL> SELECT sor_account_number,COUNT(1) inc_cnt
  2             FROM v_cis_account a
  3             GROUP BY sor_account_number
  4             HAVING COUNT(1) > 1
  5  /

4207 rows selected.

Elapsed: 00:00:17.03

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=3263 Card=190039 Byt
          es=1710351)

   1    0   FILTER
   2    1     SORT (GROUP BY) (Cost=3263 Card=190039 Bytes=1710351)
   3    2       INDEX (FAST FULL SCAN) OF 'IDX_CIS_ACCT_SAN_FSII' (NON
          -UNIQUE) (Cost=577 Card=3800775 Bytes=34206975) 

Tom Kyte
May 10, 2005 - 9:19 am UTC

what is the db release.

and how does

with m_a as (SELECT sor_account_number,COUNT(1) inc_cnt
FROM v_cis_account a
GROUP BY sor_account_number
HAVING COUNT(1) > 1
)

SELECT cd.sor_account_number,open_date,common_account_id,sor_timestamp,cis_change_detail
_id
FROM v_cis_change_detail cd, m_a
WHERE changed_table_name = 'ldgrods'
AND cd.sor_account_number = m_a.sor_account_number
AND fk_system_info_id = 11
ORDER BY sor_account_number,sor_timestamp,cis_change_detail_id

do

little better

daniel, May 10, 2005 - 9:46 am UTC

Databases release is 9.2.0.6

It shaved some time off the query, but the plan looks the same to me.
Elapsed: 00:19:06.05

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=237263 Card=1277598
Bytes=89431860)

1 0 FILTER
2 1 SORT (GROUP BY) (Cost=237263 Card=1277598 Bytes=89431860
)

3 2 HASH JOIN (Cost=83679 Card=25551946 Bytes=1788636220)
4 3 INDEX (FAST FULL SCAN) OF 'IDX_CIS_ACCT_SAN_FSII' (N
ON-UNIQUE) (Cost=577 Card=3800775 Bytes=34206975)

5 3 TABLE ACCESS (FULL) OF 'CIS_CHANGE_DETAIL' (Cost=592
56 Card=25572478 Bytes=1559921158)

Tom Kyte
May 10, 2005 - 10:39 am UTC

what is you dynamic sampling set to?

it must have sampled the temp_p table you created I'm thinking -- it got the card= value way too accurate

the problem is the card=, see how it thinks "25572478" rows? you get 4207 in real life.


are the stats correct and up to date on these tables?

What about first_value?

Dave Hemming, May 10, 2005 - 11:06 am UTC

What - if any - would be the difference between

MIN() OVER (PARTITION BY ... )
and
FIRST_VALUE() OVER (PARTITION BY ... ORDER BY ...)

and between
MAX() OVER (PARTITION BY ... )
and
FIRST_VALUE() OVER (PARTITION BY ... ORDER BY ... DESC)

If I want to get, say, only the most recent row for each ID I have been doing:

SELECT match_id, update_timestamp, .... from
(select match_id, update_timestamp, FIRST_VALUE(update_timestamp) OVER (PARTITION BY match_id ORDER BY update_timestamp DESC) as most_recent,
.....
FROM subject_match)
WHERE update_timestamp = most_recent;

Should I be using MAX(update_timestamp) OVER (...) instead?

Tom Kyte
May 10, 2005 - 1:17 pm UTC

ops$tkyte@ORA10G> select x,
  2         min(x) over () min_x,
  3         first_value(x) over (order by x) fv_x,
  4         max(x) over () max_x,
  5         first_value(x) over (order by x desc) fv2_x,
  6         first_value(x) over (order by x desc NULLS LAST) fv3_x
  7    from t;
 
         X      MIN_X       FV_X      MAX_X      FV2_X      FV3_X
---------- ---------- ---------- ---------- ---------- ----------
         1          1          1          2                     2
         2          1          1          2                     2
                    1          1          2                     2

 
gotta watch out for nulls.


max(update_timestamp) over (partition by match_id)

seems "more clear" from a readability perspective, but that might be a matter of opinion.  

The max(x) over () invokes a WINDOW(buffer) step, the first_value -- a WINDOW(sort) step.


You may find the max to be marginally more cpu efficient:

select *
  from (
select t.*,
       max(last_ddl_time) over (partition by owner) max_ddl
  from big_table.big_table t
       )
 where last_ddl_time = max_ddl
                                                                                                                                             
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch      124      7.41      27.67      62771      14490        125        1839
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      126      7.41      27.67      62771      14490        125        1839
                                                                                                                                             
select *
  from (
select t.*,
       first_value(last_ddl_time) over (partition by owner order by last_ddl_time desc nulls last) max_ddl
  from big_table.big_table t
       )
 where last_ddl_time = max_ddl
                                                                                                                                             
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch      124     11.86      35.28      76189      14490        173        1839
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      126     11.86      35.28      76189      14490        173        1839


that was a 1,000,000 row table.



 

dynamic sampling

daniel, May 10, 2005 - 2:32 pm UTC

SQL> show parameter dyn

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
optimizer_dynamic_sampling           integer     1

cis_change_detail and cis_account were analyzed yesterday and I just did it again.
I used this command:

exec DBMS_STATS.GATHER_TABLE_STATS(OWNNAME => 'odsobj', TABNAME => 'cis_account', CASCADE => TRUE, method_opt => 'FOR ALL indexed COLUMNS SIZE auto', degree=>8); 

Tom Kyte
May 10, 2005 - 2:34 pm UTC

lets see an autotrace traceonly explain of this:

SELECT sor_account_number,COUNT(1) inc_cnt
FROM v_cis_account a
GROUP BY sor_account_number
HAVING COUNT(1) > 1

trace

daniel, May 10, 2005 - 2:46 pm UTC

SQL> set autotrace traceonly explain
SQL> SELECT sor_account_number,COUNT(1) inc_cnt
  2                 FROM v_cis_account a
  3                GROUP BY sor_account_number
  4               HAVING COUNT(1) > 1 
  5  /

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=3313 Card=192458 Byt
          es=1732122)

   1    0   FILTER
   2    1     SORT (GROUP BY) (Cost=3313 Card=192458 Bytes=1732122)
   3    2       INDEX (FAST FULL SCAN) OF 'IDX_CIS_ACCT_SAN_FSII' (NON
          -UNIQUE) (Cost=591 Card=3853541 Bytes=34681869)

 

Tom Kyte
May 10, 2005 - 2:57 pm UTC

that's the problem -- when it knows 4207 for the cardinality (very different from it thinking 192,458 ) the plan is radically different.

so, does it have useful information on sor_account_number? does it know the number of distinct values for example (eg: did it gather anything on that column?

CARD

daniel, May 10, 2005 - 3:06 pm UTC

There is an index on this column (there is also another field on this index), so when I analyzed the table it computed stats on it as well.

  1  SELECT COLUMN_NAME, LAST_ANALYZED, SAMPLE_SIZE,
  2  NUM_DISTINCT
  3   FROM DBA_TAB_COLUMNS WHERE
  4  TABLE_NAME = 'CIS_CHANGE_DETAIL'
  5* AND COLUMN_NAME = 'SOR_ACCOUNT_NUMBER'
SQL> /

COLUMN_NAME                    LAST_ANAL SAMPLE_SIZE NUM_DISTINCT
------------------------------ --------- ----------- ------------
SOR_ACCOUNT_NUMBER             10-MAY-05    78558092      3849159
 

Tom Kyte
May 10, 2005 - 11:14 pm UTC

wow,

78.5 million rows
3.8 million distinct values

but only 4207 having count(*) > 1

can we make use of that somehow - really skewed data -- is there like a single SOR_ACCOUNT_NUMBER that accounts for most of the data or something?

curious ...

Gabe, May 10, 2005 - 3:38 pm UTC

Looking at the very first sql and its plan from Daniel it seems to me Oracle re-wrote that query … since the GROUP BY and FILTER (probably for the HAVING) come after the join of cis_change_detail and [the index_ffs of] cis_account (BTW, it seems he’s using views rather than the actual tables … any filtering in there?).

He’s saying the sub-query factoring didn’t do it … the re-wrote still occurred (!?!).

What if we really force the materialization … I don’t know maybe …

with m_a as (select group by having)
,m_b as (select rownum rn, m_a.* from m_a)
select …

??


Tom Kyte
May 10, 2005 - 11:16 pm UTC

it looks like a bad cardinality= to me.

no filtering

daniel, May 10, 2005 - 3:46 pm UTC

no filtering in the views...they're simply "select * from table_name"

CARD

daniel, May 11, 2005 - 12:38 pm UTC

78.5 million rows 3.8 million distinct values- this is on the CIS_CHANGE_DETAIL


but only 4207 having count(*) > 1-that's on the cis_account table.

can we make use of that somehow - really skewed data -- is there like a single
SOR_ACCOUNT_NUMBER that accounts for most of the data or something?

I don't see anything....on cis_change_detail the most #of rows for a account is 163 and on cis_account is 5.


histograms

daniel, May 12, 2005 - 8:41 am UTC

Tom,
do you think that playing with histograms could help here?

Tom Kyte
May 12, 2005 - 12:30 pm UTC

with 3.8 million - i'm thinking "no, probably not"

are there a small amount of sor account numbers that account for the vast preponderence of the repeats?

if you first_rows this query, it would probably "fix" the problem given the circumstances.

Why does EXEC have consistent gets?

Karen M, May 12, 2005 - 1:54 pm UTC

I just noticed that on the original example showing the difference between the two queries there were 2000 consistent gets on the EXEC of the first query.
It appears to be the subquery in the predicate. But from some quick tests I ran, it's not just the subquery, it's something special about the subquery (like the aggregate perhaps?). I have created several different tests (which I'd be glad to post if needed) that have consistent gets in the EXEC (and also in the PARSE by the way) but being able to create the behavior is the easy part (it seems). What alludes me is why it's happening (and somehow I believe it should be "obvious").

My question is:
What conditions cause consistent gets to show up in EXEC (or PARSE for that matter) when executing a SELECT statement and why?

Tom Kyte
May 12, 2005 - 1:58 pm UTC

some queries (i have no list) do their work during the 'open' phase and others during 'fetch'.

aggregates, sorts and such can be seen in execute phase (and select for updates will do it) but not always.

the io in parse should only be related to queries run to parse the query.

tuning subquery

vishal, July 13, 2005 - 7:15 am UTC

i have got one query but it is taking too much of the time.
query is using a corelated subquery.
tom can u help me out in this regard

we have one table holding lakhs of record,actually it is a repository table of ETL tool we have to query that table, i am sending this example please suggest me some possible way to tunned it.

select starttime from xyz where (select max(starttime) from xyz1 where xyz.starttime=xyz1.starttime
group by jobid);




Tom Kyte
July 13, 2005 - 12:46 pm UTC

well, that is an incomplete query and the correlated component doesn't really make sense.




query taking too much time

vishal, July 14, 2005 - 3:10 am UTC

Hi tom,

I'm extremly sorry for that incomplete query.
now im sending the whole query
SELECT
rwr.server_name,
rat.subject_area,
rwr.workflow_name,
rwr.start_time,
rwr.end_time,
rwr.run_err_msg,
prq.queue_name,
decode(rwr.run_status_code,1,'Succeeded',
2,'Disabled',
3,'Failed',
4,'Stopped',
5,'Aborted',
6,'Running',
7,'Suspending',
8,'Suspended',
9,'Stopping',
10,'Aborting',
11,'Waiting',
12,'Scheduled',
13,'UnScheduled',
14,'Unknown',
15,'Terminated',
NULL) Status
rat.task_type_name,
rat.task_name,
rwr.user_name
FROM rep_all_tasks rat, rep_wflow_run rwr,rep_task_inst rti,wamu_inf_prd_queue prq
WHERE rwr.subject_id = rat.subject_id
AND rwr.workflow_id = rti.workflow_id
AND rti.task_id = rat.task_id
AND rwr.user_name = 'u192276'
AND rwr.subject_area = LTRIM(RTRIM(prq.folder_name))
AND rwr.start_time = (SELECT MAX (rwr1.start_time)
FROM REP_WFLOW_RUN rwr1
WHERE rwr1.workflow_id = rwr.workflow_id
GROUP BY workflow_id)
order by rwr.workflow_name,rat.subject_area,rat.task_name;

this whole query is not taking much time. the only time get consumed in the subquery part.
here we r querying out from the view and as per my knowledge we cannot built index on view , also we have constraint that we can't mofidy the base table indexing i mean we haev got indexes on the base table but we canot changes those one.

so can u suggest me to tunned this query with some analytical function or some other mean by using like hint.


Tom Kyte
July 14, 2005 - 10:31 am UTC

well, hard to say as I don't know your model and would have to make lots of assumptions.

Like is workflowid unique withing user_name/subject_area, if so, you can use analytics.



suquery taking too much time

vishal, July 14, 2005 - 3:31 am UTC

hi tom ,

SELECT MAX (weight) weight
FROM client.pet_weight
WHERE pet_id = :pet_id
AND setup_date =
(SELECT MAX (setup_date)
FROM client.pet_weight
WHERE pet_id = :pet_id)
solution

SELECT weight from (select weight from t where pet_id = :b1
order by pet_id DESC, setup_date DESC, weight DESC )
where rownum = 1

first i appreciate ur sol to the above query.
i have got one doubt,now suppose instead of pasing one value,
suppose if we have different setupdate for number of petid like

consider this table:

create table this_table( petid number, setupdate date ,otherdetail varchar2(10));

content of table

petid setupdate otherdetail
1 01/01/05 ok
1 01/02/05 bad
2 01/03/05 fine
2 01/01/05 ok
3 01/01/05 bad
3 01/02/05 fine
3 01/03/05 bad

like wise now if i fire this query

select otherdetail from this_table where setupdate = (select max(setupdate) from this_table a,this_table b where a.petid=b.petid group by petid);

here we have few record in this table also we have very few coloumn if we have 1 lakh record
than can u suggest me any means without working on indexes on table to make it faster.

plese help me out we have this kind of senario at higer level and it is taking 2hrs to run
that query.

please help me out

Tom Kyte
July 14, 2005 - 10:32 am UTC

that other query you fire would not work. single row subquery would return more than a single row.

sort group by and max()

A reader, July 22, 2005 - 11:08 am UTC

Hi

I have a query which is correlated subquery with a max function, tab_b is a VIEW

select *
from tab_a a, tab_b b, tab_c c
where a.id = b.id
and b.id = c.id
and c.line_num = (select max(line_num) from tab_c d
where d.id = c.id)
and c.item_id = :b1

I see this in tkprof output
6 FILTER
6 HASH JOIN
6 NESTED LOOPS
6 NESTED LOOPS
6 TABLE ACCESS BY INDEX ROWID TAB_A
6 INDEX UNIQUE SCAN TAB_A_U1 (object id 11889)
6 TABLE ACCESS BY INDEX ROWID TAB_B
6 INDEX UNIQUE SCAN TAB_B_U1 (object id 11889)
6 TABLE ACCESS BY INDEX ROWID TAB_C
6 INDEX RANGE SCAN TAB_C_IDX1 (object id 195358)
11756271 VIEW
11756271 SORT GROUP BY
11756271 FILTER
11756271 TABLE ACCESS FULL TAB_C

I dont understand the SORT GROUP BY and the view, there is no group by whatsoever how come the execution plan has GROUP BY?







Tom Kyte
July 22, 2005 - 11:14 am UTC

max()

you have an aggregate, aggregates GROUP.

This query was (apparently) rewritten by the optimizer as:

select *
from tab_a a, tab_b b, tab_c c,
(select id, max(line_num) line_num from tab_c d group by id) XX

where a.id = b.id
and b.id = c.id
and ( c.line_num = XX.line_num and c.id = XX.id )




That´s probably not a good rewrite sort group by and max()

A reader, July 23, 2005 - 3:32 am UTC

Hi

May be you are correct that the CBO rewrote the query into

select *
from tab_a a, tab_b b, tab_c c,
(select id, max(line_num) line_num from tab_c d group by id) XX

where a.id = b.id
and b.id = c.id
and ( c.line_num = XX.line_num and c.id = XX.id )

However that is not a good choice since id in tab_c is indexed and very good selectivity (many distinct values) not sure why this happens!

Tom Kyte
July 23, 2005 - 9:34 am UTC

because indexes do not rule

because full scans are not evil

because perhaps it thought I'll be running that little subquery *alot*

because bulk operations are much better than little operations when you do the little operation alot



may be yes may be not

A reader, July 23, 2005 - 9:46 am UTC

Hi

The query is from a standard Oracle Application 11i concurrent program, it runs several times (for each parent key) around 400 or 500 so I am not sure if that is cool to full scan tab_c 400 times since tab_c has 4 million rows ! :-)

I think the problem is with bind peeking (and again have run into several already with other databases) because the problem started when we migrated to 9i, in 8i this batch process runs lightning fast.

I will propose to client flush the shared pool few times per day until we get a confirmation from support that we are allowed to add an hint to this standard process.

Thanks!!!

Not getting the same result

Reader, July 25, 2005 - 9:24 pm UTC

Hi,

Back to original question. I setup the same test case
and ran the both queries with sql_trace.

I have two doubts.

o I am getting the result with 3 block gets instead on 2000 blocks as you got.

o you used order by desc for all indexed columns and then rownum = 1 in outer query. ????? I could not figure out.

- Since oracle is reading the data from an index. Will i get the one row in descending order and stop the processing and give us the result sets or It will sort the whole result [ using the index ] and then give the first row.

If it will sort the whole result set then would this be the best way to get answer ?. If table is large in GBs.

Thanks,



Tom Kyte
July 26, 2005 - 7:26 am UTC

1) you executed the query 1000 times and got 3 block gets? I think maybe you executed the query once....


2) ignore the index, the index is used but ignore it.

look at the query and what is says:


SELECT weight from (select weight from t where pet_id = :b1
order by pet_id DESC, setup_date DESC, weight DESC )
where rownum = 1

SORT T by pet_id/setup_date/weight in a DESC order (so, since we get A pet_id, this is really sorted by setup_date and then weight DESC). Now that will bring us the NEWEST setup_date first. Just keep the first one, that is all we really wanted in effect for this query.


forget the index, the index is relevant to BOTH queries (try finding the max without it?). the QUERY is what is important, understand the syntax there...




one more doubt

A reader, July 26, 2005 - 8:18 pm UTC


Thanks, For your clarification.

Will oracle create/read the whole result set and then full sorting and then return 1 record to outer query ?

or

Will oracle create/read the whole result set and then will find the first/newest record [ as per order by clause ] and then return to outer query.





Tom Kyte
July 27, 2005 - 8:44 am UTC

what happens totally depends on what is available.

got an index? No, it would not.

would it ever "fully sort" -- no, probably not. There is a top-n optimization, the database recognizes the construct:

select *
from (select .... from .... where ... ORDER BY a,b,c)
where rownum <= :x

as a top n query and realizes "I don't have to sort the entire thing, I just need to get the first :x records - sort them and then get the :X+1st record and see if it is less than the current :X record I have, if so, put it into the array I built and kick out a record - else discard it"

Eg: if you:

select * from (select * from one_billion_row_table order by unindexed_column) where rownum <= 10;

it will not have to stash 1,000,000,000 records in temp and sort them, it'll get 10, sort them and then get the 11th and see if it needs to be in the 10 (kicking out one if so) and then get the 12th and so on. It would need space to hold 10 records - not 1,000,000,000.


How it is physically processed is as efficient as can be depending on what is available.

How it is logically processed is the same regardless -- logically it says "sort it by these columns descending/ascending, then return the first one(s)"

Thanks !!

Reader, July 29, 2005 - 5:18 pm UTC

Thanks for your help.

Mean in case of unindexed_column ultimately oracle will have to scan the whole table -- Correct .. ?

Thanks,


Tom Kyte
July 29, 2005 - 5:57 pm UTC

which column do you refer to?

select *
from (select .... from .... where ... ORDER BY a,b,c)
where rownum <= :x


Oracle will full scan IF needed, but not if not NEEDED.

Re:

A reader, July 29, 2005 - 9:18 pm UTC

Case 1.

Select * from ( select * from table order by unindexcolumn desc )
where rownum <= 10;

Case 2.

Select * from ( select unindexcolumn from table order by unindexcolumn desc )
where rownum <= 10;

I think In both cases oracle will have to read the full table [ full table mean for that particular column].

In what scenario oracle does not read full table in unindexed column ?

Thanks




Tom Kyte
July 30, 2005 - 8:44 am UTC

Yes, it would read the entire table.

You say "unindexed column" as if it "meant something"

there are columns you select
there are columns you put a predicate on
there are columns you order by


You would have to say "no indexes on ANY column touched by the query"


Thanks For your reply

A reader, July 30, 2005 - 2:49 pm UTC


teach me how to fish

Arul Ramachandran, July 30, 2005 - 4:48 pm UTC

When asked on how you think of clever ways to rewrite a query your response was
"
think in sets, know what is available to you, understand SQL. it all comes from that.
"

Could you please elaborate on your reply?

Normally the developer would write the sql as select max(col1) with subquery of the same table with a max(col2). Down the line, performance stinks and then we look to rewrite the sql.

What would be your advice to do this right the first time?

Thank you.

Tom Kyte
July 30, 2005 - 5:56 pm UTC

have them read this page?

Don't know what to say -- it comes from experience and experiences. Think of different ways to say the same question and many times a query "falls out"

Kathy, September 12, 2005 - 12:32 am UTC

Tom, use the following way (rowid = 1) to find a max is very clever way in case it is a just a single pid_id

SELECT weight from (select weight from t where pet_id = :b1
order by pet_id DESC, setup_date DESC, weight DESC )
where rownum = 1

But my question is if no ped_id is provided, need to find the max(weight) of all pet_id, if not use max with group by, how I can do this! Thanks a lot

Tom Kyte
September 12, 2005 - 7:21 am UTC

you'd drop the where clause and run the query again.

Get max weight for EACH pet_id without using 'max' and 'group by'

Kathy, September 13, 2005 - 3:13 am UTC

Sorry Tom, I cannot exactly get your meaning, how can I remove the where clause, then get the max weight of EACH pet_id without using 'max' and 'group by' due to the performance? Thanks in advance ^.^


Tom Kyte
September 13, 2005 - 11:52 am UTC

I did not read it that way, I though you mean "if no record found for a specific pet id, return the max(weight) over all pet_id's"


the way to get the max(weight) by pet_id is group by -- no "magic" there, unless you stored the max(weight) by pet_id separately.

MAX Subquery

Andy, December 06, 2005 - 10:58 am UTC

I've been through as many examples as I can spare time for, but so far I haven't been satisfied with what I've read.

Our developers have been in a similar situation to Kathy's, above.

We have an SQL query which runs on both Oracle and SQL Server. The Oracle one runs >40 mins; SQL Server takes 2.5 secs (note: the SQL Server development was actually converted from the Oracle development).

We are after the last start_date per ID. The original (slow) query is:

SELECT rs.ID, rs.projected_status_code
FROM resource_schedule rs,
(SELECT ID, MAX (start_date) max_date
FROM resource_schedule
WHERE entry_type = 'C'
GROUP BY ID) x
WHERE (rs.ID = x.ID)
AND rs.start_date = x.max_date
AND entry_type = 'C'

476 rows selected.

Elapsed: 00:49:31.67

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=8319577 Card=143 Byt
es=7293)
1 0 FILTER
2 1 SORT (GROUP BY) (Cost=8319577 Card=143 Bytes=7293)
3 2 HASH JOIN (Cost=5187 Card=456038657 Bytes=23257971507)
4 3 TABLE ACCESS (FULL) OF 'RESOURCE_SCHEDULE' (Cost=198
Card=505804 Bytes=8092864)

5 3 TABLE ACCESS (FULL) OF 'RESOURCE_SCHEDULE' (Cost=198
Card=505804 Bytes=17703140)


Statistics
----------------------------------------------------------
5 recursive calls
72 db block gets
6464 consistent gets
17447 physical reads
0 redo size
11263 bytes sent via SQL*Net to client
844 bytes received via SQL*Net from client
33 SQL*Net roundtrips to/from client
2 sorts (memory)
1 sorts (disk)
476 rows processed

We have tried all sorts of combinations of suggestions (from this page, and others) but could not find a solution.

The solution I have found (elsewhere on this site) seems to be the OLAP function MAX() OVER().

SELECT *
FROM (SELECT t1.*,
MAX (max_date) OVER (PARTITION BY ID) md
FROM (SELECT ID, projected_status_code, MAX (start_date) max_date
FROM RESOURCE_SCHEDULE
WHERE entry_type = 'C'
GROUP BY ID, projected_status_code) t1) t2
WHERE max_date = md

476 rows selected.

Elapsed: 00:00:01.10

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer Mode=CHOOSE 561 44 K 1611
1 0 VIEW 561 44 K 1611
2 1 WINDOW BUFFER 561 15 K 1611
3 2 SORT GROUP BY 561 15 K 1611
4 3 TABLE ACCESS FULL RESOURCE_SCHEDULE 505 K 13 M 198

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
3231 consistent gets
3224 physical reads
0 redo size
37523 bytes sent via SQL*Net to client
3580 bytes received via SQL*Net from client
245 SQL*Net roundtrips to/from client
4 sorts (memory)
0 sorts (disk)
476 rows processed

This solves the problem very nicely, and negates any 'tuning' of the database, EXCEPT that we require the (considerably more expensive) Enterprise Edition to be able to use it. This also indicates that SELF JOINS on huge tables are bad - but can be solved using the MAX OVER. We have Standard edition already distributed around our Customers, which is adequate, bar this one problem.

I have an alternative which gets around the problem, with the existing Standard Edition. I was looking into UDAG's (Enterprise again) but the recommendation for 8i was to create a user function. So I did (for 9i):

SELECT ID, Get_Projected_Status(ID,max_date) projected_status_code
FROM
(SELECT ID, MAX (start_date) max_date
FROM RESOURCE_SCHEDULE
WHERE entry_type = 'C'
GROUP BY ID)

476 rows selected.

Elapsed: 00:00:01.61

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer Mode=CHOOSE 561 8 K 1163
1 0 SORT GROUP BY 561 8 K 1163
2 1 TABLE ACCESS FULL RESOURCE_SCHEDULE 505 K 7 M 198

Statistics
----------------------------------------------------------
476 recursive calls
0 db block gets
5135 consistent gets
3221 physical reads
0 redo size
37436 bytes sent via SQL*Net to client
3336 bytes received via SQL*Net from client
245 SQL*Net roundtrips to/from client
3 sorts (memory)
0 sorts (disk)
476 rows processed


In summary, Oracle is not necessarily working as we would expect. The MAX subquery works as expected in SQL Server, but Oracle struggles with it.

Q: Perhaps this indicates a Tuning problem?
A: Maybe, but Oracle can process the query efficiently with exactly the same setup, but it requires our Customers to buy Enterprise edition and further
development

Q: Is this feasible?
A: Unfortunately, the Customer's budget cannot cover this, and why should they? But, Oracle can process MAX subqueries if you embed part of the result in a
Function (i.e. still further development). However, the SQL is then not portable.


Is it true therefore that we are forced to code around what appear to be deficiencies in Oracle being able to process straight SQL? We realise that Oracle and SQL Server are very different, but find the 'workarounds' required difficult to explain and justify.

Any comments would be appreciated.

Tom Kyte
December 06, 2005 - 2:53 pm UTC

analytics are a FEATURE of Oracle 9i standard, enterprise, personal editions and up. (even XE in 10g has it!)


Use analytics.


and even if you don't, you can use:

SELECT ID,
substr(
MAX(to_char(start_date,'yyyymmddhh24miss')||projected_status_code),
15 ) projected_status_code
FROM resource_schedule
WHERE entry_type = 'C'
GROUP BY ID
/

and the right analytic would likely be:

select *
from (select id, projected_status_code, start_date,
max(start_date) over (partition by id) max_dt
from resource_schedule)
where max_dt = start_date;

not probably the one you have that scans, groups, then windows, then filters.


I would say it looks like your default memory settings are a tad low if you are sorting to disk in the first place? Might that be part of the issue here.

MAX Subquery

Andy, December 07, 2005 - 4:39 am UTC

Thanks for your reply. Very lateral!

OLAP's still disallow portability, so we'll have to weigh up that one.

We'll have a look at the memory settings, too.

Tom Kyte
December 07, 2005 - 7:13 am UTC

SELECT ID,
substr(
MAX(to_char(start_date,'yyyymmddhh24miss')||projected_status_code),
15 ) projected_status_code
FROM resource_schedule
WHERE entry_type = 'C'
GROUP BY ID
/

all databases..... and likely better than analytics in this particular case.

Max Subquery

Andy, December 07, 2005 - 4:44 am UTC

Scratch the OLAP comment - just found out that SQL Server has OLAP too!

Tom Kyte
December 07, 2005 - 7:13 am UTC

different kind of olap, as far as I know, they have barely implemented the ANSI analytic functions.

A reader, December 16, 2005 - 11:43 am UTC

Hi Tom,

UPDATE client_details c
SET mktg_level_id =
(SELECT MAX (mk_c.customer_id)
FROM customer mk_c,
customer_hierarchy mk_ch
WHERE mk_c.valid_flag = 'Y'
AND mk_c.cl_mktg_lvl_client_loc = 1
AND mk_ch.valid_flag = 'Y'
AND mk_ch.child_customer_id = c.cust_level_id
AND mk_ch.parent_customer_id = mk_c.customer_id),
mktg_level_name =
(SELECT MAX (mk_c.customer_description)
FROM customer mk_c,
customer_hierarchy mk_ch
WHERE mk_c.valid_flag = 'Y'
AND mk_c.cl_mktg_lvl_client_loc = 1
AND mk_ch.valid_flag = 'Y'
AND mk_ch.child_customer_id = c.cust_level_id
AND mk_ch.parent_customer_id = mk_c.customer_id)
WHERE c.cust_level_id IN (
SELECT mk_ch.child_customer_id
FROM customer mk_c,
customer_hierarchy mk_ch
WHERE mk_c.valid_flag = 'Y'
AND mk_c.cl_mktg_lvl_client_loc = 1
AND mk_ch.valid_flag = 'Y'
AND mk_ch.child_customer_id = c.cust_level_id
AND mk_ch.parent_customer_id = mk_c.customer_id)


Which is the better way to do the above update ?

client_details has around 800,000 rows

Current Plan is

Operation Object Name Rows Bytes Cost Object Node In/Out PStart PStop

UPDATE STATEMENT Optimizer Mode=CHOOSE 1 33
UPDATE CLIENT_DETAILS
FILTER
TABLE ACCESS FULL CLIENT_DETAILS 1 53 21
TABLE ACCESS BY INDEX ROWID CUSTOMER 1 11 6
NESTED LOOPS 1 25 12
TABLE ACCESS BY INDEX ROWID CUSTOMER_HIERARCHY 1 14 6
INDEX RANGE SCAN CUST_CHILD_I1 1 3
INDEX RANGE SCAN CUSTOMER_I0 3 2
SORT AGGREGATE 1 25
TABLE ACCESS BY INDEX ROWID CUSTOMER 1 11 6
NESTED LOOPS 3 75 18
TABLE ACCESS BY INDEX ROWID CUSTOMER_HIERARCHY 2 28 6
INDEX RANGE SCAN CUST_CHILD_I1 4 3
INDEX RANGE SCAN CUSTOMER_I0 3 2
SORT AGGREGATE 1 61
TABLE ACCESS BY INDEX ROWID CUSTOMER 1 47 6
NESTED LOOPS 3 183 18
TABLE ACCESS BY INDEX ROWID CUSTOMER_HIERARCHY 2 28 6
INDEX RANGE SCAN CUST_CHILD_I1 4 3
INDEX RANGE SCAN CUSTOMER_I0 3 2


Tom Kyte
December 16, 2005 - 1:04 pm UTC

definitely START with:

set (a,b) = ( select max(x), max(y) from ..... )

but sigh, no version, so who knows if this could be done via merge or not :(

why not FAST FULL SCAN

Rajeswari, December 23, 2005 - 2:06 am UTC

select * from v$version;

BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.6.0 - 64bit Production
PL/SQL Release 9.2.0.6.0 - Production
CORE 9.2.0.6.0 Production
TNS for HPUX: Version 9.2.0.6.0 - Production
NLSRTL Version 9.2.0.6.0 - Production


set autotrace traceonly explain

QUERY 1
==========
SELECT port.LOCATION LOCATION,port.EQUIP_ID EQUIP_ID,
port.CHASSIS_ID CHASSIS_ID, port.PORT_ID PORT_ID,
port.PORT_TYPE PORT_TYPE,
port.ASSGNMT_RESTRCT ASSGNMT_RESTRCT
FROM
xbo_et_port port,
(select asptmp.location,asptmp.equip_id,asptmp.chassis_id,asptmp.port_id
from
(select asp.location,asp.equip_id,asp.chassis_id,asp.port_id,asp.ckt_nbr,asp.version,
max(asp.ckt_nbr) over (partition by asp.location,asp.equip_id,asp.chassis_id,asp.port_id) maxcktnbr,
min(asp.version) over (partition by asp.location,asp.equip_id,asp.chassis_id,asp.port_id) minversion
from xbo_st_portckt asp
) asptmp
where asptmp.ckt_nbr = asptmp.maxcktnbr and asptmp.version = asptmp.minversion
) asgnport
WHERE
asgnport.LOCATION = port.location AND asgnport.EQUIP_ID = port.equip_id
and asgnport.CHASSIS_ID = port.chassis_id AND asgnport.PORT_ID = port.port_id;


----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 348 | 56028 | 17 |
|* 1 | HASH JOIN | | 348 | 56028 | 17 |
|* 2 | VIEW | | 348 | 41760 | 5 |
| 3 | WINDOW BUFFER | | 348 | 12180 | 5 |
| 4 | INDEX FULL SCAN | XBO_ST_PORTCKT_AK1 | 348 | 12180 | 5 |
| 5 | TABLE ACCESS FULL | XBO_ET_PORT | 7395 | 296K| 10 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("ASPTMP"."LOCATION"="PORT"."LOCATION" AND
"ASPTMP"."EQUIP_ID"="PORT"."EQUIP_ID" AND
"ASPTMP"."CHASSIS_ID"="PORT"."CHASSIS_ID" AND
"ASPTMP"."PORT_ID"="PORT"."PORT_ID")
2 - filter("ASPTMP"."CKT_NBR"="ASPTMP"."MAXCKTNBR" AND
"ASPTMP"."VERSION"="ASPTMP"."MINVERSION")

QUERY 2
==========

select *
from
(select asp.ckt_nbr,asp.version,
max(asp.ckt_nbr) over (partition by asp.location,asp.equip_id,asp.chassis_id,asp.port_id) maxcktnbr,
min(asp.version) over (partition by asp.location,asp.equip_id,asp.chassis_id,asp.port_id) minversion,
port.LOCATION LOCATION,port.EQUIP_ID EQUIP_ID,
port.CHASSIS_ID CHASSIS_ID, port.PORT_ID PORT_ID,
port.PORT_TYPE PORT_TYPE,
port.ASSGNMT_RESTRCT ASSGNMT_RESTRCT
from
xbo_st_portckt asp,
xbo_et_port port
WHERE
asp.LOCATION = port.location AND asp.EQUIP_ID = port.equip_id
and asp.CHASSIS_ID = port.chassis_id AND asp.PORT_ID = port.port_id
) asptmp
where asptmp.ckt_nbr = asptmp.maxcktnbr and asptmp.version = asptmp.minversion;

-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 348 | 50460 | 19 |
|* 1 | VIEW | | 348 | 50460 | 19 |
| 2 | WINDOW SORT | | 348 | 26448 | 19 |
|* 3 | HASH JOIN | | 348 | 26448 | 13 |
| 4 | INDEX FAST FULL SCAN| XBO_ST_PORTCKT_AK1 | 348 | 12180 | 2 |
| 5 | TABLE ACCESS FULL | XBO_ET_PORT | 7395 | 296K| 10 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("ASPTMP"."CKT_NBR"="ASPTMP"."MAXCKTNBR" AND
"ASPTMP"."VERSION"="ASPTMP"."MINVERSION")
3 - access("ASP"."LOCATION"="PORT"."LOCATION" AND
"ASP"."EQUIP_ID"="PORT"."EQUIP_ID" AND "ASP"."CHASSIS_ID"="PORT"."CHASSIS_ID"
AND "ASP"."PORT_ID"="PORT"."PORT_ID")

Note: cpu costing is off

Constraint
===========
1 select cast(column_name as varchar2(30)) column_name,position from user_cons_columns
2* where constraint_name='XBO_ST_PORTCKT_AK1' order by position
ncon@DEV74> /

COLUMN_NAME POSITION
------------------------------ ----------
LOCATION 1
EQUIP_ID 2
CHASSIS_ID 3
PORT_ID 4
CKT_NBR 5
VERSION 6
CHANNEL_NBR 7
LOG_PORT_ID 8
LOG_CHAN_ID 9



When I run the below query I am getting index "FULL SCAN" on XBO_ST_PORTCKT table.

QUERY 3
==========

select asp.location,asp.equip_id,asp.chassis_id,asp.port_id,asp.ckt_nbr,asp.version,
max(asp.ckt_nbr) over (partition by asp.location,asp.equip_id,asp.chassis_id,asp.port_id) maxcktnbr,
min(asp.version) over (partition by asp.location,asp.equip_id,asp.chassis_id,asp.port_id) minversion
from xbo_st_portckt asp;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=348 Bytes=12180)
1 0 WINDOW (BUFFER) (Cost=5 Card=348 Bytes=12180)
2 1 INDEX (FULL SCAN) OF 'XBO_ST_PORTCKT_AK1' (UNIQUE) (Cost=5 Card=348 Bytes=12
180)


2nd Query will be more efficient because of consistent gets, predicate moved to inline view and analytic function executed at the end.

1) In query 2, I am able to get because of predicate to inline view "FAST FULL SCAN" is happening which is not in first query. I am not very clear. Will you please elaborate on this?

2) In query 3, I am using analytic funtion only on indexed columns in XBO_ST_PORTCKT table. Then why "FAST FULL SCAN" is not happening?

3) If time persist will you please explain WINDOW (SORT) and WINDOW (BUFFER)


Tom Kyte
December 23, 2005 - 12:19 pm UTC

well, it is always "in general, lowering the logical IO is the goal".

the index full scan is skipping the sort step - and LIO's could well be exactly the same using a full scan or fast full scan (in fact, full scan could do *less* LIO since it won't read all of the branch blocks)

Rajeswari, December 28, 2005 - 5:27 am UTC

Thanks Tom. Your response throws some light to understand more on explain plan.

Will you please provide the link where I can get more details on WINDOW BUFFER and WINDOW SORT

One more question - Sorting

Rajeswari, December 28, 2005 - 7:24 am UTC

I tried the Oracle documentation to get some example how Index FULL scan eliminates sorting which is not in Index FAST full scan. Not able to find sufficient details.

Will you please give some simple example which shows this (sorting) step and explain (similar to the explanation, pseudo code which you used to give for nested loop Vs hash join)

Tom Kyte
December 28, 2005 - 9:55 am UTC

index full scans read the index "in order" - a single block at a time. They start at the root, navigate down the left hand side to the very first leaf block, and then read all of the leaf blocks in order (they all point to eachother in a doubly linked list)

index FAST full scans just use multiblock IO (like a full table scan would) to read ALL of the blocks of an index - in any order. They ignore the branch blocks and just process the leaf blocks - but the leaf blocks are hit in "some random order", not predicable.

Thanks Tom for detail explanation

A reader, December 29, 2005 - 12:33 am UTC

Explanation helps me to understand better about index scan.
Thanks Tom for great work.

carrying out tests

A reader, December 29, 2005 - 5:33 am UTC

Hi

I dont understand why in your test you order by pet_id. You said it's to ensure the index is used. How so? If you dont oder by pet_id the index wont be used?

Tom Kyte
December 29, 2005 - 11:45 am UTC

the optimizer wasn't smart enough to use the index unless I ordered by the entire index key - even though that technically wasn't necessary due to the "where pet_id = constant" predicate.

Multiple max() and min() sub-queries

naresh, December 30, 2005 - 5:03 am UTC

Tom,

Great thread!

Here is my test case (a longish one, I hope it's ok) - I have a table: 

create table port_ctn (subs_no number(9), status char(1), seq_no number(9))

table maintains history of subscriber (identified by subs_no), status, with seq_no increasing for every new version.


What we need is

A) To get the first record with status != 'D', that occurs after the last record with status 'D'. 
B) If a status 'D' does not exist, then get the first record.

(All ordering in the above requirement is stated in terms of seq_no values).

test data for case A

delete port_ctn;
insert into port_ctn values (100,'D',1);
insert into port_ctn values (100,'N',2);
insert into port_ctn values (100,'R',3);
insert into port_ctn values (100,'D',4);
insert into port_ctn values (100,'R',5);
insert into port_ctn values (100,'N',6);
insert into port_ctn values (100,'D',7);
insert into port_ctn values (100,'N',8);
insert into port_ctn values (100,'D',9);
commit;

for case B

delete port_ctn;
insert into port_ctn values (100,'N',2);
insert into port_ctn values (100,'R',3);
commit;

First came up with this:

Select * from 
Port_ctn
Where subs_no = 100 and seq_no = 
    (select min(seq_no) from port_ctn where subs_no = 100 and status != 'D'
         and seq_no > (select nvl(max(seq_no),0) from port_ctn a where subs_no = 100 and status = 'D'
                    and exists (select null from port_ctn where subs_no = 100 and seq_no > a.seq_no
                                and status != 'D')
                 )
    )

Then thought over analytics and came up with this

Select subs_no, decode(status, 'D', next_status, status),
         decode(status, 'D', next_seq, seq_no)
from
(Select subs_no, seq_no, status, lag(status) over (order by seq_no desc) as next_status,  -- for case A
                    lag(seq_no) over (order by seq_no desc) as next_seq, -- for case A
                    lead (status) over (order by seq_no desc) as prev_status  -- for case B
From port_ctn
Where subs_no = 100)
Where ( (status = 'D' and nvl(next_status,'D') != 'D') -- for case A
   or   (prev_status is NULL) -- for case B
    )
 and rownum = 1


For test case A:
SQL> Select subs_no, seq_no, status, lag(status) over (order by seq_no desc) as next_status,  -- for case A
  2                                     lag(seq_no) over (order by seq_no desc) as next_seq, -- for case A
  3                                     lead (status) over (order by seq_no desc) as prev_status  -- for case B
  4  From port_ctn
  5  Where subs_no = 100
  6  /

   SUBS_NO     SEQ_NO STATUS NEXT_STATU   NEXT_SEQ PREV_STATU
---------- ---------- ------ ---------- ---------- ----------
       100          9 D                            N    
       100          8 N      D                   9 D
       100          7 D      N                   8 N          >> selected for case A
       100          6 N      D                   7 R
       100          5 R      N                   6 D
       100          4 D      R                   5 R    >> selected for case A
       100          3 R      D                   4 N
       100          2 N      R                   3 D
       100          1 D      N                   2    >> selected, for both case A and B

**** QUESTION ****: Will the rownum = 1 always work in this case, meaning I can depend on the order of retrieval from sub-query? 
The presence of two conditions in the outer query does not matter?

For Test Case B:
  1  Select subs_no, seq_no, status, lag(status) over (order by seq_no desc) as next_status,  -- for case A
  2                                     lag(seq_no) over (order by seq_no desc) as next_seq, -- for case A
  3                                     lead (status) over (order by seq_no desc) as prev_status  -- for case B
  4  From port_ctn
  5* Where subs_no = 100
SQL> /

   SUBS_NO     SEQ_NO STATUS NEXT_STATU   NEXT_SEQ PREV_STATU
---------- ---------- ------ ---------- ---------- ----------
       100          3 R                            N
       100          2 N      R                   3        >> selected for case B

In this case it seems the rownum works OK as only one row gets selected (?).

This is more or less a "design review" - but I would really appreciate your thoughts.

Thanks,
Naresh. 

Tom Kyte
December 30, 2005 - 9:32 am UTC

ops$tkyte@ORA10GR2> break on subs_no skip 1
ops$tkyte@ORA10GR2> select *
  2    from port_ctn
  3   order by subs_no, seq_no;

   SUBS_NO S     SEQ_NO
---------- - ----------
       100 D          1
           N          2
           R          3
           D          4
           R          5
           N          6
           D          7
           N          8
           D          9

       101 N          2
           R          3

       102 D          2
           D          3


13 rows selected.

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> select subs_no,
  2         to_number( substr( max( to_char( seq_no, 'fm00000') || status ), 1, 5 ) ) seq_no,
  3         substr( max( to_char( seq_no, 'fm00000') || status ), 6 ) status
  4    from (
  5  select subs_no,
  6         status,
  7         nvl(lag(status) over(partition by subs_no order by seq_no),'D') last_status,
  8         seq_no
  9    from port_ctn
 10         )
 11   where last_status = 'D'
 12     and status <> 'D'
 13   group by subs_no
 14  /

   SUBS_NO     SEQ_NO ST
---------- ---------- --
       100          8 N

       101          2 N


is one approach. 

Thank You !

Naresh, December 31, 2005 - 4:25 pm UTC

Hello Tom,

Thanks a lot for your insight - even your test cases were more complete.

Looking at your query, which is much cleaner, I got below - I understand that it will work for only one
subs_no at a time, but it looks much better than my first one! (which also works for one subs_no only).

**** MY QUESTION STILL REMAINS THOUGH ****:

Can I depend on the order of the inner query - meaning can I use rownum = 1 in the outer query?

select subs_no, status, seq_no from (
select subs_no, status,
nvl(lead(status)
over(partition by subs_no order by seq_no desc),'D') prev_status,
seq_no
from port_ctn where subs_no = 100)
where status != 'D' and prev_status = 'D'
and rownum = 1

Thanks and a Happy New Year to You,
Naresh.


Tom Kyte
December 31, 2005 - 5:15 pm UTC

confusion, mine works on any number of subs_no? that is why I changed your inputs.


I see no order by in your query, hence, rownum of 1 is assigned to "some first row", nothing more - nothing less.



Order by

Naresh, January 01, 2006 - 7:06 am UTC

Tom,

I meant

- your query with the max in outer select works for muliple subs_no
- The new one I wrote works only for one subs_no (as did the first one I wrote)


Regarding the rownum=1,

I had thought that the order by in the over() clause would perform the ordering - however, from the manual (8.1.7)

***
Note: Analytic functions always operate on rows in the order specified in the ORDER_BY_clause of the function. However, the ORDER_BY_clause of the function does not guarantee the order of the result. Use the ORDER_BY_clause of the query to guarantee the final result ordering.
***

So OK, I will need an explicit order by seq_no desc in the inner query.

Thanks,
Naresh.


Tom Kyte
January 01, 2006 - 11:08 am UTC

The order by in the over() would order the data for the evaluation of the analytic function - this is true - so the lag() value would always be "correct" in that regards - but it does not promise that the result set is subsequently ordered by that same set of attributes!

is there a better way to re-write this query

A reader, February 03, 2006 - 10:32 am UTC

select equipment, equipment_type, action, audit_id, primary_hazards,damages, ssl_user_scac, voided_date, in_visit, out_visit, chassis_container,yard_area
from edi_outbound_view
where action <> 'LOC'
and audit_id > 2751793
and (equipment_type = 'CH' or (equipment_type = 'CN' and audit_id > 9025262 ))
order by ssl_user_scac, created_date

Tom Kyte
February 03, 2006 - 5:18 pm UTC

looks dandy to me, says exactly what you want it to say....

This is one way....

A reader, February 03, 2006 - 12:02 pm UTC

select equipment, equipment_type, action, audit_id, primary_hazards, damages,
ssl_user_scac, voided_date, in_visit, out_visit, chassis_container, yard_area
from edi_outbound_view
where action <> 'LOC'
and (audit_id > 2751793
and equipment_type = 'CH' or (equipment_type = 'CN' and audit_id > 9025262 ))
order by ssl_user_scac, created_date;

is there a better way to re-write this query

Tim, February 05, 2006 - 8:28 pm UTC

I would disagree with the response - well - perhaps this is another way - but not one I would suggest.

I have some problems with the second (rewritten) query.

Does this even give the same result? I don't know. Why should I even have to think this hard about a query which seems rather simple? My own opinion is that whenever I have AND / OR combined together - to make liberal use of parens around the parts - keeping likes together. Exactly like it is done in the first query.

Now, when going back to look at the query (maybe six months later - maybe a different person looking at it) - it is just much easier to understand. In the first query - obviously the inner AND is satisfied first, then the OR. (Forget about any possible boolean short circuit - which may or may not exist.) Then - just a bunch of ANDs - very easy to wrap your mind around.

But in the second query - it is just more confusing.

Maybe it is just me - but I see no gain in trying to make something more complicated than it has to be.

why do I get "invalid column name" in a subquery?

M Jackson, February 23, 2006 - 4:49 pm UTC

I've learned so much from reading Ask Tom, but today I've hit a snag when trying to apply a trick I learned in this thread:

I need to select some data about a set of customer accounts from table c, plus their most recent payment amount from table t.

Table t has many types of cash transactions where payments are identified by trx_type = 'P'.

This query gets the latest payment for customer 123456:

Select amount
from (select amount
from t
where trx_type = 'P'
and customer = 123456
order by trx_date DESC)
where rownum = 1;

The query below gets the set of accounts and other data I need except for the last payment:

select c.account_number,
c.agreement_id
from c
where c.agreement_type = 'B';

I try to incorporate the query for the last payment amount to get the last payment for all the accounts, like this:

select c.account_number,
c.agreement_id
(Select amount
from (select amount
from t
where t.trx_type = 'P'
and t.customer = c.account_number
order by t.trx_date DESC)
where rownum = 1) lastpay
from bl_cus_def_payments cdp
where cdp.agreement_type = 'B';

I get the error "invalid column name" for c.account_number.

Table t is large. I'm trying to avoid a full scan. I've had a similar result when I try to use analytics.

Can I get all the information I need from a single query?

Thanks for your help.


Tom Kyte
February 23, 2006 - 8:09 pm UTC

instead of:

Select amount
from (select amount
from t
where trx_type = 'P'
and customer = 123456
order by trx_date DESC)
where rownum = 1;


select to_number(substr(max(to_char(trx_date,'yyyymmddhh24miss')||amount),15))
from t
where trx_type = 'P'
and t.customer = c.account_number


the problem is - the correlation variables can only go one level down in the scalar subquery


Brilliant!

M Jackson, February 24, 2006 - 10:48 am UTC


A variation on the inital theme

Iudith Mentzel, March 14, 2006 - 12:28 pm UTC

Hi Tom,

First of all, I should underline how useful this thread is
for making clear some basic aspects.

In the following examples I use these tables:

a. A daily currency rates table CURRENCY_RATES,
with a unique index IX_GIZ_DRATE on
( CURRENCY, RATE_DATE ), about 300,000 rows.
b. TAB_CURRENCY, with a unique index on CURR_CODE,
about 200 rows.

My question is directly related to the very first explanation of yours on this thread:

Indeed, for returning the last rate of a given CURRENCY
effective for a given DATE, using a query like:

select t.rate_date, t.rate
from (select rate_date, rate
from currency_rates r
where currency = '&my_currency'
and rate_date <= to_date('&my_date,'DDMMYYYY')
order by currency desc, rate_date desc
) t
where rownum = 1

works great and gives a plan like the following:

QUERYPLAN
--------------------------------------------------
COUNT STOPKEY
VIEW
INDEX RANGE SCAN DESCENDING IX_GIZ_DRATE

Here ix_giz_drate is a unique index on (currency, rate_date)

Especially, I totally agree with you that this is the correct solution instead of using an INDEX_DESC hint.

By the way, the following interesting thing happens
(Oracle 8.1.7.4):

If instead of selecting specific columns, I use
SELECT *, then the plan changes to a full table scan:

select * from
( select *
from currency_rates
where currency = '&my_currency'
and rate_date <= to_date('&my_date','DDMMYYYY')
order by currency desc, rate_date desc )
where rownum = 1
/

QUERYPLAN
------------------------------------------------------------
COUNT STOPKEY
VIEW
SORT ORDER BY STOPKEY
TABLE ACCESS FULL GIZ_DAILY_RATE


But the main question I'd like to ask is the following:

If I want to retrieve the LAST rate date not for a specific
CURRENCY and DATE, but for each member of a list of (CURRENCY,DATE) pairs, coming from a "qualifiers" table,
( here I use the TAB_CURRENCY as a qualifiers table ),
then of course I can use the following "classic" select:

select c.curr_code, max(r.rate_date)
from currency_rates r, tab_currency c
where r.currency = c.curr_code
and r.rate_date <= c.curr_refins
group by c.curr_code
/

but this has a plan with an INDEX FAST FULL scan and
not an INDEX RANGE SCAN DESCENDING with a COUNT STOPKEY.

QUERYPLAN
---------------------------------------------------
SORT GROUP BY
HASH JOIN
TABLE ACCESS FULL TAB_CURRENCY
INDEX FAST FULL SCAN IX_GIZ_DRATE

This is in fact a "simplified version" for a very
wide range of questions.

For example, in real-life, instead of CURRENCY_RATES
we have a large FACTS table (millions of rows),
and instead of the TAB_CURRENCY "qualifiers" table
we may have several "code" tables with different conditions
on them (usually "materialized" in a "single" qualifiers
table, using your "ROWNUM materialization technique",
learnt from you a month ago, during your visit in Israel !),

but the problem I want to solve is the same:

For EACH "qualifier" record (qualifier record meaning
a given ENTITY + DATE pair), I want the last
record from the FACTS table for that given ENTITY+DATE
( "last" has the meaning of ordering by DATE separately
for each entity ).

The case of the CURRENCY_RATES table above maybe is not
"the tipical case", because these tables are not so big,
so even an INDEX FAST FULL SCAN is so-called "fast-enough",
although much less fast than an INDEX descending scan.

But in real life, with a huge FACTS table, I definitely
need the "INDEX descending scan" behavior, where the
scan stops on the first record found FOR EACH ENTITY+DATE.

I can achieve this by using a scalar query to retrieve
the last date only with the INDEX_DESC technique
( the one that you would rather fire the developer who
dares to use it ! ):

select c.curr_code,
( select /*+ INDEX_DESC (R IX_GIZ_DRATE) */
r.rate_date
from currency_rates r
where r.currency = c.curr_code
and r.rate_date <= c.curr_refins
and rownum = 1 ) as RATE_DATE
from
tab_currency c
/

( and this is extremely fast ! )

but unfortuntely I cannot use the correct
"ORDER BY + ROWNUM" technique, because this requires
an additional inner-view level and I cannot propagate
the qualifiers (ENTITY+DATE) to the inner-most level,
that is, two levels deeper from the qualifiers table:

select c.curr_code,
( select t.rate_date
from
( select r.rate_date
from currency_rates r
where r.currency = c.curr_code
and r.rate_date <= c.curr_refins
order by r.currency desc, r.rate_date desc ) t
where rownum = 1 )
from
tab_currency c
/

which of course will not work
( ORA-00904 invalid column name ).

Just for checking things out, I tried the INDEX_DESC
method even for a list of about 8,000 qualifier pairs
and it is still far much faster than any other approach
it tried up to now.

Of course, an alternative to using the INDEX_DESC hint
could be to simply write a PL/SQL function that will
work for a SINGLE "entity+date" pair, and maybe will
return an SQL "object type" ( a record containing ALL
the facts table columns, not just the last DATE ).

Or, another approach could be to use analytic functions,
to retrieve the last facts record with all its columns,
but I'm not sure whether using a construct like

... OVER (PARTITION BY "entity"
ORDER BY "date" DESC)

could achieve the effect of "COUNT STOP KEY" for each
entity+date pair, separately, just as a PL/SQL function
certainly will.

I intendedly avoided to post one of our real-time queries,
because it could be too "verbose" ( I could do it if you
have patience for it ), but I'd like enormously to hear your opinion on this category of queries,
what do you think is the best approach.

Yes, you are right if you would say that "it depends",
of course, it depends on the real schema design and the
physical characteristics of the data, but I'm kind of a
"theoretician" that needs to agree first with the principle
of a solution, before I even think of what could make it
work amazingly well or wrongly bad in one specific case
or another.

Thanks a lot for your patience, thanks again and again
for the great privilege of having you here in Israel
with us and, also HAPPY BIRTHDAY !

My best regards,
Iudith Mentzel
ZIM Integrated Shipping Services Ltd.
Haifa, Israel


Guna Srinivas, March 31, 2006 - 5:23 am UTC

Hi Tom,

How do we get NVL values from subquery when subquery return no row...

For Example...
Update table_A set (a1,a2) = (select nvl(row1,0),nvl(row2,0) from table_b where row3 = 'X')

Assume that subquery will return max one row at any time..
Now if subquery returns 0 row updating the coulum with null. How do i re-write this query...

Thanks in advance...



Tom Kyte
March 31, 2006 - 12:19 pm UTC

and what would you like it to return exactly?

in order to rewrite it, we need to know what you want it to do.

Guna srinivas, March 31, 2006 - 12:43 pm UTC

Hi Tom,

Thanks for your immediate response..

Actually i have query that

update table_a set a1 = nvl(select col1 from table_b where col3 = 'X'),0) ,a2 = nvl(select col2 from table_b where col3 = 'X'),0)

If the where clause is failed i mean col3 <> 'X' then the subquery wont return any row. so i can use NVL if null then i can update 0.

So i have tried two subquery in one that

Update table_A set (a1,a2) = (select nvl(row1,0),nvl(row2,0) from table_b where row3 = 'X')

but if no row i mean when where clause return false it is updating null to a1, a2.

because I dont want to select two times from table_b ..

So how do I update table_A (a1,a2) without NULL value..

Thanks in advance..


Tom Kyte
March 31, 2006 - 1:22 pm UTC

update A set (x,y) = (select r1, r2 from b where r3 = 'X' )
WHERE EXISTS (select null from b where r3 = 'X' )


why even update if the row doesn't exist... the where exists will sort of pre-empt the update from happening.

Guna srinivas, March 31, 2006 - 1:44 pm UTC

Thanks Tom for your immediate reply

query tuning

sharma, January 10, 2007 - 7:39 am UTC

Dear Tom,

This is how the query look like if you want i can post the explain plan of this. i am tuning this query since many days but i am not able to tune properly.

My question is ,

1. As i know that if we use aggregate functions index wont be used , in this query almost it consists of aggregate functins only. how do i tune this query. can you pls give me few updates on this. that will great help if you could give me some updates if i have to rewrite this query.


SELECT
ITM.ITM_INV_ITEM_ID
,ITM.ITM_ORGANIZATION_ID
,ITM.ITM_ITEM_NO
,ITM.ITM_DESCRIPTION
,ITM.ITM_BUYER_ID
,ITM.ITM_BUYER_NAME
,ITM.ITM_PLANNER_CODE
,ITM.ITM_MAKE_OR_BUY
,ITM.ITM_MIN_MINMAX_QTY
,ITM.ITM_MAX_MINMAX_QTY
,ITM.ITM_MIN_ORDER_QTY
,ITM.ITM_PREPROCESSING_LEAD_TIME
,ITM.ITM_FULL_LEAD_TIME
,ITM.ITM_TOTAL_LEAD_TIME
,ITM.ITM_MRP_PLANNING_CODE
,ITM.ITM_FIXED_DAYS_SUPPLY
,ITM.ITM_MRP_PLANNING_MEANING
,ITM.ITM_STD_LOT_SIZE
,ITM.ITM_SAFETY_STOCK
,ITM.ITM_FIXED_LOT_MULTIPLIER
,ITM.ITM_PRIMARY_UOM_CODE
,LOC.SA_SUB_AREA
-- ,LOC.SN_SECTION_NAME
,LOC.AR_AREA
,LOC.SN_ORG_ID
,ORG.IO_INV_ORG_ID
,ORG.IO_INV_ORG_NAME
,ORG.LE_LEGAL_ENTITY_ID
,ORG.LE_LEGAL_ENTITY_NAME
,ORG.OU_OPERATING_UNIT_ID
,ORG.OU_OPERATING_UNIT_NAME
,DFCT.DY_DAY_FK
,DFCT.IMP_IND_FLAG
,SUM(DFCT.ON_HAND_QTY) ON_HAND_QTY
,MAX(DFCT.STD_COST) STD_COST
,SUM(DFCT.STD_VALUE) STD_VALUE
,MAX(DFCT.AVG_COST) AVG_COST
,SUM(DFCT.AVG_VALUE) AVG_VALUE
,SUM(DFCT.GIT_QTY) GIT_QTY
,SUM(DFCT.GIT_STD_VALUE) GIT_STD_VALUE
,SUM(DFCT.GIT_AVG_VALUE) GIT_AVG_VALUE
,SUM(DFCT.SUM_RECEIPTS_IND) SUM_RECEIPTS_IND
,SUM(DFCT.SUM_RECEIPTS_IMP) SUM_RECEIPTS_IMP
,SUM(DFCT.COUNT_RECEIPTS_IMP) COUNT_RECEIPTS_IMP
,SUM(DFCT.COUNT_RECEIPTS_IND) COUNT_RECEIPTS_IND

,MAX(DFCT.LAST_RECEIPT_DATE_IND) LAST_RECEIPT_DATE_IND
,MAX(DFCT.LAST_RECEIPT_DATE_IMP) LAST_RECEIPT_DATE_IMP
,MIN(DFCT.FIRST_RECEIPT_DATE_IMP) FIRST_RECEIPT_DATE_IMP
,MIN(DFCT.FIRST_RECEIPT_DATE_IND) FIRST_RECEIPT_DATE_IND
,SUM(DFCT.PHY_INV_ADJ_QTY_ISSUES) PHY_INV_ADJ_QTY_ISSUES
,SUM(DFCT.PHY_INV_ADJ_QTY_RECEIPTS) PHY_INV_ADJ_QTY_RECEIPTS
,SUM(DFCT.PHY_INV_ADJ_AVG_VALUE_ISSUES) PHY_INV_ADJ_AVG_VALUE_ISSUES
,SUM(DFCT.PHY_INV_ADJ_AVG_VALUE_RECEIPTS) PHY_INV_ADJ_AVG_VALUE_RECEIPTS
,SUM(DFCT.PHY_INV_ADJ_STD_VALUE_ISSUES) PHY_INV_ADJ_STD_VALUE_ISSUES
,SUM(DFCT.PHY_INV_ADJ_STD_VALUE_RECEIPTS) PHY_INV_ADJ_STD_VALUE_RECEIPTS
,MAX(DFCT.LAST_TRANS_DATE_EXCP_RECPT) LAST_TRANS_DATE_EXCP_RECPT
,SUM(DFCT.QTY_COMPLETED) QTY_COMPLETED
,SUM(DFCT.QTY_SHIPPED) QTY_SHIPPED
,SUM(DFCT.TOTAL_QTY_CYCL_COUNT) TOTAL_QTY_CYCL_COUNT
,SUM(DFCT.CONSUMPTION_RATE_VAL) CONSUMPTION_RATE_VAL
,SUM(DFCT.CONSUMPTION_RATE_QTY) CONSUMPTION_RATE_QTY
,0 WIP_QTY
,0 WIP_STD_VALUE
,0 WIP_AVG_VALUE
from dm_inv_dly_item_cube DFCT
,DM_INV_ITEM_DIM ITM
,DM_INV_LOCATION_DIM LOC
,DM_INV_ORG_DIM ORG
,DM_TIME_DIM TD
where DFCT.DY_DAY_FK = TD.DY_DAY_PK
AND DFCT.ITM_INV_ITEM_FK = ITM.ITM_INV_ITEM_PK
AND DFCT.SN_LOCATION_ID_FK = LOC.SN_LOCATION_PK
AND DFCT.SI_INV_ORG_DIM_FK = ORG.SI_INV_ORG_DIM_PK
GROUP BY ITM.ITM_INV_ITEM_ID
,ITM.ITM_ORGANIZATION_ID
,ITM.ITM_ITEM_NO
,ITM.ITM_DESCRIPTION
,ITM.ITM_BUYER_ID
,ITM.ITM_BUYER_NAME
,ITM.ITM_PLANNER_CODE
,ITM.ITM_MAKE_OR_BUY
,ITM.ITM_MIN_MINMAX_QTY
,ITM.ITM_MAX_MINMAX_QTY
,ITM.ITM_MIN_ORDER_QTY
,ITM.ITM_PREPROCESSING_LEAD_TIME
,ITM.ITM_FULL_LEAD_TIME
,ITM.ITM_TOTAL_LEAD_TIME
,ITM.ITM_MRP_PLANNING_CODE
,ITM.ITM_FIXED_DAYS_SUPPLY
,ITM.ITM_MRP_PLANNING_MEANING
,ITM.ITM_STD_LOT_SIZE
,ITM.ITM_SAFETY_STOCK
,ITM.ITM_FIXED_LOT_MULTIPLIER
,ITM.ITM_PRIMARY_UOM_CODE
,LOC.SA_SUB_AREA
-- ,LOC.SN_SECTION_NAME
,LOC.AR_AREA
,LOC.SN_ORG_ID
,ORG.IO_INV_ORG_ID
,ORG.IO_INV_ORG_NAME
,ORG.LE_LEGAL_ENTITY_ID
,ORG.LE_LEGAL_ENTITY_NAME
,ORG.OU_OPERATING_UNIT_ID
,ORG.OU_OPERATING_UNIT_NAME
,DFCT.DY_DAY_FK
,DFCT.IMP_IND_FLAG;

Query tuning

Sharma, February 08, 2007 - 9:00 am UTC

Dear Tom,

This is how the query look like if you want i can post the explain plan of this. i am tuning this query since many days but i am not able to tune properly.

My question is ,

1. As i know that if we use aggregate functions index wont be used , in this query almost it consists of aggregate functins only. how do i tune this query. can you pls give me few updates on this. that will great help if you could give me some updates if i have to rewrite this query.


SELECT
ITM.ITM_INV_ITEM_ID
,ITM.ITM_ORGANIZATION_ID
,ITM.ITM_ITEM_NO
,ITM.ITM_DESCRIPTION
,ITM.ITM_BUYER_ID
,ITM.ITM_BUYER_NAME
,ITM.ITM_PLANNER_CODE
,ITM.ITM_MAKE_OR_BUY
,ITM.ITM_MIN_MINMAX_QTY
,ITM.ITM_MAX_MINMAX_QTY
,ITM.ITM_MIN_ORDER_QTY
,ITM.ITM_PREPROCESSING_LEAD_TIME
,ITM.ITM_FULL_LEAD_TIME
,ITM.ITM_TOTAL_LEAD_TIME
,ITM.ITM_MRP_PLANNING_CODE
,ITM.ITM_FIXED_DAYS_SUPPLY
,ITM.ITM_MRP_PLANNING_MEANING
,ITM.ITM_STD_LOT_SIZE
,ITM.ITM_SAFETY_STOCK
,ITM.ITM_FIXED_LOT_MULTIPLIER
,ITM.ITM_PRIMARY_UOM_CODE
,LOC.SA_SUB_AREA
-- ,LOC.SN_SECTION_NAME
,LOC.AR_AREA
,LOC.SN_ORG_ID
,ORG.IO_INV_ORG_ID
,ORG.IO_INV_ORG_NAME
,ORG.LE_LEGAL_ENTITY_ID
,ORG.LE_LEGAL_ENTITY_NAME
,ORG.OU_OPERATING_UNIT_ID
,ORG.OU_OPERATING_UNIT_NAME
,DFCT.DY_DAY_FK
,DFCT.IMP_IND_FLAG
,SUM(DFCT.ON_HAND_QTY) ON_HAND_QTY
,MAX(DFCT.STD_COST) STD_COST
,SUM(DFCT.STD_VALUE) STD_VALUE
,MAX(DFCT.AVG_COST) AVG_COST
,SUM(DFCT.AVG_VALUE) AVG_VALUE
,SUM(DFCT.GIT_QTY) GIT_QTY
,SUM(DFCT.GIT_STD_VALUE) GIT_STD_VALUE
,SUM(DFCT.GIT_AVG_VALUE) GIT_AVG_VALUE
,SUM(DFCT.SUM_RECEIPTS_IND) SUM_RECEIPTS_IND
,SUM(DFCT.SUM_RECEIPTS_IMP) SUM_RECEIPTS_IMP
,SUM(DFCT.COUNT_RECEIPTS_IMP) COUNT_RECEIPTS_IMP
,SUM(DFCT.COUNT_RECEIPTS_IND) COUNT_RECEIPTS_IND

,MAX(DFCT.LAST_RECEIPT_DATE_IND) LAST_RECEIPT_DATE_IND
,MAX(DFCT.LAST_RECEIPT_DATE_IMP) LAST_RECEIPT_DATE_IMP
,MIN(DFCT.FIRST_RECEIPT_DATE_IMP) FIRST_RECEIPT_DATE_IMP
,MIN(DFCT.FIRST_RECEIPT_DATE_IND) FIRST_RECEIPT_DATE_IND
,SUM(DFCT.PHY_INV_ADJ_QTY_ISSUES) PHY_INV_ADJ_QTY_ISSUES
,SUM(DFCT.PHY_INV_ADJ_QTY_RECEIPTS) PHY_INV_ADJ_QTY_RECEIPTS
,SUM(DFCT.PHY_INV_ADJ_AVG_VALUE_ISSUES) PHY_INV_ADJ_AVG_VALUE_ISSUES
,SUM(DFCT.PHY_INV_ADJ_AVG_VALUE_RECEIPTS) PHY_INV_ADJ_AVG_VALUE_RECEIPTS
,SUM(DFCT.PHY_INV_ADJ_STD_VALUE_ISSUES) PHY_INV_ADJ_STD_VALUE_ISSUES
,SUM(DFCT.PHY_INV_ADJ_STD_VALUE_RECEIPTS) PHY_INV_ADJ_STD_VALUE_RECEIPTS
,MAX(DFCT.LAST_TRANS_DATE_EXCP_RECPT) LAST_TRANS_DATE_EXCP_RECPT
,SUM(DFCT.QTY_COMPLETED) QTY_COMPLETED
,SUM(DFCT.QTY_SHIPPED) QTY_SHIPPED
,SUM(DFCT.TOTAL_QTY_CYCL_COUNT) TOTAL_QTY_CYCL_COUNT
,SUM(DFCT.CONSUMPTION_RATE_VAL) CONSUMPTION_RATE_VAL
,SUM(DFCT.CONSUMPTION_RATE_QTY) CONSUMPTION_RATE_QTY
,0 WIP_QTY
,0 WIP_STD_VALUE
,0 WIP_AVG_VALUE
from dm_inv_dly_item_cube DFCT
,DM_INV_ITEM_DIM ITM
,DM_INV_LOCATION_DIM LOC
,DM_INV_ORG_DIM ORG
,DM_TIME_DIM TD
where DFCT.DY_DAY_FK = TD.DY_DAY_PK
AND DFCT.ITM_INV_ITEM_FK = ITM.ITM_INV_ITEM_PK
AND DFCT.SN_LOCATION_ID_FK = LOC.SN_LOCATION_PK
AND DFCT.SI_INV_ORG_DIM_FK = ORG.SI_INV_ORG_DIM_PK
GROUP BY ITM.ITM_INV_ITEM_ID
,ITM.ITM_ORGANIZATION_ID
,ITM.ITM_ITEM_NO
,ITM.ITM_DESCRIPTION
,ITM.ITM_BUYER_ID
,ITM.ITM_BUYER_NAME
,ITM.ITM_PLANNER_CODE
,ITM.ITM_MAKE_OR_BUY
,ITM.ITM_MIN_MINMAX_QTY
,ITM.ITM_MAX_MINMAX_QTY
,ITM.ITM_MIN_ORDER_QTY
,ITM.ITM_PREPROCESSING_LEAD_TIME
,ITM.ITM_FULL_LEAD_TIME
,ITM.ITM_TOTAL_LEAD_TIME
,ITM.ITM_MRP_PLANNING_CODE
,ITM.ITM_FIXED_DAYS_SUPPLY
,ITM.ITM_MRP_PLANNING_MEANING
,ITM.ITM_STD_LOT_SIZE
,ITM.ITM_SAFETY_STOCK
,ITM.ITM_FIXED_LOT_MULTIPLIER
,ITM.ITM_PRIMARY_UOM_CODE
,LOC.SA_SUB_AREA
-- ,LOC.SN_SECTION_NAME
,LOC.AR_AREA
,LOC.SN_ORG_ID
,ORG.IO_INV_ORG_ID
,ORG.IO_INV_ORG_NAME
,ORG.LE_LEGAL_ENTITY_ID
,ORG.LE_LEGAL_ENTITY_NAME
,ORG.OU_OPERATING_UNIT_ID
,ORG.OU_OPERATING_UNIT_NAME
,DFCT.DY_DAY_FK
,DFCT.IMP_IND_FLAG;

analytical function use?

Mike, March 15, 2007 - 9:39 am UTC

Tom,

How can I improve this sql query. Can I use analytical function here perhas?


select c.course_no, c.name, a.service_id, count(distinct a.student_id), count(*) ,min(db_date), max(db_date), a.access
from student_activity a, permissions b, Region c
where a.student_id = b.student_id
and a.Region = c.Region
and upper(b.test_id) != 'P'
and a.school_days >= sysdate -1
group by c.course_no, c.name, a.service_id, a.access
order by c.course_no, c.name,a.service_id, a.access
Tom Kyte
March 15, 2007 - 12:29 pm UTC

you want to aggregate, hence analytics would not really be appropriate.

looks good to me.


thanks!

A reader, March 15, 2007 - 1:21 pm UTC


Query not using index

Suraj Sharma, March 21, 2007 - 2:07 am UTC

Hi Tom,

I am working on SQL tuning. For learning purpose on Oracle 8.1.0.6. I found one difficult query performing bad in our database and trying to Tune the same. The query is like:
"select * from ( select searchstat0_.id as id3_, searchstat0_.materialNumber as material2_3_, searchstat0_.lotNumber as lotNumber3_, searchstat0_.searchDate as searchDate3_, searchstat0_.plant as plant3_, searchstat0_.businessUnit as business6_3_, searchstat0_.searchresult as searchre7_3_, searchstat0_.version as version3_, searchstat0_.isinternal as isinternal3_, ( select max(md.description) from reguser.materialdescriptions md where searchstat0_.materialNumber = md.materialNumber and md.languageCode = 'EN') as formula0_ from reguser.searchstatistics searchstat0_ where searchstat0_.searchresult=:RESULT and searchstat0_.isinternal=:ISINTERNAL ) where rownum <= 100"

There are indexes on tables searchstatistics on ISINTERNAL and MATERIALNUMBER column. The query is not using any of these indexes I took this thing in hint as well asking the query to use index. My optimizer mode is set to FIRST_ROWS. First I was under impression that because I am using rownum the indexes are not been used, but when I issued the same query without rownum again it performed full tablescan. I am not sure why my query is not using the indexes. I used virtual index vizard (using Oracle Management Server) so see if there will be any performance inpact by creating an index on md.languafeCode column, but no impact.

1. Please suggest me the approach to look into tuning this query.
2. I am under impression that if CBO found full tablescan more effective than index scan it will go for the same, but not sure how to find this out.

I am sorry if I couldn't able to make you understand the question.
Tom Kyte
March 21, 2007 - 6:51 am UTC

you want a SINGLE index on (searchresult,isinternal)

but - you could tell the optimizer to use first_rows optimization

select /*+ first_rows */ ....

in current releases you would use first_rows(100) to tell it to get the first 100 rows as fast as possible.

VEry useful

Suraj Sharma, March 21, 2007 - 7:29 am UTC

Very-very useful reply. Thanks a lot Tom the sql is working better now.

How to judge about Index

Rathan Rao, March 21, 2007 - 7:59 am UTC

Hi Tom,

In the above query, how did you judge that the index should be on these columns. Please let me know so that from the next time we will also have something to analyze before index creation.
Tom Kyte
March 21, 2007 - 11:19 am UTC

look at the columns referenced in the predicate...

analytic question

Yoav, February 13, 2008 - 2:19 pm UTC

Hi Tom,
Could you please show , how to move this statment to analytic:

select l.subscriber, l.cust_id, l.campaign_id,
l.start_day, l.end_day
from cust_sale l,
( select lt.cust_id,
max(end_day) as end_day
from cust_sale lt
where end_day <= sysdate
group by lt.cust_id ) lst
where l.cust_id=lst.cust_id
and l.end_day=lst.end_day;

Thank You.

Tom Kyte
February 13, 2008 - 10:50 pm UTC

nope, because every time I write sql without have the ability to test it, I get it wrong.

meaning

no create table
no inserts
no look

max statement

Herbert, April 17, 2008 - 2:25 am UTC

Tom, would you rewrite a max query without subquery also?

SELECT max(weight) from t where pet_id = :b1

to:

SELECT weight from (select weight from t where pet_id = :b1
order by pet_id DESC, weight DESC )
where rownum = 1
Tom Kyte
April 17, 2008 - 7:29 am UTC

no, that would not be necessary in general - you should see a "min/max" optimization in the plan if it is possible.

Tunign the subquery to get rid of merge join cartesian and full table scan

Priya, September 16, 2008 - 3:24 pm UTC

Please let me know some suggestions to tune the below query
INSERT INTO IDS_MEME_MEMB_ATTR_ODS ( CLCL_ID, CLCL_PAID_DT, CDML_SEQ_NO, MEME_CK,
MEME_ATTR_NAME, MEME_ATTR_VALUE, MEPE_EFF_DT, MEPE_TERM_DT ) SELECT DISTINCT CLCDML.CLCL_ID,
CLCLCL.CLCL_PAID_DT, CLCDML.CDML_SEQ_NO, MEMBXX.MEME_CK, MEMBXX.MEME_ATTR_NAME,
MEMBXX.MEME_ATTR_VALUE, MEMBXX.MEME_EFF_DT, MEMBXX.MEME_TERM_DT FROM IDS_CDML_CL_LINE_WRK1
CLCDML JOIN IDS_CLCL_CLAIM_WRK1 CLCLCL ON (CLCLCL.CLCL_ID = CLCDML.CLCL_ID) JOIN
EDS_MEME_MEMB_ATTR MEMBXX ON (MEMBXX.MEME_CK = CLCLCL.MEME_CK) WHERE CLCDML.CDML_FROM_DT >=
MEMBXX.MEME_EFF_DT AND CLCDML.CDML_FROM_DT <= MEMBXX.MEME_TERM_DT AND CLCLCL.IDS_NEGATIVE_IND
= 'P' AND CLCLCL.IDS_DUP_FLG = 0 AND MEMBXX.MEME_ATTR_NAME IN (SELECT TRIM(ENTX_REF_VAL) FROM
IDS_ENTX_ENTITY_XREF IDSXRF WHERE IDSXRF.ENTX_ID = :B1 AND IDSXRF.ENTX_APP_TYPE = 'I' AND
IDSXRF.ENTX_REF_ID = 'MEME_ATTR_NAME')

Plan hash value: 3734397932

-------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | | | 1087 (100)| |
| 1 | HASH UNIQUE | | 1 | 150 | 1087 (6)| 00:00:02 |
| 2 | HASH JOIN | | 1 | 150 | 1086 (6)| 00:00:02 |
| 3 | TABLE ACCESS BY INDEX ROWID| EDS_MEME_MEMB_ATTR | 1 | 49 | 3 (0)| 00:00:01 |
| 4 | NESTED LOOPS | | 33 | 4059 | 926 (5)| 00:00:02 |
| 5 | MERGE JOIN CARTESIAN | | 240 | 17760 | 198 (16)| 00:00:01 |
| 6 | INDEX RANGE SCAN | IDSENT_SK_02 | 1 | 40 | 2 (0)| 00:00:01 |
| 7 | BUFFER SORT | | 7874 | 261K| 196 (16)| 00:00:01 |
| 8 | TABLE ACCESS FULL | IDS_CLCL_CLAIM_WRK1 | 7874 | 261K| 196 (16)| 00:00:01 |
| 9 | INDEX RANGE SCAN | SYS_C00113635 | 1 | | 2 (0)| 00:00:01 |
| 10 | TABLE ACCESS FULL | IDS_CDML_CL_LINE_WRK1 | 14930 | 393K| 157 (9)| 00:00:01 |
-------------------------------------------------------------------------------------------------------
Tom Kyte
September 16, 2008 - 11:49 pm UTC

why? why do you want to get rid of them?

(if they were inherently evil - we would not have implemented them after all)

ROWNUM=1 vs ROWNUM<=1

Bilal, March 25, 2009 - 11:06 am UTC

Tom, why these queries are different in performance.
for me its very strange, would you please tell me the reason for this.

when is use ROWNUM<=1 it performs very well compare to ROWNUM=1.
i dont understand the logic, would you please tel me the reason.

1) CPU - I/O usage is very high

SELECT inv_number,
inv_type,
inv_sub_type,
carrier_local_flag,
inv_run,
invoice_mode_code
FROM (SELECT insih.inv_number,
insih.inv_type,
insih.inv_sub_type,
insih.carrier_local_flag,
insih.inv_run,
insih.invoice_mode_code
FROM sales_inv_headers insih
WHERE :p_company_code IS NULL
AND:p_system_id IS NULL
AND:p_inv_number IS NULL
AND:p_inv_run IS NULL
AND:p_bulk_suffix IS NULL)
WHERE ROWNUM =1


2) CPU - I/O usage is very low

SELECT inv_number,
inv_type,
inv_sub_type,
carrier_local_flag,
inv_run,
invoice_mode_code
FROM (SELECT insih.inv_number,
insih.inv_type,
insih.inv_sub_type,
insih.carrier_local_flag,
insih.inv_run,
insih.invoice_mode_code
FROM sales_inv_headers insih
WHERE :p_company_code IS NULL
AND:p_system_id IS NULL
AND:p_inv_number IS NULL
AND:p_inv_run IS NULL
AND:p_bulk_suffix IS NULL)
WHERE ROWNUM <=1
Tom Kyte
March 29, 2009 - 8:23 pm UTC

go look in the OTHER PLACE YOU PUT THE SAME EXACT STUFF.


squeaky wheels - they get made fun of here. No grease

using max

babloo, September 26, 2010 - 11:11 am UTC

Hi Tom ,
I have a similar question. I have a query that goes into multiple tables which has millions of records and along with other columns gets 50000 ids. now for each id go to an audit table, which has multiple record . pick up the record which has the max create date for this id. get few other columns from this row.
Can you please suggest few approaches.
as we need to go to audit table for only 50K, I do not want to put the Audit table in the main query . how to get other columns from audit table where create date= max create date for the id. the id column is in both main table and audit table
Tom Kyte
September 27, 2010 - 11:45 am UTC

... I do not want to put the Audit
table in the main query ...

WHY THE HECK NOT??????


write a single query, just join. Databases where born to join, it is what they do best.

reason

babloo, September 28, 2010 - 7:53 pm UTC

Hi tom,
If I put the audit table in main query, it looks like we are looking at millions of the record while joining them to main query. whereas current code , executes the audit query via a cursor only 58000 times, becuase main query only gets 58000 record.

I tried two approaches

first approach

with dev_query as(main_query)
select a.*, b.key_id, b.other_column_from_audit from (SELECT ROW_NUMBER ()
OVER (PARTITION BY key_id
ORDER BY audit_create_date)
v_rank,other_column_from_audit,
mpaa.key_id
FROM audit mpaa where exists ( select 1 from dev_query where dev_query.key_id = mpaa.key_id)
) b, dev_query a
where a.key_id = b.key_id
and b.v_rank = 1



Plan
SELECT STATEMENT CHOOSECost: 2,268,544 Bytes: 16,903,851,756 Cardinality: 51,852,306
23 TEMP TABLE TRANSFORMATION
12 LOAD AS SELECT
11 HASH JOIN Cost: 373,589 Bytes: 1,273,864,048 Cardinality: 7,363,376
1 TABLE ACCESS FULL TABLE main_query_table1 Cost: 3 Bytes: 228 Cardinality: 12
10 HASH JOIN Cost: 373,385 Bytes: 1,133,959,904 Cardinality: 7,363,376
2 TABLE ACCESS FULL TABLE main_query_table2 Cost: 24 Bytes: 277,620 Cardinality: 6,610
9 HASH JOIN Cost: 373,161 Bytes: 824,698,112 Cardinality: 7,363,376
3 INDEX FULL SCAN INDEX (UNIQUE) main_query_table3_PK Cost: 1 Bytes: 29,775 Cardinality: 5,955
8 HASH JOIN Cost: 372,958 Bytes: 787,881,232 Cardinality: 7,363,376
4 TABLE ACCESS FULL TABLE main_query_table4 Cost: 28,739 Bytes: 193,527,968 Cardinality: 6,047,749
7 HASH JOIN Cost: 326,010 Bytes: 552,253,275 Cardinality: 7,363,377
5 TABLE ACCESS FULL TABLE main_query_table5 Cost: 231,872 Bytes: 410,446,686 Cardinality: 7,076,667
6 TABLE ACCESS FULL TABLE main_query_table6 Cost: 44,642 Bytes: 1,093,100,000 Cardinality: 64,300,000
22 HASH JOIN Cost: 1,894,956 Bytes: 16,903,851,756 Cardinality: 51,852,306
14 VIEW Cost: 15,496 Bytes: 2,113,288,912 Cardinality: 7,363,376
13 TABLE ACCESS FULL TABLE (TEMP) SYS.SYS_TEMP_0FD9D664A_6579C20B Cost: 15,496 Bytes: 1,075,052,896 Cardinality: 7,363,376
21 VIEW Cost: 1,724,841 Bytes: 4,179,135,831 Cardinality: 107,157,329
20 WINDOW SORT PUSHED RANK Cost: 1,724,841 Bytes: 3,214,719,870 Cardinality: 107,157,329
19 HASH JOIN RIGHT SEMI Cost: 933,462 Bytes: 3,214,719,870 Cardinality: 107,157,329
16 VIEW Cost: 15,496 Bytes: 95,723,888 Cardinality: 7,363,376
15 TABLE ACCESS FULL TABLE (TEMP) SYS.SYS_TEMP_0FD9D664A_6579C20B Cost: 15,496 Bytes: 1,075,052,896 Cardinality: 7,363,376
18 PARTITION RANGE ALL Cost: 777,679 Bytes: 3,764,650,000 Cardinality: 221,450,000 Partition #: 22 Partitions accessed #1 - #97
17 TABLE ACCESS FULL TABLE AUDIT Cost: 777,679 Bytes: 3,764,650,000 Cardinality: 221,450,000 Partition #: 22 Partitions accessed #1 - #97



other approach

SELECT *
FROM (SELECT ROW_NUMBER ()
OVER (PARTITION BY key_id
ORDER BY mpaa.audit_create_date)
v_rank,
mpaa.key_id
FROM audit mpaa) mpaao
,
(Smain_query) mpa
where mpaao.key_id=mpa.key_id
and v_rank = 1


Plan
SELECT STATEMENT CHOOSECost: 2,370,005 Bytes: 21,324,309,466 Cardinality: 107,157,334
16 HASH JOIN Cost: 2,370,005 Bytes: 21,324,309,466 Cardinality: 107,157,334
1 TABLE ACCESS FULL TABLE main_query_table1 Cost: 3 Bytes: 228 Cardinality: 12
15 HASH JOIN Cost: 2,367,090 Bytes: 19,288,320,120 Cardinality: 107,157,334
2 TABLE ACCESS FULL TABLE main_query_table2 Cost: 24 Bytes: 277,620 Cardinality: 6,610
14 HASH JOIN Cost: 2,364,155 Bytes: 14,787,712,092 Cardinality: 107,157,334
9 HASH JOIN Cost: 373,161 Bytes: 824,698,112 Cardinality: 7,363,376
3 INDEX FULL SCAN INDEX (UNIQUE) main_query_table3_PK Cost: 1 Bytes: 29,775 Cardinality: 5,955
8 HASH JOIN Cost: 372,958 Bytes: 787,881,232 Cardinality: 7,363,376
4 TABLE ACCESS FULL TABLE main_query_table4 Cost: 28,739 Bytes: 193,527,968 Cardinality: 6,047,749
7 HASH JOIN Cost: 326,010 Bytes: 552,253,275 Cardinality: 7,363,377
5 TABLE ACCESS FULL TABLE main_query_table5 Cost: 231,872 Bytes: 410,446,686 Cardinality: 7,076,667
6 TABLE ACCESS FULL TABLE main_query_table6 Cost: 44,642 Bytes: 1,093,100,000 Cardinality: 64,300,000
13 VIEW Cost: 1,797,983 Bytes: 5,757,700,000 Cardinality: 221,450,000
12 WINDOW SORT PUSHED RANK Cost: 1,797,983 Bytes: 3,100,300,000 Cardinality: 221,450,000
11 PARTITION RANGE ALL Cost: 777,679 Bytes: 3,100,300,000 Cardinality: 221,450,000 Partition #: 15 Partitions accessed #1 - #97
10 TABLE ACCESS FULL TABLE CAUDIT Cost: 777,679 Bytes: 3,100,300,000 Cardinality: 221,450,000 Partition #: 15 Partitions accessed #1 - #97



Tom Kyte
September 29, 2010 - 8:31 am UTC

you seem to think that this "main query" will be executed and then joined it.

It doesn't have to be. Regardless of how you write it. We rewrite your query however we see fit when we optimize it.

Just code the query naturally - the optimizer is responsible for optimizing. Give it a chance before you start second guessing things.



I cannot help you further if you don't share a schema and the real query.

Correlated Subquery

Orlando Reyes, December 01, 2010 - 3:32 pm UTC

Tom,

I agreed on not using hints and writing fragile code, question is we need to do this as a sub-query and based on your comments “and if they cannot use order by in a subquery, just declare a cursor that looks like the following and open/fetch 1 row/close it:” (our exact situation and we get an error)

Where and how can we use this cursor if all we have is a simple SQL, not a pl/sql object?

Our query looks like this:

SELECT t1.f1,t1.f2,t1.fn
From t1
Where ex1
And ex2
And t1.dttm = (select max(dt) from t2
Where t2.id = t1.id)
Where t1.status = ‘xxx’;

If there is not a feasible solution for this single query, what would be the best option, perhaps using pl/sql then?

Thanks in advance for your help.

Orlando Reyes

Tom Kyte
December 07, 2010 - 9:00 am UTC

I don't understand - what issue are you hitting. Your example is nothing like the original question.

You need the "first record BY some field" - you get more than one record. The original question gets exactly ONE record only.

Improve sub-query where Order by is not allowed.

Orlando Reyes, December 09, 2010 - 12:53 pm UTC

Well Tom,

The main query returns multiple records, but the sub-query “(select max(dt) from t2 Where t2.id = t1.id)” would return only one value, the MAX date for the ID being used at that time, so what I am looking for is to use the “index/sort/rownum=1” approach to reduce the I/O’s on this sub-query, but for that we would need the “ORDER BY DESC” which is not allowed here, so I wonder if there are other options to accomplish similar effect without rewriting the whole query, which is really a lot more complex.

Thanks,

Orlando.

Tom Kyte
December 10, 2010 - 8:27 am UTC

it would have been best to tell us that initially - that you were looking at "tuning" this query:

SELECT t1.f1,t1.f2,t1.fn
From t1
Where ex1
And ex2
And t1.dttm = (select max(dt) from t2
Where t2.id = t1.id)
Where t1.status = ‘xxx’;

(that query is not even a tiny bit similar to the original query at the top of this page - superficially it looks the same, but the original query was a single table query - you have different tables in there)


if you look at that plan now, you should see an index min/max scan - it is already doing the logical equivalent of "rownum = 1"

so tell us - how big is t1, how big is t2, what percentage of t1 will have a status of 'xxx' and ex1 and ex2 be true in general (how many rows will that query return).

if t1.status = 'xxx' and ex1 and ex2 returns a large percentage of the table, perhaps:

SELECT t1.f1,t1.f2,t1.fn
From t1, (select max(dt) max_dt, id from t2 group by id) t2
Where ex1
And ex2
And t1.dttm = t2.max_dt
and t1.id = t2.id
and t1.status = ‘xxx’;


Jorge, March 04, 2011 - 10:16 am UTC

Tom,

what if a need a set of record with two columns: pet_id and max(weight)

regards

Tom Kyte
March 04, 2011 - 11:34 am UTC

then you probably want to use analytics - in particular "keep dense rank"

see:

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:122801500346829407

Query

A reader, January 30, 2014 - 1:09 pm UTC

Please help me tune this query :
explain plan for
SELECT FW2.PATIENT_KEY, FW2.REFERRAL_ID,MAX(FW2.WIP_KEY) AS MaxWipKey
FROM FACT_WIP FW2
INNER JOIN
(

SELECT PATIENT_KEY, REFERRAL_ID, MAX(STAGE_START_TIMESTAMP) AS MaxDateTime
FROM SMA.FACT_WIP
WHERE STAGE_START_TIMESTAMP >= '01-JUL-2013' AND STAGE_START_TIMESTAMP < '02-JAN-2014'
GROUP BY PATIENT_KEY, REFERRAL_ID,DRUG_KEY
) FW3
ON FW2.PATIENT_KEY = FW3.PATIENT_KEY
AND FW2.REFERRAL_ID = FW3.REFERRAL_ID
AND FW2.STAGE_START_TIMESTAMP = FW3.MaxDateTime
GROUP BY FW2.PATIENT_KEY, FW2.REFERRAL_ID

Explain Plan O/P:
Plan hash value: 1129752342

--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 75 | | 99365 (2)| 00:19:53 |
| 1 | HASH GROUP BY | | 3 | 75 | | 99365 (2)| 00:19:53 |
| 2 | VIEW | VM_NWVW_1 | 3 | 75 | | 99365 (2)| 00:19:53 |
|* 3 | FILTER | | | | | | |
| 4 | HASH GROUP BY | | 3 | 186 | | 99365 (2)| 00:19:53 |
|* 5 | HASH JOIN | | 3302K| 195M| 113M| 98979 (2)| 00:19:48 |
|* 6 | TABLE ACCESS FULL| FACT_WIP | 3302K| 75M| | 42556 (2)| 00:08:31 |
| 7 | TABLE ACCESS FULL| FACT_WIP | 3440K| 124M| | 42516 (2)| 00:08:31 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter("FW2"."STAGE_START_TIMESTAMP"=MAX("STAGE_START_TIMESTAMP"))
5 - access("FW2"."PATIENT_KEY"="PATIENT_KEY" AND
"FW2"."REFERRAL_ID"="REFERRAL_ID")
6 - filter("STAGE_START_TIMESTAMP"<TO_DATE(' 2014-01-02 00:00:00', 'syyyy-mm-dd
hh24:mi:ss') AND "STAGE_START_TIMESTAMP">=TO_DATE(' 2013-07-01 00:00:00',
'syyyy-mm-dd hh24:mi:ss'))

In Memory Column Store

longTermFanBoy, August 24, 2014 - 2:54 am UTC

SELECT MAX (weight) weight
            FROM client.pet_weight
           WHERE pet_id = :pet_id
             AND setup_date =
                          (SELECT MAX (setup_date)
                             FROM client.pet_weight
                            WHERE pet_id = :pet_id

was rewritten as 
SELECT weight from (select weight from t where pet_id = :b1
                 order by pet_id DESC, setup_date DESC, weight DESC )
         where rownum = 1



Now its 2014 and we have in memory column store and with that technology my guess is ,the max query would be faster discounting the fact that the max code is bit awkward.
does this mean that since the technology has matured we can afford to have code like max and not bother ?
or do you still advise us to refactor ?

Fastest way I've found to rewrite query

Žilvinas Vidmantas, October 19, 2016 - 2:29 pm UTC

Fastest that I was able to achieve is:
WITH a(Pet_Id, Setup_Date, Weight)
  AS (SELECT NULL,
             MAX(Setup_Date) Setup_Date,
             NULL
        FROM t
       WHERE Pet_Id = :Pet_Id
       UNION ALL
       SELECT NULL, NULL, (SELECT --+ no_unnest
                                  MAX(t.Weight)
                             FROM t
                            WHERE t.Pet_Id = :Pet_Id AND t.Setup_Date = a.Setup_Date)
         FROM a)
SELECT *
  FROM a
 WHERE Pet_Id = 1

Even faster could be something like this:
SELECT MAX(t.Weight) KEEP(DENSE_RANK LAST ORDER BY Pet_Id, Setup_Date, Weight, ROWID)
  FROM t
 WHERE t.Pet_Id = :Pet_Id

But Oralce refuses to use INDEX RANGE SCAN (MIN/MAX).
In this case we just need last index entry of index branch Pet_Id = :Pet_Id

General Question

Sam, January 31, 2022 - 10:32 am UTC

Hi Tom, I have a general question regarding the oracle query.
My understanding is Writing a 'subquery' will use more CPU and response time then writing the individual 'query'. Please correct me if I am wrong.
Original Query
e.g. Select max(Order_ID) from t where cust_ID = '1'
In this case we are fetching the previous order by a particular customer and Order_Id is in input.
Select max(Order_ID) from t
where cust_ID='1'
and order_ID < :Order_ID

New Query
Select (Select max(Order_ID) from t where cust_ID = '1') Order,
(select max(Order_ID) from t where cust_ID = '1'
and order_ID< :Order_ID) prev_Order)

Also would this query benefit us in Oracle Caching anyhow?
If I write this query as

Select max(Order_ID) from t
where cust_ID = '1'
and order_ID< :Order_ID

"I will populate the Order_ID with max value of order '999999' when I don't have the order_ID in input and will populate it with actual value in input when I have it"
Do you think it would work any better or worse?
Chris Saxon
January 31, 2022 - 11:14 am UTC

Writing a 'subquery' will use more CPU and response time then writing the individual 'query'

Not necessarily - the optimizer will rearrange your query to execute it the fastest way it can. Provided the transformation is possible (i.e. is know to give the same results), the database will often merge subqueries into the main query.

I'm unclear exactly what you're trying to do here. Please submit a new question with a complete example (create table + inserts) showing what your goal is.

Writing a 'subquery' will use more CPU and response time then writing the individual 'query'

Sam, January 31, 2022 - 12:20 pm UTC

Thanks for the response.

We have individual queries in one application which is using DB2, but we are building another application which will be using Oracle, In DB2 we have got individual queries but to do the maximum utilization of the Oracle Cache, we have clubbed the individual queries and build one.

DB2 Individual Query
1. Select max(Order_ID) from t where cust_ID = '1'
2. Select max(Order_ID) from t where cust_ID='1' and order_ID < :Order_ID

Oracle New Query
Select (Select max(Order_ID) from t where cust_ID = '1') Order,
(select max(Order_ID) from t where cust_ID = '1'
and order_ID< :Order_ID) prev_Order)

So I am trying to understand which query will use less CPU in this case.

The alternative solution to above 2 queries is below, but again I am not sure if this is a better approach.

Select max(Order_ID) from t
where cust_ID = '1'
and order_ID< :Order_ID

"I will populate the Order_ID with max value of order '999999' when I don't have the order_ID in input and will populate it with actual value in input when I have it"


The main focus here to have a query which run fastest for us and it uses Cache to help us in optimization.


Chris Saxon
February 01, 2022 - 2:52 pm UTC

Generally it's better to execute one query instead of two. This reduces the number of roundtrips from app <> DB and can have other benefits.

That said, in this case you're asking two separate questions:

- Find the max ID
- Find the mad ID less than some value

Combining them into one query may have minimal advantage here. There's only one true way to know for sure:

Test it on your database!

Provided you have an index on:

t ( cust_id, order_id )


I would expect either way to be faster than fast enough though. Unless you've identified these queries as bottlenecks in your app, I feel like the gains here are too small to worry about.

But see my point above about testing!

How about ...

A reader, January 31, 2022 - 1:10 pm UTC

SELECT ORDER_ID FROM (
SELECT ORDER_ID FROM T
WHERE CUST_ID = 'I'
ORDER BY ORDER_ID DESC)
WHERE ROWNUM <=2
ORDER BY 1 DESC


You'll get 2 rows with the two highest ORDER_IDs for CUST_ID = 'I'

Problem with Suggestion

Sam, February 01, 2022 - 6:42 am UTC

The problem with this solution is that it will always respond back with the maximum two values whereas the requirement is below.

1. Get maximum order number.
2. Get max previous order number than the order number being received in the input.
e.g. If the max order number in table is '100' and in input we receive the order number as '90', we should be responding back as 100 and 89.

The solution proposed will return 100 and 99.

Tune the subquery with max(dt)

John, March 21, 2022 - 4:04 am UTC

Greetings!

Database is on Oracle 19.11.0.0

Tables involved in this query are partitioned on snp_dt (Date column).

There is a use case where the table is queried by (inst_id, grp_id) and the queries does not have partition key (Snp_dt) as input in the where clause.

Local partitioned primary key index on LQY.FCT_lqY_UNV table: SNAP_DATE,INSTRUMENT_ID,SNAP_ID,UNIVERSE_GROUP_ID
Local partitioned Primary key index on LQY.FCT_lqY_BSE table: SNAP_DATE,INSTRUMENT_ID,SNAP_ID


This query is taking forever as the subquery is scanning all partitions via primary key index.

Is there any way to speed up the query or the query can be rewritten in a more optimized way ?

Thanks for your suggestions.


SQL> SELECT lu.lqY_SCORE
FROM lqY.FCT_lqY_BSE lb
JOIN lqY.FCT_lqY_UNV lu ON lb.inst_id = lu.inst_id
AND lb.snp_dt = lu.snp_dt
AND lu.snp_dt = (SELECT MAX( l1.snp_dt)
FROM lqY.FCT_lqY_UNV l1
WHERE l1.UNV_grp_id = 1
AND l1.inst_id = 134740022
GROUP BY l1.inst_id)
WHERE lu.UNV_grp_id = 1 and lb.inst_id = 134740022
/

Execution Plan
----------------------------------------------------------
Plan hash value: 3959197490

---------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                   | Name                          | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                            |                               |     1 |    37 |   415  (13)| 00:00:01 |       |       |
|   1 |  NESTED LOOPS                               |                               |     1 |    37 |     7   (0)| 00:00:01 |       |       |
|   2 |   PARTITION RANGE SINGLE                    |                               |     1 |    22 |     5   (0)| 00:00:01 |   KEY |   KEY |
|   3 |    TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| FCT_lqY_UNV                   |     1 |    22 |     5   (0)| 00:00:01 |   KEY |   KEY |
|   4 |     SORT CLUSTER BY ROWID BATCHED           |                               |     1 |       |     4   (0)| 00:00:01 |       |       |
|*  5 |      INDEX RANGE SCAN                       | FCT_lq_UNV_UNQ_CONST          |     1 |       |     4   (0)| 00:00:01 |   KEY |   KEY |
|   6 |       SORT GROUP BY NOSORT                  |                               |     1 |    18 |   408  (13)| 00:00:01 |       |       |
|   7 |        PARTITION RANGE ALL                  |                               |     1 |    18 |   408  (13)| 00:00:01 |     1 |  2277 |
|*  8 |         INDEX FULL SCAN                     | FCT_lq_UNV_UNQ_CONST          |     1 |    18 |   408  (13)| 00:00:01 |     1 |  2277 |
|   9 |   PARTITION RANGE ITERATOR                  |                               |     1 |    15 |     2   (0)| 00:00:01 |   KEY |   KEY |
|* 10 |    INDEX RANGE SCAN                         | FCT_lqY_BSE_UNQ_CONST         |     1 |    15 |     2   (0)| 00:00:01 |   KEY |   KEY |
---------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("LU"."snp_dt"= (SELECT MAX("L1"."snp_dt") FROM "lqY"."FCT_lqY_UNV" "L1" WHERE
              "L1"."inst_id"=134740022 AND "L1"."UNV_grp_id"=1 GROUP BY 134740022) AND "LU"."inst_id"=134740022 AND
              "LU"."UNV_grp_id"=1)
       filter("LU"."UNV_grp_id"=1)
   8 - filter("L1"."inst_id"=134740022 AND "L1"."UNV_grp_id"=1)
  10 - access("LB"."snp_dt"="LU"."snp_dt" AND "LB"."inst_id"=134740022)

Connor McDonald
March 21, 2022 - 4:45 am UTC

AND lu.snp_dt = (SELECT MAX( l1.snp_dt)
FROM lqY.FCT_lqY_UNV l1
WHERE l1.UNV_grp_id = 1
AND l1.inst_id = 134740022
GROUP BY l1.inst_id)


Firstly, I'm not sure what benefit the GROUP BY has here? You *must* return only 1 row anyway because its a "=" not "in", so I'll remove that. Now its

SELECT MAX( l1.snp_dt)
FROM lqY.FCT_lqY_UNV l1
WHERE l1.UNV_grp_id = 1
AND l1.inst_id = 134740022


Since you *never mentioned* an INST_ID column in the table definitions I'm going to assume its INSTRUMENT_ID

Since you *never mentioned* an UNV_grp_id column I'm going to assume its UNIVERSE_GROUP_ID

(I mean...come on...help us to help you ....)

Anyway....those values could appear in *any* of the date partitions, so we're always going to have scan all of them.

If you know something about the data that the database does not, eg, the provided values will map in some way to a date (eg, most queries start with recent dates) you could perhaps work your through a subset of date ranges to narrow the list of partitions considered down

Tune the subquery with max

John, March 22, 2022 - 4:06 am UTC

Thank you, Connor for the update and fixing the typos.

1) Does it help creating new global index on LQY.FCT_LQY_UNV(INSTRUMENT_ID) column speed up the query subquery ?

2) Also, If I rearrange the existing Primary Key (Local index) index columns in this order,
(INSTRUMENT_ID, UNIVERSE_GROUP_ID, SNAP_ID, SNAP_DATE)

Would that help to speed up the subquery without creating new global index .?

Thanks!

Connor McDonald
March 23, 2022 - 2:47 am UTC

re (2), see my comments in response to iudith

re (1) if inst_id is (nearly) unique, then yes, otherwise you might want to the GRP_ID into the index as well... ***bute*** do not forget the implications that come with global indexes. I don't know what you partition maintenance requirements currently are.


Iudith Mentzel, March 22, 2022 - 10:23 pm UTC

Hello All,

Just a remark to the last post:

If you want to reorder the columns in your PK index, then probably
the best column order for this query would be:

INSTRUMENT_ID,UNIVERSE_GROUP_ID,SNAP_DATE, SNAP_ID

This way the scan of each index partition could be more efficient.

Placing the SNAP_ID before the SNAP_DATE prevents an index range scan to be performed that also includes the SNAP_DATE column.
I expect an INDEX RANGE DESCENDING to be performed as the innermost operation (once for each partition).

Cheers & Best Regards,
Iudith Mentzel
Connor McDonald
March 23, 2022 - 2:25 am UTC

True...but that's 2000 partitions.

There is still a lot of benefit to be gained by being able to narrow this down to a date range

Tuning the subquery with MAX

John, March 23, 2022 - 5:49 am UTC

Thanks to both Connor and iudith for your inputs and suggestions.

Hi Connor,

1) I have tested with global index, FCT_LQ_UNV_IDX1, on LQY.FCT_LQY_UNV(INST_ID) column and with Index Hint, Query comes back quickly.
For some reason, Oracle is not picking the new index and stats are in place and i had to force the query with INDEX HINT to choose the new index.
Partitioned tables are enabled with incremental stats. Not sure what could be the reason for Oracle not choosing the Global index.

2) INST_ID column is not unique, ofcourse, i can add UNV_grp_ID too if that improves the query execution time further.

3) Connor, I understand and agree with you the implications that comes with Global indexes. In this case, this new global index size is large (~250GB) and will grow.

When we do, partition maintenance operations like partition compression, move partition or index rebuild (in case), global index mainteanance is very expensive and index rebuild will take long hours.

Any idea how to convert this index, as Global partitioned Index ?

Here is sample data for the columns,

INST_ID UNV_GRP_ID
------------- -----------------
134740022 1
134740022 2
134740022 3

4) I have not tested with rearranging the existing Primary Key local index columns. Can you elaborate little more on this statement ?

Does reordering the PK indexed column in the order (INST_ID,UNV_grp_ID,SNP_DT, SNAP_ID) helps Oracle to index range scan (index read) faster as no need to scan the table partition ? Is this the main benefit of ordering this way..

"If you want to reorder the columns in your PK index, then probably the best column order for this query would be: INST_ID,UNV_grp_ID,SNP_DT, SNAP_ID
This way the scan of each index partition could be more efficient.
Placing the SNAP_ID before the SNAP_DATE prevents an index range scan to be performed that also includes the SNAP_DATE column."



Thank you.

See below the execution plan with new global index.


-----
SELECT lu.LQY_SCORE
FROM LQY.FCT_LQY_BSE lb
JOIN LQY.FCT_LQY_UNV lu ON lb.inst_id = lu.inst_id
AND lb.snp_dt = lu.snp_dt
AND lu.snp_dt = (SELECT /*+ INDEX(l1 FCT_LQ_UNV_IDX1) */ MAX( l1.snp_dt)
FROM LQY.FCT_LQY_UNV l1
WHERE l1.UNV_grp_id = 1
AND l1.inst_id = 134740022
)
WHERE lu.UNV_grp_id = 1 and lb.inst_id = 134740022
/


Elapsed: 00:00:00.03

Execution Plan
----------------------------------------------------------
Plan hash value: 2493378227
--------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                        | Name                          | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                                 |                               |     1 |    37 |  1099   (0)| 00:00:01 |       |       |
|   1 |  NESTED LOOPS                                    |                               |     1 |    37 |     7   (0)| 00:00:01 |       |       |
|   2 |   PARTITION RANGE SINGLE                         |                               |     1 |    22 |     5   (0)| 00:00:01 |   KEY |   KEY |
|   3 |    TABLE ACCESS BY LOCAL INDEX ROWID BATCHED     | FCT_LQY_UNV                   |     1 |    22 |     5   (0)| 00:00:01 |   KEY |   KEY |
|   4 |     SORT CLUSTER BY ROWID BATCHED                |                               |     1 |       |     4   (0)| 00:00:01 |       |       |
|*  5 |      INDEX RANGE SCAN                            | FCT_LQ_UNV_UNQ_CONST          |     1 |       |     4   (0)| 00:00:01 |   KEY |   KEY |
|   6 |       SORT AGGREGATE                             |                               |     1 |    18 |            |          |       |       |
|*  7 |        TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| FCT_LQY_UNV                   |   376 |  6768 |  1092   (0)| 00:00:01 | ROWID | ROWID |
|   8 |         SORT CLUSTER BY ROWID                    |                               |  2476 |       |    11   (0)| 00:00:01 |       |       |
|*  9 |          INDEX RANGE SCAN                        | FCT_LQ_UNV_IDX1               |  2476 |       |    11   (0)| 00:00:01 |       |       |
|  10 |   PARTITION RANGE ITERATOR                       |                               |     1 |    15 |     2   (0)| 00:00:01 |   KEY |   KEY |
|* 11 |    INDEX RANGE SCAN                              | FCT_LQY_BSE_UNQ_CONST         |     1 |    15 |     2   (0)| 00:00:01 |   KEY |   KEY |
--------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("LU"."snp_dt"= (SELECT /*+ INDEX ("L1" "FCT_LQ_UNV_IDX1") */ MAX("L1"."snp_dt") FROM
              "LQY"."FCT_LQY_UNV" "L1" WHERE "L1"."inst_id"=134740022 AND "L1"."UNV_grp_ID"=1) AND
              "LU"."inst_id"=134740022 AND "LU"."UNV_grp_ID"=1)
       filter("LU"."UNV_grp_ID"=1)
   7 - filter("L1"."UNV_grp_ID"=1)
   9 - access("L1"."inst_id"=134740022)
  11 - access("LB"."snp_dt"="LU"."snp_dt" AND "LB"."inst_id"=134740022)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       1080  consistent gets
        116  physical reads
          0  redo size
        558  bytes sent via SQL*Net to client
        832  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL>

Connor McDonald
March 23, 2022 - 6:15 am UTC

If you add SNAP_DT as trailing column in the index, then it can be a local index. Try that

Tuning the subquery with MAX

John, March 24, 2022 - 1:34 am UTC

Hi Connor,

With Local index on FCT_LQY_UNV(INST_ID,UNV_GRP_ID,SNP_DT) , query execution plan is below- Consistent gets are bit higher than global index but performance wise they are almost the same. please provide your comments.


SELECT lu.LQY_SCORE
FROM LQY.FCT_LQY_BSE lb
JOIN LQY.FCT_LQY_UNV lu ON lb.inst_id = lu.inst_id
AND lb.snp_dt = lu.snp_dt
AND lu.snp_dt = (SELECT /*+ INDEX(l1 FCT_LQ_UNV_IDX1_LOCAL) */ MAX( l1.snp_dt)
FROM LQY.FCT_LQY_UNV l1
WHERE l1.UNV_grp_id = 1
AND l1.inst_id = 134740022
)
WHERE lu.UNV_grp_id = 1 and lb.inst_id = 134740022
/

Execution Plan
----------------------------------------------------------
Plan hash value: 2869811492

---------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                   | Name                          | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                            |                               |     1 |    37 |  4568   (1)| 00:00:01 |       |       |
|   1 |  NESTED LOOPS                               |                               |     1 |    37 |     6   (0)| 00:00:01 |       |       |
|   2 |   PARTITION RANGE SINGLE                    |                               |     1 |    22 |     4   (0)| 00:00:01 |   KEY |   KEY |
|   3 |    TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| FCT_LQ_UNV                    |     1 |    22 |     4   (0)| 00:00:01 |   KEY |   KEY |
|   4 |     SORT CLUSTER BY ROWID BATCHED           |                               |     1 |       |     3   (0)| 00:00:01 |       |       |
|*  5 |      INDEX RANGE SCAN                       | FCT_LQ_UNV_IDX1_LOCAL         |     1 |       |     3   (0)| 00:00:01 |   KEY |   KEY |
|   6 |       SORT AGGREGATE                        |                               |     1 |    18 |            |          |       |       |
|   7 |        FIRST ROW                            |                               |     1 |    18 |  4562   (1)| 00:00:01 |       |       |
|   8 |         PARTITION RANGE ALL                 |                               |     1 |    18 |  4562   (1)| 00:00:01 |  2280 |     1 |
|*  9 |          INDEX RANGE SCAN (MIN/MAX)         | FCT_LQ_UNV_IDX1_LOCAL         |     1 |    18 |  4562   (1)| 00:00:01 |  2280 |     1 |
|  10 |   PARTITION RANGE ITERATOR                  |                               |     1 |    15 |     2   (0)| 00:00:01 |   KEY |   KEY |
|* 11 |    INDEX RANGE SCAN                         | FCT_LQ_BSE_UNQ_CONST          |     1 |    15 |     2   (0)| 00:00:01 |   KEY |   KEY |
---------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("LU"."INST_ID"=134740022 AND "LU"."UNV_GRP_ID"=1 AND "LU"."SNP_DT"= (SELECT /*+ INDEX ("L1" "FCT_LQ_UNV_IDX1_LOCAL") */ MAX("L1"."SNP_DT") FROM 
              "LIQUIDITY"."FCT_LQ_UNV" "L1" WHERE
              "L1"."UNV_GRP_ID"=1 AND "L1"."INST_ID"=134740022))
   9 - access("L1"."INST_ID"=134740022 AND "L1"."UNV_GRP_ID"=1)
  11 - access("LB"."SNP_DT"="LU"."SNP_DT" AND "LB"."INST_ID"=134740022)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       1395  consistent gets
          3  physical reads
          0  redo size
        558  bytes sent via SQL*Net to client
        838  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL>

Connor McDonald
March 25, 2022 - 3:03 am UTC

Its definitely possibly going to be a little bit more, because we potentially have to scan partitions in reverse order to find the latest.

You can weigh up that (small) extra cost with the benefit of having a local index which helps things like maintenance etc

John, March 25, 2022 - 4:35 am UTC

Thank you, Connor!
Connor McDonald
March 28, 2022 - 2:28 am UTC

glad we could help

Rewrite query

A reader, April 12, 2022 - 3:12 am UTC

Greetings!

Do you have any suggestions, if this MAX() query can be rewritten more efficiently with analytic function. LIQ.F_LIQ_UNV is accessed twice.

SELECT DISTINCT lb.inst_id inst_id ,lb.COMP_DOLLAR_VOL LIQCOMPDOLLARVOL,lb.DISP_DOLLAR_VOL LIQDISPDOLLARVOL,
lb.VOLATILITY LIQVOLATILITY,lb.AMIHUD LIQAMIHUD,lu_m.UNIVERSE_SIZE LIQUNIVERSESIZE,lu_m.LIQ_RANK LIQRANK,
lu_m.LIQ_SCORE LIQSCORE,lu_m.LIQ_MORE_LIQUID_PERCENT LIQMORELIQUIDPERCENT,lu_m.LIQ_SCORE_CHANGE LIQSCORECHANGE,
lb.snp_dt LIQDATE,lb.PROJ_DOLLAR_VOL LIQPROJDOLLARVOL
FROM LIQ.F_LIQ_BASE lb,
(SELECT lu.inst_id, lu.snp_dt, lu.UNIVERSE_SIZE, lu.LIQ_RANK, lu.LIQ_SCORE ,
lu.LIQ_MORE_LIQUID_PERCENT ,lu.LIQ_SCORE_CHANGE
FROM LIQ.F_LIQ_UNV lu,
(SELECT /*+ FULL(l1) PARALLEL(32) */ l1.inst_id, MAX( l1.snp_dt) m_snp_dt
FROM LIQ.F_LIQ_UNV l1, (SELECT DISTINCT inst_id FROM IG.IMPL_UNVS WHERE UNV_CODE = 14) v
WHERE l1.unv_grp_id = 1 and l1.inst_id = v.inst_id GROUP BY l1.inst_id ) qy1
WHERE lu.inst_id = qy1.inst_id AND lu.snp_dt = qy1.m_snp_dt and lu.unv_grp_id = 1
) lu_m
WHERE lb.inst_id = lu_m.inst_id AND lb.snp_dt = lu_m.snp_dt
/

Note that, I have to use, /*+ PARALLEL FULL */ Hint on the largest parition table as number of distinct inst_id returned from result set "V" is about 102000+ rows are retrieved.


Thanks!

Connor McDonald
April 12, 2022 - 4:43 am UTC

Maybe something along these lines

select distinct lb.inst_id inst_id,
                lb.comp_dollar_vol liqcompdollarvol,
                lb.disp_dollar_vol liqdispdollarvol,
                lb.volatility liqvolatility,
                lb.amihud liqamihud,
                lu_m.universe_size liquniversesize,
                lu_m.liq_rank liqrank,
                lu_m.liq_score liqscore,
                lu_m.liq_more_liquid_percent liqmoreliquidpercent,
                lu_m.liq_score_change liqscorechange,
                lb.snp_dt liqdate,
                lb.proj_dollar_vol liqprojdollarvol
  from liq.f_liq_base lb,
       (
         select *
         from (
             select lu.inst_id,
                    lu.snp_dt,
                    lu.universe_size,
                    lu.liq_rank,
                    lu.liq_score,
                    lu.liq_more_liquid_percent,
                    lu.liq_score_change,
                    max(l1.snp_dt) over ( partition by lu.inst_id ) as m_snp_dt
               from liq.f_liq_unv lu,
                             (select distinct inst_id
                              from ig.impl_unvs
                              where unv_code = 14
                             ) v
              where lu.inst_id = v.inst_id
                and lu.unv_grp_id = 1
          )
          where lu.snp_dt = qy1.m_snp_dt
       ) lu_m
 where lb.inst_id = lu_m.inst_id
   and lb.snp_dt = lu_m.snp_dt



MERGE running inner SELECT is blocked on cell smart scan

A reader, May 05, 2022 - 3:16 am UTC

Hi Connor,

Thanks for analytical version of the query rewrite. While testing it, Analytical version of the query was running much longer than the original version. Now the query business requirement has changed.

Below is the full MERGE SQL used to populate another table via merge statement.

I had to scan the table twice to get max (SNP_ID) for the corresponding max(SNP_DT) by INST_ID. As the LIQ.F_LIQ_UNV table is huge partition table , I had to use PARALLEL DOP 32 Full table scan to speed up the query. Note that, LIQ.F_LIQ_UNV table is enabled with QUERY HIGH compression at table partition level.

1)
The issue is, at times, the MERGE statement is hung and the PQ coordinator session is waiting on event "cell smart table scan" on the table LIQ.F_LIQ_UNV indefinitely and is blocking other 32 PQ slave sessions. This is happening very frequently.

Any idea what could be the reason for the "cell smart table scan" hung ?

2) Any better way to rewrite this inner query to perform full table scan once instead of twice ?

Appreciate your inputs.


MERGE INTO IMPL_RFD_LIQ_LD TRG USING
(
        SELECT  INST_ID             ,
                LIQDATE             ,
                LIQPROJDOLLARVOL    ,
                LIQCOMPDOLLARVOL    ,
                LIQDISPDOLLARVOL    ,
                LIQVOLATILITY       ,
                LIQAMIHUD           ,
                LIQUNIVERSESIZE     ,
                LIQRANK             ,
                LIQSCORE            ,
                LIQMORELIQUIDPERCENT,
                LIQSCORECHANGE
        FROM
                (
                        SELECT DISTINCT lb.INST_ID INST_ID                       ,
                                lb.SNP_DT LIQDATE                                   ,
                                lb.PROJ_DOLLAR_VOL LIQPROJDOLLARVOL                    ,
                                lb.COMP_DOLLAR_VOL LIQCOMPDOLLARVOL                    ,
                                lb.DISP_DOLLAR_VOL LIQDISPDOLLARVOL                    ,
                                lb.VOLATILITY LIQVOLATILITY                            ,
                                lb.AMIHUD LIQAMIHUD                                    ,
                                lu_m.UNIVERSE_SIZE LIQUNIVERSESIZE                     ,
                                lu_m.LIQ_RANK LIQRANK                            ,
                                lu_m.LIQ_SCORE LIQSCORE                          ,
                                lu_m.LIQ_MORE_LIQUID_PERCENT LIQMORELIQUIDPERCENT,
                                lu_m.LIQ_SCORE_CHANGE LIQSCORECHANGE
                        FROM    LIQ.F_LIQ_BASE lb,
                                (
                                        SELECT DISTINCT lu.INST_ID        ,
                                                lu.SNP_ID                      ,
                                                lu.SNP_DT                    ,
                                                lu.UNIVERSE_SIZE                ,
                                                lu.LIQ_RANK               ,
                                                lu.LIQ_SCORE              ,
                                                lu.LIQ_MORE_LIQUID_PERCENT,
                                                lu.LIQ_SCORE_CHANGE
                                        FROM    LIQ.F_LIQ_UNV lu,
                                                (
                                                        SELECT
                                                                /*+ FULL(l2) PARALLEL(32) */
                                                                l2.INST_ID,
                                                                l2.SNP_DT    ,
                                                                MAX( l2.SNP_ID) m_SNP_ID
                                                        FROM    LIQ.F_LIQ_UNV l2,
                                                                (
                                                                        SELECT
                                                                                /*+ FULL(l1) PARALLEL(32) */
                                                                                l1.INST_ID,
                                                                                MAX( l1.SNP_DT) m_SNP_DT
                                                                        FROM    LIQ.F_LIQ_UNV l1,
                                                                                (
                                                                                        SELECT DISTINCT INST_ID
                                                                                        FROM    ICM.IMPL_UNV
                                                                                        WHERE   UNV_CODE = 12
                                                                                )
                                                                                v
                                                                        WHERE   l1.UNV_GRP_ID = 1
                                                                                AND l1.INST_ID = v.INST_ID
                                                                        GROUP BY l1.INST_ID
                                                                )
                                                                qy1
                                                        WHERE   l2.UNV_GRP_ID = 1
                                                                AND l2.INST_ID = qy1.INST_ID
                                                                AND l2.SNP_DT     = qy1.m_SNP_DT
                                                        GROUP BY l2.INST_ID,
                                                                l2.SNP_DT
                                                )
                                                qy2
                                        WHERE   lu.INST_ID         = qy2.INST_ID
                                                AND lu.SNP_DT         = qy2.SNP_DT
                                                AND lu.SNP_ID           = qy2.m_SNP_ID
                                                AND lu.UNV_GRP_ID = 1
                                )
                                lu_m
                        WHERE   lb.INST_ID = lu_m.INST_ID
                                AND lb.SNP_DT = lu_m.SNP_DT
                                AND lb.SNP_ID   = lu_m.SNP_ID
                )
)
SRC ON
(
        TRG.INST_ID = SRC.INST_ID AND TRG.UNV_CODE = 12
)
WHEN MATCHED THEN
        UPDATE
        SET     TRG.LIQDATE              = SRC.LIQDATE             ,
                TRG.LIQPROJDOLLARVOL     = SRC.LIQPROJDOLLARVOL    ,
                TRG.LIQCOMPDOLLARVOL     = SRC.LIQCOMPDOLLARVOL    ,
                TRG.LIQDISPDOLLARVOL     = SRC.LIQDISPDOLLARVOL    ,
                TRG.LIQVOLATILITY        = SRC.LIQVOLATILITY       ,
                TRG.LIQAMIHUD            = SRC.LIQAMIHUD           ,
                TRG.LIQUNIVERSESIZE      = SRC.LIQUNIVERSESIZE     ,
                TRG.LIQRANK              = SRC.LIQRANK             ,
                TRG.LIQSCORE             = SRC.LIQSCORE            ,
                TRG.LIQMORELIQUIDPERCENT = SRC.LIQMORELIQUIDPERCENT,
                TRG.LIQSCORECHANGE       = SRC.LIQSCORECHANGE
;





Chris Saxon
May 09, 2022 - 2:56 pm UTC

I don't know on the smart scan issue. For the second point if you're trying to find:

Find the highest SNP_ID among the rows that have the highest SNP_DT for each INST_ID, something like this could help:

SELECT  MAX ( SNP_ID ) KEEP (
          DENSE_RANK LAST ORDER BY SNP_DT
        )
FROM    LIQ.F_LIQ_UNV l2
GROUP   BY l2.INST_ID;


https://oracle-base.com/articles/misc/rank-dense-rank-first-last-analytic-functions#first_and_last

optimize query

A reader, November 30, 2022 - 5:28 am UTC

Greetings!

I am just wondering if this query can be optimized in more efficient way. Thanks for your suggestions.

Local Index F_LIQ_BASE_UNQ_CONST on LIQ.LIQ_BASE (SNP_DATE, INSTR_ID, SNP_ID)
Local Index F_LIQ_UNV_IDX1 on LIQ.F_LIQ_UNV(INSTR_ID, UNV_GRP_ID, SNP_DATE)
Local Index F_LIQ_UNV_UNQ_CONST on LIQ.F_LIQ_UNV(SNP_DATE, INSTR_ID, SNP_ID, UNV_GRP_ID)

SELECT *
FROM
  (SELECT instr_id,
    snp_date,
    DECODE(volatility,NULL,'!NA',volatility),
    DECODE(disp_dollar_vol,NULL,'!NA',disp_dollar_vol),
    DECODE(amihud,NULL,'!NA',1000000*amihud),
    DECODE(proj_dollar_vol,NULL,'!NA',proj_dollar_vol)
  FROM LIQ.F_LIQ_BASE
  WHERE instr_id= 11939805
  ) t1
INNER JOIN
  (SELECT *
  FROM
    (SELECT instr_id,
      snp_date,
      liq_score,
      unv_grp_id
    FROM LIQ.F_LIQ_UNV
    WHERE instr_id = 11939805
    AND snp_date   =
      (SELECT MAX(snp_date )
      FROM LIQ.F_LIQ_UNV
      WHERE instr_id = 11939805
      AND snp_date  <= to_date('28-NOV-2022')
      )
    ) PIVOT (MAX(LIQ_score) FOR unv_grp_id IN (1,2,3,4,5,6,7))
  ) t2
ON t1.instr_id  = t2.instr_id
AND t1.snp_date = t2.snp_date
ORDER BY t2.snp_date DESC
FETCH FIRST ROW ONLY; 


Execution Plan
----------------------------------------------------------
Plan hash value: 3261707505

---------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                         | Name                          | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                                  |                               |     1 |   245 |  5077   (1)| 00:00:01 |       |       |
|*  1 |  VIEW                                             |                               |     1 |   245 |  5077   (1)| 00:00:01 |       |       |
|*  2 |   WINDOW SORT PUSHED RANK                         |                               |     4 |   656 |  5077   (1)| 00:00:01 |       |       |
|   3 |    NESTED LOOPS                                   |                               |     4 |   656 |  5076   (1)| 00:00:01 |       |       |
|   4 |     SORT CLUSTER BY ROWID                         |                               |     4 |   656 |  5076   (1)| 00:00:01 |       |       |
|   5 |      NESTED LOOPS                                 |                               |     4 |   656 |  5076   (1)| 00:00:01 |       |       |
|   6 |       VIEW                                        |                               |     3 |   321 |  5069   (1)| 00:00:01 |       |       |
|   7 |        HASH GROUP BY PIVOT                        |                               |     3 |    66 |  5069   (1)| 00:00:01 |       |       |
|   8 |         PARTITION RANGE SINGLE                    |                               |     4 |    88 |     6   (0)| 00:00:01 |   KEY |   KEY |
|   9 |          TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| F_LIQ_UNV                     |     4 |    88 |     6   (0)| 00:00:01 |   KEY |   KEY |
|  10 |           SORT CLUSTER BY ROWID                   |                               |     1 |       |     4   (0)| 00:00:01 |       |       |
|* 11 |            INDEX RANGE SCAN                       | F_LIQ_UNV_UNQ_CONST           |     1 |       |     4   (0)| 00:00:01 |   KEY |   KEY |
|  12 |             PARTITION RANGE ITERATOR MIN/MAX      |                               |     1 |    15 |            |          |  2525 |     1 |
|  13 |              SORT AGGREGATE                       |                               |     1 |    15 |            |          |       |       |
|* 14 |               INDEX RANGE SCAN                    | F_LIQ_UNV_IDX1                |  2878 | 43170 |  5063   (1)| 00:00:01 |  2525 |     1 |
|  15 |       PARTITION RANGE ITERATOR                    |                               |     1 |       |     2   (0)| 00:00:01 |   KEY |   KEY |
|* 16 |        INDEX RANGE SCAN                           | F_LIQ_BASE_UNQ_CONST          |     1 |       |     2   (0)| 00:00:01 |   KEY |   KEY |
|  17 |     TABLE ACCESS BY LOCAL INDEX ROWID             | F_LIQ_BASE                    |     1 |    51 |     3   (0)| 00:00:01 |     1 |     1 |
---------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("from$_subquery$_009"."rowlimit_$$_rownumber"<=1)
   2 - filter(ROW_NUMBER() OVER ( ORDER BY INTERNAL_FUNCTION("T2"."snp_date") DESC )<=1)
  11 - access("snp_date"= (SELECT MAX("snp_date") FROM "LIQ"."F_LIQ_UNV" "F_LIQ_UNV" WHERE
              "instr_id"=11939805 AND "snp_date"<=TO_DATE(' 2022-11-28 00:00:00', 'syyyy-mm-dd hh24:mi:ss')) AND "instr_id"=11939805)
  14 - access("instr_id"=11939805 AND "snp_date"<=TO_DATE(' 2022-11-28 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
       filter("snp_date"<=TO_DATE(' 2022-11-28 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  16 - access("snp_date"="T2"."snp_date" AND "instr_id"=11939805)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         14  consistent gets
          0  physical reads
          0  redo size
       1850  bytes sent via SQL*Net to client
       1413  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          5  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL>

Connor McDonald
December 01, 2022 - 3:29 am UTC

Why is 14 consistent gets a concern ?

Optimize query

A reader, December 01, 2022 - 4:44 am UTC

>> Why is 14 consistent gets a concern ?

Hi Connor,

Agree, 14 consistent gets is low. Due to SLA time, thought of any room for further optimization.

Also, we are seeing fluctuation in query response time varying from 10ms to 2 sec.. We have SLA for the query to complete within 150ms.

When the query runs 1st time, it usually takes longer than expected time but on next execution its running fast. So wondering what could be the reason for this ?

Some times, we see the same query does more 'consistent gets' than good run and breaches the SLA time. The consistent gets goes higher from 14 to 30574 . We did not have any data load or DMLs ran during slow run.

Any idea what What else could be causing this and how to check/fix ?


BTW, this is Exadata system.

The statistics are below-

Statistics
----------------------------------------------------------
       5141  recursive calls
          0  db block gets
      30574  consistent gets
        397  physical reads
          0  redo size
       1850  bytes sent via SQL*Net to client
       1381  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
         26  sorts (memory)
          0  sorts (disk)
          1  rows processed



Thanks!

optimize query

A reader, December 01, 2022 - 4:47 am UTC

forgot to mention that the partitions on the involved tables are enabled with Exadata QUERY HIGH compression.
Connor McDonald
December 05, 2022 - 5:56 am UTC

Hard to say if you're not giving the combination of run time stats for the execution plan that actually generated that larger consistent gets.

Anyway...for subqueries in general, some times an analytic can avoid a second table visit, eg

SELECT instr_id,
      snp_date,
      liq_score,
      unv_grp_id
    FROM LIQ.F_LIQ_UNV
    WHERE instr_id = 11939805
    AND snp_date   =
      (SELECT MAX(snp_date )
      FROM LIQ.F_LIQ_UNV
      WHERE instr_id = 11939805
      AND snp_date  <= to_date('28-NOV-2022')
      )


can be phrased as

select (
from (
select 
  instr_id,
  snp_date,
  liq_score,
  unv_grp_id,
  max(case when snp_date  <= to_date('28-NOV-2022') then snp_date end) over ( ) as highest_snp_less_than_limit
from liq.f_liq_unv
where instr_id = 11939805
)
where snp_date = highest_snp_less_than_limit


which *might* be more efficient

Optimize query

A reader, December 07, 2022 - 4:33 am UTC

Hi Connor,

Thanks for the modified with analytic function.

Just have few questions-

SELECT MAX(snp_date )
FROM LIQ.F_LIQ_UNV
WHERE instr_id = 11939805
AND snp_date <= to_date('28-NOV-2022')

1. As i keep noticing that, after successive executions, the same query executes faster due to the fact that query results already found in the db cache.

Since the max query below has to index range scan on almost all partitions ,
Would it help to keep the local partitioned index in the Exadata storage flash cache to speed up the query retrieval. Index is about 450GB in size.

2. I tried with different approach of query optimisation, but I am still not able to meet the query SLA of consistent max, 150 ms.response time.

One other thing, to note here is that table partitions are compressed with query high. Does oracle has to decompress the blocks while fetching data ..?
Mostly rows are inserted and no updates but rarely data gets deleted on the tables.

Would it help if I keep the data in uncompressed ?


3. I tried with /*+ RESULT_CACHE */ hint added to the select and saw little improvement. Would creating materialized view would speed up the query response time .?



Thank you for your time!


Connor McDonald
December 09, 2022 - 4:13 am UTC

Can we have just this query

SELECT MAX(snp_date )
FROM LIQ.F_LIQ_UNV
WHERE instr_id = 11939805
AND snp_date <= to_date('28-NOV-2022')

with the gather_plan_statistics hint

More to Explore

DBMS_RANDOM

More on PL/SQL routine DBMS_RANDOM here