Skip to Main Content
  • Questions
  • How to find out which process caused the maximum IO request to db server?

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, zhu.

Asked: January 04, 2003 - 9:58 am UTC

Last updated: November 20, 2012 - 11:24 am UTC

Version: 8.1.7

Viewed 50K+ times! This question is

You Asked

Hi, tom:
Recently I met a severe performance issue on my db server, though finally resolved it, it took me quite long time and have adverse impact on client response time. It is because of Heavy IO wait in the db server.
It is that after a heavy batch operation on the db server, some module of the app begin to slow down and middle ware (tuxedo) slow down. Server has high IO wait(at worst time, 90% cpu spent on IO wait). How can i find out which process(It is a dedicated database server) caused the most io request to the server(I am in unix, sun solaris).As It is middle ware, connection is kept for a long time, and i checked v$sesstat for physical read, redo size , hard to find out solution.(though finally i found the clue from v$sesstat through reboot unix and middleware), I need some method to quickly find out which bad process pool the whole system down.(unix tips, or oracle tips, anything that will work for me, according to your experience.)
Thanks very much.


and Tom said...

why do you care about a TRANSIENT process (a dedicated server). suppose I told you it was dedicated server 1234. Now what?

You want to find the SQL, the processing, that is doing this -- not a dedicated server.


You will use statspack.
You will take a snapshot.
15 minutes later take another snapshot.
You will analyze the report and it'll show you the top sql.


You would have found the problem in 15 minutes in the first place had you simply used statspack from the get go. It'll tell you the top 5 wait events (IO will be there), then you can refer to the top SQL to find the sql with the most physical IO. Then you can refer to the top IO by tablespace to find the tablespace with the most IO. And so on. Everything you need will be there.

Read
$ORACLE_HOME/rdbms/admin/spdoc.txt




Rating

  (56 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

statspack does not work in my case

zhu, January 04, 2003 - 10:46 am UTC

Tom, I did used statspack but failed to find that bad sql.
Yes, the top wait event is :
Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt Time
-------------------------------------------- ------------ ------------ -------
db file sequential read 1,477,384 17,496,979 84.31
db file parallel write 41,433 1,107,580 5.34
enqueue 21,379 503,590 2.43
buffer busy waits 44,763 494,411 2.38
control file parallel write 4,538 252,388 1.22

And I did look at the statspack sql statement part:
Buffer gets largest:
--I removed the detailed sql here.

and disk read largest sql:
^LSQL ordered by Reads for DB: BIDDB Instance: biddb Snaps: 854 -855
-> End Disk Reads Threshold: 1000
--I do not post the detailed sql here.
None of them caused this problem. In fact, It is a big update that caused this problem.And because of Tuxedo timeout control, It always failed to do its update and it is controled by Tuxedo to do this update every twenty minutes, and with more and more records to update, the update is unable to finish timely, and terminated by tuxedo and caused a lot of rollback with again caused heavy IO wait
Statspack does not work in my case.In fact, I used statspack very frequently.

Tom Kyte
January 04, 2003 - 11:15 am UTC

If it were a big update -- it would have fell in that report clearly.


The top sql includes updates.

since I wasn't actually there, I cannot comment further. A big update -- even one that rolls back -- would have been there. Perhaps you were measuring AFTER the fact -- when the update wasn't in fact running.

How long of a window did you use for statspack (hopefully not more than 15-30 minutes at most)?

What WAS the top SQL in that report.

zhu, January 04, 2003 - 12:22 pm UTC

Yes, the time interval is 4 hours:
Snap Id Snap Time Sessions
------- ------------------ --------
Begin Snap: 854 02-Jan-03 17:32:46 458
End Snap: 855 02-Jan-03 21:59:06 458
Elapsed: 266.33 (mins)
But during the whole day, system is behavioring poorly because of that sql.

Top SQL in Statspack:
LSQL ordered by Gets for DB: BIDDB Instance: biddb Snaps: 854 -855
-> End Buffer Gets Threshold: 30000
-> Note that resources reported for PL/SQL includes the resources used by
all SQL statements called within the PL/SQL code. As individual SQL
statements are also reported, it is possible and valid for the summed
total % to exceed 100

Buffer Gets Executions Gets per Exec % Total Hash Value
--------------- ------------ -------------- ------- ------------
54,986,214 36,769 1,495.5 18.3 630678823
select t1.product_id ,0 ,NVL(t1.BRAND_ID,0) ,NVL(decode(t1.ON_SH
ELL_TYPE,2,t1.FIX_PRICE,6,t1.price1,7,t1.start_price,18,t1.FIX_P
RICE,t1.CURRENT_BID_PRICE),0) ,NVL(t1.DISCOUNT_RATE,0) ,NVL(t1.Q
UANTITY,0) ,TO_CHAR(t1.START_DATE_DTM,'YYYY-MM-DD HH24:MI:SS') ,
TO_CHAR(t1.END_DATE_DTM,'YYYY-MM-DD HH24:MI:SS') ,NVL(t1.STATUS,

36,589,129 23,830 1,535.4 12.2 1828704478
SELECT TO_CHAR(A.JUDGE_ID) || CHR(9) || TO_CHAR(A.AUCTION_ID)
|| CHR(9) || TO_CHAR(A.PRODUCT_ID) || CHR(9) || RTRIM(A.TITLE
) || CHR(9) || RTRIM(A.AUCTION_TYPE) || CHR(9) || TO_CHAR(A.
CRITIC_ID) || CHR(9) || RTRIM(B.USER_ACCOUNT) || CHR(9) || T
O_CHAR(B.USER_TRUST_SCORE) || CHR(9) || TO_CHAR(B.USER_TRUST_C

10,098,274 85,435 118.2 3.4 2026580849
select count(*) into :b0 from products a where ((((((:b1<>0 an
d SHOP_GOODS_CLASS_ID=:b1) or :b1=0) or (:b1=(-1) and (nvl(SHOP_
GOODS_CLASS_ID,0)=0 or not exists (select * from ShopGoodsClas
s b ,shopcatalog c where (((a.shop_goods_class_id=b.shop_goods_c
lass_id and b.shop_id=c.shop_id) and c.user_id=:b5) and c.status

10,004,276 23,837 419.7 3.3 1969715255
begin OPEN :vCursor FOR SELECT TO_CHAR ( A . JUDGE_ID ) || CHR (
9 ) || TO_CHAR ( A . AUCTION_ID ) || CHR ( 9 ) || TO_CHAR ( A .
PRODUCT_ID ) || CHR ( 9 ) || RTRIM ( A . TITLE ) || CHR ( 9 ) |
| RTRIM ( A . AUCTION_TYPE ) || CHR ( 9 ) || TO_CHAR ( A . CRITI
C_ID ) || CHR ( 9 ) || RTRIM ( B . USER_ACCOUNT ) || CHR ( 9 ) |

9,884,393 45,044 219.4 3.3 2203634836
select CRITIC_IDENTITY ,CRITIC_REMARK ,count(*) from UserJudge
where (((ACCEPTER_ID=:b0 and JUDGE_STATE=:b1) and (SYSDATE-CRIT
IC_TIME_DTM)<=:b2) and ACCP_VISIBLE=1) group by CRITIC_IDENTITY,
CRITIC_REMARK

7,301,279 9,567 763.2 2.4 1396772918
select count(B.UC_REF_ID) ,A.PRODUCT_ID ,A.TITLE from PRODUCTS
A ,UC_USERCOMMENT B where (((((((A.STATUS in (1,2) and A.PRODUCT
_ID=B.UC_REF_ID) and B.UC_REF_CODE like '1-%') and B.UC_RECEIVER
_ID=:b0) and A.SELLER_ID=DECODE(:b1,1,A.SELLER_ID,2,:b0,0,A.SELL
ER_ID)) and B.UC_POSTER_TYPE=DECODE(:b1,1,2,2,1,0,B.UC_POSTER_TY

7,181,092 678,559 10.6 2.4 3308940087
begin :sellcount := GetSellAmount ( :p_id ) ; END ;

7,068,876 22,522 313.9 2.4 54290869
select A.CRITIC_IDENTITY ,A.CRITIC_REMARK ,count(*) from UserJ
udge A ,USERPRIVATEINFO B where ((((A.CRITIC_ID=B.USER_ID and B.
USER_TYPE>=:b0) and A.ACCEPTER_ID=:b1) and A.JUDGE_STATE=:b2) an
d A.ACCP_VISIBLE=1) group by A.CRITIC_IDENTITY,A.CRITIC_REMARK


Disk read sql:
^LSQL ordered by Reads for DB: BIDDB Instance: biddb Snaps: 854 -855
-> End Disk Reads Threshold: 1000

Physical Reads Executions Reads per Exec % Total Hash Value
--------------- ------------ -------------- ------- ------------
123,750 28,308 4.4 6.5 3636753446
SELECT A.PRODUCT_ID,B.QUANTITY_GET FROM PRODUCTS A,PRODUCTBIDI
NFO B WHERE B.USER_ID = :b1 AND B.STATUS = 0 AND B.PRODUCT_ID
= A.PRODUCT_ID AND A.STATUS IN ( 1,2 ) AND A.END_DATE_DTM >=
SYSDATE AND A.ON_SHELL_TYPE != 6 ORDER BY A.END_DATE_DTM,A.PROD
UCT_ID,B.ROWID DESC

123,747 14,543 8.5 6.5 2343244299
begin OPEN :myeach_buyinginfo_cur FOR SELECT A . PRODUCT_ID , B
. QUANTITY_GET FROM PRODUCTS A , PRODUCTBIDINFO B WHERE B . USER
_ID = :lUser_id AND B . STATUS = 0 AND B . PRODUCT_ID = A . PROD
UCT_ID AND A . STATUS IN ( 1 , 2 ) AND A . END_DATE_DTM >= SYSDA
TE AND A . ON_SHELL_TYPE <> 6 ORDER BY A . END_DATE_DTM , A . PR

102,595 3,879 26.4 5.4 275259148
select NVL(sum(TX_TransactionAmount),0) into :b0 from TX_Transa
ction where (((PA_AccountID=:b1 and TO_char(TX_TransactionDate,'
YYYYMM')=:b2) and TX_Status not in (1005,4005)) and TX_Transact
ionAmount<0)

93,979 15 6,265.3 4.9 998854616
select count(*) into :b0 from RealNameVerifyInfo A where (((A.
DATE_MEM_APPL between TO_DATE(:b1,:b2) and TO_DATE(:b3,:b2) and
ITEM_TYPE=1) and ITEM_STATUS=1) and CREDITCARD_TYPE=:b5)

82,794 22,912 3.6 4.4 1835894617
select count(*) into :b0 from uc_usercomment where (uc_poster_
id=:b1 and uc_create_dtm>=(sysdate-:b2))

78,503 12 6,541.9 4.1 450156570
SELECT B.USER_ACCOUNT,A.REAL_NAME,A.ITEM_VALUE,NVL(A.CREDITCARD_
TYPE,0),NVL(TO_CHAR(A.CREDITEXPIRE_DATE,:b1) || :b2 ,' ') || NVL
(TO_CHAR(A.CREDITEXPIRE_DATE,:b3) || :b4 ,' ') || NVL(TO_CHAR(A
.CREDITEXPIRE_DATE,:b5) || :b6 ,' ') ,A.TRACE_NO,NVL(A.IDCARD_NO
,' ') FROM REALNAMEVERIFYINFO A,USERPRIVATEINFO B WHERE A.USE

77,713 7 11,101.9 4.1 1894247638
select count(*) into :b0 from USERPRIVATEINFO where (USER_ID>0
and USER_STATUS=:b1)

56,095 12 4,674.6 2.9 2366728271
select count(*) into :b0 from st_smstrace where (((((:b1 is n
ot null and rtrim(st_mpnumber)=:b1) or :b1 is null ) and ((:b4<
>0 and st_status=:b4) or (:b4=0 and st_status<>136))) and ((:b7
is not null and st_submittime>=to_date(:b7,'yyyy-mm-dd')) or :
b7 is null )) and ((:b10 is not null and st_submittime<=to_dat

51,844 5,741 9.0 2.7 1273485882
select count(distinct A.PRODUCT_ID) into :b0 from PRODUCTS A ,U
C_USERCOMMENT B where (((((((A.STATUS in (1,2) and A.PRODUCT_ID=
B.UC_REF_ID) and B.UC_REF_CODE like '1-%') and A.SELLER_ID=DECOD
E(:b1,1,A.SELLER_ID,2,:b2,0,A.SELLER_ID)) and B.UC_RECEIVER_ID=:
b2) and B.UC_POSTER_TYPE=DECODE(:b1,1,2,2,1,0,B.UC_POSTER_TYPE))

51,052 15,033 3.4 2.7 292807924
^LSQL ordered by Reads for DB: BIDDB Instance: biddb Snaps: 854 -855

And the Actual SQL that caused the server slow down is:
UPDATE products
set status=3
WHERE end_date_dtm<=sysdate and
end_date_dtm>sysdate-0.1 and
( ( ( (status=1) or (status=2) ) and (end_date_dtm<sysdate) ) or
( ( (status=1) or (status=2) ) and (quantity=quantity_saled_fix) ) ) and
( (on_shell_type<6) or (on_shell_type=18) ) and
nvl(cost,0)<>1 ;
This SQL will update about 6K records, and should be able to finish in 3 minutes(I tested this SQL in another machine with the actual data exported from the standby database, which has not been updated because of 6 hour later than the production), However, it failed to finish the SQL in time and rolled back by tuxedo.
The strange thing is that in a less powerful machine (3500 8CPU 5 old disk raid5), when I was testing the update, IO wait is about just 30% and IO busy is just about 50-80%, but in the production with 12 CPU and 9 disk Raid5,(with concurrent user), when I manually run the sql, 5 minutes passed and it did not finish.Execution plan is the same and the data is the same.IO busy(from iostat) is 100% and cpu spent on IO wait is 90%(From top and sar/iostat) and I have to abort this SQL and later break this update into several update and commit.
Thanks.



Tom Kyte
January 04, 2003 - 12:40 pm UTC

4 hours is at least 8 times too long.  You cannot pinpoint a bottleneck with such a large timeframe.  15-30 minutes *max*

Seems to me, you tune these two:

  Buffer Gets    Executions  Gets per Exec  % Total  Hash Value
--------------- ------------ -------------- ------- ------------
     54,986,214       36,769        1,495.5    18.3    630678823
select t1.product_id ,0 ,NVL(t1.BRAND_ID,0) ,NVL(decode(t1.ON_SH
ELL_TYPE,2,t1.FIX_PRICE,6,t1.price1,7,t1.start_price,18,t1.FIX_P
RICE,t1.CURRENT_BID_PRICE),0) ,NVL(t1.DISCOUNT_RATE,0) ,NVL(t1.Q
UANTITY,0) ,TO_CHAR(t1.START_DATE_DTM,'YYYY-MM-DD HH24:MI:SS') ,
TO_CHAR(t1.END_DATE_DTM,'YYYY-MM-DD HH24:MI:SS') ,NVL(t1.STATUS,

     36,589,129       23,830        1,535.4    12.2   1828704478
SELECT TO_CHAR(A.JUDGE_ID) || CHR(9)  || TO_CHAR(A.AUCTION_ID)
|| CHR(9)  || TO_CHAR(A.PRODUCT_ID)  || CHR(9)  || RTRIM(A.TITLE
)  || CHR(9)  || RTRIM(A.AUCTION_TYPE)  || CHR(9)  || TO_CHAR(A.
CRITIC_ID)  || CHR(9)  || RTRIM(B.USER_ACCOUNT)  || CHR(9)  || T
O_CHAR(B.USER_TRUST_SCORE)  || CHR(9)  || TO_CHAR(B.USER_TRUST_C

And you'll get your BIGGEST bang for the buch.  55million executions in 4 hours:

ops$tkyte@ORA920> select 32769/(4*60*60) from dual;

36769/(4*60*60)
---------------
     2.55340278


that's 2.5 times A SECOND.  Fix that query and you'll see a huge gain in overall system response time!




If you notice -- you have enqueue waits -- what are the odds this update is actually getting BLOCKED?  

9 disk Raid5 <<<== raid5 = "slow = true", sure hope redo, temp and arch is NOT on raid 5, no matter what glorious things the hardware vendor said about it.


why not consider running the SQL in the database, not via tuxedo, pre-empt the tuxedo problem entirely.  Tuxedo isn't for batch -- it is for high end OLTP.

(but running the same poorly performing query 2.5 times a second -- some very very looooowwww  hanging fruit there).  


Tell me, when you tested the update -- did you do it when you had Upteem hundred/thousand users concurrently hitting the same data? 

Some ?'s

reader, February 04, 2003 - 4:08 pm UTC

Tom,
In your respose :
select 32769/(4*60*60) from dual;

what does each value represent?

Also, I am facing a situation where enqueue is the top wait event. can you please tell me what is this wait event?

thanks.

Tom Kyte
February 04, 2003 - 4:30 pm UTC

it was from:


  Buffer Gets    Executions  Gets per Exec  % Total  Hash Value
--------------- ------------ -------------- ------- ------------
     54,986,214  <b>     36,769  </b>      1,495.5    18.3    630678823
....

And you'll get your BIGGEST bang for the buch.  55million executions in 4 hours:

ops$tkyte@ORA920> select 32769/(4*60*60) from dual;

36769/(4*60*60)
---------------
     2.55340278


that's 2.5 times A SECOND.


that query is executed 2.5 times a second (4 hours * 60 minutes * 60 seconds )


Enqueue -- easy, thats a fancy name for a LOCK.  You have locking contention (eg: I've updated a row you are waiting to update and you wait for me to commit) 

Can you explain what this indicates

Bala, February 04, 2003 - 9:53 pm UTC

How do you arrive at the figure of 55 million executions from the report...

What does that mean ..

that query is executed 2.5 times a second (4 hours * 60 minutes * 60 seconds )

Can you interpret this



Tom Kyte
February 05, 2003 - 7:46 am UTC

sorry that 55mill exec/hour should have been 36k exec/hour -- the math was right the text wasn't:


Buffer Gets    Executions  Gets per Exec  % Total  Hash Value
--------------- ------------ -------------- ------- ------------
     54,986,214       36,769        1,495.5    18.3    630678823
....

And you'll get your BIGGEST bang for the buch.  <b>36,769</b> executions in 4 hours:

ops$tkyte@ORA920> select 32769/(4*60*60) from dual;

36769/(4*60*60)
---------------
     2.55340278


that's 2.5 times A SECOND.


that query is executed 2.5 times a second (4 hours * 60 minutes * 60 seconds )


What I meant by that was take the number of seconds in 4 hours -- 4*60*60 and divide 36,769 by that amount and you get 2.5 executions per second. 

I am in same situation

reader, February 05, 2003 - 10:28 am UTC

I am trying to resolve locking issues on our production system. Just want to get your opinion on the following, which is just a tiny bit of information from tons that i generated (tkprof and statspack).

I am particularly interested in resolving enqueue. And the table in the query was the one being locked.

Thanks.
Avg
Total Wait wait Waits
Event Waits Timeouts Time (cs) (ms) /txn
---------------------------- ------------ ---------- ----------- ------ ------
enqueue 915 904 278,899 3048 2.9
db file sequential read 107,245 0 126,190 12 336.2
latch free 3,920 3,609 5,391 14 12.3


Buffer Gets Executions Gets per Exec % Total Hash Value
--------------- ------------ -------------- ------- ------------
17,463,282 18,033 968.4 46.6 2921279387
select TRACE01.TYPE ,TRACE01.STATUS ,TRACE01.SEQUENCE into :b0,:
b1,:b2 from TRACE TRACE01 where TRACE01.TYPE=NVL(RTRIM(:b3,' ')
,' ')

Tom Kyte
February 05, 2003 - 12:11 pm UTC

my opinion? looks like you have a blocking problem -- that is all I can say about that.

no idea of the duration of the statspack. is that 2788 seconds of wait over 15 minutes, a day, a week....

use v$lock to query up who's blocking who and on what object periodically.

In addition

reader, February 05, 2003 - 11:45 am UTC

Snap Id Snap Time Sessions
------- ------------------ --------
Begin Snap: 12 31-Jan-03 22:03:19 26
End Snap: 14 31-Jan-03 22:30:45 26
Elapsed: 27.43 (mins)

Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt Time
-------------------------------------------- ------------ ------------ -------
enqueue 915 278,899 66.86
db file sequential read 107,245 126,190 30.25

BIG question is :
how can i have enqueue wait of 47min when the duration of
snapshot is 27 min.

please help -thanks

Tom Kyte
February 05, 2003 - 12:16 pm UTC

if you have 47 people wait 1 minute each, you'll have 47 minutes of enqueue waits.


It is cumulative. you can have a huge number of people waiting. If you have 100 people ALL BLOCKED by the same guy for 5 seconds -- that'll be 500 seconds of wait time.

Blocking?

reader, February 05, 2003 - 12:31 pm UTC

This is a month end batch job. how and what triggers blocking? is this different than locking?

the object that is blocked is the table in the top sql.

thanks for your help. your comments are very helpful

Tom Kyte
February 05, 2003 - 12:47 pm UTC

i update a row

you want to update the same row

I block you until I commit.


it is contention for the same resource. Locking is something that can lead to blocking if the locked elements need to be locked by more then one session at the same time.


that top sql is a select without for update. Other then it does probably way too much LIO -- that query isn't relevant.

It'll be the insert/update/deletes or select for updates against that object that are.

OK 2nd statement in Top SQL

reader, February 05, 2003 - 12:59 pm UTC

Tom,
here is the second statement in TOP sql:
***********************************************
select TRACE01.TYPE ,TRACE01.SEQUENCE ,TRACE01.NUMBER0 ,TRACE01.
RANGE_START ,TRACE01.RANGE_END ,TRACE01.STATUS ,TO_CHAR(TRACE01.
LAST_UPDATED,'YYYY-MM-DDHH24.MI.SS') from TRACE TRACE01 where (
TRACE01.TYPE=:b0 and TRACE01.STATUS=NVL(RTRIM(:b1,' '),' ')) for
update of NUMBER0
***************************************
key point is that it is the same table with a update of
statement. Copule of questions :

1. table is small 220 rows, i am told that a process will go only after its row via TYPE col. my understanding is that for update of clause should not lock the table as it is going via TYPE hence will only lock its row. is my understanding correct?

2.Any comments on db file sequential read wait event it is also quite hige for the snapshot window.

Thanks for your help. i appreciate your comments and they are helping in clearing out the path.

thanks.

Tom Kyte
February 05, 2003 - 1:27 pm UTC

1) correct. as long as they each use the OWN :b0 values for that bind -- they'll not block eachother.

2) it is indicative of doing lots and lots of indexed reads that cause lots of physical IOs

reader, February 05, 2003 - 2:58 pm UTC

Tom,
what does waits and total wait time columns indicate?

does the difference between enqueue wait time and snapshot duration means that I have many sessions going after the same row?

is it possible to get that information from statspack report?

also i thing enqueue wait event indicates application issue, am i on the right path?

thanks and appreciate your time.




Tom Kyte
February 05, 2003 - 6:11 pm UTC

waits = number of discrete times you waited on that event.
total wait time = sum of the times you waited..

100 people wait for 2 seconds apiece means -- 100 waits, 200 seconds total wait time


the enqueue waits means you have a blocking/blockee issue. whether it is a single row or many rows or something else -- cannot say. statspack is aggregate instance stuff, you would have to monitor v$lock to track it down.


enqueue waits are typically an application issue, yes.

Autotrace columns

reader, February 05, 2003 - 4:47 pm UTC

Tom,
what is your take on the following, also if you can reinforce my understanding of what each columns means would be a great help. I trid to search the site couldnt find what i was looking for, unless i missed something.

thanks for your time and advice.

SQL> /


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=32)
   1    0   FOR UPDATE
   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'TEST_TRACE' (Cost=1 Ca
          rd=1 Bytes=32)

   3    2       INDEX (UNIQUE SCAN) OF 'TRACE1UX' (UNIQUE)




Statistics
----------------------------------------------------------
         79  recursive calls
          1  db block gets
         35  consistent gets
          0  physical reads
          0  redo size
        709  bytes sent via SQL*Net to client
        358  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> /


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=32)
   1    0   FOR UPDATE
   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'TEST_TRACE' (Cost=1 Ca
          rd=1 Bytes=32)

   3    2       INDEX (UNIQUE SCAN) OF 'TRACE1UX' (UNIQUE)




Statistics
----------------------------------------------------------
          0  recursive calls
          1  db block gets
          2  consistent gets
          0  physical reads
          0  redo size
        709  bytes sent via SQL*Net to client
        358  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1  rows processed 

Tom Kyte
February 05, 2003 - 6:15 pm UTC

looks like the first time around it did a HARD parse (eg: query was brand new, never seen before). the recursive sql was to parse the query.

the second time, no hard parse, thus no recursive sql, thus less logical IOS...

Why no commit in tkprof

reader, February 07, 2003 - 1:26 pm UTC

Tom,
when I do select for update and issue a commit it does not get recorded in tkprof. It records it when i do insert statement and commit. Can you please explain why this behaviour?


Tom Kyte
February 07, 2003 - 2:02 pm UTC

i see it in mine:

PARSING IN CURSOR #3 len=26 dep=0 uid=146 oct=2 lid=146 tim=1020160631338492 hv=11516084 ad='5eb5fd78'
insert into t values ( 1 )
END OF STMT
PARSE #3:c=3906,e=3149,p=0,cr=1,cu=0,mis=1,r=0,dep=0,og=4,tim=1020160631338469
BINDS #3:
EXEC #3:c=0,e=720,p=0,cr=1,cu=20,mis=0,r=1,dep=0,og=4,tim=1020160631339352
WAIT #3: nam='SQL*Net message to client' ela= 7 p1=1650815232 p2=1 p3=0
WAIT #3: nam='SQL*Net message from client' ela= 1490 p1=1650815232 p2=1 p3=0
=====================
PARSING IN CURSOR #3 len=6 dep=0 uid=146 oct=44 lid=146 tim=1020160631344825 hv=3867936055 ad='5daf8d44'
commit
END OF STMT
PARSE #3:c=0,e=114,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=4,tim=1020160631344801
XCTEND rlbk=0, rd_only=0

EXEC #3:c=0,e=719,p=0,cr=0,cu=1,mis=0,r=0,dep=0,og=4,tim=1020160631345671
WAIT #3: nam='SQL*Net message to client' ela= 6 p1=1650815232 p2=1 p3=0
WAIT #3: nam='SQL*Net message from client' ela= 860 p1=1650815232 p2=1 p3=0
=====================
PARSING IN CURSOR #3 len=26 dep=0 uid=146 oct=3 lid=146 tim=1020160631348200 hv=2108716148 ad='5eba58f8'
select * from t for update
END OF STMT
PARSE #3:c=1953,e=1051,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=4,tim=1020160631348175
BINDS #3:
EXEC #3:c=0,e=545,p=0,cr=7,cu=2,mis=0,r=0,dep=0,og=4,tim=1020160631348928
WAIT #3: nam='SQL*Net message to client' ela= 5 p1=1650815232 p2=1 p3=0
FETCH #3:c=0,e=96,p=0,cr=5,cu=0,mis=0,r=1,dep=0,og=4,tim=1020160631349194
WAIT #3: nam='SQL*Net message from client' ela= 518 p1=1650815232 p2=1 p3=0
FETCH #3:c=0,e=83,p=0,cr=2,cu=0,mis=0,r=0,dep=0,og=4,tim=1020160631350070
WAIT #3: nam='SQL*Net message to client' ela= 5 p1=1650815232 p2=1 p3=0
WAIT #3: nam='SQL*Net message from client' ela= 1431762 p1=1650815232 p2=1 p3=0
STAT #3 id=1 cnt=1 pid=0 pos=1 obj=0 op='FOR UPDATE (cr=7 r=0 w=0 time=146 us)'
STAT #3 id=2 cnt=2 pid=1 pos=1 obj=39210 op='TABLE ACCESS FULL T (cr=14 r=0 w=0 time=278 us)'
=====================
PARSING IN CURSOR #3 len=6 dep=0 uid=146 oct=44 lid=146 tim=1020160632782666 hv=3867936055 ad='5daf8d44'
commit
END OF STMT
PARSE #3:c=0,e=113,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=4,tim=1020160632782645
XCTEND rlbk=0, rd_only=0

EXEC #3:c=0,e=323,p=0,cr=0,cu=1,mis=0,r=0,dep=0,og=4,tim=1020160632783109
WAIT #3: nam='log file sync' ela= 379 p1=652 p2=0 p3=0
WAIT #3: nam='SQL*Net message to client' ela= 6 p1=1650815232 p2=1 p3=0
WAIT #3: nam='SQL*Net message from client' ela= 554701 p1=1650815232 p2=1 p3=0
XCTEND rlbk=0, rd_only=1
=====================


Please ignore previous ?

reader, February 07, 2003 - 1:47 pm UTC

I was aggregating please ignore previous ?

thank you

Selecting the correct snapshot time interval

Mohsin, March 12, 2003 - 4:03 pm UTC

Tom:

Earlier on in this thread you suggested a 15-30min sampling interval. Could you please share your real life experience on the correct selection of this interval.

Say a sql starts at time t1 and ends at time t2
but my first snapshot was taken at t1-14 mins and the next one was at t1+1 min. So from the time the sql started and finished the performance metrics are now in two snapshots. How would this be represented in the report if I ran the statspack report that spanned only one snapshot and not the next (which is when the sql actually completed).
Which raises the issue, Having collected statspack info at 15 min interval, is the report that spans several snapshots an average of data collected over all the snapshots.

thanks for sharing your vast pool of knowledge.

MJ

Tom Kyte
March 13, 2003 - 7:28 am UTC

15-30 minutes is the correct sample size.

You are looking at the SYSTEM, not a single query, not a single user. You take a 15 minute window and say "so, what's happening"



Agreed but

Mohsin, March 13, 2003 - 10:05 am UTC

I agree that the data collected is at the system level.

So what path can I follow to "drill" down a statspack datacollection exercise during which we collected snapshot data at a 15min interval over a business day and now I need to analyze the data to determine the "cause of bad/poor performance" or determine what was happening in the d/b.

Let's say we identified that users percieved poor performance was at around 10:00 am.

How should I progress from here?


Tom Kyte
March 14, 2003 - 5:19 pm UTC

umm, look at the top timed events and go from there.

If you have no waits of any measure, then you have done what you can -- beyond changing your ALGORITHMS -- which means you have to go back to the applications and use profilers (dbms_profiler or sql_trace+tkprof) and make them more efficient.

If you have massive waits on resources, remove them -- how you do that varies wait by wait. Big library cache latch waits? you didn't use binds. big PIO waits -- maybe a query or two or more needs to be tuned and so on.

dba

Harry, March 13, 2003 - 4:05 pm UTC

Tom, is there any sql script that can be used to find most expensive sql without using stats pack, and better if it also tell me from how long this query is active.

Regards
Garry

Tom Kyte
March 14, 2003 - 5:43 pm UTC

define "most expensive".

then look at v$sql -- lots of stuff there (hey, thats what statspack uses)

but stats pack will tell you the most "expensive" (by MANY different criteria) over a period of time which is more relevant probably then "forever"

What does CPU Time mean

Arun Gupta, April 02, 2003 - 11:06 pm UTC

Tom
In Oracle 9.2, in the statspack report, I get:

Top 5 timed events %Total
Event Time(s) Ela time
-------------------------------------------------
CPU Time 70 79.8

What is the meaning of CPU Time and why is it appearing ?
Thanks


Tom Kyte
April 03, 2003 - 7:47 am UTC

The Top 5 Wait events section on the first page of the Statspack report (spreport.sql) has been renamed to Top 5 Timed events as it now includes CPU usage as well as wait events. This avoids the need to look at the v$sysstat section of the report to determine if CPU usage or wait events are the main time consumers on a system. Note: this does not mean that there is a CPU time wait event!

from spdoc.txt in rdbms/admin:

9. New and Changed Features
----------------------------

9.1. Changes between 9.0 and 9.2

Changes on the Summary Page of the Instance Report (spreport.sql)

o The Top 5 Wait Events has been changed to be the Top 5 Timed Events.

What was previously the Top 5 Wait Events has been expanded to give the
Top 5 time usage within the instance: i.e. in addition to including Wait
events, this section can now include the 'CPU used by this session'
statistic. This statistic will appear in the Top 5 only if it's value
is one of the the Top 5 users of time for the snapshot interval.

Note that the name of the statistic 'CPU used by this session' will
actually appear in the Top 5 section as 'CPU Time'. The statistic
name is masked in the Top 5 to avoid the confusion of the suffix
'by this session'.
The statistic will continue to appear in the System Statistics
(SYSSTAT) section of the report as 'CPU used by this session'.

Additionally, instead of the percentage calculation being the % Total
Wait Time (which is time for each wait event divided by the total wait
time for this snapshot interval), the percentage calculation is now
% Total Elapsed Time (which is time for each timed event divided by
the total elapsed time).
i.e.
previously: time for each wait event / total wait time for all events
now: time for each timed event / total elapsed time

The total elapsed time is computed using the total wait time for all
events added to the total CPU time used for the interval
i.e.
total elapsed time = total wait time + total CPU time

Purpose
~~~~~~~
The purpose for including CPU time with wait events:

When tuning a system, the first step is to identify where the most of the
time is spent, in order to identify where the most productive tuning
effort should be concentrated.

The majority of time could be spent in waiting for events to complete
(and so be identifyable in the wait event data), or the system could be
consuming much CPU (for which Operating System statistics, and the Oracle
CPU statistic 'CPU used by this session' in SYSSTAT are examined).
Having the CPU Time co-located with the wait events in the Top 5 section
of the instance report makes it easier to compare the relative values
and to identify whether the most productive investigation would occur
by drilling down the wait events, or in reducing Oracle CPU usage
(e.g. by tuning SQL).


any demo?

Reader, July 25, 2003 - 8:33 pm UTC

Tom, do you have any demo to illustrate oracle wait events? Thanks.

Tom Kyte
July 25, 2003 - 8:51 pm UTC

umm, no -- not really.


A reader, July 26, 2003 - 11:55 am UTC

Tom in v$sqlarea the column sql_text is varchar2(1000), what if the query is more than that and I would like to see the entire query without getting truncated.

Thanks.

Tom Kyte
July 26, 2003 - 12:55 pm UTC

v$sqltext_with_newlines


has the full text

A reader, July 26, 2003 - 7:35 pm UTC

Thanks Tom!!

Confused with CPU time.

Babu, August 21, 2003 - 2:59 pm UTC

Hi Tom,

Excume me for asking a simple question here:
What is CPU time, how it is measured, and how is it realted load average???

I see the following output in my statspack report with 4hr interval(very long, i'm gonna use 15 min intervals from now onwards).

Statistic Total per Second per Trans
--------------------------------- ---------------- ------------ ------------
CPU used by this session 2,513,525 58.1 9.5
CPU used when call started 2,515,025 58.2 9.5

What is CPU used by this session "Toal", "Per second", and "per transaction"????

Whats is the unit of CPU time, when it says CPU used by this session per second is 58.1 -- what does 58.1 indicate.

Please explain. Thanks in advance...

Tom Kyte
August 21, 2003 - 7:22 pm UTC

cpu seconds is the number of seconds the cpu was being used.


that number is in centi seconds (100'ths of a second).

so that says "every second, you used 0.58 cpu seconds". if you have a single cpu, that means on average you used about 60% of it.


is that a 12 hour snap really?

Re: Confused with CPU.

Babu, August 21, 2003 - 8:36 pm UTC

Yes, it was a 12 hr snap report.

We have, 42% cpu time event, 25% db file sequential wait event, and 10% db file scattered read wait event.

I can tune sequential and scattered read wait events, but how can I tune CPU time event?

I think, tuning large buffer gets versus executions sql statements in v$sql will reduce CPU time event. Are there any other ways to tune CPU time event, other than looking at expensive sql's.

Thanks a lot for your time.

Tom Kyte
August 21, 2003 - 8:51 pm UTC

math doesn't lie! i was a math major in college :) I knew it was 12 hours, not 4.

the cpu event is just a TIMED EVENT.

it is not a wait event

42% means "i used 42 out of 100 units of some resource that I cannot put into the bank and use later"

you should only be worried if you are near 100%!!! 42% is great -- really. it means you bought two times at much hardware as you really needed!!!!




Math major...;)

Kishan, August 21, 2003 - 11:59 pm UTC

Dear Tom:
I am not sure if you are a math major or magician. Top of this page has this computatin...We all know you can do wonders with Oracle, but how did you manage this? Numbers don't lie eh?:)

ops$tkyte@ORA920> select 32769/(4*60*60) from dual;

36769/(4*60*60)
---------------
2.55340278



Asks for some username/password

A reader, August 22, 2003 - 9:16 am UTC

Hi Tom,

The link you have given in above response asks some userid/password ???



Tom Kyte
August 22, 2003 - 9:26 am UTC

whoops, corrected that.

CPU usage from STATSPACK

Leo, August 22, 2003 - 12:05 pm UTC

Base on the report, our system CPU usage would be more 100%. But I monitored the system, the CPU usage was about 30%.

Snap Length
Start Id End Id Start Time End Time (Minutes)
-------- -------- -------------------- -------------------- -----------
5239 5241 14-Aug-03 14:00:05 14-Aug-03 14:30:03 29.97

Statistic Total per Second per Trans
--------------------------------- ---------------- ------------ ------------
CPU used by this session 1,615,612 898.6 19.6
CPU used when call started 530,644 295.1 6.4

The system has 8 CPUs, the percentage would be

8.98/8=1.12 = 112%

Is my calculation right?



Tom Kyte
August 22, 2003 - 7:53 pm UTC


see
</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:7641015793792#8768340190291 <code>

and consider that information....

all you needed was a job that was running for 5 hours to finish in that window.

physical reads

Reader, August 22, 2003 - 11:26 pm UTC

when I am loading data using sql loader, how does oracle account for physical reads and writes? Would oracle consider this (sqlloader operation) also a part of cache hit ratio? Thanks.

Tom Kyte
August 23, 2003 - 10:24 am UTC

depends on whether you use conventional path loads (sql inserts are used) or direct path loads (read input files and write database files, skip the sql engine for inserting)

conventional path loading, sqlldr is just another program doing sql against the database, nothing special at all.

direct path loading, sqlldr bypasses the buffer cache.


cache hit ratios are virtually meaningless!


CPU Time in statspack

A reader, September 10, 2003 - 11:00 pm UTC

Sir,
If CPU time is the topper in the statspack report, should my priority be to look into the topper sql's OR first tune the foreground waits ( which appear after CPU Time) and then look into the sql's if possible.
Rgds..



Tom Kyte
September 11, 2003 - 8:27 am UTC

no...


in many cases -- you would be jumping up and down for joy, you actually used a resource that if you don't use -- you cannot "save up" (eg: if you have a cpu minute and only use 30 cpu seconds, you'll never ever get that other 30 cpu seconds back again)....


it just shows you about how much cpu you used during the window of observation. If you have a 15 minute stats pack, on a 2 cpu box, you have 15*60*2 cpu seconds (1800). if the statspack reports says "hey, you had 1,000 cpu seconds" -- that means there was 800 more you didn't use (you were about 50-60% utilized).

It is just a number...

You need to understand what it represents in the grand context of things...

remember -- there will always be a top five timed events, they are just numbers. you may need to do nothing, you may need to do something.

Your new book

A reader, January 21, 2004 - 8:08 pm UTC

Tom,

Are the info about STATSPACK available in your new book. If not where can I find some very basic info about it. I am very new to using this tool. Please help

Thanks

Tom Kyte
January 22, 2004 - 6:29 am UTC

well, have you read the performance guide (online at otn.oracle.com)

i talk about it -- i do not go into gory detail on every wait, every statistic (the reference guide does most of that -- metalink.oracle.com would have more info)

It is what it is -- a fairly basic report.

Buffer waits

MEHMOOD, June 05, 2004 - 4:58 pm UTC

Dear Tom:

I have got this by my statspack report. Can you explain it?? how can i tune it?? and is there any tuning needed for this part ??

1-
=========================
Tablespace IO Stats for DB: ECRM Instance: ecrm Snaps: 1 -2
->ordered by IOs (Reads + Writes) desc

Tablespace
------------------------------
Av Av Av Av Buffer Av Buf
Reads Reads/s Rd(ms) Blks/Rd Writes Writes/s Waits Wt(ms)
-------------- ------- ------ ------- ------------ -------- ---------- ------
ADEP_TBL
19,829 8 0.5 12.0 135 0 199 3.5
TOOLS
1 0 50.0 1.0 1,231 0 0 0.0
UNDOTBS1
1 0 50.0 1.0 729 0 6 0.0
SYSTEM
============================

Many thanks in advance

Tom Kyte
June 06, 2004 - 10:40 am UTC



Looks great.
Looks horrible.
It is just OK.

This is similar to:

I stopped at 55 red lights on my last car trip. Is that good, bad or indifferent.

Don't worry -- whichever you pick from the above list -- I'll convince you the other two were true.



No one could make any sort of comments good bad or otherwise based on the above. We have no idea what the underlying file system looks like (maybe you have 1 terabyte of perfectly striped storage. Maybe you put everything on a single IDE drive). Maybe this was over 2 days. Maybe this was over 5 seconds.

Numbers out of context, meaningless.

More details of statspack report

MEHMOOD, June 07, 2004 - 3:47 am UTC

Dear Tom:

I am pasting here more details of statspack report. Kinldy advice and comment on the statspack report.

=================================
STATSPACK report for

DB Name DB Id Instance Inst Num Release Cluster Host
------------ ----------- ------------ -------- ----------- ------- ------------
db1 3541912267 db1 1 9.2.0.2.0 NO dbhost

Snap Id Snap Time Sessions Curs/Sess Comment
------- ------------------ -------- --------- -------------------
Begin Snap: 1 02-Jun-04 14:40:44 39 2.5
End Snap: 2 02-Jun-04 15:23:54 41 2.3
Elapsed: 43.17 (mins)

Cache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 224M Std Block Size: 8K
Shared Pool Size: 208M Log Buffer: 512K

Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 4,400.78 5,328.67
Logical reads: 950.24 1,150.59
Block changes: 23.93 28.98
Physical reads: 92.22 111.67
Physical writes: 1.11 1.34
User calls: 30.69 37.16
Parses: 6.47 7.83
Hard parses: 0.05 0.06
Sorts: 0.89 1.08
Logons: 0.02 0.03
Executes: 9.31 11.27
Transactions: 0.83

% Blocks changed per Read: 2.52 Recursive Call %: 18.40
Rollback per transaction %: 0.00 Rows per Sort: 180.34

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 99.99 Redo NoWait %: 99.99
Buffer Hit %: 90.30 In-memory Sort %: 100.00
Library Hit %: 99.51 Soft Parse %: 99.20
Execute to Parse %: 30.53 Latch Hit %: 99.98
Parse CPU to Parse Elapsd %: 91.74 % Non-Parse CPU: 97.58

Shared Pool Statistics Begin End
------ ------
Memory Usage %: 90.00 91.23
% SQL with executions>1: 42.30 42.46
% Memory for SQL w/exec>1: 78.78 78.99

Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
CPU time 41 63.71
db file scattered read 18,013 10 14.88
log file sequential read 106 6 9.27
log file sync 2,141 3 4.58
db file parallel write 209 2 2.81
-------------------------------------------------------------
Wait Events for DB: db1 Instance: db1 Snaps: 1 -2
-> s - second
-> cs - centisecond - 100th of a second
-> ms - millisecond - 1000th of a second
-> us - microsecond - 1000000th of a second
-> ordered by wait time desc, waits desc (idle events last)

Avg
Total Wait wait Waits
Event Waits Timeouts Time (s) (ms) /txn
---------------------------- ------------ ---------- ---------- ------ --------
db file scattered read 18,013 0 10 1 8.4
log file sequential read 106 0 6 57 0.0
log file sync 2,141 0 3 1 1.0
db file parallel write 209 0 2 9 0.1
db file sequential read 1,856 0 1 1 0.9
control file parallel write 858 0 1 1 0.4
buffer busy waits 205 0 1 3 0.1
async disk IO 290 0 0 0 0.1
control file sequential read 640 0 0 0 0.3
log file switch completion 2 0 0 48 0.0
SQL*Net break/reset to clien 602 0 0 0 0.3
log buffer space 4 0 0 10 0.0
latch free 13 4 0 3 0.0
log file single write 4 0 0 9 0.0
direct path read 15 0 0 2 0.0
log file parallel write 3,201 2,898 0 0 1.5
SQL*Net more data to client 155 0 0 0 0.1
LGWR wait for redo copy 75 0 0 0 0.0
direct path write 15 0 0 0 0.0
SQL*Net message from client 79,503 0 54,185 682 37.2
virtual circuit status 86 86 2,520 29297 0.0
wakeup time manager 82 82 2,458 29976 0.0
SQL*Net more data from clien 202 0 4 18 0.1
SQL*Net message to client 79,505 0 0 0 37.2
-------------------------------------------------------------
Background Wait Events for DB: db1 Instance: db1 Snaps: 1 -2
-> ordered by wait time desc, waits desc (idle events last)

Avg
Total Wait wait Waits
Event Waits Timeouts Time (s) (ms) /txn
---------------------------- ------------ ---------- ---------- ------ --------
db file parallel write 209 0 2 9 0.1
control file parallel write 854 0 1 1 0.4
control file sequential read 366 0 0 0 0.2
log file single write 4 0 0 9 0.0
direct path read 15 0 0 2 0.0
log file sequential read 4 0 0 7 0.0
db file scattered read 1 0 0 14 0.0
log file parallel write 3,201 2,898 0 0 1.5
LGWR wait for redo copy 75 0 0 0 0.0
direct path write 15 0 0 0 0.0
rdbms ipc message 11,736 8,314 16,063 1369 5.5
smon timer 8 8 2,400 ###### 0.0
-------------------------------------------------------------

=============================

Thanks in advance

Tom Kyte
June 07, 2004 - 8:29 am UTC

it is 3-4 times longer then recommended (15 minutes or so is sufficient)

other than that, it looks "just fine". it is an almost idle system, very little work being done, very few waits.


more statspack details

MEHMOOD, June 07, 2004 - 9:13 am UTC

Dear Tom:

Here is the other details of the statspack report. Kindly explain it, especially "advisory" sections.

====================================================
Buffer Pool Statistics for DB: db1 Instance: db1 Snaps: 1 -2
-> Standard block size Pools D: default, K: keep, R: recycle
-> Default Pools for other block sizes: 2k, 4k, 8k, 16k, 32k

Free Write Buffer
Number of Cache Buffer Physical Physical Buffer Complete Busy
P Buffers Hit % Gets Reads Writes Waits Waits Waits
--- ---------- ----- ----------- ----------- ---------- ------- -------- ------
D 27,790 90.3 2,463,117 238,844 2,848 0 0 205
-------------------------------------------------------------

Instance Recovery Stats for DB: db1 Instance: db1 Snaps: 1 -2
-> B: Begin snapshot, E: End snapshot

Targt Estd Log File Log Ckpt Log Ckpt
MTTR MTTR Recovery Actual Target Size Timeout Interval
(s) (s) Estd IOs Redo Blks Redo Blks Redo Blks Redo Blks Redo Blks
- ----- ----- ---------- ---------- ---------- ---------- ---------- ----------
B 27 9 2231 25621 25091 184320 25091
E 27 8 1101 14606 14609 184320 14609
-------------------------------------------------------------

Buffer Pool Advisory for DB: db1 Instance: db1 End Snap: 2
-> Only rows with estimated physical reads >0 are displayed
-> ordered by Block Size, Buffers For Estimate

Size for Size Buffers for Est Physical Estimated
P Estimate (M) Factr Estimate Read Factor Physical Reads
--- ------------ ----- ---------------- ------------- ------------------
D 16 .1 1,985 24.20 732,554,743
D 32 .1 3,970 24.02 726,816,514
D 48 .2 5,955 23.29 704,721,383
D 64 .3 7,940 6.22 188,119,061
D 80 .4 9,925 1.17 35,338,514
D 96 .4 11,910 1.10 33,221,487
D 112 .5 13,895 1.06 32,037,364
D 128 .6 15,880 1.04 31,472,731
D 144 .6 17,865 1.03 31,167,353
D 160 .7 19,850 1.02 30,919,627
D 176 .8 21,835 1.02 30,778,818
D 192 .9 23,820 1.01 30,632,069
D 208 .9 25,805 1.01 30,598,527
D 224 1.0 27,790 1.00 30,264,848
D 240 1.1 29,775 1.00 30,241,089
D 256 1.1 31,760 1.00 30,219,775
D 272 1.2 33,745 0.99 30,033,195
D 288 1.3 35,730 0.99 29,885,048
D 304 1.4 37,715 0.99 29,853,951
D 320 1.4 39,700 0.99 29,827,397
-------------------------------------------------------------


Buffer wait Statistics for DB: db1 Instance: db1 Snaps: 1 -2
-> ordered by wait time desc, waits desc

Tot Wait Avg
Class Waits Time (s) Time (ms)
------------------ ----------- ---------- ---------
data block 199 1 4
undo block 6 0 0
-------------------------------------------------------------
PGA Aggr Target Stats for DB: db1 Instance: db1 Snaps: 1 -2
-> B: Begin snap E: End snap (rows dentified with B or E contain data
which is absolute i.e. not diffed over the interval)
-> PGA cache hit % - percentage of W/A (WorkArea) data processed only in-memory
-> Auto PGA Target - actual workarea memory target
-> W/A PGA Used - amount of memory used for all Workareas (manual + auto)
-> %PGA W/A Mem - percentage of PGA memory allocated to workareas
-> %Auto W/A Mem - percentage of workarea memory controlled by Auto Mem Mgmt
-> %Man W/A Mem - percentage of workarea memory under manual control

PGA Cache Hit % W/A MB Processed Extra W/A MB Read/Written
--------------- ---------------- -------------------------
100.0 142 0

%PGA %Auto %Man
PGA Aggr Auto PGA PGA Mem W/A PGA W/A W/A W/A Global Mem
Target(M) Target(M) Alloc(M) Used(M) Mem Mem Mem Bound(K)
- --------- --------- ---------- ---------- ------ ------ ------ ----------
B 62 37 42.8 0.0 .0 .0 .0 3,181
E 62 36 43.7 0.0 .0 .0 .0 3,181
-------------------------------------------------------------

PGA Aggr Target Histogram for DB: db1 Instance: db1 Snaps: 1 -2
-> Optimal Executions are purely in-memory operations

Low High
Optimal Optimal Total Execs Optimal Execs 1-Pass Execs M-Pass Execs
------- ------- -------------- ------------- ------------ ------------
8K 16K 1,357 1,357 0 0
16K 32K 32 32 0 0
32K 64K 3 3 0 0
64K 128K 2 2 0 0
128K 256K 156 156 0 0
512K 1024K 4 4 0 0
1M 2M 78 78 0 0
2M 4M 2 2 0 0
-------------------------------------------------------------

PGA Memory Advisory for DB: db1 Instance: db1 End Snap: 2
-> When using Auto Memory Mgmt, minimally choose a pga_aggregate_target value
where Estd PGA Overalloc Count is 0

Estd Extra Estd PGA Estd PGA
PGA Target Size W/A MB W/A MB Read/ Cache Overalloc
Est (MB) Factr Processed Written to Disk Hit % Count
---------- ------- ---------------- ---------------- -------- ----------
16 0.3 28,050.5 943.4 97.0 257
31 0.5 28,050.5 131.1 100.0 50
47 0.8 28,050.5 123.7 100.0 43
62 1.0 28,050.5 0.0 100.0 7
75 1.2 28,050.5 0.0 100.0 0
87 1.4 28,050.5 0.0 100.0 0
99 1.6 28,050.5 0.0 100.0 0
112 1.8 28,050.5 0.0 100.0 0
124 2.0 28,050.5 0.0 100.0 0
186 3.0 28,050.5 0.0 100.0 0
249 4.0 28,050.5 0.0 100.0 0
373 6.0 28,050.5 0.0 100.0 0
497 8.0 28,050.5 0.0 100.0 0
-------------------------------------------------------------
Rollback Segment Stats for DB: db1 Instance: db1 Snaps: 1 -2
->A high value for "Pct Waits" suggests more rollback segments may be required
->RBS stats may not be accurate between begin and end snaps when using Auto Undo
managment, as RBS may be dynamically created and dropped as needed

Trans Table Pct Undo Bytes
RBS No Gets Waits Written Wraps Shrinks Extends
------ -------------- ------- --------------- -------- -------- --------
0 9.0 0.00 0 0 0 0
1 1,111.0 0.00 175,462 2 0 0
2 1,735.0 0.00 211,842 0 0 0
3 1,422.0 0.00 304,618 1 0 0
4 1,318.0 0.00 347,346 1 0 0
5 1,700.0 0.00 250,244 0 0 0
6 1,407.0 0.00 213,008 1 0 0
7 1,065.0 0.00 191,702 0 0 0
8 1,719.0 0.00 218,534 0 0 0
9 1,735.0 0.00 1,226,432 2 0 2
10 1,576.0 0.00 496,944 1 0 0
-------------------------------------------------------------
Rollback Segment Storage for DB: db1 Instance: db1 Snaps: 1 -2
->Optimal Size should be larger than Avg Active

RBS No Segment Size Avg Active Optimal Size Maximum Size
------ --------------- --------------- --------------- ---------------
0 385,024 0 385,024
1 2,220,032 480,529 3,268,608
2 3,268,608 584,495 4,317,184
3 2,220,032 557,324 4,317,184
4 3,268,608 534,367 4,317,184
5 3,268,608 566,470 24,240,128
6 4,317,184 587,203 4,317,184
7 2,220,032 568,629 9,560,064
8 3,268,608 590,992 5,365,760
9 4,317,184 733,416 4,317,184
10 2,220,032 555,789 3,268,608
-------------------------------------------------------------
Undo Segment Summary for DB: db1 Instance: db1 Snaps: 1 -2
-> Undo segment block stats:
-> uS - unexpired Stolen, uR - unexpired Released, uU - unexpired reUsed
-> eS - expired Stolen, eR - expired Released, eU - expired reUsed

Undo Undo Num Max Qry Max Tx Snapshot Out of uS/uR/uU/
TS# Blocks Trans Len (s) Concurcy Too Old Space eS/eR/eU
---- -------------- ---------- -------- ---------- -------- ------ -------------
1 865 7,924,796 6 1 0 0 0/0/0/0/0/0
-------------------------------------------------------------


Undo Segment Stats for DB: db1 Instance: db1 Snaps: 1 -2
-> ordered by Time desc

Undo Num Max Qry Max Tx Snap Out of uS/uR/uU/
End Time Blocks Trans Len (s) Concy Too Old Space eS/eR/eU
------------ ------------ -------- ------- -------- ------- ------ -------------
02-Jun 15:21 63 ######## 3 1 0 0 0/0/0/0/0/0
02-Jun 15:11 68 ######## 3 1 0 0 0/0/0/0/0/0
02-Jun 15:01 129 ######## 5 1 0 0 0/0/0/0/0/0
02-Jun 14:51 59 ######## 5 1 0 0 0/0/0/0/0/0
02-Jun 14:41 546 ######## 6 1 0 0 0/0/0/0/0/0
-------------------------------------------------------------
Latch Activity for DB: db1 Instance: db1 Snaps: 1 -2
->"Get Requests", "Pct Get Miss" and "Avg Slps/Miss" are statistics for
willing-to-wait latch get requests
->"NoWait Requests", "Pct NoWait Miss" are for no-wait latch get requests
->"Pct Misses" for both should be very close to 0.0

Pct Avg Wait Pct
Get Get Slps Time NoWait NoWait
Latch Requests Miss /Miss (s) Requests Miss
------------------------ -------------- ------ ------ ------ ------------ ------
Consistent RBA 3,204 0.0 0 0
FAL request queue 2 0.0 0 0
FIB s.o chain latch 18 0.0 0 0
FOB s.o list latch 34 0.0 0 0
SQL memory manager latch 1 0.0 0 842 0.0
SQL memory manager worka 57,503 0.0 0 0
active checkpoint queue 1,054 0.0 0 0
alert log latch 5 0.0 0 0
archive control 88 0.0 0 0
archive process latch 57 0.0 0 0
cache buffer handles 488 0.0 0 0
cache buffers chains 5,287,588 0.0 0.0 0 465,443 0.0
cache buffers lru chain 253,252 0.0 0.0 0 256,993 0.0
channel handle pool latc 110 0.0 0 0
channel operations paren 1,859 0.0 0 0
checkpoint queue latch 66,224 0.0 0 1,729 0.0
child cursor hash table 2,521 0.0 0 0
dml lock allocation 21,333 0.0 0 0
dummy allocation 110 0.0 0 0
enqueue hash chains 36,830 0.0 0 0
enqueues 11,179 0.0 0 0
event group latch 56 0.0 0 0
hash table column usage 23 0.0 0 3,142 0.0
job_queue_processes para 42 0.0 0 0
ktm global data 8 0.0 0 0
kwqit: protect wakeup ti 82 0.0 0 0
lgwr LWN SCN 3,388 0.1 0.0 0 0
library cache 255,258 0.0 0.1 0 0
library cache load lock 114 0.0 0 0
library cache pin 144,470 0.0 0.0 0 0
library cache pin alloca 78,597 0.0 0 0
list of block allocation 77 0.0 0 0
loader state object free 12 0.0 0 0
messages 27,321 0.0 0.0 0 0
mostly latch-free SCN 3,407 0.4 0.0 0 0
multiblock read objects 39,322 0.0 0 0
ncodef allocation latch 41 0.0 0 0
post/wait queue 3,220 0.0 0 2,148 0.0
process allocation 56 0.0 0 56 0.0
process group creation 110 0.0 0 0
redo allocation 38,745 0.0 0.0 0 0
redo copy 0 0 32,224 0.2
redo writing 12,433 0.0 0 0
row cache enqueue latch 16,684 0.0 0 0
row cache objects 17,193 0.0 0 0
sequence cache 174 0.0 0 0
session allocation 22,608 0.0 0 0
session idle bit 166,473 0.0 0.0 0 0
session switching 41 0.0 0 0
session timer 865 0.0 0 0
Latch Activity for DB: db1 Instance: db1 Snaps: 1 -2
->"Get Requests", "Pct Get Miss" and "Avg Slps/Miss" are statistics for
willing-to-wait latch get requests
->"NoWait Requests", "Pct NoWait Miss" are for no-wait latch get requests
->"Pct Misses" for both should be very close to 0.0

Pct Avg Wait Pct
Get Get Slps Time NoWait NoWait
Latch Requests Miss /Miss (s) Requests Miss
------------------------ -------------- ------ ------ ------ ------------ ------
shared pool 96,958 0.0 0.2 0 0
sim partition latch 0 0 35 0.0
simulator hash latch 164,009 0.0 0.0 0 0
simulator lru latch 15,303 0.0 0.0 0 1,115 0.0
sort extent pool 49 0.0 0 0
transaction allocation 111 0.0 0 0
transaction branch alloc 41 0.0 0 0
undo global data 25,374 0.0 0.0 0 0
user lock 216 0.0 0 0
-------------------------------------------------------------
Latch Sleep breakdown for DB: db1 Instance: db1 Snaps: 1 -2
-> ordered by misses desc

Get Spin &
Latch Name Requests Misses Sleeps Sleeps 1->4
-------------------------- -------------- ----------- ----------- ------------
cache buffers chains 5,287,588 902 5 0/0/0/0/0
library cache 255,258 59 6 53/6/0/0/0
shared pool 96,958 10 2 8/2/0/0/0
-------------------------------------------------------------
Latch Miss Sources for DB: db1 Instance: db1 Snaps: 1 -2
-> only latches with sleeps are shown
-> ordered by name, sleeps desc

NoWait Waiter
Latch Name Where Misses Sleeps Sleeps
------------------------ -------------------------- ------- ---------- --------
cache buffers chains kcbgtcr: kslbegin excl 0 3 3
cache buffers chains kcbchg: kslbegin: bufs not 0 1 1
cache buffers chains kcbrls: kslbegin 0 1 1
library cache kglic 0 6 0
shared pool kghalo 0 1 0
shared pool kghfen: not perm alloc cla 0 1 0
-------------------------------------------------------------
Dictionary Cache Stats for DB: db1 Instance: db1 Snaps: 1 -2
->"Pct Misses" should be very low (< 2% in most cases)
->"Cache Usage" is the number of cache entries being used
->"Pct SGA" is the ratio of usage to allocated size for that cache

Get Pct Scan Pct Mod Final
Cache Requests Miss Reqs Miss Reqs Usage
------------------------- ------------ ------ ------- ----- -------- ----------
dc_histogram_defs 1,162 0.0 0 0 5,919
dc_object_ids 1,454 0.2 0 0 1,924
dc_objects 905 2.4 0 0 2,672
dc_profiles 57 0.0 0 0 1
dc_rollback_segments 189 0.0 0 0 22
dc_segments 2,089 0.0 0 0 2,535
dc_sequences 3 0.0 0 3 3
dc_tablespaces 1,310 0.0 0 0 17
dc_user_grants 353 0.0 0 0 25
dc_usernames 196 0.0 0 0 46
dc_users 1,194 0.0 0 1 46
-------------------------------------------------------------


Library Cache Activity for DB: db1 Instance: db1 Snaps: 1 -2
->"Pct Misses" should be very low

Get Pct Pin Pct Invali-
Namespace Requests Miss Requests Miss Reloads dations
--------------- ------------ ------ -------------- ------ ---------- --------
BODY 10 0.0 10 0.0 0 0
INDEX 574 0.0 574 0.0 0 0
SQL AREA 16,745 0.5 67,840 0.4 13 0
TABLE/PROCEDURE 2,527 1.0 3,098 2.6 5 0
TRIGGER 109 0.0 109 0.0 0 0
-------------------------------------------------------------
Shared Pool Advisory for DB: db1 Instance: db1 End Snap: 2
-> Note there is often a 1:Many correlation between a single logical object
in the Library Cache, and the physical number of memory objects associated
with it. Therefore comparing the number of Lib Cache objects (e.g. in
v$librarycache), with the number of Lib Cache Memory Objects is invalid

Estd
Shared Pool SP Estd Estd Estd Lib LC Time
Size for Size Lib Cache Lib Cache Cache Time Saved Estd Lib Cache
Estim (M) Factr Size (M) Mem Obj Saved (s) Factr Mem Obj Hits
----------- ----- ---------- ------------ ------------ ------- ---------------
112 .5 100 14,763 40,590 1.0 5,776,674
144 .7 131 18,133 40,676 1.0 5,792,676
176 .8 162 24,243 40,681 1.0 5,795,028
208 1.0 193 30,370 40,687 1.0 5,798,846
240 1.2 225 37,310 40,688 1.0 5,800,091
272 1.3 256 44,261 40,688 1.0 5,800,671
304 1.5 287 49,835 40,689 1.0 5,801,697
336 1.6 319 58,277 40,695 1.0 5,803,078
368 1.8 338 62,300 40,697 1.0 5,804,786
400 1.9 338 62,300 40,697 1.0 5,804,786
432 2.1 338 62,300 40,697 1.0 5,804,786
-------------------------------------------------------------
SGA Memory Summary for DB: db1 Instance: db1 Snaps: 1 -2

SGA regions Size in Bytes
------------------------------ ----------------
Database Buffers 234,881,024
Fixed Size 742,152
Redo Buffers 798,720
Variable Size 637,534,208
----------------
sum 873,956,104
-------------------------------------------------------------


SGA breakdown difference for DB: db1 Instance: db1 Snaps: 1 -2

Pool Name Begin value End value % Diff
------ ------------------------------ ---------------- ---------------- -------
java free memory 263,802,880 263,802,880 0.00
java memory in use 4,632,576 4,632,576 0.00
large free memory 117,440,512 117,440,512 0.00
shared 1M buffer 1,056,768 1,056,768 0.00
shared FileOpenBlock 4,168,240 4,168,240 0.00
shared KGK heap 7,000 7,000 0.00
shared KGLS heap 13,666,504 13,674,896 0.06
shared KQR L PO 4,698,216 4,720,744 0.48
shared KQR M PO 4,694,520 4,694,520 0.00
shared KQR S SO 6,400 6,400 0.00
shared MTTR advisory 63,840 63,840 0.00
shared PL/SQL DIANA 5,562,272 5,562,272 0.00
shared PL/SQL MPCODE 1,932,536 1,932,536 0.00
shared PLS non-lib hp 2,088 2,088 0.00
shared VIRTUAL CIRCUITS 1,243,200 1,243,200 0.00
shared XDB Schema Cac 5,650,352 5,650,352 0.00
shared dictionary cache 3,229,952 3,229,952 0.00
shared errors 71,080 71,080 0.00
shared event statistics per sess 6,304,960 6,304,960 0.00
shared fixed allocation callback 472 472 0.00
shared free memory 25,166,480 22,059,056 -12.35
shared joxlod: in ehe 270,568 270,568 0.00
shared joxlod: in phe 9,900,864 9,900,864 0.00
shared joxlod: init P 152 152 0.00
shared joxs heap init 4,240 4,240 0.00
shared kgl simulator 12,430,208 12,430,208 0.00
shared library cache 47,470,632 47,935,616 0.98
shared miscellaneous 19,469,912 19,522,328 0.27
shared parameters 84,720 84,720 0.00
shared partitioning d 93,680 93,680 0.00
shared pl/sql source 7,464 7,464 0.00
shared sessions 1,649,440 1,649,440 0.00
shared sim memory hea 183,328 183,328 0.00
shared sql area 79,638,608 82,195,248 3.21
shared subheap 102,312 102,312 0.00
shared table definiti 7,760 10,224 31.75
shared transaction 1,121,560 1,121,560 0.00
shared trigger defini 3,032 3,032 0.00
shared trigger inform 1,872 1,872 0.00
shared trigger source 2,472 2,472 0.00
shared type object de 1,690,536 1,690,536 0.00
buffer_cache 234,881,024 234,881,024 0.00
fixed_sga 742,152 742,152 0.00
log_buffer 787,456 787,456 0.00
-------------------------------------------------------------

Tom Kyte
June 07, 2004 - 10:13 am UTC

tell you what -- if you have a specific question about something - great. Otherwise, this is not really relevant to the original question ............





Buffer pool advisory

MEHMOOD, June 07, 2004 - 12:03 pm UTC

Tom:

Actually if you could explain that how can we interpret this Buffer pool advisory in the above statspack report, I will be thankfull to you.

Tom Kyte
June 07, 2004 - 1:27 pm UTC

Buffer Pool Advisory for DB: db1 Instance: db1 End Snap: 2
-> Only rows with estimated physical reads >0 are displayed
-> ordered by Block Size, Buffers For Estimate

Size for Size Buffers for Est Physical Estimated
P Estimate (M) Factr Estimate Read Factor Physical Reads
--- ------------ ----- ---------------- ------------- ------------------
D 16 .1 1,985 24.20 732,554,743
D 32 .1 3,970 24.02 726,816,514
D 48 .2 5,955 23.29 704,721,383
D 64 .3 7,940 6.22 188,119,061
D 80 .4 9,925 1.17 35,338,514
D 96 .4 11,910 1.10 33,221,487
D 112 .5 13,895 1.06 32,037,364
D 128 .6 15,880 1.04 31,472,731
D 144 .6 17,865 1.03 31,167,353
D 160 .7 19,850 1.02 30,919,627
D 176 .8 21,835 1.02 30,778,818
D 192 .9 23,820 1.01 30,632,069
D 208 .9 25,805 1.01 30,598,527
D 224 1.0 27,790 1.00 30,264,848
D 240 1.1 29,775 1.00 30,241,089
D 256 1.1 31,760 1.00 30,219,775
D 272 1.2 33,745 0.99 30,033,195
D 288 1.3 35,730 0.99 29,885,048
D 304 1.4 37,715 0.99 29,853,951
D 320 1.4 39,700 0.99 29,827,397
-------------------------------------------------------------


says -- if you use you current buffer cache (factr = 1.0) of 224m, you would have done about 30.2 million physical IO's..

If you go up 10% more to 1.1 -- with 240meg, you'll do about 30 million. Going up 40% to a factor of 1.4 would take you down to 29.8.

In other words, a 40% increase in the amount of ram would pay off with a 1% decrease in PHI..

In fact, you would reduce your buffer cache by a good amount and only see a marginal increase in PIO.

It is basically a select * from this view:

</code> http://docs.oracle.com/docs/cd/B10501_01/server.920/a96536/ch345.htm#1113278 <code>



A reader, June 07, 2004 - 12:54 pm UTC

For MEHMOOD from Karachi, PAKISTAN spdoc.txt is what you should be looking for.

Thanks Tom

MEHMOOD, June 07, 2004 - 3:40 pm UTC

Dear Tom:

Thanks for your answer and your valueble time.

Thanks

What would you ste db_cache to?

Steve, July 02, 2004 - 6:18 am UTC

In the above example then, is it suggesting that by reducing the db_Cache to 96M (a 60% reduction) you'd increase your physical reads by only 10%, and therefore you should reduce your db_cache to that level?

Or, as with all this tuning malarkey, there's more to it than that and it requires a holistic approach?

Thanks

Tom Kyte
July 02, 2004 - 10:09 am UTC

"only 10%" -- 10% is pretty large actually

I look at it the other way -- increasing by 100m would only decrease by 1%, that's not significant. so don't increase.

they could probably go a little smaller if starved for memory elsewhere, but not by much.

Yes well....

Steve, July 02, 2004 - 10:54 am UTC


"only 10%" -- 10% is pretty large actually

I was thinking relative to the 60% reduction in cache size, but I take your point.

However, if their cache size was set to 96M and this report re-run you would be looking at a 16.67% rise in cache would result in a reduction of 1.2m reads (or 3.5%). Would that be worth it or is it more a judgement call based on the overall performace of the database?

Tom Kyte
July 02, 2004 - 11:10 am UTC


3.5% would be considered getting there. it is the difference between 96 and 224 (10%) that would be interesting.

It is a bunch of numbers you can look at, in conjunction to the amount of free ram on your system to see if "going up would make a marked difference".

just remark regarding TUXEDO

Edgar, March 23, 2005 - 5:46 am UTC

> .. middle ware (tuxedo) slow down...
> How can i find out which
process(It is a dedicated database server) caused...

Just look on top, then use this script
</code> http://www.orafaq.org/scripts/sql/unixusr.txt <code>
Than You can find that problematic SQL statement interactivelly.

But regarding more systematic approach..
In our configuration complex Tuxedo application (Singl.eView) use XA interface to access Oracle database.

This in turn makes some usual OracleDBA approaches to performance tuning unappliable:
- it is no longer possible to turn on SQL trace for particular real user session or batch process
- it is no longer possible to find performance/wait statistics for particular real user session or batch process, only instance-wide now makes sence
- it is no longer possible to identify real user id from v$session etc.,
- it is no longer possible to identify who is executing given particular SQL using only v$% views

Sesions, transactions, resource profiles, user security & audit - all that is under control of Tuxedo TPM (transaction processing monitor), Oracle in this picture (XA) plays a role of RM (resource manager).

There is a "stack" of Tuxedo server processes trerodb and trerwdb which executes particular SQL statements in database thru dedicated server connections (one per process). Some number of that statements executed in different Oracle dedicated server sessions can made up one single XA transaction.

So, while solving performance issues, OracleDBA (assuming having no Tuxedo competence) is a little limited in mean of instrumentation.

Tom, have You any suggestions for me?
Sincerely,


Tom Kyte
March 23, 2005 - 8:51 am UTC

the suggestion would be to have the developers instrument their custom middle tier so you can performance tune them.

10g has dbms_monitor so that if the developers help a little bit -- you get sql_trace=true back for a "session"

but the tux middle tier has to help -- or you can just trace transactions (which is what tux is all about). I've always just traces transactions from tuxedo middle tiers. each "service" is a transaction.

What is Parse CPU to Parse Elapsd?

Parikshit Paul, April 03, 2005 - 12:15 pm UTC

TOM:
What is the meaning of the component Parse CPU to Parse Elapsd?
My statspack report show the eficiancy percentage of this component as 21.38.It should be nearer to 100.Shoudn't it be?
How to tune it?


physical reads issue

Jelena, July 11, 2005 - 1:12 pm UTC

Hi Tom,
thanks for great advices first!

I'm trying to test different SGA sizes (within availble 4GB or RAM) and test queries with flashing buffer cache, to see what is execution time when it it has to do a lot of disk IO.
What is see is that quite simple query on quite big tables (~10 mil) if data is cached takes 0.01 sec (execution plan is perfect, using 2 indexes) but if i flash the cache, it has ~4000 physical reads and takes 23 seconds (???). For me this is far too much time for 4000 reads. It's production system, SLES8, CRS & ASM and Raid1.

Which execution time would you expect on average production system, if you see 11.000 consistent gets and 4000 physical reads?

Tom Kyte
July 11, 2005 - 2:52 pm UTC

sys@ORA10GR1> select 23/4000 from dual;

23/4000
----------
.00575

1 row selected.

doesn't seem entirely unreasonable - you don't say if the reads are single block or multiblock either, but seems "ok", what does your hardware vendor say your average response for the type of IO you are doing should be.

A reader, July 13, 2005 - 2:39 pm UTC

Could you please comment on this:
ache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 1,200M Std Block Size: 8K
Shared Pool Size: 752M Log Buffer: 3,072K

Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 6,018.22 4,427.22
Logical reads: 6,884.12 5,064.21
Block changes: 40.14 29.53
Physical reads: 61.56 45.28
Physical writes: 2.68 1.97
User calls: 93.43 68.73
Parses: 30.71 22.59
Hard parses: 9.49 6.98
Sorts: 17.40 12.80
Logons: 0.06 0.05
Executes: 60.37 44.41
Transactions: 1.36

% Blocks changed per Read: 0.58 Recursive Call %: 68.97
Rollback per transaction %: 0.33 Rows per Sort: 35.58

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 99.99 Redo NoWait %: 100.00
Buffer Hit %: 99.12 In-memory Sort %: 100.00
Library Hit %: 95.58 Soft Parse %: 69.11
Execute to Parse %: 49.12 Latch Hit %: 99.96
Parse CPU to Parse Elapsd %: 63.18 % Non-Parse CPU: 87.90

Shared Pool Statistics Begin End
------ ------
Memory Usage %: 93.69 93.07
% SQL with executions>1: 46.85 47.17
% Memory for SQL w/exec>1: 58.99 57.28

Thanks.

Tom Kyte
July 13, 2005 - 2:59 pm UTC

comment:

it is a statspack report.



that soft parse % makes me feel really sorry for you. Ask the developers "how in the world can it be true that every second there are 10 SQL queries that never existed before in this system? Wow, we do lots of different sql -- OR DID YOU NOT USE BIND VARIABLES"

A reader, July 13, 2005 - 2:41 pm UTC

Could you please explain what should be the range for each of the above?

Tom Kyte
July 13, 2005 - 3:01 pm UTC

ranges should be negative infinity to positive infinity?

the only thing that makes me feel bad off hand is the soft parse percent, your application is a true "killer app", a killer of resources. You probably use more of your cpu to hard parse than you do to actually perform work on your data.

A reader, July 13, 2005 - 2:42 pm UTC

Forgot to mention we have 8cpus and statspack time limit is 1hour.

A reader, July 13, 2005 - 3:13 pm UTC

Thanks a lot Tom!

Could you please explain what these mean:
% Non-Parse CPU
Memory Usage %
% SQL with executions>1
% Memory for SQL w/exec>1

And what should their approximate ranges be?

Thanks.


Tom Kyte
July 13, 2005 - 3:24 pm UTC

ranges should be 0 to 100.

ratios are just that, ratios, ways to hide stuff, 10 hard parses PER SECOND FOR AN HOUR, now that is scary stuff.

10 Hard Passes

A reader, July 14, 2005 - 12:14 am UTC

Dumb question, but how is it 10 hard passes per second?


Tom Kyte
July 14, 2005 - 10:24 am UTC

Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
....
Parses: 30.71 22.59
Hard parses: 9.49 6.98


Performance Statistics for Import

A reader, March 14, 2006 - 1:56 am UTC

I am importing an apps database (source:11.5.7/9206,solaris) to target (9206,linux).

DB size: 30GB.

Target server details:
shared_pool_reserved_size1943040                           shared_pool_size                        419430400   
pga_aggregate_target1073741824                             
processes                               1800 
log_buffer                              10485760                               
log_checkpoint_interval                 999999999                              
log_checkpoint_timeout                  3600                                   
log_checkpoints_to_alert                TRUE       
db_block_size                           8192                                   
db_cache_size                           1073741824      
db_file_multiblock_read_count           8         
db_writer_processes                     1 
event                                   10932 trace name context level 32768  
filesystemio_options                    directio    
java_pool_size                          67108864                               
job_queue_processes                     10                                     
log_archive_max_processes               1      
open_cursors                            2000
olap_page_pool_size                     67108864  
timed_statistics                        TRUE                                   
undo_management                         AUTO                                   
undo_retention                          28800                                  
undo_suppress_errors                    FALSE    


Logfiles: 100MB, 4 groups, 2 members.



Linux machine 
4 cpus. Good idle time. No other process running apart from oracle database .

============
 00:35:27  up 19 days,  8:45, 15 users,  load average: 0.00, 0.00, 0.00 
153 processes: 151 sleeping, 1 running, 1 zombie, 0 stopped 
CPU states:  cpu    user    nice  system    irq  softirq  iowait    idle 
           total    0.3%    0.0%    2.2%   0.0%     0.7%    0.1%   96.3% 
           cpu00    1.3%    0.0%    8.3%   0.0%     3.1%    0.0%   87.0% 
           cpu01    0.0%    0.0%    0.3%   0.0%     0.0%    0.0%   99.6% 
           cpu02    0.1%    0.0%    0.1%   0.0%     0.0%    0.3%   99.2% 
           cpu03    0.0%    0.0%    0.0%   0.0%     0.0%    0.3%   99.6% 
Mem:  8195684k av, 7357128k used,  838556k free,       0k shrd,   64484k buff 
                   5635648k actv, 1158440k in_d,  183740k in_c 
Swap: 6289384k av,       0k used, 6289384k free                 6922976k cached 

$ cat /etc/redhat-release
Red Hat Enterprise Linux AS release 3 (Taroon Update 6)

=====================================

All tablespaces created and datafiles have been sized as per source(i.e the required size pre-allocated)
============================================

import script:

export NLS_LANG=AMERICAN_AMERICA.US7ASCII
nohup imp sys/xxx full=y buffer=15248880 ignore=y commit=n file=file.dmp log=file.log resumable=y resumable_timeout=28800 STATISTICS=none TOID_NOVALIDATE= \(SYS.AQ\$_AGENT, SYS.AQ\$_JMS_USERPROPERTY, SYS.AQ\$_JMS_USERPROPARRAY, SYS.AQ\$_JMS_MESSAGE, SYS.AQ\$_JMS_TEXT_MESSAGE, SYS.AQ\$_JMS_BYTES_MESSAGE, SYS.AQ\$_JMS_STREAM_MESSAGE, SYS.AQ\$_JMS_MAP_MESSAGE, SYS.AQ\$_JMS_OBJECT_MESSAGE, SYSTEM.REPCAT\$_OBJECT_NULL_VECTOR, MDSYS.SDO_GEOMETRY\) &

Source charset is US7ASCII
==============================
This is not a very large database (with reference to apps databases).
The import slowed down especially during stored procedure/package creation drastically.

Ex:

select object_name,object_type,owner,TO_CHAR(LAST_DDL_TIME,'DD-MON-YYYY HH24:MI:SS') LAST_DDL from dba_objects  where last_ddl_time between to_date('13-MAR-2006:16:58:00','DD-MON-YYYY:HH24:MI:SS') and to_date('13-MAR-2006:21:30:00','DD-MON-YYYY:HH24:MI:SS');
 
OBJECT_NAME
--------------------------------------------------------------------------------
OBJECT_TYPE        OWNER                          LAST_DDL
------------------ ------------------------------ --------------------
EDW_HR_POSITION_M_C
PACKAGE BODY       APPS                           13-MAR-2006 21:29:44
 
EDW_HR_PRSN_TYP_M_C
PACKAGE BODY       APPS                           13-MAR-2006 16:58:49
 
EDW_ORGANIZATION_M_C
PACKAGE BODY       APPS                           13-MAR-2006 17:05:07

In 6 hrs..it created just 3 packages. Not sure if import was doing anything else. There was no checkpoint occuring either.

I want to find out what is the bottleneck. 
I was not able to use statspack as the statspack packages were having compilation errors and import was still going on.

select event,P1TEXT,P2TEXT,P3TEXT,WAIT_TIME,SECONDS_IN_WAIT,STATE from v$session_wait where sid=9;

EVENT
----------------------------------------------------------------
P1TEXT
----------------------------------------------------------------
P2TEXT
----------------------------------------------------------------
P3TEXT                                                            WAIT_TIME
---------------------------------------------------------------- ----------
SECONDS_IN_WAIT STATE
--------------- -------------------
single-task message
                                                                          0
            168 WAITING


====================================================
I used utlbstat and utlestat to collect some stats:


SQL> select n1.name "Statistic",
  2          n1.change "Total",
  3          round(n1.change/trans.change,2) "Per Transaction",
  4          round(n1.change/((start_users + end_users)/2),2)  "Per Logon",
  5          round(n1.change/(to_number(to_char(end_time,   'J'))*60*60*24 -
  6                   to_number(to_char(start_time, 'J'))*60*60*24 +
  7                   to_number(to_char(end_time,   'SSSSS')) -
  8                   to_number(to_char(start_time, 'SSSSS')))
  9            , 2) "Per Second"
 10      from
 11               stats$stats n1,
 12               stats$stats trans,
 13               stats$dates
 14      where
 15            trans.name='user commits'
 16       and  n1.change != 0
 17      order by n1.name;

Statistic                          Total Per Transact    Per Logon   Per Second
--------------------------- ------------ ------------ ------------ ------------
CPU used by this session               5            5          .29          .01
CPU used when call started             5            5          .29          .01
CR blocks created                     21           21          1.2          .04
DBWR checkpoint buffers wri           64           64         3.66          .12
DBWR transaction table writ            5            5          .29          .01
DBWR undo block writes                26           26         1.49          .05
SQL*Net roundtrips to/from            13           13          .74          .02
active txn count during cle            3            3          .17          .01
background timeouts                  624          624        35.66          1.2
buffer is not pinned count          9936         9936       567.77        19.03
buffer is pinned count               150          150         8.57          .29
bytes received via SQL*Net          1180         1180        67.43         2.26
bytes sent via SQL*Net to c         2258         2258       129.03         4.33
calls to get snapshot scn:          1145         1145        65.43         2.19
calls to kcmgas                      207          207        11.83           .4
calls to kcmgcs                       10           10          .57          .02
change write time                      5            5          .29          .01
cleanout - number of ktugct            4            4          .23          .01
cleanouts and rollbacks - c            1            1          .06            0
cluster key scan block gets          360          360        20.57          .69
cluster key scans                     49           49          2.8          .09
commit cleanouts                     254          254        14.51          .49
commit cleanouts successful          254          254        14.51          .49
commit txn count during cle            1            1          .06            0
consistent changes                    38           38         2.17          .07
consistent gets                    10901        10901       622.91        20.88
consistent gets - examinati          779          779        44.51         1.49
data blocks consistent read           38           38         2.17          .07
db block changes                    2898         2898        165.6         5.55
db block gets                       3223         3223       184.17         6.17
deferred (CURRENT) block cl          104          104         5.94           .2
enqueue releases                     721          721         41.2         1.38
enqueue requests                     715          715        40.86         1.37
execute count                        629          629        35.94          1.2
free buffer requested                 89           89         5.09          .17
immediate (CR) block cleano            1            1          .06            0
immediate (CURRENT) block c           43           43         2.46          .08
index fetch by key                   383          383        21.89          .73
index scans kdiixs1                  428          428        24.46          .82
logons cumulative                      2            2          .11            0
logons current                         1            1          .06            0
messages received                     55           55         3.14          .11
messages sent                         55           55         3.14          .11
no work - consistent read g         9466         9466       540.91        18.13
opened cursors cumulative            191          191        10.91          .37
opened cursors current                 9            9          .51          .02
parse count (hard)                     6            6          .34          .01
parse count (total)                  231          231         13.2          .44
parse time cpu                        22           22         1.26          .04
parse time elapsed                    22           22         1.26          .04
physical reads                         1            1          .06            0
physical writes                       64           64         3.66          .12
physical writes non checkpo           28           28          1.6          .05
recursive calls                     5040         5040          288         9.66
recursive cpu usage                   20           20         1.14          .04
redo blocks written                  906          906        51.77         1.74
redo entries                        1488         1488        85.03         2.85
redo size                         420500       420500     24028.57       805.56
redo synch writes                      1            1          .06            0
redo wastage                       13824        13824       789.94        26.48
redo write time                        3            3          .17          .01
redo writes                           55           55         3.14          .11
rollbacks only - consistent           37           37         2.11          .07
rows fetched via callback            223          223        12.74          .43
session cursor cache count            47           47         2.69          .09
session cursor cache hits             75           75         4.29          .14
session logical reads              14124        14124       807.09        27.06
session pga memory                441976       441976     25255.77        846.7
session pga memory max            310904       310904     17765.94        595.6
session uga memory                 89680        89680      5124.57        171.8
session uga memory max            155128       155128      8864.46       297.18
shared hash latch upgrades           428          428        24.46          .82
sorts (memory)                       142          142         8.11          .27
sorts (rows)                        4414         4414       252.23         8.46
switch current to new buffe           58           58         3.31          .11
table fetch by rowid                 278          278        15.89          .53
table scan blocks gotten            9044         9044        516.8        17.33
table scan rows gotten             15730        15730       898.86        30.13
table scans (long tables)              1            1          .06            0
table scans (short tables)            72           72         4.11          .14
user calls                            17           17          .97          .03
user commits                           1            1          .06            0
workarea executions - optim           32           32         1.83          .06

83 rows selected.

SQL> 
SQL> column "Event Name" format a32 trunc;
SQL> set numwidth 13;
SQL> Rem System wide wait events for non-background processes (PMON,
SQL> Rem SMON, etc).  Times are in hundreths of seconds.  Each one of
SQL> Rem these is a context switch which costs CPU time.  By looking at
SQL> Rem the Total Time you can often determine what is the bottleneck
SQL> Rem that processes are waiting for.  This shows the total time spent
SQL> Rem waiting for a specific event and the average time per wait on
SQL> Rem that event.
SQL> select  n1.event "Event Name",
  2           n1.event_count "Count",
  3           n1.time_waited "Total Time",
  4           round(n1.time_waited/n1.event_count, 2) "Avg Time"
  5      from stats$event n1
  6      where n1.event_count > 0
  7      order by n1.time_waited desc;

Event Name                               Count    Total Time      Avg Time     
-------------------------------- ------------- ------------- -------------     
rdbms ipc message                          121         52637        435.02     
wakeup time manager                         17         48144          2832     
SQL*Net message from client                 26         23274        895.15     
db file sequential read                    745             1             0     
log file sync                                1             1             1     
SQL*Net message to client                   27             0             0     
control file sequential read               140             0             0     

7 rows selected.

SQL> 
SQL> 
SQL> Rem System wide wait events for background processes (PMON, SMON, etc)
SQL> select  n1.event "Event Name",
  2           n1.event_count "Count",
  3           n1.time_waited "Total Time",
  4           round(n1.time_waited/n1.event_count, 2) "Avg Time"
  5      from stats$bck_event n1
  6      where n1.event_count > 0
  7      order by n1.time_waited desc;

Event Name                               Count    Total Time      Avg Time     
-------------------------------- ------------- ------------- -------------     
rdbms ipc message                          562        152498        271.35     
smon timer                                   2         58596         29298     
pmon timer                                 181         50841        280.89     
control file parallel write                174            51           .29     
log file parallel write                     55            11            .2     
LGWR wait for redo copy                      1             0             0     
control file sequential read                72             0             0     

7 rows selected.

SQL> 
SQL> 
SQL> column latch_name format a18 trunc;
SQL> set numwidth 11;
SQL> Rem Latch statistics. Latch contention will show up as a large value for
SQL> Rem the 'latch free' event in the wait events above.
SQL> Rem Sleeps should be low.    The hit_ratio should be high.
SQL> select name latch_name, gets, misses,
  2       round((gets-misses)/decode(gets,0,1,gets),3)
  3         hit_ratio,
  4       sleeps,
  5       round(sleeps/decode(misses,0,1,misses),3) "SLEEPS/MISS"
  6      from stats$latches
  7       where gets != 0
  8       order by name;

LATCH_NAME                GETS      MISSES   HIT_RATIO      SLEEPS SLEEPS/MISS 
------------------ ----------- ----------- ----------- ----------- ----------- 
Consistent RBA              55           0           1           0           0 
SQL memory manager       11591           0           1           0           0 
active checkpoint          179           0           1           0           0 
cache buffer handl          34           0           1           0           0 
cache buffers chai       33966           0           1           0           0 
cache buffers lru          122           0           1           0           0 
channel handle poo           3           0           1           0           0 
channel operations         353           0           1           0           0 
checkpoint queue l       17178           0           1           0           0 
child cursor hash           22           0           1           0           0 
dml lock allocatio         485           0           1           0           0 
dummy allocation             3           0           1           0           0 
enqueue hash chain        1412           0           1           0           0 
enqueues                   708           0           1           0           0 
event group latch            2           0           1           0           0 
job_queue_processe           9           0           1           0           0 
ktm global data              2           0           1           0           0 
kwqit: protect wak          17           0           1           0           0 
lgwr LWN SCN               203           0           1           0           0 
library cache             4097           0           1           0           0 
library cache load          22           0           1           0           0 
library cache pin         2588           0           1           0           0 
library cache pin         1128           0           1           0           0 
messages                  1451           0           1           0           0 
mostly latch-free          203           0           1           0           0 
ncodef allocation            8           0           1           0           0 
object stats modif           1           0           1           0           0 
post/wait queue              1           0           1           0           0 
process allocation           2           0           1           0           0 
process group crea           3           0           1           0           0 
redo allocation           1716           0           1           0           0 
redo writing               621           0           1           0           0 
row cache enqueue         1354           0           1           0           0 
row cache objects         1443           0           1           0           0 
session allocation        1969           0           1           0           0 
session idle bit            48           0           1           0           0 
session switching            8           0           1           0           0 
session timer              181           0           1           0           0 
shared pool               4277           0           1           0           0 
simulator hash lat         769           0           1           0           0 
sort extent pool            10           0           1           0           0 
spilled msgs queue          17           0           1           0           0 
trace latch                  1           0           1           0           0 
transaction alloca           6           0           1           0           0 
transaction branch           8           0           1           0           0 
undo global data           582           0           1           0           0 

46 rows selected.

SQL> 
SQL> set numwidth 16
SQL> Rem Statistics on no_wait gets of latches.  A no_wait get does not
SQL> Rem wait for the latch to become free, it immediately times out.
SQL> select name latch_name,
  2       immed_gets nowait_gets,
  3       immed_miss nowait_misses,
  4       round((immed_gets/(immed_gets+immed_miss)), 3)
  5         nowait_hit_ratio
  6      from stats$latches
  7       where immed_gets + immed_miss != 0
  8       order by name;

LATCH_NAME              NOWAIT_GETS    NOWAIT_MISSES NOWAIT_HIT_RATIO          
------------------ ---------------- ---------------- ----------------          
SQL memory manager              173                0                1          
cache buffers chai               34                0                1          
cache buffers lru              2885                0                1          
checkpoint queue l               27                0                1          
library cache                    82                0                1          
post/wait queue                   1                0                1          
process allocation                2                0                1          
redo copy                      1459                1             .999          
row cache objects                20                0                1          
simulator lru latc                7                0                1          

10 rows selected.

SQL> 
SQL> Rem Buffer busy wait statistics.  If the value for 'buffer busy wait' in
SQL> Rem the wait event statistics is high, then this table will identify
SQL> Rem which class of blocks is having high contention.  If there are high
SQL> Rem 'undo header' waits then add more rollback segments.  If there are
SQL> Rem high 'segment header' waits then adding freelists might help.    Check
SQL> Rem v$session_wait to get the addresses of the actual blocks having
SQL> Rem contention.
SQL> select * from stats$waitstat
  2    where count != 0
  3    order by count desc;

no rows selected


==============================================

Any pointers in identifying the bottleneck to this will be helpful.

I see two things:
1)single-task message
2)wakeup time manager 
consiming more time and lots of wait events.

I am not sure if this is anyway related.




       
 

physical read IO requests

A reader, December 22, 2006 - 4:12 pm UTC

Hi Tom,
wish you are doing fine
the v$mystat in 10g have field called "physical read IO requests"

but in 9i this field isn't there , is there any other way i can know how many IO requests my session has performed in 9i ?

many thanks
Tom Kyte
December 24, 2006 - 8:50 am UTC

ops$tkyte%ORA9IR2> select name from v$statname where name like '%physical%';

NAME
------------------------------
physical reads
physical writes
physical writes non checkpoint
physical reads direct
physical writes direct
physical reads direct (lob)
physical writes direct (lob)

7 rows selected.

what about number of fetched blocks

A reader, January 03, 2007 - 6:18 pm UTC

Hi Tom,

yes , thanks i got this.

what about number of blocks which have been fetched by those physical reads ?
in 10g i can check the "physical reads cache " statistics an similar for this in 9i ?

Thank you very much

To bump up iowait

Raj, January 14, 2007 - 4:41 pm UTC

Hi Tom,

We are testing db monitoring tool(BMC Patrol).We need to get alert from patrol if server iowait is more than 30%.

My problem is its very difficult to simulate iowait more than 30% in dev environment.Can you suggest some script/code to bump up iowait.

Regds...Raj

Server Statistics - using Toad

Elahe Faghihi, August 27, 2007 - 1:02 pm UTC

Hi Tom,
I noticed that when I want to run a query in Oracle 10g database, it would fail with the following error message:
ORA-01652: unable to extend temp segment by 128 in tablespace USRTEMP

Then I ran statistic on server and notiecd that there are red signs in front of the following parameters. I am wondering what does it mean when and how i can fix it:

Chain Fetch ratio = 0.0282
Redo Space Wait Ratio = 1.1264
Parse to Execute Ratio = 26.8149

Tom Kyte
September 04, 2007 - 12:18 pm UTC

umm, your temp is too small - and that has nothing to do with those ratios, they are entirely unrelated.



I'm not really concerned about those ratios (except maybe the parse to execute), that one can only be fixed by the development staff - instead of always closing cursors, they should be reusing them (open once, execute many)

Is ora-8007 indicative of an I/O issue too?

Raj, October 10, 2012 - 12:09 pm UTC

Hi Tom,
I saw an ora-8007 recently and was curious as to the cause. It's not listed in the alert log, so I assume that it's an issue from an application's sql. I am just trying to figure out how to simulate sql to generate the same error to help explain what to look for to the developers. The only output from the error they handled in their code was "ORA-08007: Further changes to this block by this transaction not allowed". I'm not sure if more would be given that just wasn't captured. Can you explain the error and simulate?

Thanks!
Tom Kyte
October 10, 2012 - 2:09 pm UTC

are you using a 32k block size?

...more info...

Raj, October 10, 2012 - 2:12 pm UTC

Sorry... I failed to include sufficient information:
-Attempting to reproduce ora-8007 on 10.2.0.4 running AIX.
-The tablespace this occurred on had an 8k block size.

Here is what I tried to encourage the 8007 error:

create tablespace testora8007 datafile '/tmp/TestOra8007.dbf' size 100m extent management local uniform size 1m;

create table raj (col1 varchar2(10)) pctfree 0 tablespace testora8007;

declare
r number;
begin
for r in 1 .. 50000
loop
insert into raj values (r);
delete from raj;
end loop;
end ;
/

But no luck, so the limitation is >50000... or my logic is off. Help?

Tom Kyte
October 10, 2012 - 4:29 pm UTC

It should not happen on an 8k block - we'd need to have more than 4095 rows possible on the block and that isn't possible in 8k - or even 16k.

You might want to utilize support on this one - they can set an event to capture more information if it happens again.

Academically curious now...

Raj, October 11, 2012 - 1:17 pm UTC

Thanks Tom, I thought the same (shouldn't fit), then thought maybe an insert and subsequent delete might count as two block hits within a transaction, hence my test attempt.

How is the 4095 limitation? Because it sounds then like it's a block header limitation then (since my little test failed).

Thanks again!
Tom Kyte
October 11, 2012 - 3:43 pm UTC

could be a block header or undo segment header limitation. It is a magic number very close to 4096

Anand, October 13, 2012 - 12:59 pm UTC

Hi Tom,

One of my process in batch used to finish on 6-7 min.But now suddenly start running for 20-25 min.

And we gather stats every weekend on every table.

After looking at AWR we came to know that there is a high cluster wait.So we have start to process in same instance.


But after running in same instance still we have high cluster wait.Can you please help.

Below is the AWR report :
SQL ordered by Cluster Wait Time

Wait Time (s) CWT % of Elapsd Time Elapsed Time(s) CPU Time(s) Executions SQL Id SQL Module SQL Text

6,068.91 63.66 9,532.96 1,798.17 32 6jnyts7jvg982 ln_crr_pop_arrear_10 SELECT A.COD_CC_BRN, :B4 , A...

Full query is :

SELECT A.COD_CC_BRN, :B4 , A.COD_PROD, A.COD_ACCT_NO, A.COD_CUST_ID, A.COD_CRR_FROM, A.COD_CRR_TO, A.DAT_PROCESS, A.COD_CRR_MVMT_REASON, DECODE(C.FLG_SECURED, 'Y', D.COD_AC_SECURE_PLAN, D.COD_AC_UNSECURE_PLAN) COD_AC_PLAN, B.COD_PLAN_ID, C.BAL_BOOK, NULL DAT_LAST_CR, NULL DAT_LAST_DR, NULL FLG_PREV_BAL_STAT FROM AC_ACCT_CRR_CODE A, AC_ACCT_PREFERENCES B, LN_ACCT_MAST C, LN_PROD_MAST D, AC_EOD_CRR_PROV_PROC E, LN_ACCT_STREAM_MAST F WHERE A.COD_ACCT_NO = F.COD_ACCT_NO AND A.COD_ACCT_NO = B.COD_ACCT_NO AND A.COD_ACCT_NO = C.COD_ACCT_NO AND F.COD_ARREAR_STREAM = :B3 AND A.COD_PROD = D.COD_PROD AND B.COD_PLAN_ID = E.COD_PREF_ID AND E.FLG_CRR_PROC = 'Y' AND C.COD_ACCT_STAT NOT IN (1, 5, 11, 12, 20, 24) AND (:B2 IN (B.COD_FWD_MVMNT, B.COD_REV_MVMNT) OR :B1 IN (B.COD_FWD_MVMNT, B.COD_REV_MVMNT)) AND A.FLG_MNT_STATUS = 'A' AND B.FLG_MNT_STATUS = 'A' AND C.FLG_MNT_STATUS = 'A' AND D.FLG_MNT_STATUS = 'A' AND C.FLG_CRR = 'Y' AND C.FLG_CHARGING = 'Y' ORDER BY A.COD_CC_BRN, A.COD_PROD, A.COD_ACCT_NO

in this ln_acct_mast is view.

Is this because of this view ?

Why cluster wait is high even though we are running on same instance ??

I have monitor the batch process and this query in cursor took around 15 min.



anand, October 20, 2012 - 3:26 am UTC

Hi Tom,

Waiting for your response on above post.

Thanks a lot...for giving us your precious time.
Tom Kyte
October 23, 2012 - 11:53 am UTC

do you have a comparative AWR report - do you know if the "high cluster wait" is new or was always there - are the plans the same.

compare the "bad" to the 'good' and report back on what changed between the two

Anand, November 05, 2012 - 4:25 am UTC

Hi Tom,

Below is the AWR report where it finish in 5 min :

SQL ordered by Elapsed Time


sed Time (s) CPU Time (s) Executions Elap per Exec (s) % Total DB Time SQL Id SQL Module SQL Text

1,483 455 16 92.70 2.04 3z0c1whmx708n ln_crr_code_shell_10 SELECT A.COD_CC_BRN, :B7 , A...

SQL ordered by Cluster Wait Time :

Cluster Wait Time (s) CWT % of Elapsd Time Elapsed Time(s) CPU Time(s) Executions SQL Id SQL Module SQL Text

722.14 48.69 1,483.23 455.12 16 3z0c1whmx708n ln_crr_code_shell_10 SELECT A.COD_CC_BRN, :B7 , A...



Below is the AWR report where it finish in 15 min :

SQL ordered by Elapsed Time

Elapsed Time (s) CPU Time (s) Executions Elap per Exec (s) % Total DB Time SQL Id SQL Module SQL Text
13,294 2,527 32 415.44 27.37 3z0c1whmx708n ln_crr_code_shell_14 SELECT A.COD_CC_BRN, :B7 , A...

SQL ordered by Cluster Wait Time

Cluster Wait Time (s) CWT % of Elapsd Time Elapsed Time(s) CPU Time(s) Executions SQL Id SQL Module SQL Text

5,255.29 39.53 13,294.00 2,526.73 32 3z0c1whmx708n ln_crr_code_shell_14 SELECT A.COD_CC_BRN, :B7 , A...


There is no code change has been done.

Please help.


Anand, November 20, 2012 - 9:59 am UTC

Hi Tom,

Can you just help me to reply with above post why my execution increase from 16 to 32 without any code change ?

I really cant find any reason why my batch process time increase suddenly.Table analyzed on weekly basis.Ya there is growth of data in table but that's not so huge growth.

If i see AWR report cluster wait and elapsed time increased.

Though i have changed the query with inline view which came back to normal.

Just curious to know the reason.

Thanks alot Tom.

Tom Kyte
November 20, 2012 - 11:24 am UTC

Though i have changed the query with inline view which came back to normal.


that would indicate "query plan changed and didn't change for the better" to explain your 2x increase in response time.

You could have used ASH to review that - it would show you the history of the plans for that query - assuming you have the history going back that far now.


Anand, December 02, 2012 - 3:12 am UTC

Hi Tom,

Thanks lot for your help.Always appreciate it.

After reorg of 2 table and revert to old code timing back to normal.

Can you tell me the criteria or any reference site to identify probable table candidate for reorg so that proactively identify and do reorg to avoid such problem further ?

As you always say that table reorg shouldn't be done scheduled base.

waiting for your reply.