You Asked
Hi Tom,
Here is Statspack report that was obtained when the end-users complained about the poor reponse times in the system.
Please advice on the following:
(1) The Top wait event i.e. Latch Free.
(2) Latches Activity.
During this period an mpstat on the OS side shows that
wt and idl times are 0 and the user processes is taking more than 90% cpu( 4 cpu).
SunOS 5.9$ mpstat 5 5
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
20 134 14 2669 463 139 1318 235 38 23 3 2343 80 20 0 0
21 128 12 2816 328 101 1290 238 38 24 3 2279 89 11 0 0
22 144 13 3073 1134 1030 1117 225 39 26 3 2342 91 9 0 0
23 133 12 3380 290 57 1284 240 41 25 3 2243 92 8 0 0
This happened continously for the past 2 days.
3.) Is there anything else to investigate with respect to the poor reponse times and the system hanging on the end-users.
Thanks Tom in advance.
Ram.
STATSPACK report for
DB Name DB Id Instance Inst Num Release Cluster Host
------------ ----------- ------------ -------- ----------- ------- ------------
XXXXX 3013888371 XXXXX 1 9.2.0.6.0 NO prodb
Snap Id Snap Time Sessions Curs/Sess Comment
--------- ------------------ -------- --------- -------------------
Begin Snap: 185 08-Feb-06 11:10:58 159 39.2
End Snap: 186 08-Feb-06 11:37:02 146 43.3
Elapsed: 26.07 (mins)
Cache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 448M Std Block Size: 16K
Shared Pool Size: 400M Log Buffer: 512K
Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 74,923.36 4,903.55
Logical reads: 21,238.92 1,390.03
Block changes: 360.10 23.57
Physical reads: 1,293.47 84.65
Physical writes: 103.05 6.74
User calls: 1,181.01 77.29
Parses: 252.17 16.50
Hard parses: 22.01 1.44
Sorts: 22.95 1.50
Logons: 0.40 0.03
Executes: 349.55 22.88
Transactions: 15.28
% Blocks changed per Read: 1.70 Recursive Call %: 46.94
Rollback per transaction %: 4.54 Rows per Sort: 989.63
Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 99.99 Redo NoWait %: 99.97
Buffer Hit %: 94.39 In-memory Sort %: 99.70
Library Hit %: 95.54 Soft Parse %: 91.27
Execute to Parse %: 27.86 Latch Hit %: 99.65
Parse CPU to Parse Elapsd %: 4.27 % Non-Parse CPU: 88.41
Shared Pool Statistics Begin End
------ ------
Memory Usage %: 93.30 92.03
% SQL with executions>1: 61.13 57.51
% Memory for SQL w/exec>1: 51.23 49.54
Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
latch free 231,787 18,490 49.24
direct path write 35,679 7,706 20.52
CPU time 3,496 9.31
db file scattered read 461,152 2,755 7.34
direct path read 38,931 1,672 4.45
-------------------------------------------------------------
Wait Events for DB: XXXXX Instance: XXXXX Snaps: 185 -186
-> s - second
-> cs - centisecond - 100th of a second
-> ms - millisecond - 1000th of a second
-> us - microsecond - 1000000th of a second
-> ordered by wait time desc, waits desc (idle events last)
Avg
Total Wait wait Waits
Event Waits Timeouts Time (s) (ms) /txn
---------------------------- ------------ ---------- ---------- ------ --------
latch free 231,787 144,912 18,490 80 9.7
direct path write 35,679 0 7,706 216 1.5
db file scattered read 461,152 0 2,755 6 19.3
direct path read 38,931 0 1,672 43 1.6
log file sync 8,576 602 1,299 151 0.4
db file sequential read 293,444 0 723 2 12.3
log file parallel write 31,914 0 342 11 1.3
SQL*Net more data to client 511,438 0 338 1 21.4
log buffer space 547 47 233 426 0.0
buffer busy waits 2,184 10 206 94 0.1
db file parallel write 398 0 154 387 0.0
LGWR wait for redo copy 4,498 2,472 47 11 0.2
log file switch completion 84 6 34 400 0.0
control file parallel write 607 0 28 46 0.0
library cache pin 65 0 9 131 0.0
async disk IO 198 0 5 27 0.0
row cache lock 102 0 5 45 0.0
log file sequential read 156 0 4 26 0.0
wait list latch free 152 0 4 23 0.0
enqueue 2 0 2 920 0.0
ARCH wait on ATTACH 19 0 1 77 0.0
SQL*Net break/reset to clien 784 0 1 2 0.0
log file single write 24 0 0 13 0.0
db file parallel read 12 0 0 20 0.0
control file sequential read 565 0 0 0 0.0
ARCH wait on SENDREQ 19 0 0 6 0.0
ARCH wait on DETACH 19 0 0 1 0.0
buffer deadlock 1 1 0 0 0.0
SQL*Net message from client 1,848,570 0 96,546 52 77.4
jobq slave wait 1,868 944 4,610 2468 0.1
SQL*Net message to client 1,848,567 0 26 0 77.4
SQL*Net more data from clien 3,059 0 0 0 0.1
-------------------------------------------------------------
Background Wait Events for DB: XXXXX Instance: XXXXX Snaps: 185 -186
-> ordered by wait time desc, waits desc (idle events last)
Avg
Total Wait wait Waits
Event Waits Timeouts Time (s) (ms) /txn
---------------------------- ------------ ---------- ---------- ------ --------
log file parallel write 31,924 0 343 11 1.3
db file parallel write 398 0 154 387 0.0
LGWR wait for redo copy 4,498 2,472 47 11 0.2
latch free 447 200 36 80 0.0
control file parallel write 607 0 28 46 0.0
log buffer space 70 3 19 278 0.0
async disk IO 198 0 5 27 0.0
buffer busy waits 20 0 5 239 0.0
log file sequential read 156 0 4 26 0.0
ARCH wait on ATTACH 19 0 1 77 0.0
db file scattered read 128 0 1 11 0.0
direct path write 78 0 1 18 0.0
db file sequential read 125 0 1 5 0.0
log file single write 24 0 0 13 0.0
control file sequential read 523 0 0 0 0.0
wait list latch free 4 0 0 37 0.0
ARCH wait on SENDREQ 19 0 0 6 0.0
direct path read 78 0 0 1 0.0
ARCH wait on DETACH 19 0 0 1 0.0
rdbms ipc message 13,358 1,340 8,259 618 0.6
smon timer 232 0 1,530 6593 0.0
pmon timer 1,312 466 1,493 1138 0.1
-------------------------------------------------------------
SQL ordered by Gets for DB: XXXXX Instance: XXXXX Snaps: 185 -186
-> End Buffer Gets Threshold: 10000
-> Note that resources reported for PL/SQL includes the resources used by
all SQL statements called within the PL/SQL code. As individual SQL
statements are also reported, it is possible and valid for the summed
total % to exceed 100
CPU Elapsd
Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
1,857,990 1 1,857,990.0 5.6 11.11 181.40 3248088983
Module: dllhost.exe
select uRef || ' (#' || to_char(hMy - 1200000000) || ')' from tr
ans where iType = 12 and bOpen = -1 and (hmy in (select t.hParen
t2 from trans t, detail d where t.hParent2 = trans.hMy and t.hMy
= d.hInvOrRec and (d.hProp IN (2712, 2713, 2714, 2715, 2716, 36
77, 2717, 2718, 2721, 2724, 2859, 4054, 3466, 609, 985, 934, 922
676,302 108,327 6.2 2.0 27.00 430.48 3273774910
SELECT MAX(HMY) FROM H8SUMM WHERE HTENANT = :B2 AND NVL(BMASTERR
ECD, 0) = 0 AND INSTR(:B1 , SACTIONTYPE2A) <> 0 AND TO_CHAR(DTEF
FECTIVE2B, 'YYYYMMDD') || TO_CHAR(NVL(ICORRECTIONNUM,0)) = ( SEL
ECT MAX(TO_CHAR(DTEFFECTIVE2B, 'YYYYMMDD') || TO_CHAR(NVL(ICORRE
CTIONNUM,0))) FROM H8SUMM WHERE HTENANT = :B2 AND NVL(BMASTERREC
-------------------------------------------------------------
Instance Activity Stats for DB: XXXXX Instance: XXXXX Snaps: 185 -186
Statistic Total per Second per Trans
--------------------------------- ------------------ -------------- ------------
CPU used by this session 349,570 223.5 14.6
CPU used when call started 338,828 216.6 14.2
CR blocks created 3,438 2.2 0.1
Cached Commit SCN referenced 0 0.0 0.0
Commit SCN cached 0 0.0 0.0
DBWR buffers scanned 81,453 52.1 3.4
DBWR checkpoint buffers written 8,958 5.7 0.4
DBWR checkpoints 6 0.0 0.0
DBWR free buffers found 72,997 46.7 3.1
DBWR lru scans 78 0.1 0.0
DBWR make free requests 79 0.1 0.0
DBWR revisited being-written buff 11 0.0 0.0
DBWR summed scan depth 81,453 52.1 3.4
DBWR transaction table writes 75 0.1 0.0
DBWR undo block writes 3,686 2.4 0.2
DFO trees parallelized 0 0.0 0.0
PX local messages recv'd 0 0.0 0.0
PX local messages sent 0 0.0 0.0
Parallel operations downgraded 25 0 0.0 0.0
SQL*Net roundtrips to/from client 1,845,799 1,180.2 77.2
SQL*Net roundtrips to/from dblink 0 0.0 0.0
active txn count during cleanout 4,323 2.8 0.2
background checkpoints completed 6 0.0 0.0
background checkpoints started 6 0.0 0.0
background timeouts 1,802 1.2 0.1
branch node splits 0 0.0 0.0
buffer is not pinned count 21,749,100 13,906.1 910.1
buffer is pinned count 13,628,581 8,713.9 570.3
bytes received via SQL*Net from c 88,199,894 56,393.8 3,690.8
bytes received via SQL*Net from d 0 0.0 0.0
bytes sent via SQL*Net to client 1,408,176,981 900,368.9 58,926.9
bytes sent via SQL*Net to dblink 0 0.0 0.0
calls to get snapshot scn: kcmgss 1,345,109 860.0 56.3
calls to kcmgas 47,854 30.6 2.0
calls to kcmgcs 2,039 1.3 0.1
change write time 34,940 22.3 1.5
cleanout - number of ktugct calls 4,577 2.9 0.2
cleanouts and rollbacks - consist 1,778 1.1 0.1
cleanouts only - consistent read 307 0.2 0.0
cluster key scan block gets 35,092 22.4 1.5
cluster key scans 19,142 12.2 0.8
commit cleanout failures: block l 747 0.5 0.0
commit cleanout failures: buffer 5 0.0 0.0
commit cleanout failures: callbac 5 0.0 0.0
commit cleanout failures: cannot 37 0.0 0.0
commit cleanout failures: hot bac 0 0.0 0.0
commit cleanouts 87,684 56.1 3.7
commit cleanouts successfully com 86,890 55.6 3.6
commit txn count during cleanout 4,942 3.2 0.2
consistent changes 195,457 125.0 8.2
consistent gets 32,790,808 20,966.0 1,372.2
consistent gets - examination 8,443,878 5,398.9 353.3
current blocks converted for CR 60 0.0 0.0
cursor authentications 18,690 12.0 0.8
data blocks consistent reads - un 184,284 117.8 7.7
db block changes 563,198 360.1 23.6
Instance Activity Stats for DB: XXXXX Instance: XXXXX Snaps: 185 -186
Statistic Total per Second per Trans
--------------------------------- ------------------ -------------- ------------
db block gets 426,840 272.9 17.9
deferred (CURRENT) block cleanout 37,148 23.8 1.6
dirty buffers inspected 78 0.1 0.0
enqueue conversions 640 0.4 0.0
enqueue deadlocks 0 0.0 0.0
enqueue releases 130,800 83.6 5.5
enqueue requests 130,810 83.6 5.5
enqueue timeouts 4 0.0 0.0
enqueue waits 2 0.0 0.0
exchange deadlocks 1 0.0 0.0
execute count 546,702 349.6 22.9
free buffer inspected 12,419 7.9 0.5
free buffer requested 1,872,044 1,197.0 78.3
hot buffers moved to head of LRU 39,683 25.4 1.7
immediate (CR) block cleanout app 2,085 1.3 0.1
immediate (CURRENT) block cleanou 1,906 1.2 0.1
index fast full scans (full) 1,513 1.0 0.1
index fast full scans (rowid rang 0 0.0 0.0
index fetch by key 6,013,423 3,844.9 251.6
index scans kdiixs1 5,187,054 3,316.5 217.1
leaf node 90-10 splits 56 0.0 0.0
leaf node splits 119 0.1 0.0
logons cumulative 618 0.4 0.0
messages received 16,479 10.5 0.7
messages sent 16,479 10.5 0.7
no buffer to keep pinned count 0 0.0 0.0
no work - consistent read gets 18,451,850 11,797.9 772.1
opened cursors cumulative 375,051 239.8 15.7
parse count (failures) 910 0.6 0.0
parse count (hard) 34,425 22.0 1.4
parse count (total) 394,387 252.2 16.5
parse time cpu 40,504 25.9 1.7
parse time elapsed 949,246 606.9 39.7
physical reads 2,022,993 1,293.5 84.7
physical reads direct 159,729 102.1 6.7
physical reads direct (lob) 0 0.0 0.0
physical writes 161,170 103.1 6.7
physical writes direct 143,872 92.0 6.0
physical writes non checkpoint 158,108 101.1 6.6
pinned buffers inspected 1,696 1.1 0.1
prefetch clients - default 2 0.0 0.0
prefetched blocks 1,108,825 709.0 46.4
prefetched blocks aged out before 0 0.0 0.0
process last non-idle time 1,562 1.0 0.1
queries parallelized 0 0.0 0.0
recovery array read time 0 0.0 0.0
recovery array reads 0 0.0 0.0
recovery blocks read 0 0.0 0.0
recursive calls 1,633,893 1,044.7 68.4
recursive cpu usage 37,457 24.0 1.6
redo blocks written 243,227 155.5 10.2
redo buffer allocation retries 585 0.4 0.0
redo entries 282,608 180.7 11.8
redo log space requests 86 0.1 0.0
redo log space wait time 3,445 2.2 0.1
redo ordering marks 0 0.0 0.0
Latch Activity for DB: XXXXX Instance: XXXXX Snaps: 185 -186
->"Get Requests", "Pct Get Miss" and "Avg Slps/Miss" are statistics for
willing-to-wait latch get requests
->"NoWait Requests", "Pct NoWait Miss" are for no-wait latch get requests
->"Pct Misses" for both should be very close to 0.0
Pct Avg Wait Pct
Get Get Slps Time NoWait NoWait
Latch Requests Miss /Miss (s) Requests Miss
------------------------ -------------- ------ ------ ------ ------------ ------
Consistent RBA 16,006 0.0 0 0
FAL request queue 12 0.0 0 0
FIB s.o chain latch 84 0.0 0 0
FOB s.o list latch 4,874 0.0 1.0 0 0
SQL memory manager latch 1 0.0 0 511 0.0
SQL memory manager worka 199,851 0.0 0.8 5 0
active checkpoint queue 943 0.0 0 0
alert log latch 32 0.0 0 0
archive control 70 0.0 0 0
archive process latch 67 4.5 0.0 0 0
cache buffer handles 28,611 0.0 0 0
cache buffers chains 61,536,340 0.1 3.4 0 3,249,271 0.4
cache buffers lru chain 28,340 2.9 0.5 99 4,101,010 2.2
channel handle pool latc 992 0.0 0 0
channel operations paren 3,838 0.0 0.0 0 0
checkpoint queue latch 119,324 0.0 0.3 0 17,964 0.0
child cursor hash table 245,426 0.1 0.3 7 0
commit callback allocati 1 0.0 0 0
constraint object alloca 1 0.0 0 0
dml lock allocation 85,042 0.0 0.2 2 0
dummy allocation 1,250 0.0 0 0
enqueue hash chains 262,645 0.0 0.5 10 0
enqueues 378,411 0.2 0.6 67 0
event group latch 309 0.0 0 0
hash table column usage 437 0.9 4.0 0 2,548,839 0.0
job workq parent latch 1 100.0 0.0 0 620 1.5
job_queue_processes para 335 0.0 0 0
ktm global data 232 0.0 0 0
lgwr LWN SCN 16,017 0.0 0.1 0 0
library cache 5,084,839 2.2 0.5 6432 100,934 11.2
library cache load lock 4,888 0.0 0 0
library cache pin 2,853,744 0.1 0.4 283 0
library cache pin alloca 1,614,590 0.1 0.5 216 0
list of block allocation 988 0.0 0 0
loader state object free 794 0.0 0 0
longop free list parent 2,575 0.0 0 187 0.0
message pool operations 637 0.0 0 0
messages 51,139 0.0 0.2 0 0
mostly latch-free SCN 16,054 0.1 0.1 0 0
multiblock read objects 1,278,671 0.2 0.3 173 0
ncodef allocation latch 428 0.0 0 0
object stats modificatio 102 0.0 0 0
post/wait queue 14,689 0.0 0.0 0 9,209 0.0
process allocation 309 0.3 33.0 1 309 0.0
process group creation 631 0.0 0 0
redo allocation 315,286 0.1 0.3 17 0
redo copy 0 0 283,343 1.7
redo writing 50,122 0.0 0.5 0 0
row cache enqueue latch 6,335,779 0.4 0.0 52 0
row cache objects 6,767,248 0.4 0.2 1386 1,584 0.2
Latch Sleep breakdown for DB: XXXXX Instance: XXXXX Snaps: 185 -186
-> ordered by misses desc
Get Spin &
Latch Name Requests Misses Sleeps Sleeps 1->4
-------------------------- -------------- ----------- ----------- ------------
shared pool 4,600,581 127,578 24,417 104263/22273
/988/54/0
library cache 5,084,839 111,775 52,506 62069/47161/
2326/219/0
cache buffers chains 61,536,340 42,486 142,926 0/0/0/0/0
row cache objects 6,767,248 29,574 4,696 24914/4624/3
6/0/0
row cache enqueue latch 6,335,779 22,846 497 22354/487/5/
0/0
library cache pin 2,853,744 3,540 1,364 2178/1360/2/
0/0
multiblock read objects 1,278,671 3,023 775 2249/773/1/0
/0
session idle bit 3,975,458 2,393 1,794 0/0/0/0/0
library cache pin allocati 1,614,590 2,359 1,108 1251/1108/0/
0/0
session allocation 808,491 2,067 803 1264/803/0/0
/0
cache buffers lru chain 28,340 819 424 397/420/2/0/
0
enqueues 378,411 617 354 263/354/0/0/
0
redo allocation 315,286 331 94 239/90/2/0/0
simulator lru latch 47,662 242 208 34/208/0/0/0
child cursor hash table 245,426 160 52 108/52/0/0/0
enqueue hash chains 262,645 108 52 56/52/0/0/0
dml lock allocation 85,042 36 8 28/8/0/0/0
SQL memory manager workare 199,851 24 20 4/20/0/0/0
undo global data 136,739 14 7 7/7/0/0/0
redo writing 50,122 11 6 5/6/0/0/0
messages 51,139 10 2 8/2/0/0/0
mostly latch-free SCN 16,054 9 1 8/1/0/0/0
lgwr LWN SCN 16,017 8 1 7/1/0/0/0
simulator hash latch 1,820,154 7 3 4/3/0/0/0
hash table column usage la 437 4 16 0/0/0/0/0
checkpoint queue latch 119,324 3 1 2/1/0/0/0
sequence cache 34,660 2 2 0/2/0/0/0
user lock 4,199 2 1 1/1/0/0/0
process allocation 309 1 33 0/0/0/1/0
FOB s.o list latch 4,874 1 1 0/1/0/0/0
-------------------------------------------------------------
and Tom said...
if you are a frequent reader of this site or any of my writings.... This is a "no brainer"
Hard parses: 22.01
so, you are telling me that in a 26 minute period, there were 34,428 new SQL statements that never ever existed before coming into your system.
This is not possible, not unless someone of course has forgotten to use - well, yes: bind variables...
Here, this looks like a culprit (one of many I am sure)
Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
1,857,990 1 1,857,990.0 5.6 11.11 181.40 3248088983
Module: dllhost.exe
select uRef || ' (#' || to_char(hMy - 1200000000) || ')' from tr
ans where iType = 12 and bOpen = -1 and (hmy in (select t.hParen
t2 from trans t, detail d where t.hParent2 = trans.hMy and t.hMy
= d.hInvOrRec and (d.hProp IN (2712, 2713, 2714, 2715, 2716, 36
77, 2717, 2718, 2721, 2724, 2859, 4054, 3466, 609, 985, 934, 922
It does not use binds - and it does "quite a bit of work".
It is really hard to "tune a system" using aggregate information like this, however I feel very confident in stating the following:
a) the developers have chosen to not use bind variables. This system therefore cannot and will not scale.
b) there is some obvious low hanging fruit in the form of really expensive to execute SQL statements.
Rating
(8 ratings)
Is this answer out of date? If it is, please let us know via a Comment