Performance Investigation Approach when using STATSPACK

Additional Information...

Matt, February 02, 2003 - 7:33 pm UTC

I've already capture data with a vanilla STATSPACK configuration. Some of the SQL I am seeing pulled out seems to be from the STATSPACK snapshot queries. As a result (see SNIP below) I am a little wary of increasing the snapshot level at this time since the doco highlights that extra work will be carried out, I have some data now, before I take this next step I want to see whether I can make good use of what I have already.

I'll provide some additional information and then can you please confirm this approach of increasing the level to 10?

[SNIP From ?/rdbms/admin/spdoc.txt]
Levels >= 10 Additional statistics: Parent and Child latches
This level includes all statistics gathered in the lower levels, and
additionally gathers Parent and Child Latch information. Data
gathered at this level can sometimes cause the snapshot to take longer
to complete i.e. this level can be resource intensive, and should
only be used when advised by Oracle personnel.
[END]

OK, I can't point the finger at what changed in the upgrade but surely I can highlight where gains can be made (and get a feel for how much gain might be felt by the end users).

Can you provide some guidance as far as the following SNAP is concerned? It was a taken as described above.

The system is a HYBRID - 80% batch processing (reports and data loads); 20% OLTP.

STATSPACK report for

DB Name DB Id Instance Inst Num Release Cluster Host
------------ ----------- ------------ -------- ----------- ------- ------------
xxxx xxxxxxxxxx xxxx 1 9.0.1.4.0 NO xxxxxx.xxxxx
.com

Snap Id Snap Time Sessions Curs/Sess Comment
------- ------------------ -------- --------- -------------------
Begin Snap: 66 31-Jan-03 16:43:01 25 15.4
End Snap: 67 31-Jan-03 17:13:04 23 11.6
Elapsed: 30.05 (mins)

Cache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 393M Std Block Size: 16K
Shared Pool Size: 80M Log Buffer: 160K

Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 347.06 78,219.00
Logical reads: 42.38 9,552.25
Block changes: 0.21 47.13
Physical reads: 0.56 125.38
Physical writes: 0.32 72.38
User calls: 3.62 815.00
Parses: 0.24 53.50
Hard parses: 0.00 0.00
Sorts: 1.12 252.50
Logons: 0.00 0.38
Executes: 4.30 969.13
Transactions: 0.00

% Blocks changed per Read: 0.49 Recursive Call %: 45.56
Rollback per transaction %: 62.50 Rows per Sort: 5.93

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 100.00 Redo NoWait %: 99.66
Buffer Hit %: 99.25 In-memory Sort %: 100.00
Library Hit %: 100.00 Soft Parse %: 100.00
Execute to Parse %: 94.48 Latch Hit %: 100.00
Parse CPU to Parse Elapsd %: 84.62 % Non-Parse CPU: 99.35

Shared Pool Statistics Begin End
------ ------
Memory Usage %: 93.99 93.95
% SQL with executions>1: 67.62 67.33
% Memory for SQL w/exec>1: 36.57 36.39

Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (s) Wt Time
-------------------------------------------- ------------ ----------- -------
control file parallel write 604 16 49.30
db file parallel write 178 7 20.24
db file sequential read 383 4 11.78
log file parallel write 100 2 6.39
async disk IO 15 1 3.29
-------------------------------------------------------------

What I see is:

1) % Blocks changed per Read: 0.49 - very low, a reporting process must be running.
2) Recursive Call %: 45.56 - We use mostly PL/SQL packages so that should be attributable to this
3) Rollback per transaction %: 62.50 - rather high and requires investigation

4) Execute to Parse %: 94.48 - For each parse we execute 1.06 times (could be the nature of the running report)
5) Parse CPU to Parse Elapsd %: 84.62 - low. we are waiting for something - see WAITS below
6) % Non-Parse CPU: 99.35 - OK

7) Memory Usage %: 93.99 93.95 - a little high
8) % SQL with executions>1: 67.62 67.33 - OK given HYBRID system.
9) % Memory for SQL w/exec>1: 36.57 36.39 - ?????

[SNIP DB Reference...]
Control file Parallel Write:

This event occurs while the session is writing physical blocks to all control files. This happens when:
The session starts a control file transaction (to make sure that the control files are up to date in case the session crashes before committing the control file transaction)

The session commits a transaction to a control file
Changing a generic entry in the control file, the new value is being written to all control files
Wait Time: The wait time is the time it takes to finish all writes to all control files
[END]

Should I investigate this further, and if so how?

Do you have any other comments on the report?

February 03, 2003 - 7:14 am UTC

My comments:

o the reason the statspack queries were the top is....

o ZERO tps -- this system was not active at all during the measurement time. suggest you find a busy time, measure then (and use level 10 - statspack.snap runs for what? a couple of seconds. that is the ONLY point in time it affects the performance at all! when .snap is actually running)

o the rollback % could be attributable to the fact that there were so few actual transactions....

o 16 seconds of wait time in 30 minutes is less then bothersome, it is noise. just make sure care has been taken to spread IO out for controlfiles just as you would for log files. we write to them continously, you don't want them on busy file systems.

If there was a performance issue during this window, it's not evident in this snippet.

Apologies (and some valid STATSPACK data).

Matt, February 04, 2003 - 12:35 am UTC

A lesson learnt there. Check the validity of your data. I had missed the zero transaction count and had assumed that the system was in use 24/7 and so any SNAP would have highlighted an issue. Apologies for wasting your time.

I now have a correct SNAP for a data load process that takes 55 minutes to complete (and is reported to have taken approximately 50% less time in the past).

This problem data load started before and finished after the SNAP IDs used to generate this.

STATSPACK report for

DB Name DB Id Instance Inst Num Release Cluster Host
------------ ----------- ------------ -------- ----------- ------- ------------
xxxx xxxxxxxxxx xxxx 1 9.0.1.4.0 NO xxxxxxxxxxx
.com

Snap Id Snap Time Sessions Curs/Sess Comment
------- ------------------ -------- --------- -------------------
Begin Snap: 205 03-Feb-03 05:29:01 29 14.7
End Snap: 206 03-Feb-03 05:59:05 28 18.1
Elapsed: 30.07 (mins)

Cache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 393M Std Block Size: 16K
Shared Pool Size: 80M Log Buffer: 160K
Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 172,724.94 478,641.77
Logical reads: 4,331.81 12,003.97
Block changes: 978.82 2,712.42
Physical reads: 12.98 35.96
Physical writes: 15.66 43.41
User calls: 1,352.03 3,746.65
Parses: 9.82 27.21
Hard parses: 0.00 0.01
Sorts: 5.83 16.16
Logons: 0.00 0.01
Executes: 2,746.23 7,610.13
Transactions: 0.36

% Blocks changed per Read: 22.60 Recursive Call %: 77.13
Rollback per transaction %: 0.00 Rows per Sort: 1.95

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 99.22 Redo NoWait %: 99.98
Buffer Hit %: 99.86 In-memory Sort %: 100.00
Library Hit %: 100.00 Soft Parse %: 99.95
Execute to Parse %: 99.64 Latch Hit %: 97.39
Parse CPU to Parse Elapsd %: 67.47 % Non-Parse CPU: 99.93

Shared Pool Statistics Begin End
------ ------
Memory Usage %: 53.36 53.77
% SQL with executions>1: 68.11 69.37
% Memory for SQL w/exec>1: 71.10 70.79

Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (s) Wt Time
-------------------------------------------- ------------ ----------- -------
buffer busy waits 60,866 699 22.89
latch free 219,813 515 16.87
log file parallel write 10,751 494 16.17
db file sequential read 9,711 319 10.44
db file parallel write 2,446 278 9.10
-------------------------------------------------------------
^LWait Events for DB: NSL1 Instance: NSL1 Snaps: 205 -206
-> s - second
-> cs - centisecond - 100th of a second
-> ms - millisecond - 1000th of a second
-> us - microsecond - 1000000th of a second
-> ordered by wait time desc, waits desc (idle events last)

Avg
Total Wait wait Waits
Event Waits Timeouts Time (s) (ms) /txn
---------------------------- ------------ ---------- ---------- ------ --------
buffer busy waits 60,866 73 699 11 93.5
latch free 219,813 30,717 515 2 337.7
log file parallel write 10,751 7,337 494 46 16.5
db file sequential read 9,711 0 319 33 14.9
db file parallel write 2,446 2,446 278 114 3.8
log buffer space 1,375 73 260 189 2.1
control file parallel write 1,672 0 112 67 2.6
control file sequential read 21,147 0 78 4 32.5
async disk IO 638 0 66 103 1.0
log file sync 808 11 56 69 1.2
direct path write 5,266 0 51 10 8.1
log file switch completion 163 0 49 302 0.3
log file sequential read 431 0 39 90 0.7
LGWR wait for redo copy 144 80 15 104 0.2
direct path read 12,760 0 13 1 19.6
db file scattered read 89 0 4 46 0.1
enqueue 883 2 3 4 1.4
log file single write 124 0 3 26 0.2
process startup 2 0 0 200 0.0
buffer deadlock 68 68 0 0 0.1
SQL*Net more data to client 4 0 0 0 0.0
SQL*Net break/reset to clien 2 0 0 0 0.0
SQL*Net message from client 2,439,175 0 12,751 5 3,746.8
pipe get 4,344 3,648 9,845 2266 6.7
jobq slave wait 45 43 136 3031 0.1
SQL*Net message to client 2,439,173 0 26 0 3,746.8
SQL*Net more data from clien 2,568 0 0 0 3.9
-------------------------------------------------------------
^LBackground Wait Events for DB: NSL1 Instance: NSL1 Snaps: 205 -206
-> ordered by wait time desc, waits desc (idle events last)

Avg
Total Wait wait Waits
Event Waits Timeouts Time (s) (ms) /txn
---------------------------- ------------ ---------- ---------- ------ --------
log file parallel write 10,750 7,337 494 46 16.5
db file parallel write 2,446 2,446 278 114 3.8
control file parallel write 1,672 0 112 67 2.6
async disk IO 638 0 66 103 1.0
direct path write 5,266 0 51 10 8.1
log file sequential read 431 0 39 90 0.7
control file sequential read 2,956 0 36 12 4.5
LGWR wait for redo copy 144 80 15 104 0.2
direct path read 12,760 0 13 1 19.6
log buffer space 91 1 11 118 0.1
log file single write 124 0 3 26 0.2
enqueue 7 2 1 151 0.0
latch free 40 39 1 19 0.1
db file scattered read 12 0 1 58 0.0
db file sequential read 5 0 0 16 0.0
rdbms ipc message 17,264 11,117 8,400 487 26.5
smon timer 6 6 1,843 ###### 0.0
pmon timer 604 604 1,804 2987 0.9
-------------------------------------------------------------

This was generated for

My conclusions:

1) There is some work being done!
2 A very small percentage of logical reads resulted in physical I/O (<<1%).
3) 78% of blocks read are unchanged
4) Recursive calls most likely due to using P/SQL packages

5) Parse CPU to Parsed elapsed indicates that there are some waits occuring

6) Some memory in the shared pool seems to be wasted (but actually the shared pool hasn't filled with SQL since the last reboot)

We have some waits:

buffer busy waits - 40% of the 30 minute period
latch free - 28% of the 30 minute period
log file parallel write - 27% of the 30 minute period
db file sequential read - 18% of the 30 minute period
db file parallel write - 15% of the 30 minute period

I assume that if we fix the buffer busy waits. Then the knock on effect will reduce (some) of the remaining waits.

Buffer Busy Waits:
Wait until a buffer becomes available. This event happens because a buffer is either being read into the buffer cache by another session (and the session is waiting for that read to complete) or the buffer is the buffer cache, but in a incompatible mode (that is, some other session is changing the buffer).
Wait Time: Normal wait time is 1 second. If the session was waiting for a buffer during the last wait, then the next wait will be 3 seconds.

Questions:
==========

Do my conclusions above make sense?
How do I identify which buffers are being waited for?
How do I reduce these waits?
Is there anything else that I need to look at?

Thanks and Regards,

February 04, 2003 - 7:48 am UTC

how many cpus do you have?
how many processes run in parallel to do this load (and if one - why???)
what were the statistics for CPU in the statspack report?

Information as requested and some additional stuff

Matt, February 04, 2003 - 5:44 pm UTC

The machine has 4 cpu's and 4Gb of Memory.

There are several (12) processes that carry out the data load. For the period of this snapshot there were 6 processes loading concurrently ( roughly half of these complete midway through the snapshot).

When you ask for the CPU statistics, I think that you want the Instance Activity Stats. Here they are (plus some extra):

^LInstance Activity Stats for DB: NSL1 Instance: NSL1 Snaps: 205 -206

Statistic Total per Second per Trans
--------------------------------- ------------------ -------------- ------------
CPU used by this session 522,826 289.8 803.1
CPU used when call started 522,829 289.8 803.1
CR blocks created 8,513 4.7 13.1
DBWR buffers scanned 2,792 1.6 4.3
DBWR checkpoint buffers written 21,349 11.8 32.8
DBWR checkpoints 62 0.0 0.1
DBWR free buffers found 1,640 0.9 2.5
DBWR lru scans 140 0.1 0.2
DBWR make free requests 158 0.1 0.2
DBWR revisited being-written buff 0 0.0 0.0
DBWR summed scan depth 2,792 1.6 4.3
DBWR transaction table writes 405 0.2 0.6
DBWR undo block writes 8,364 4.6 12.9
SQL*Net roundtrips to/from client 2,438,683 1,351.8 3,746.1
background checkpoints completed 62 0.0 0.1
background checkpoints started 62 0.0 0.1
background timeouts 2,019 1.1 3.1
buffer is not pinned count 2,322,395 1,287.4 3,567.4
buffer is pinned count 3,784,897 2,098.1 5,814.0
bytes received via SQL*Net from c 600,699,809 332,982.2 922,734.0
bytes sent via SQL*Net to client 415,529,514 230,337.9 638,294.2
calls to get snapshot scn: kcmgss 3,453,833 1,914.5 5,305.4
calls to kcmgas 23,186 12.9 35.6
calls to kcmgcs 7,451 4.1 11.5
change write time 30,285 16.8 46.5
cleanouts and rollbacks - consist 8,283 4.6 12.7
cleanouts only - consistent read 147 0.1 0.2
cluster key scan block gets 322,248 178.6 495.0
cluster key scans 47,441 26.3 72.9
commit cleanout failures: callbac 11 0.0 0.0
commit cleanout failures: cannot 72 0.0 0.1
commit cleanouts 30,681 17.0 47.1
commit cleanouts successfully com 30,598 17.0 47.0
consistent changes 1,700,678 942.7 2,612.4
consistent gets 6,840,920 3,792.1 10,508.3
consistent gets - examination 4,432,309 2,456.9 6,808.5
cursor authentications 20 0.0 0.0
data blocks consistent reads - un 1,699,869 942.3 2,611.2
db block changes 1,765,784 978.8 2,712.4
db block gets 973,743 539.8 1,495.8
deferred (CURRENT) block cleanout 23,958 13.3 36.8
dirty buffers inspected 492 0.3 0.8
enqueue conversions 6 0.0 0.0
enqueue releases 19,106 10.6 29.4
enqueue requests 21,250 11.8 32.6
enqueue timeouts 2,117 1.2 3.3
enqueue waits 833 0.5 1.3
exchange deadlocks 68 0.0 0.1
execute count 4,954,193 2,746.2 7,610.1
free buffer inspected 495 0.3 0.8
free buffer requested 29,411 16.3 45.2
hot buffers moved to head of LRU 18,758 10.4 28.8
immediate (CR) block cleanout app 8,430 4.7 13.0
immediate (CURRENT) block cleanou 4,980 2.8 7.7
leaf node splits 66 0.0 0.1
logons cumulative 4 0.0 0.0
^LInstance Activity Stats for DB: NSL1 Instance: NSL1 Snaps: 205 -206

Statistic Total per Second per Trans
--------------------------------- ------------------ -------------- ------------
messages received 12,452 6.9 19.1
messages sent 12,452 6.9 19.1
no buffer to keep pinned count 0 0.0 0.0
no work - consistent read gets 1,390,113 770.6 2,135.4
opened cursors cumulative 17,636 9.8 27.1
parse count (failures) 1 0.0 0.0
parse count (hard) 9 0.0 0.0
parse count (total) 17,716 9.8 27.2
parse time cpu 363 0.2 0.6
parse time elapsed 538 0.3 0.8
physical reads 23,407 13.0 36.0
physical reads direct 12,834 7.1 19.7
physical writes 28,259 15.7 43.4
physical writes direct 5,266 2.9 8.1
physical writes non checkpoint 22,372 12.4 34.4
pinned buffers inspected 0 0.0 0.0
prefetched blocks 772 0.4 1.2
process last non-idle time 2,088,423,277 1,157,662.6 3,208,023.5
recovery blocks read 0 0.0 0.0
recursive calls 8,225,672 4,559.7 12,635.4
recursive cpu usage 186,061 103.1 285.8
redo blocks written 314,595 174.4 483.3
redo buffer allocation retries 1,555 0.9 2.4
redo entries 890,378 493.6 1,367.7
redo log space requests 200 0.1 0.3
redo log space wait time 4,924 2.7 7.6
redo ordering marks 2 0.0 0.0
redo size 311,595,792 172,724.9 478,641.8
redo synch time 5,553 3.1 8.5
redo synch writes 1,806 1.0 2.8
redo wastage 5,667,272 3,141.5 8,705.5
redo write time 81,272 45.1 124.8
redo writer latching time 1,495 0.8 2.3
redo writes 10,751 6.0 16.5
rollback changes - undo records a 3 0.0 0.0
rollbacks only - consistent read 502 0.3 0.8
rows fetched via callback 34,940 19.4 53.7
session connect time 2,088,423,277 1,157,662.6 3,208,023.5
session logical reads 7,814,584 4,331.8 12,004.0
session uga memory max 3,031,480 1,680.4 4,656.7
shared hash latch upgrades - no w 897,857 497.7 1,379.2
shared hash latch upgrades - wait 10,013 5.6 15.4
sorts (memory) 10,523 5.8 16.2
sorts (rows) 20,511 11.4 31.5
summed dirty queue length 2,312 1.3 3.6
switch current to new buffer 401 0.2 0.6
table fetch by rowid 2,835,932 1,572.0 4,356.3
table fetch continued row 3,098 1.7 4.8
table scan blocks gotten 95,475 52.9 146.7
table scan rows gotten 5,390,773 2,988.2 8,280.8
table scans (long tables) 0 0.0 0.0
table scans (short tables) 51,298 28.4 78.8
user calls 2,439,071 1,352.0 3,746.7
user commits 651 0.4 1.0
user rollbacks 0 0.0 0.0
write clones created in backgroun 1 0.0 0.0
^LInstance Activity Stats for DB: NSL1 Instance: NSL1 Snaps: 205 -206

Statistic Total per Second per Trans
--------------------------------- ------------------ -------------- ------------
write clones created in foregroun 794 0.4 1.2

(...snip)

Buffer wait Statistics for DB: NSL1 Instance: NSL1 Snaps: 205 -206
-> ordered by wait time desc, waits desc

Tot Wait Avg
Class Waits Time (s) Time (ms)
------------------ ----------- ---------- ---------
data block 55,376 664 12
undo block 5,405 45 8
undo header 36 0 10
segment header 87 0 1
-------------------------------------------------------------

PGA Memory Stats for DB: NSL1 Instance: NSL1 Snaps: 205 -206
-> WorkArea (W/A) memory is used for: sort, bitmap merge, and hash join ops

Statistic Begin (M) End (M) % Diff
----------------------------------- ---------------- ---------------- ----------
maximum PGA allocated 48.387 48.387 .00
-------------------------------------------------------------
^LEnqueue activity for DB: NSL1 Instance: NSL1 Snaps: 205 -206
-> Enqueue stats gathered prior to 9i should not be compared with 9i data
-> ordered by waits desc, requests desc

Avg Wt Wait
Eq Requests Succ Gets Failed Gets Waits Time (ms) Time (s)
-- ------------ ------------ ----------- ----------- ----------- ------------
TX 14,186 14,185 0 816 2.78 2
HW 228 228 0 10 1.70 0
CF 1,743 1,726 17 7 152.29 1
-------------------------------------------------------------
^LRollback Segment Stats for DB: NSL1 Instance: NSL1 Snaps: 205 -206
->A high value for "Pct Waits" suggests more rollback segments may be required
->RBS stats may not be accurate between begin and end snaps when using Auto Undo
managment, as RBS may be dynamically created and dropped as needed

Trans Table Pct Undo Bytes
RBS No Gets Waits Written Wraps Shrinks Extends
------ -------------- ------- --------------- -------- -------- --------
0 7.0 0.00 0 0 0 0
2 180.0 0.00 106,798 0 0 0
3 198.0 0.00 172,758 0 0 0
4 172.0 0.00 73,484 0 0 0
5 2,466.0 0.00 10,223,994 2 0 0
6 8,150.0 0.00 27,858,772 5 0 4
7 9,057.0 0.00 29,412,858 6 0 5
8 203.0 0.00 244,494 0 0 0
9 9,109.0 0.01 29,223,536 6 0 5
10 2,392.0 0.00 9,777,734 2 0 1
11 6,333.0 0.00 22,915,372 5 0 4
-------------------------------------------------------------

^LRollback Segment Storage for DB: NSL1 Instance: NSL1 Snaps: 205 -206
->Optimal Size should be larger than Avg Active

RBS No Segment Size Avg Active Optimal Size Maximum Size
------ --------------- --------------- --------------- ---------------
0 1,556,480 4,915 1,556,480
2 10,469,376 0 10,485,760 36,683,776
3 10,469,376 0 10,485,760 10,469,376
4 10,469,376 0 10,485,760 10,469,376
5 15,712,256 4,678,219 10,485,760 20,955,136
6 31,440,896 7,197,523 10,485,760 31,440,896
7 36,683,776 9,346,924 10,485,760 36,683,776
8 10,469,376 0 10,485,760 10,469,376
9 36,683,776 9,346,924 10,485,760 36,683,776
10 15,712,256 1,518,960 10,485,760 15,712,256
11 31,440,896 7,199,803 10,485,760 31,440,896
-------------------------------------------------------------
^LLatch Activity for DB: NSL1 Instance: NSL1 Snaps: 205 -206
->"Get Requests", "Pct Get Miss" and "Avg Slps/Miss" are statistics for
willing-to-wait latch get requests
->"NoWait Requests", "Pct NoWait Miss" are for no-wait latch get requests
->"Pct Misses" for both should be very close to 0.0
-> ordered by Wait Time desc, Avg Slps/Miss, Pct NoWait Miss desc

Pct Avg Wait Pct
Get Get Slps Time NoWait NoWait
Latch Requests Miss /Miss (s) Requests Miss
------------------------ -------------- ------ ------ ------ ------------ ------
shared pool 73,360 0.0 0.0 0 0
messages 50,845 0.1 0.0 0 0
redo allocation 914,048 0.3 0.0 0 0
session idle bit 4,878,547 0.0 0.1 0 0
library cache 28,579,396 4.6 0.1 0 0
cache buffer handles 39,898 0.0 0.2 0 0
latch wait list 1,100,299 0.6 0.2 0 0
undo global data 124,805 0.0 0.3 0 0
redo writing 32,727 0.1 0.3 0 0
cache buffers chains 17,523,755 0.4 0.4 0 30,543 0.0
list of block allocation 27,350 0.0 0.5 0 0
transaction allocation 41,136 0.1 0.5 0 0
checkpoint queue latch 146,571 0.0 0.5 0 0
enqueues 51,443 0.0 0.6 0 0
enqueue hash chains 41,265 0.1 0.7 0 0
cache buffers lru chain 26,876 0.0 1.0 0 28,229 0.1
job workq parent latch 0 0 2 0.0
process allocation 3 0.0 0 3 0.0
redo copy 0 0 892,071 0.0
FAL request queue 122 0.0 0 0
FIB s.o chain latch 494 0.0 0 0
FOB s.o list latch 769 0.0 0 0
child cursor hash table 101 0.0 0 0
user lock 8 0.0 0 0
transaction branch alloc 28 0.0 0 0
sort extent pool 34 0.0 0 0
session timer 605 0.0 0 0
session switching 28 0.0 0 0
session allocation 1,476 0.0 0 0
sequence cache 2,043 0.0 0 0
row cache objects 2,614 0.0 0 0
ncodef allocation latch 28 0.0 0 0
multiblock read objects 260 0.0 0 0
loader state object free 420 0.0 0 0
library cache load lock 30 0.0 0 0
ktm global data 6 0.0 0 0
job_queue_processes para 33 0.0 0 0
hash table column usage 4 0.0 0 0
file number translation 18,170 0.0 0 0
event group latch 3 0.0 0 0
process group creation 7 0.0 0 0
post/wait queue latch 1,949 0.0 0 0
dml lock allocation 1,588 0.0 0 0
channel operations paren 544 0.0 0 0
channel handle pool latc 7 0.0 0 0
SQL memory manager worka 67 0.0 0 0
active checkpoint queue 2,175 0.0 0 0
archive control 186 0.0 0 0
archive process latch 123 0.0 0 0
^LLatch Activity for DB: NSL1 Instance: NSL1 Snaps: 205 -206
->"Get Requests", "Pct Get Miss" and "Avg Slps/Miss" are statistics for
willing-to-wait latch get requests
->"NoWait Requests", "Pct NoWait Miss" are for no-wait latch get requests
->"Pct Misses" for both should be very close to 0.0
-> ordered by Wait Time desc, Avg Slps/Miss, Pct NoWait Miss desc

Pct Avg Wait Pct
Get Get Slps Time NoWait NoWait
Latch Requests Miss /Miss (s) Requests Miss
------------------------ -------------- ------ ------ ------ ------------ ------
-------------------------------------------------------------
^LLatch Sleep breakdown for DB: NSL1 Instance: NSL1 Snaps: 205 -206
-> ordered by misses desc
Latch Name Requests Misses Sleeps Sleeps 1->4
-------------------------- -------------- ----------- ----------- ------------
library cache 28,579,396 1,311,420 188,628 1140566/1542
09/15624/102
1/0
cache buffers chains 17,523,755 78,497 29,264 0/0/0/0/0
latch wait list 1,100,299 6,836 1,564 5291/1526/19
/0/0
redo allocation 914,048 3,102 131 2979/115/8/0
/0
session idle bit 4,878,547 1,816 172 0/0/0/0/0
messages 50,845 60 2 58/2/0/0/0
transaction allocation 41,136 37 19 18/19/0/0/0
undo global data 124,805 37 10 0/0/0/0/0
enqueue hash chains 41,265 32 22 10/22/0/0/0
redo writing 32,727 23 8 15/8/0/0/0
enqueues 51,443 18 11 7/11/0/0/0
checkpoint queue latch 146,571 17 9 8/9/0/0/0
cache buffer handles 39,898 5 1 4/1/0/0/0
cache buffers lru chain 26,876 3 3 0/3/0/0/0
list of block allocation 27,350 2 1 1/1/0/0/0
^LLatch Miss Sources for DB: NSL1 Instance: NSL1 Snaps: 205 -206
-> only latches with sleeps are shown
-> ordered by name, sleeps desc

NoWait Waiter
Latch Name Where Misses Sleeps Sleeps
------------------------ -------------------------- ------- ---------- --------
cache buffer handles kcbzfs 0 1 1
cache buffers chains kcbgtcr: kslbegin 0 18,616 10,949
cache buffers chains kcbgcur: kslbegin 0 7,188 1,447
cache buffers chains kcbchg: kslbegin: bufs not 0 1,529 4,296
cache buffers chains kcbzwb 0 1,086 225
cache buffers chains kcbrls: kslbegin 0 560 7,639
cache buffers chains kcbchg: kslbegin: call CR 0 188 4,610
cache buffers chains kcbget: pin buffer 0 21 12
cache buffers chains kcbcge 0 6 5
cache buffers chains kcbzgb: scan from tail. no 0 5 0
cache buffers chains kcbget: exchange 0 2 2
cache buffers chains kcbnlc 0 2 36
cache buffers chains kcbgtcr 0 2 0
cache buffers chains kcbzib: finish free bufs 0 1 0
cache buffers lru chain kcbzgb: multiple sets nowa 0 3 0
checkpoint queue latch kcbklbc: Link buffer into 0 7 5
checkpoint queue latch kcbbxsv: move to being wri 0 2 2
enqueue hash chains ksqrcl 0 17 12
enqueue hash chains ksqgtl3 0 3 5
enqueue hash chains ksqcmi: get hash chain lat 0 2 5
enqueues ksqdel 0 5 1
enqueues ksqrcl 0 5 8
enqueues ksqgtl2 0 1 3
latch wait list kslges add 0 1,346 257
latch wait list kslges delete 0 218 1,307
library cache kglupc: child 0 63,235 59,894
library cache kglpnc: child 0 59,487 61,164
library cache kglpnp: child 0 33,337 31,786
library cache kglpndl: child: before pro 0 29,344 31,674
library cache kglpnal: child: alloc spac 0 1,421 1,425
library cache kgllkdl: child: cleanup 0 908 544
library cache kglhdgc: child: 0 374 853
library cache kglhdgn: child: 0 247 89
library cache kgllkdl: child: free pin 0 139 1,097
library cache kglic 0 118 91
library cache kglpsl: child 0 6 7
library cache kgldte: child 0 0 3 13
library cache kglini: child 0 2 2
library cache kglpin 0 2 1
library cache kglati 0 1 0
library cache kgldti: 2child 0 1 0
library cache kgldtld: 2child 0 1 0
library cache kglget: child: KGLDSBRD 0 1 0
list of block allocation ktlabl 0 1 0
messages ksaamb: after wakeup 0 1 2
messages ksarcv 0 1 0
redo allocation kcrfwr: redo allocation 0 127 107
redo allocation kcrfwi: before write 0 3 14
redo allocation kcrfwi: more space 0 1 10
redo writing kcrfsr 0 8 0
session idle bit ksupuc: clear busy 0 90 105
session idle bit ksupuc: set busy 0 82 66
transaction allocation ktcxba 0 13 13
^LLatch Miss Sources for DB: NSL1 Instance: NSL1 Snaps: 205 -206
-> only latches with sleeps are shown
-> ordered by name, sleeps desc

NoWait Waiter
Latch Name Where Misses Sleeps Sleeps
------------------------ -------------------------- ------- ---------- --------
transaction allocation ktcdso 0 6 6
undo global data ktubnd 0 7 0
undo global data ktudnx: KSLBEGIN 0 3 3
-------------------------------------------------------------
^LDictionary Cache Stats for DB: NSL1 Instance: NSL1 Snaps: 205 -206
->"Pct Misses" should be very low (< 2% in most cases)
->"Cache Usage" is the number of cache entries being used
->"Pct SGA" is the ratio of usage to allocated size for that cache

Get Pct Scan Pct Mod Final Pct
Cache Requests Miss Reqs Miss Reqs Usage SGA
---------------------- ------------ ------ ------ ----- -------- ---------- ----
dc_files 1 0.0 0 0 1 9
dc_free_extents 202 7.4 19 0.0 19 99 27
dc_object_ids 58 0.0 0 0 531 100
dc_objects 33 18.2 0 0 1,755 100
dc_profiles 2 0.0 0 0 1 33
dc_rollback_segments 168 0.0 0 0 15 79
dc_segments 77 0.0 0 19 511 99
dc_sequences 33 0.0 0 33 7 78
dc_tablespaces 113 0.9 0 0 55 87
dc_used_extents 19 100.0 0 19 22 92
dc_user_grants 6 16.7 0 0 23 66
dc_usernames 10 20.0 0 0 12 52
dc_users 19 5.3 0 0 27 90
-------------------------------------------------------------
Library Cache Activity for DB: NSL1 Instance: NSL1 Snaps: 205 -206
->"Pct Misses" should be very low

Get Pct Pin Pct Invali-
Namespace Requests Miss Requests Miss Reloads dations
--------------- ------------ ------ -------------- ------ ---------- --------
BODY 29 0.0 29 0.0 0 0
CLUSTER 5 0.0 8 0.0 0 0
PIPE 3,237 0.0 3,933 0.0 0 0
SQL AREA 17,602 0.0 9,152,079 0.0 0 0
TABLE/PROCEDURE 86 10.5 4,870,564 0.0 0 0
-------------------------------------------------------------
^LSGA Memory Summary for DB: NSL1 Instance: NSL1 Snaps: 205 -206

SGA regions Size in Bytes
------------------------------ ----------------
Database Buffers 419,430,400
Fixed Size 435,752
Redo Buffers 180,224
Variable Size 150,994,944
----------------
sum 571,041,320
-------------------------------------------------------------
(...snip)

I think that there may be a couple of issues here (non of which are upgrade related), but I guess I need some confirmation:

1) Freelist contention for the tables being inserted/updated (if this is the case how do I track this - look at the INSERTS UPDATES is the SQL part of the report? How do I fix it - LMS with auto segment mangement?)

2) Possibly some 'Checkpoint not complete' errors - nothing in the alert log, but from memory I don't think that 9i logs these in the alert.log by default. How do I confirm this?

3) Lots of indexes used for any queries (currently using RBO)

Can you please comment on the additional information I have provided?

February 04, 2003 - 7:52 pm UTC

1) just add more freelists, ALTER TABLE and ALTER INDEX let you do that.

2) they would be, you would see log waits big time if there were.

3) i was looking to see if you are cpu bound. you are not, looks like you were less than 75% utilized (2.8 cpu seconds used per second, you have 4 cpu seconds per second to use)

bump up freelists, increase load processes.

freelists will reduce buffer busy waits
more loads will use the CPU you have idle.

Clarification

Matt, February 04, 2003 - 9:42 pm UTC

OK, I can add more freelists to the tables and indexes involved. I guess 10-20 will be a reasonable number depending on how many processes insert/update concurrently.

I understand the 'checkpoint not complete' is not an issue here. The message would be logged in the alert.log.

As far as your CPU comments are concerned:

If I understand you correctly since we have 4 cpus, for every elapsed second we have 4 cpu seconds.

How did you calculate that only 2.8 CPU seconds are used per second? The data I see is:

Statistic Total per Second per Trans
--------------------------------- ------------------ -------------- ------------
CPU used by this session 522,826 289.8 803.1
CPU used when call started 522,829 289.8 803.1

Is this really 289.8 seconds per second? Ah, I see it now this is actually 289.8% CPU per second which is 289.8/4=72% per CPU. So we are wasting CPU cycles.

Is this correct?

Many thanks.

February 05, 2003 - 7:41 am UTC

The stats for cpu are in hsecs. I divided by 100 2.898 ~ 2.8. Using pct's worked since you really divide them by 100 as well.

You are not "wasting" them
You are just not "using them"

You have 7200 cpu seconds in that 30 minute window (30*60*4), you are using 5,228 of them.

A reader, February 04, 2003 - 9:48 pm UTC

Hi Tom,

Am going through your book Chapt 10 Tuning Strategies and Tools, page 480

"On the system reported on, one out of every 345 transaction resulted in rollback - this is acceptable"

Tom how did you calculate that number 345 ??

Thanks..

February 05, 2003 - 7:43 am UTC

rollback per transaction = 0.29%

  1* select 29/100, round( 1/345*100, 2 ) from dual
ops$tkyte@ORA920> /

    29/100 ROUND(1/345*100,2)
---------- ------------------
       .29                .29


1 out of 345 is 0.29% of the time...

Additional on initrans and maxtrans

Matt, February 04, 2003 - 10:59 pm UTC

I've just been investigating the required changes.

Most of the systems tables have an initrans of 1 and a maxtrans of 255. Which I guess are the defaults.

Is there any good reason why this might be required???

My understanding is that this will seriously impact the concurrency of updates to data that is found on the same block.

Is there any advantage to these current settings?

A specific set (included in this performance problem) have
initrans of 0 and a maxtrans of 0.

select table_name, ini_trans, max_trans, freelists, freelist_groups from user_tables

TABLE_NAME INI_TRANS MAX_TRANS FREELISTS FREELIST_GROUPS
------------------------------ ---------- ---------- ---------- ---------------
TABLE_XXXXXXXXXXXXXXX03 0 0 1 1

February 05, 2003 - 7:53 am UTC

for most systems, the defaults are just fine as there is generally enough room on a block for the transaction entries to grow if need be.

where you might be needing to fine tune them is on a system that uses serializable transactions heavily. There, we may need to pre-allocate the transaction slots on the block to avoid them being reused -- much in the same way we preallocate rollback to avoid it being reused. There it is not a concurrency thing in as much as providing us the ability to provide the serializable feature. You can get "cannot serialize access" errors if we reuse the transaction slots during your transaction (just like you can get ora-1555 if we reuse rollback you need during your transaction).

Error in previous post.

Matt, February 05, 2003 - 2:45 am UTC

Sorry the point of the above post was that I have found a number of tables with initrans and maxtrans of zero in the database.

Is this unusual and if so why might someone have decided to set these values?

It seems to me that this must be an error as I can't think of a good reason why someone might want to limit block updates like this?

February 05, 2003 - 8:06 am UTC

Those tables are not segments -- not like normal tables. ini trans and max trans of 0 aren't "real" (they should have nulled them out instead of decoding them to zero)

They could be tables in a CLUSTER (look at C_OBJ#)

Or they could be index organized tables (IOTs)

Both of thos "table" types are tables that do not have segments -- but rather live in or use some other segment.

The cluster tables "live in" the cluster segment. the Cluster would have the ini/max trans.

The IOT tables only have an index, no table -- the ini/max trans would be found with the index.

Mike, February 05, 2003 - 10:03 am UTC

"
Statistic Total per Second per Trans
--------------------------------- ------------------ -------------- ------------
CPU used by this session 522,826 289.8 803.1
CPU used when call started 522,829 289.8 803.1

"

Instead of using statpack, can we use some v$ to find out the same/similar stat infor?
Thanks.

February 05, 2003 - 12:03 pm UTC

sure, all statspack does is query v$ tables after all.

v$sesstat joined to v$statname by statistic# is what you are looking for.

Spot on.

Matt, February 05, 2003 - 5:21 pm UTC

Yes these are clustered tables (I looked at dba_tables.cluster_name and dba_clusters).

initrans set to 2 and maxtrans set to 255 (the defaults)

STATSPACK WAIT STATS

Zoran Martic, February 07, 2003 - 12:58 pm UTC

Hi all,

Be aware of statspack top wait event statistics:

These statistics are based on all Oracle processes, foreground (your server processes) and background (LGWR, DBWR, ...).
You can be very very confused sometimes by top waits
because they are not excluding background processing time that not directly slowdown your foreground processes.

In regards of this you can see control file parallel write as the top wait event, but in the real life this event is not the case to slowdown your database at all.

YOU NEED TO THINK ONLY ABOUT WAITS THAT CAUSE YOUR FOREGROUND PROCESSES TO SLOWDOWN.

This is what is wrong in statspack and many scripts around based on wait event method.

The only scripts I know that are showing you this difference are Steve Adams scripts www.ixora.com.au.

See the output from statspack that Matt from Australia post.
First three events will maybe not be on that list at all if you not count work that DBWR (db file parallel write) or LGWR (log file parallel write) or CKPT (99% of control file parallel write) did.
If DBWR can write in the background and not causing foreground processes to slowdown you DO NOT CARE.

YOU WILL NOT FIND THIS FROM STATSPACK.

See now one of more useful scripts to show you only statistics on foreground processes:

SQL> @response_time_breakdown

MAJOR    MINOR         WAIT_EVENT                                SECONDS    PCT
-------- ------------- ---------------------------------------- -------- ------
CPU time parsing       n/a                                           201  3.85%
         reloads       n/a                                           435  8.33%
         execution     n/a                                          2508 48.04%

disk I/O normal I/O    db file sequential read                       549 10.52%
         full scans    db file scattered read                         47   .89%
         direct I/O    direct path read                                0   .00%
         other I/O     control file sequential read                    0   .00%
                       control file parallel write                     0   .00%

waits    DBWn writes   rdbms ipc reply                               680 13.02%
                       free buffer waits                              66  1.27%
         LGWR writes   log file switch completion                     10   .20%
         enqueue locks enqueue                                         0   .01%
         other locks   latch free                                     33   .62%
                       sort segment request                            1   .02%
                       undo segment extension                          0   .00%
                       buffer busy waits                               0   .00%
                       library cache load lock                         0   .00%

latency  commits       log file sync                                 337  6.45%

MAJOR    MINOR         WAIT_EVENT                                SECONDS    PCT
-------- ------------- ---------------------------------------- -------- ------
latency  network       SQL*Net more data to client                   267  5.12%
                       SQL*Net message to client                      73  1.40%
                       SQL*Net more data from client                   9   .17%
                       SQL*Net break/reset to client                   2   .03%
         file ops      file open                                       2   .04%
                       file identify                                   0   .00%
         process ctl   process startup                                 0   .01%
         misc          refresh controlfile command                     0   .00%


In my statspack for the same database I will get totally wrong waits to look.

THERE ARE WAIT METHODS THAT WILL WASTE YOUR TIME 
AND THERE ARE CORRECT WAIT METHODS THAT WILL POINT YOU TO TUNE FIRST FOREGROUND WAIT EVENTS.

Sometimes you need to tune where and how DBWR is writing, to wich disk if you want to optimize reads to be faster, or to optimize LGWR to write faster (if you wait on LGWR).

Be smart. From statspack you will go and tune stupid CKPT, LGWR and DBWR waits without notice that maybe only part of these events are slowing down your application (it could be 10%).
YOU ALWAYS NEED TO IDENTIFY THE BIGGEST WAIT. 
You will never find this one with statspack.

Regards,
Zoran

February 07, 2003 - 1:58 pm UTC

well, while i agree with lots said here - i would say that statspack does tell you this.

Waits are just tricky -- control file writes, well, you might actually be doing them.

sqlnet message from client - could be an idle wait, could mean the client is a huge bottleneck. sometimes you need to ignore, sometimes not.

statspack shows you the waits -- all of them. Yes, you need to use your knowledge of the system in order to figure out the wheat from the chaff -- but to just ignore them? nah, you want to see them and ignore if them if you find them to not be the cause.

All of the waits are there. You can even tell statspack to conside these other waits as "idle waits" so they don't show up in the top 5 if you like.

Statspack WAIT clarification

Zoran Martic, February 08, 2003 - 4:57 am UTC

Hi Tom,

Just to not be misunderstand.
The point is to get all WAITS that are causing your foreground processes to slowdown.

Of course that it could be controlfile writes or DBWR writes (on log switch when the next redo log is active, or if you need to read from disk and you have high disk utilization and this slowdown sequential/scattered reads).

But this will not in most cases be TOP WAIT EVENT at all.
It could be 20 or 30 in many databases.

What statspack is giving you is not what you need to tune first.
Statspack is great for top SQL's but for waits.

Imagine this.

IF Oracle just create this script and say put first TOP WAITS for foreground (that includes times from background if for example your commit after delete needs to wait for LGWR to switch logfile first to write log buffer to the new redo). This is very easy by only eliminating times for backgrounds from v$session_event (it will always be the time that it is not directly included in foreground waits).

I was using statspack for a long time and used to (it was stupid from me, but I did not think too much at that time)
tune top waits, ususally in 8i is always controlfile wait, DBWR writing, ..... Usually few of them are not at all top waits.

Oracle is build to work that background processes work in optimal way as possible, that means controlfile writes are usually >90% (maybe more) CKPT wait event.

Do you mean that this part of CKPT time will be the TOP event as normally in statspack.

You said you can exclude some events from statspack.
But in this case you will loose a part of time of that event that is causing foreground processes to slowdown (to say like this it could be 10% of DBWR writes to cause your application to slowdown).

Do you mean that all DBWR write time is causing slowdown?

Something is in OS that you cannot see that easy, like you have reads (sequential/scattered) always causing foregrounds to slowdown, but you can spread better (not like SAM method) all write IO from the disk with maximum reads to other disks and speed up reads because this disk is less utilized.

Hard game is this.

This guy Steven Adams is very smart. When I realized that many sites are not dealing with waits in proper way (I used for a long time scripts from orapub, but this was with the same problem as statspack).

What do you think?

I do not think that I am on the wrong way.

Also Eneterprise Manager, Quest tools, .... are using all WAIT EVENTS startegy.

Just think what is your response time: client, network and foreground process time (including all waits).

STATSPACK is telling us are we going to tune CKPT, DBWR or foreground (server) processes.
What is more important to tune CKPT or foreground processes?
Statspack is not telling that for now as I know.

Regards,
Zoran

Staspack review

Zoran Martic, February 08, 2003 - 2:23 pm UTC

Hi Tom,

In any case STATSPACK is useful as free tool.
You just need to see the background waits in the report and to remove these from top waits (or all waits).

I am not sure how many people look into background waits.

Just to not say hard words, maybe I cannot see the reason behind why in top waits are not only top waits for foreground processes as natural.

I spent a lot of time tuning stupid events in the past just because I watched top waits by default without looking into background waits.
At the end I am guilty because WE DID OR DO NOT TO LOOK PROPERLY statspack reports. I am not sure am I the only one.

Regards,
Zoran

Enqueues...

Matt, February 10, 2003 - 11:40 pm UTC

Ok, behind the freelist issue there is another issue relating to enqueues. See the top 5 waits below.

Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (s) Wt Time
-------------------------------------------- ------------ ----------- -------
enqueue 410 1,169 21.63
db file sequential read 56,729 1,119 20.71
log file parallel write 17,359 789 14.61
log buffer space 2,457 787 14.57
db file parallel write 5,081 563 10.42

enqueues:
=========
Enqueues are locks that serialize access to database resources. This event indicates that the session is waiting for a lock that is held by another session.

^LEnqueue activity for DB: NSL1 Instance: NSL1 Snaps: 580 -581
-> Enqueue stats gathered prior to 9i should not be compared with 9i data
-> ordered by waits desc, requests desc

Avg Wt Wait
Eq Requests Succ Gets Failed Gets Waits Time (ms) Time (s)
-- ------------ ------------ ----------- ----------- ----------- ------------
UL 3,609 2,209 1,400 25 48.08 1
TX 15,618 15,616 2 5 230,535.20 1,153

So, from above I am concluding that I am seeing TX enqueues on the system. I have not seen these for other data loads.

I think that I have waits for an Interested Transaction List (ITL).
How can I prove this given the remaining information in the report?
How can I identify which tables need an increased INITTRANS/MAXTRANS?

BTW:

The following parameters are set in the init.ora (I cannot find reference on the system as to why these may have initially been set):
enqueue_resources 30000
_enqueue_locks 30000

The system is the same as earlier.

February 11, 2003 - 8:22 am UTC

I would be more inclined to think you have classic blocking/locking as waits for itls are rather rare. You had 5 waits for loooonnngggg times (on average). Looks more like a classic "blocker/blockee -- I've got the row you want to have already locked".

Now - fortunately for you -- these waits are very long -- so you can in fact monitor the system and see what the blocker is (and what they are doing) and who the blockee is and what they are TRYING to do. That'll give you a very good clue.

Additional on INITRANS

Matt, February 11, 2003 - 1:28 am UTC

For any INITRANS Changes to take effect MUST the table be re-created?

BTE previously there were approcx 100 concurrent loads.

February 11, 2003 - 8:23 am UTC

it'll affect newly added blocks, but if you want it on the existing blocks -- yes.

Investigation locks

Matt, February 11, 2003 - 10:50 pm UTC

Can you suggest how I highlight the blockers and blockees?

I've been using:

SELECT s.sid
,s.serial#
,sw.event
-- ,sw.p1text
,sw.p1
-- ,sw.p2text
,sw.p2
-- ,sw.p3text
,sw.p3
-- ,sw.wait_time
,sw.seconds_in_wait
,s.sql_hash_value
,s.sql_address
FROM V$SESSION s
,V$SESSION_WAIT sw
WHERE s.sid = sw.sid
AND s.status = 'ACTIVE'
AND sw.wait_time = 0
order by sw.event

To try and catch active waits for all events (and then filter only on the enqueues). I then have the info here to find the session info (and the SQL...and the locks etc).

Is this what you had in mind?

If this is application related. What can I possibly do to fix it?

February 12, 2003 - 8:20 am UTC

see
</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:839412906735 <code>

You can use v$lock to see this easily...

If this is application related, we'll have to look at the app and see how it is they want to work on the same exact rows and change that somehow.

Calculating CPU usage from STATSPACK

Logan Palanisamy, March 21, 2003 - 6:49 pm UTC

Tom,

Here is the CPU releated stats from my STATSPACK output for a four hour period on a 4 CPU box. There is only once instance on this machine.

Statistic Total per Second per Trans
--------------------------------- ---------------- ------------ ------------
CPU used by this session 136,025,771 9,478.5 9,065.4
CPU used when call started 2,607,877 181.7 173.8

How do I calculate CPU usage? CPU used by this session or CPU used when call started?

Total available CPU seconds = 4 * 60 * 60 * 4 = 57,600 seconds.

CPU Usage when call started% = 26,078/57,600 * 100 = 45%

"CPU used by this session" gives over 100% which doesn't seem correct. What is it then?

March 21, 2003 - 7:21 pm UTC

4 hours is 3 hours and 45minutes too long for a statspack. but anyway...

CPU used by this session is normal -- but tell me, do you have some really LONG running things? Looks more like an overlap problem then anything else here.

(but generally, you want to use OS TOOLS to measure OS statistics like CPU utilization)

CPU Usage

Logan Palanisamy, March 22, 2003 - 11:53 am UTC

Tom,

The reason we have a 4-hour snapshot interval is because it is an Oracle Applications database with over 300 tablespaces. If we do a snapshot every 15 minutes, it fills up space pretty quickly. Of course, whenever there is any special need we take a manual on-demand snapshot.

If we measure the CPU usage at the OS level then it gives it at the machine level instead of at the database level. That is the reason we are calculating the CPU usage using STATSPACK's data.

March 22, 2003 - 12:37 pm UTC

you only need two snapshots -- 15 minutes apart.

No one said you had to take one every 15 minutes -- just that when you take one, you want one 15 minutes later (unless the one you just took is the one that is "15 minutes later")

It'll be extremely hard since CPU usage is measured after calls complete and in your case, you have reallly long running sessions. long running calls/sessions that span your observation window will mess you up -- as they have here.

what you really care about is overall machine utilization, the OS level should be what you are looking at.

But anyway -- consider this example:

create or replace procedure run_until( p_date in date )
as
begin
while sysdate < p_date
loop
null;
end loop;
end;
/
exec run_until( sysdate + 1/24/60 * 3 )

So, that procedure just "burns CPU" for 3 minutes as called.

I took a snapshot a little before it started....
I took a snapshot right before it ended...
I took a snapshot right after it finished...

and the results are:

Snap Id Snap Time Sessions Curs/Sess Comment
------- ------------------ -------- --------- -------------------
Begin Snap: 21 22-Mar-03 12:24:42 16 4.0
End Snap: 24 22-Mar-03 12:29:23 16 4.3
Elapsed: 4.68 (mins)

Statistic Total per Second per Trans
--------------------------------- ------------------ -------------- ------------
CPU used by this session 1,092 3.9 273.0
CPU used when call started 1,093 3.9 273.3

apparently my system was virtually idle.... OR was it:

Snap Id Snap Time Sessions Curs/Sess Comment
------- ------------------ -------- --------- -------------------
Begin Snap: 21 22-Mar-03 12:24:42 16 4.0
End Snap: 25 22-Mar-03 12:29:45 16 4.4
Elapsed: 5.05 (mins)

Statistic Total per Second per Trans
--------------------------------- ------------------ -------------- ------------
CPU used by this session 18,619 61.5 3,723.8
CPU used when call started 18,619 61.5 3,723.8

Wow, in less then 1/2 of a minute, I used gobs of CPU time...

Snap Id Snap Time Sessions Curs/Sess Comment
------- ------------------ -------- --------- -------------------
Begin Snap: 24 22-Mar-03 12:29:23 16 4.3
End Snap: 25 22-Mar-03 12:29:45 16 4.4
Elapsed: 0.37 (mins)

Statistic Total per Second per Trans
--------------------------------- ------------------ -------------- ------------
CPU used by this session 17,527 796.7 17,527.0
CPU used when call started 17,526 796.6 17,526.0

And that shows a caveat of statspack -- some statistics are just not reported until they are done. So, looking at your 4 hour window -- all that had to happen was a couple of 8 hour long jobs FINISHED and dumped all of their CPU time into your window.

Matt, March 23, 2003 - 8:59 pm UTC

Very interesting. I was using Metalink Note:223117.1 and the COE performance approach for some tuning. Basically using a 9.0.1 statspack and breaking down the response tim einto the constituent components to highlight where the most benefits could be gotten.

These were my wait events:
Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (s) Wt Time
-------------------------------------------- ------------ ----------- -------
db file sequential read 48,481 1,054 23.37
log buffer space 3,824 1,002 22.21
log file parallel write 21,373 858 19.01
latch free 448,006 453 10.04
db file parallel write 2,612 420 9.30
.
This was my CPU usage:
Statistic Total per Second per
Trans
--------------------------------- ------------------ --------------
------------
CPU used by this session 679,596 188.9
678.2

So, response time = service time + wait time

Calculating the total wait time in centi-seconds (using the number for the db file sequential read entry) we have:

(105400/23.37)*100 = 451006

Service time in centi-seconds is 679596

So, response time = 451006 + 679596 = 1130602 centi-seconds

So calculating percentages (of total response time) for each response time component we get:

CPU Time = (679596/1130602)*100 = 60%
db file sequential read (105400/1130602)*100 = 9%
log buffer space (100200/1130602)*100 = 8%
...

So, suddenly I am focused on tuning CPU usage rather than anything else (and the log buffer space wait as well)

Now, there were a number of queries with a large ratio of LIO to executions (500:1), which was where I was going to start my tuning - these were the problem statements.

What is a 'good' LIO to execution ratio? Obviously it depends on the number of rows returned. Is is possible to find out how many rows were processed by an individual SQL statement using STATSPACK?

The system itself has a number of long running processes which may have been contributing to the CPU stat by 'dropping in' on my snap, definately worth being aware of this (in my case, I believe that CPU is indeed in need of tuning).

March 24, 2003 - 7:34 am UTC

so now you see that all of those CPU related numbers are potentially "suspect" on any system. Just takes some long running things to muck them all right....

But -- why would this make you tune CPU usage more then anything else? I would be looking at my indexing setup on this database with the goal to reduce the db file sequential reads.

A good LIO to execution ratio (man do i hate ratios, there is one "ratio" i use and thats the soft parse one) like all raios has no single "good goal number". For if we propose one -- there will be some guy walking forward in to a corner trying to get massive query from heck to fit the ratio (fruitlessly).

As you pointed out - it is a function of the number of rows returned (for one) but also the number of tables joined (for another). eg:

select * from emp, dept where emp.deptno = dept.deptno and emp.empno = :x;

I would expect

2-3 lio's to read the emp index
1 lio to table access by rowid the emp table
2-3 lio's to read the dept index
1 lio to table access by rowid the dept table
---
6-8 lios

Now, if that were:

select * from emp, dept where emp.deptno = dept.deptno and dept.deptno = 10;

I would expect

2-3 lio's to read the dept index
1 lio to read the dept table
2-3 lio's to read the index on emp(deptno) and start range scanning
ceil(1*N/num-rows-per-leaf-block) lio's to scan said index
1*N lio's to table access by rowid the emp table, n = number of rows to return

A 3 table join would take more, a query that returns 100 rows takes more, the array fetch size will impact this, yadda yadda yadda.

You need to look at that query and say to yourself "what is its job, what does it do".

Remember a query that has a 5000 to 1 ratio of lios to execute might not be the one you want to tune first -- especiall if there is one that has a 10 to 1 ratio.. What? Well, if you execute the 5000/1 once a day but the 10 to 1 a hundred times/second -- knocking even 1 lio off of that one to get it to 9:1 is what you want to do.

CPU usage

Matt, March 24, 2003 - 6:01 pm UTC

"I would be looking at my indexing setup on this database with the goal to reduce the db file sequential reads."

Agreed, but in this case, fixing the LIO's would fix the db file sequential reads (and vice-versa). This wasn't really evident from the info posted by me.

The problem query itself is core to all our data loads, so should be the focus of attention.

What is happening is the use of "PARTITION VIEWS", which are not implemented correctly. This is causing painful index range scans of multiple partitions - no partition pruning! This causes the db file sequential reads and the LIO's.....still working towards those partitioned tables.

Yes, I understood that LIO's were a function of many things. I don't really think that my thoughts had crystallised into complete sense. I was thinking about trying to pull out high LIO SQL using a more complicated criteria, but as you have indicated. The actual criteria is so complicated as to be 'too hard'.

All my conclusions are based on application knowledge and the way that the data is used. All R.O.T have to be put in context.

March 24, 2003 - 8:05 pm UTC

well said "all ROT have to be put in context"... That may be my epitaph -- either that or "use binds"

from performance tuning guide in 8.1.7

A reader, April 01, 2003 - 7:01 am UTC

Hi
</code> http://download-west.oracle.com/docs/cd/A87860_01/doc/server.817/a76992/ch18_cpu.htm#669 <code>
from the doc it says this

Read Consistency
Your system may spend excessive time rolling back changes to blocks in order to maintain a consistent view. Consider the following scenarios:

If there are many small transactions and an active long-running query is running in the background on the same table where the inserts are happening, then the query may need to roll back many changes.

If the number of rollback segments is too small, then your system could also be spending a lot of time rolling back the transaction table. Your query may have started long ago; because the number of rollback segments and transaction tables is very small, your system frequently needs to reuse transaction slots.

A solution is to make more rollback segments, or to increase the commit rate. For example, if you batch ten transactions and commit them once, then you reduce the number of transactions by a factor of ten.

============================================================

My question is:

for the first statement why the query has to rollback many times?!?!?!

and for the second, why Oracle has to spend a lot of time rolling back the transaction table? how is transaction table size related t RBS size?

And finally, in the solutions it says increase commit rate, isnt it the other way round?

April 01, 2003 - 7:54 am UTC

well, the increase the commit rate is poorly worded at best.  The explanation of what to do (don't commt batch/batch/batch - do ten batches then commit) tells you the correct advice -- commit LESS OFTEN, the goal -- reduce the number of commits.

"increase the commit rate" should probably read decrease.



As for the questions

it is not that it is rolling back many times, but rather that it is rolling back many changes.  Suppose you have a query that takes 10 minutes.  In that 10 minutes, block X was modified by 500 different transactions.  Block X is the last block you are going to read.  We have to rollback 500 times.  Consider this example:

ops$tkyte@ORA817DEV> create table t ( x char(255) );
Table created.

ops$tkyte@ORA817DEV> insert into t values ( 0 );
1 row created.


ops$tkyte@ORA817DEV> create or replace procedure update_t_lots( p_times in number )
  2  is<b>
  3          pragma autonomous_transaction;</b>
  4  begin
  5          for i in 1 .. p_times
  6          loop
  7                  update t set x = i;
  8                  commit;
  9          end loop;
 10  end;
 11  /
Procedure created.


ops$tkyte@ORA817DEV> create or replace procedure run_demo( p_times in number )
  2  as
  3          type rc is ref cursor;
  4
  5          l_cur rc;
  6          l_data t.x%type;
  7  begin
  8          open l_cur for
  9          'select * from t update_' || p_times;
 10
 11          update_t_lots( p_times );
 12
 13          fetch l_cur into l_data;
 14          close l_cur;
 15  end;
 16  /
Procedure created.

ops$tkyte@ORA817DEV> alter session set sql_trace=true;
Session altered.

ops$tkyte@ORA817DEV> exec run_demo( 0 );
PL/SQL procedure successfully completed.

ops$tkyte@ORA817DEV> exec run_demo( 500 );
PL/SQL procedure successfully completed.

tkprof shows:

select * from t update_0

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      0.00       0.00          0          1          4           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.00       0.00          0          1          4           1
********************************************************************************
select * from t update_500

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      0.01       0.01          0        496          4           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.01       0.01          0        496          4           1


See how the second select did 496 lio's to read that one row?  that was the rolling back of the intervening 500 transactions to reconstruct that block as it was when the cursor was opened.

The transaction table is just "data" itself.  It is rolled back to the point in time needed in order to process the rbs data.

read ITL 500 times?

A reader, April 01, 2003 - 9:19 am UTC

Hi

in your example where 496 LIO took place, as far as I understand is when we need a consistent view of any modified data block we first read ITL located at the block header where the row is located and from that we can find out the transaction slot in the corresponding transaction table in that RBS therefore find the undo to build the consistent view for in our query. Does this mean, with your example that Oracle does this process 500 times? Once for each row change?

April 01, 2003 - 4:12 pm UTC

there were 500 some odd changes to unwind through -- so yes, there was a whole lot of rollback reading going on to interatively read through all of the changes.

it wasn't that it read the ITL 500 times, it read through the change of undo to get back in time to JUST BEFORE the query was opened (and no further). It has to walk the list of changes.

CPU Utilisation is 100%

Vivek Sharma, April 02, 2003 - 10:54 am UTC

Hi Tom,

This article of yours was very helpful. Thanks. Going from ur calculation, the statistics of one of my database (statspack report) have db file scattered read as the top most event in the top 5 events about 76% and latch free being 14%.

CPU Statistics is :
Instance Activity Stats for DB: CMSOB Instance: cmsob Snaps: 35 -41

Statistic Total per Second per Trans
--------------------------------- ---------------- ------------ ------------
CPU used by this session 3,562,881 123.7 85.8
CPU used when call started 3,562,881 123.7 85.8

There are 2 CPU's.
Going thru ur calculations we are utilizing 1.2 cpu seconds per second. The Snapshot was taken for an interval of 8 hours (I know it is too high and will reduce it to 15-30 minutes). So the CPU Cycle was 2 * 60 * 60 * 8 = 57600.
So 35628/57600 is 62%. But I have seen that the CPU is 100% and oracle.exe takes around 99% of CPU (Windows NT Server). If we are using only 62% CPU then why is oracle showing 100%. But let me tell u that when this statspack report was generated, I did not checked the CPU Utlisation physically on the server. Shall I run the statspack when the CPU is 100% and check. And what is the solution to reduce the CPU Utilisation and how much should it be ? Because when the CPU is high, users complaint about the slowdown of the database.

Awaiting your reply.

Thanks in Advance
Vivek Sharma.

April 02, 2003 - 12:46 pm UTC

read through this:

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:7641015793792#8768340190291 <code>

and you'll see that computing CPU utilization with statspack can be unreliable.

Suppose you have 5 points in time:

t1 -- jobs 1 and 2 are running

t2 -- you take a snapshot

t3 -- job 1 completes

t4 -- you take another snapshot 15 minutes later.

t5 -- job 2 completes

And those are the only two jobs.

Now, say T1 and T2 are separated by 2 hours. Statspack would should around 2 hours of CPU time used in that 15 minute snapshot. Job 1 contributes all of its runtime when it completes. If you had 8 cpus, you would say "wow, I was 100% utilized". If you had 4 cpus -- you would be wondering what sort of magic allowed you to be 200% utilized. And if you were 16 CPUS, you be thinking "hmm, 50%".

Truth was -- if you were 8 cpus, you were 25% (at most, if job 1 finished near the end of the window) utilized.

4 cpus -- 50%

16 1/8th

and so on...

The way to reduce cpu utilization is -- to use less.

Look at your low hanging fruit. If you take a query that does 5 LIO's and is executed 100,000 times -- and reduce that to 3 LIO's by removing a table access by rowid or something -- that might do it. Taking the query from heck and reducing its work -- that might do it. You are looking to remove things here, to reduce the need for this shared resource.

Well, say t1 is far far before t2 (an hour). Well stats

Shared Pool Statistics

Vivek Sharma, April 04, 2003 - 7:18 pm UTC

Hi Tom,

Can u please tell us something about what does these indicate in the statspack report.

Shared Pool Statistics Begin End
------ ------
Memory Usage %: 79.11 80.14
% SQL with executions>1: 74.05 67.74
% Memory for SQL w/exec>1: 32.10 18.82

What should be the value of these and their corrective action.

April 05, 2003 - 11:40 am UTC

from spdoc.txt in rdbms/admin:

Instance Efficiency - Shared Pool Statistics are shown for the begin and
end snapshots.

o Memory Usage %: The percentage of the shared pool which is used.

o % SQL with executions>1: The percentage of reused SQL (i.e. the
percentage of SQL statements with more than one execution).

o % Memory for SQL w/exec>1: The percentage of memory used for SQL
statements with more than one execution.

This data is newly gathered by the 8.1.7 Statspack for level 5 snapshots
and above, and so will not evident if the report is run against older
data captured using the 8.1.6 Statspack.

There is no single "corrective action".

You would like ALL of your sql to be reused -- but some won't be -- just because it isn't (infrequently run queries for example).

The first number is how much of the shared pool you are using.
The second two are described above.
It is information you can use to see if you think things are happening the way they should. A really low 2cnd/3rd number coupled with a low soft parse pct would indicate you are not using bind variables and are hard parsing way too much.

Too much hard parsing

Vivek Sharma, April 06, 2003 - 10:57 pm UTC

Hi Tom,
Thanks for your reply.

Please investigate this for me. The soft parse ratio is 91.43% and the last value in the Shared Pool Statistics shows too less.
Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Soft Parse %: 91.43
Execute to Parse %: -106.97
Parse CPU to Parse Elapsd %: 61.99

Shared Pool Statistics Begin End
------ ------
Memory Usage %: 79.11 80.14
% SQL with executions>1: 74.05 67.74
% Memory for SQL w/exec>1: 32.10 18.82

From the statspack top sql session, I have seen all the queries using literals instead of bind variables. Is this low value because of not using bind variables. I have already asked the application developers to modify their scripts. It means, I am correct in my approach. Also I have seen Latch Free event in top 5 wait events. So using bind variables should also reduce this. am i correct ?

Secondly, the Execute to Parse ratio is -106.97. So according to your formula, which i understood, for every single 2.07 parses there are 1 executions. am i correct ?

Thanks
Vivek Sharma

April 07, 2003 - 7:54 am UTC

yes, yes and yes and yes. all yes's.

yes - that soft parse ratio is very low.
yes - they are parsing more then they execute.
yes - the latch frees are almost certainly from the high parse counts

but adding binds is only 50% of the solution (a big 50% none the less). You must also teach them to parse ONCE and execute OVER AND OVER.

What did you upgrade ?

Randy, April 07, 2003 - 2:47 pm UTC

I faced similar problem when I upgraded from 734 to 817. I can share my experience if it makes sense. What kind of migration you have done and how the application is written ? What's the interface language ?

Randy

Need Your Advice

Riaz Shahid, April 15, 2003 - 10:28 am UTC

Hello Tom !

Consider the following statspack report (for 24 minutes):

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 100.00 Redo NoWait %: 99.83
Buffer Hit %: 80.38 In-memory Sort %: 99.59
Library Hit %: 97.57 Soft Parse %: 82.06
Execute to Parse %: 86.54 Latch Hit %: 99.96
Parse CPU to Parse Elapsd %: 0.86 % Non-Parse CPU: 99.99

Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt Time
-------------------------------------------- ------------ ------------ -------
SQL*Net message from dblink 4,372 121,921 91.68
db file scattered read 9,060 2,670 2.01
db file sequential read 7,771 2,116 1.59
db file parallel write 192 1,349 1.01
control file parallel write 619 1,102 .83

My questions:

(1) Why there is less "Parse CPU to Parse Elapsd" ratio ?
(2) Look at the top 5 wait events...I have understanding that db file scattered read & db file sequential read tell us something about the setting of optimizer_index_caching & optimizer_index_cost_adj. So if they are among top 5 wait events, does that mean that the current values of these parameters are set wrong ???

Admin@STARR.LHR> show parameter optimizer

NAME TYPE VALUE
==================================== ======= ======
optimizer_features_enable string 8.1.7
optimizer_index_caching integer 100
optimizer_index_cost_adj integer 5
optimizer_max_permutations integer 80000
optimizer_mode string CHOOSE
optimizer_percent_parallel integer 0

(3) There are triggers on some tables (before insert) that run the code like select count(*) into a from t where check_code='ABC';
Where ABC is fixed. My question in this regard is: will this code be re-used if another user tried to insert data in the table having trigger ? My understanding is that it should re-use the code since its 100% same in both cases.

Morover, will cacheing table t (which is used in triggers) will improve the performance or not ?

Waiting for your precious advice

Riaz Shahid

April 15, 2003 - 10:39 am UTC

1) my question - why is your soft parse % so poor. It should be 99.5%+

You are waiting on parses, the elapsed time to parse it high relative to the cpu time for the same. This happens when you hard parse way too much (you are waiting in line to parse queries)

2) no they don't (tell you about the setting of anything). they indicate you are reading single blocks (sequential read) or multi-block io (scattered read).

Looking at your top 5 waits, they are not relevant. they are 3.6% of the waits. your major wait is message from dblink. as far as I can see, IO isn't an issue for you -- the network is.

3) the code will be reused, shared.

I would question why you need to do a select count(*) though. I find 99.999% of the time -- the code surrounding a count(*) is either

o wrong
o inefficient at best

alter table cache modifies the way the blocks are treated in the buffer cache -- it affects full scans of LARGE tables only. The way to improve that would be not to full scan a large table.

Great

Riaz Shahid, April 15, 2003 - 11:20 am UTC

Tom !

Many thanks for your quick response...

Actually we are using 256K ISDN connection and users are remotely accessing the database. So i agree this is due to network speed. We need to upgrade it.

The reson behind using count(*) is that we use the code:

----other code---
select count(*) into a from check_codes where check_code='ABC';
if a=1 then
------
------
------
end if;

Hope you will suggest it to:

----other code---
begin
select 1 into a from check_codes where check_code='ABC';
------
------
------
exception
when no_data_found then
......
....
end;

(am i right ?)
If so what will be the performance benefit.

What if i make function for that like :

function myfunction(chkid char(6)) return number is
a number(1);
begin
select 1 into a from check_codes where check_id=chkid;
return a;
exception
when no_data_found then return 0; (--there is exactly one row in check_codes table against each check_id and check_id is PK)
end;

and in trigger:

...............
...............
...............
a:=myfunction('ABC');
if a=1 then
............
............
............
else
............
............
end if;

Please advise

April 15, 2003 - 11:43 am UTC

what is the PURPOSE of the count(*). why do you do it in the first place. It looks alot like "RI" to me.

...

Riaz Shahid, April 15, 2003 - 12:12 pm UTC

Hello Tom !

cr@STARR.LHR> desc check_codes
Name Null? Type
----------------------- -------- ----------------
CHECK_ID NOT NULL VARCHAR2(7)
CHECK_DESCRIPTION VARCHAR2(200)
ENABLE_FLAG VARCHAR2(1)
AUTO_FLAG VARCHAR2(1)
ENABLED_DATE DATE
REMARKS VARCHAR2(100)

We run the various checks against the rows that are being inserted in the table t. Before running a check, we need to know that whether that specific check is enabled or not. So for that purpose we use:

select count(*) into a from check_Codes
where check_id='ABC' and enable_flag='1'

(sorry for not providing you the complete detail i.e; enable_flag)

a=1 ==> (the check is enabled)
a=0 ==> (the check is not enabled and so it should not be run)

Regards

April 16, 2003 - 9:13 am UTC

ok, this falls into the 'inefficient' arena.

Here is what you should do:

create package checks
as

g_abc_enabled boolean default false;
...
procedure ABC( .... );
....
end;
/

create package body checks
as

procedure ABC( ..... )
is
begin
if ( NOT g_abc_enabled ) then return; end if;
....
end;

BEGIN
for x in ( select 1 from check_codes
where check_id = 'ABC' and enable_flag='1' )
loop
g_abc_enabled := true;
end loop;
....
end;
/

That way -- you peek at that table ONCE, any SQL in CHECKS.ABC will be parsed once per session (different from the trigger) and the trigger would be coded as:

begin
if ( checks.g_abc_enabled ) then checks.abc( ..... ); end if;

To Riaz ....

J. Laurindo Chiappa, April 16, 2003 - 7:01 am UTC

PMFJI, but as I understand, you do ** NOT ** want to know the EXACT number of records with the count, you ONLY want to know IF at least one record exists, right ? So, *** WHY *** to use COUNT and read the table in full ? Consider :

scott@PO8I:SQL>select count(*) from big_table where c1=1;

  COUNT(*)
----------
         1


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   SORT (AGGREGATE)
   2    1     TABLE ACCESS (FULL) OF 'BIG_TABLE'




Statistics
----------------------------------------------------------
          0  recursive calls
         23  db block gets
       3140  consistent gets
       3140  physical reads
          0  redo size
        375  bytes sent via SQL*Net to client
        425  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

scott@PO8I:SQL>select 1 from big_table where c1=1 and rownum =1;

         1
----------
         1


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   COUNT (STOPKEY)
   2    1     TABLE ACCESS (FULL) OF 'BIG_TABLE'
   
Statistics
------------------------------------------------------
          0  recursive calls
          4  db block gets
          1  consistent gets
          8  physical reads
          0  redo size
        368  bytes sent via SQL*Net to client
        425  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

scott@PO8I:SQL>


[]s

 Chiappa

to Chiapa...

Riaz Shahid, April 16, 2003 - 7:56 am UTC

Thankx chiapa for your help. But Consider:

cr@STARR.LHR> set autotrace on
cr@STARR.LHR> select count(*) from check_codes
2 where check_id='200052I';

COUNT(*)
==========
1

Elapsed: 00:00:00.97

Execution Plan
==========================================================
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=8)
1 0 SORT (AGGREGATE)
2 1 INDEX (UNIQUE SCAN) OF 'PK_CHECK_CODE' (UNIQUE)

Statistics
==========================================================
0 recursive calls
0 db block gets
1 consistent gets
0 physical reads
0 redo size
367 bytes sent via SQL*Net to client
425 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processed

cr@STARR.LHR> ed
Wrote file afiedt.buf

1 select 1 from check_codes
2 where check_id='200052I'
3* and rownum=1
cr@STARR.LHR> /

1
==========
1

Elapsed: 00:00:00.56

Execution Plan
==========================================================
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=8)
1 0 COUNT (STOPKEY)
2 1 INDEX (UNIQUE SCAN) OF 'PK_CHECK_CODE' (UNIQUE)

Statistics
==========================================================
0 recursive calls
0 db block gets
1 consistent gets
0 physical reads
0 redo size
360 bytes sent via SQL*Net to client
425 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processed

As autotrace shows, there's no difference in first & second approaches. Whats your opinion abt it ?

Thankx

for Chaiapa

Riaz Shahid, April 16, 2003 - 8:03 am UTC

Please add this to above comment:

"and the table check_codes is not a big table...it contains only 66 rows. Further it is not expected that it will grow much more (maximum 1 row per 5 months)."

Regards

Riaz

Flip switch after update...

Jan, April 16, 2003 - 9:49 am UTC

Keep in mind that the global variable g_abc_enabled will not change if you update check_codes.enable_flag. You might want to add a procedure in that package to do that and add an after_update trigger on check_codes.enable_flag if that's a concern.

Some comments more , to Riaz

J. Laurindo Chiappa, April 16, 2003 - 10:16 am UTC

Well, my line of thinking was : COUNT always MUST read until the end of the table, SO ask to read just the first record (the rownum in my case, OR a cursor reading just the one matching the key), does just this. BUT 66 rows means just a fistful of blocks, sooo small, probably you will not see any RADICAL performance improvements here...
Anyway, by any means follow Tom´s tip and implement it in a stored PL/SQL, as he said it will help in reducing parses (*** ALWAYS *** a very very good thing), and that´s it : you will be doing the best possible regarding performance/scalability. Maybe the bottleneck in your system resides somewhere (network, some other SQL, hardware usage...) and this action will not improve anything greatly, BUT at least you are doing all the things at your disposal to alleviate it.

[]s

Chiappa

???

Riaz Shahid, April 16, 2003 - 10:41 am UTC

Tom !

I was a little bit confused with your solution. I want the package to be dynamic.i;e;

give check_id and pprocedure should tell whether this check is enabled or not and thats it. Your code is static (i.e; only for check_code='ABC')

I don't know very much abt PL/SQL (like global variables in Packages) so a bit confused with ur code.

Please advise

April 16, 2003 - 11:02 am UTC

you have a set KNOWN universe of check codes (you must, else you would not have any code to execute for the CHECK)...

So, the "...." in the code represent you setting booleans for check='XYZ', check='LMN' and so on

If you don't know about packages and global variables et. al -- time to read that plsql book (its short). Without that knowledge, you'll not be writing efficient code in any regards.

all my code does it set a bunch of booleans -- driven by a lookup table -- to tell us whether a check is enabled or not in our session.

Great

Riaz Shahid, April 19, 2003 - 12:28 pm UTC

Hello Tom !

You are really great.

I am now seriously thinking about implementing your solution. But what is meant by (as Mr. chiappa said):

"Keep in mind that the global variable g_abc_enabled will not change if you update check_codes.enable_flag. You might want to add a procedure in that package to do that and add an after_update trigger on check_codes.enable_flag if that's a concern. "

Does it means that a user tries to insert into table and while insertion i enable some check, then that check will not run on that isnertion (i mean on the records that are inserted after the updation of enable_flg of that check).

OR

does that means if i enable a check, then all the users hace to reconenct to DB in order to rhar check to be run on the rows.

OR

does that means i'll have to do "something" (like writing a routine) else that check wouldn't be run on any of records (no matter the current session or the next ones).

Regards

April 19, 2003 - 1:03 pm UTC

a trigger would be meaningless in this context.

This package would set its values ONCE per session -- which, for a data load, should be more then sufficient. Why in the middle of a load would you turn on a check -- only for part of it.

If you understand how global package variables work -- what I've done above, you can make it do whatever you need it to do.

Performance problem after upgrade

GM, May 08, 2003 - 9:28 am UTC

Hi Tom,

As I met also some problems with the performance after upgrade, I would like to ask you for help. The problem is that one of sql statement - create table ... as select .... from .... where... now needs abt 7 hours to finish vs 1.5 before the upgrade. Statement uses db link to tables in remote databases. I use statspack to collect statistics, but can you tell me what to look exactly for in the report, which values? I am still not so good report reader and need to gain more experience. I created snaps during this create statement and this is extract from the report-5 top events
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
SQL*Net message to dblink 2,341,402 617,778 85.75
SQL*Net message from dblink 2,341,403 28,891 4.01
db file sequential read 1,969,684 22,457 3.12
direct path write 99,163 15,543 2.16
direct path read 502,504 12,232 1.70
Isn't is too high value for Waits for SQL*Net message to/from dblink? What else to look for in the report?

many thanks in advance

May 08, 2003 - 10:01 am UTC

do you have the plans from before and after for this CTAS?

CTAS

A reader, May 08, 2003 - 10:50 am UTC

No, to my best regret! I tryed to execute this CTAS in other database (just copyed tables used in statement ) and result was the same abt 8 hours. Maybe I should try to install 8i on one test machine , import required tables and see how much time CTAS needs. But what could cause this long wait?

thanks and regards,

May 08, 2003 - 6:15 pm UTC

look at the plan, does the current plan look "reasonable"

CTAS - more info

GM, May 08, 2003 - 11:17 am UTC

After some time the status of the session executing this CTAS in the base databse is
ACTIVE, while in remote database in INACTIVE. In v$session view in the base
database, the seq# for this session is growing constantly and value for the
event is 'SQL*Net message from dblink'.
In v$open_cursor there are few rows for this SID. The value for SQL_TEXT for
one if them is SAVEPOINT DOA__LOCKPOINT - I have never seen it before.

regards,

CTAS

GM, May 09, 2003 - 9:44 am UTC

Hi Tom,

Now everything seems OK - I just delete/create statistics for all tables.

regards,

Steve, May 13, 2003 - 12:13 pm UTC

Hi Tom

Following on from the first post above, we have a production database where the 'Rollback per transaction %' is constantly in the 70%-80% range. How can I go about investigating the user(s)/process(es) that will be causing this?
Thanks.
Oh, we're on 8.1.7. and W2K (ugh!) ;-)

May 13, 2003 - 4:56 pm UTC

you can look at v$sesstat -- join to v$statname and look for "user rollbacks"

find sessions that have lots, find out what apps those sessions are running.

Waits for dblink

David Pujol, May 14, 2003 - 2:58 pm UTC

Hi Tom, I'am studing one system where there are a lot of "SQL*Net message from dblink" (60% of response time). I'd like to know how can I or what ways exist to minimize these waits?. Can I compute this waits "how" iddle waits?

Thanks and regards

Sean, May 21, 2003 - 11:48 am UTC

I got the follow from stackpack:
Snap Id Snap Time Sessions Curs/Sess Comment
------- ------------------ -------- --------- -------------------
Begin Snap: 988 21-May-03 09:49:37 2,156 25.6
End Snap: 989 21-May-03 10:04:40 2,160 25.6
Elapsed: 15.05 (mins)

It says there were >2000 sessions the begin and the end snap shot. It was not the case, I checked over and over time, we have never had over 150 database sessions, which includes the active and inactive, at any given time. The DB is 9.1.0.3.0.

Thanks

May 21, 2003 - 2:54 pm UTC

do a sanity check on your logons current statistic in v$sysstat then. it just pulls right from there on the snap.

Sean, May 21, 2003 - 3:14 pm UTC

""
do a sanity check on your logons current statistic in v$sysstat then.  it just 
pulls right from there on the snap.
""

On my 9iR1:
SQL> select count(*) from v$session;

  COUNT(*)
----------
       132

SQL> select * from v$sysstat where name like '%logon%';

STATISTIC# NAME                           CLASS      VALUE
---------- ------------------------- ---------- ----------
         0 logons cumulative                  1      12122
         1 logons current                     1       2194

There was the inconsistance of v$session and v$sysstat, the session numbers from v$session shall be correct in my database.

On my 9i R2

SQL> select count(*) from v$session;

  COUNT(*)
----------
        79

SQL> select * from v$sysstat where name like '%logon%';

STATISTIC# NAME                           CLASS      VALUE
---------- ------------------------- ---------- ----------
         0 logons cumulative                  1     798604
         1 logons current                     1         78

They were both correct.

Do you think I shall file TAR for the R1?

Thanks

May 21, 2003 - 3:21 pm UTC

sure, if it is wrong, it is wrong. I see a documented issue with that statistic in conjunction with parallel query.

RE: logons current

Mark A. Williams, May 21, 2003 - 3:33 pm UTC

As a possibility, check out bug #2327249 (V$SYSSTAT SHOWS HUGE VALUES) - fixed in 10i

Searching metalink for "v$sysstat logons current" turned up alot of bug reports for high values in v$sysstat.

HTH,

Mark

any doc

Reader, August 22, 2003 - 11:23 pm UTC

Tom, could you provide any link to doc to interpret the terminology in statspack report. For example, what do the following terminology mean:

waits (is it like these many times server process waited for a particular event?)

timeouts means?

Thanks.

August 23, 2003 - 10:23 am UTC

don't over analyze things here. timeouts and waits are exactly what you think they are. they were WAITS and timeouts waiting for sometime.

Waits:

I waited for IO (db file scattered read for example).
I waited for Latches (latch free for example)
I waited N times, for so many seconds for that resource.

Timeouts:

Timeouts mean "i was waiting. I was willing to wait only so long. I timed out waiting, what I was waiting for wasn't available in the period of time I was will to wait"

reason for waits

Reader, August 23, 2003 - 10:02 am UTC

Tom, what are the possible reasons that server process waits for events like db file sequential read and db file scattered read? Thanks.

August 23, 2003 - 12:06 pm UTC

your disks do not instantaneously return data upon request.

so we wait.

db file sequential reads are usually caused by index accesses (single block IO)

db file scattered reads are usually caused by scanning (multi-block IO)

who decides how much time should a process wait ?

reader, August 23, 2003 - 11:40 am UTC

"I timed out waiting, what I was waiting for wasn't available in the period of time I was
will to wait"

Does oracle have a default value for timeout in milliseconds, for example, after which it would let a process to take a timeout and try again? If i see a larger number in timeouts column for example in db file sequential read or scattered read events, what would be the reason and should one approach. Thanks.

August 23, 2003 - 12:15 pm UTC

we decide how long to wait for what and where.

enqueues time out every 3 seconds for example.
latches based on a spin

not all waits have timeouts.

larger number then what? no single number by itself means anything. large waits on IO related events mean you want to look for techniques to

o reduce the amount of io you do
o make your io faster

What does it mean by latches based on a spin? Thanks.

Mani, August 23, 2003 - 12:56 pm UTC

August 23, 2003 - 3:33 pm UTC

when we go to get a latch

and we cannot get it right away.

we "spin" -- do a tight loop if you will -- just burning some cpu -- to see if after we spin a bit, we can get the latch. if after spinning some number of times, we don't get it, we goto sleep to be timed out (woken up) later to try again. These are "sleeps"

how to set spin_count?

reader, August 23, 2003 - 4:19 pm UTC

</code> http://download-west.oracle.com/docs/cd/A87860_01/doc/server.817/a76992/ch18_cpu.htm#4404 <code>

from the above link, "In some cases, the spin count may be set too high." How to find out what is set? Thanks.

August 23, 2003 - 6:30 pm UTC

i would ignore that spurious comment. no need to change it.

CPU data

Reader, August 24, 2003 - 10:39 am UTC

Tom, in the statspack report, is "CPU used by this session" value in column "per second" and "per transaction" for n number of CPUs? If so, should I divide that value by n to get a value for one CPU?

How accurate oracle gets the data with regard to CPU consumption? Thanks.

August 24, 2003 - 11:44 am UTC

it is the amount of cpu used by those sessions -- aggregate over the number of cpus.

you cannot really "divide", not sure what meaning you would attempt to even derive from that? one of the cpus might have been 100% and the other 10% or both at 55%. You can only multiply up, not really divide down.

we get this info from the OS itself.

What does it mean cpu used when call started? Thanks

Ramu, August 24, 2003 - 1:51 pm UTC

August 24, 2003 - 4:52 pm UTC

the amount of cpu time you already used when before the current call (if any) started.

current call means the start of the snapshot?

Reader, August 24, 2003 - 6:12 pm UTC

August 24, 2003 - 9:03 pm UTC

start of call

call?

Venkat, August 24, 2003 - 9:41 pm UTC

If both CPU used by the session and CPU used when call started have the same value, how do I interpret them? Thanks.

August 25, 2003 - 6:30 am UTC

consider them the same for all intents and purposes.

What does call mean? Is it the start of the execution of SQL? Thanks

Reader, August 24, 2003 - 9:42 pm UTC

August 25, 2003 - 6:30 am UTC

any client call to the database.

parse
execute
run this plsql block
fetch me a row
close that cursor
commit
.....

A reader, August 27, 2003 - 6:30 pm UTC

Tom,

Most of web sites say to ignore "slave wait" events when tuning the system, in my database its on the top of list..

Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt Time
-------------------------------------------- ------------ ------------ -------
slave wait 3,657 376,505 99.99
control file parallel write 305 17 .00
io done 2 2 .00
log file parallel write 2 2 .00
control file sequential read 37 1 .00
-------------------------------------------------------------

Can you please elaborate what is "slave wait" event and why is it caused.

Thanks...

August 27, 2003 - 7:26 pm UTC

it is an io slave waiting to be told to do something, yours are not doing anything so they wait and wait and wait.

you basically appear to have an idle system there.

block splits

Reader, August 28, 2003 - 5:52 pm UTC

I heard that If I do not set a proper pctfree for index, oracle might have to do block splits? How do I know whether Oracle did block splits? Is there any statistic that I can look at? Thanks.

August 29, 2003 - 9:03 am UTC

even if you set it, Oracle might have to do block splits. they are normal, natural and happen all of the time.

for every leaf and branch block you see in an index -- we split at some time.

the index starts as a single block.
it fills
splits -- becomes more then one block.
they fill, split -- becomes yet more blocks.

so, a simple count of blocks used in the index will tell you how many splits.

Load Issue!!

Sriram, October 20, 2003 - 9:46 pm UTC

Tom,
We are running Oracle Applications 11i on Windows 2000 Servers. We are having some serious performance issues during peak load. Is there a way i can identify the maximum user limit that the server can take? What kind of solution would you recommend in this case?RAC?

Will statspack reveal anything that could tell me that it is a load related performance issue.

Thanks,
Sriram

October 20, 2003 - 9:51 pm UTC

well, given the level of detail here -- not too much can be said.

you might be interested in Cary Millsaps new book on amazon.

a 10 minute stats pack might be somewhat informative.

but given the info here (i can run windows on a laptop, i can run windows on bigger machines. I can have apps 11i with 5 users, I can have it with 50,000 users. Lots of room in these analog measures -- not really sure where you are)

I cannot possibly recommend anything short of "find out whats wrong", "discover if you were even in the realm of realistic in sizing". If you have services on site -- they are fairly adept at sizing exercises for something as well known as apps -- they would be the most appropriate path to take.

How to find the CPU Utilisation

Praveen, October 21, 2003 - 8:28 am UTC

Hi Tom,

I have one question.. Suppose I have a server with two databases on it. I want to find out the CPU utilised by each database, so that I can set certain parameters in the init parameter file to evenly allocate the CPU across two databases. In this way I can utilise the CPU appropriately. Or is there any other method to find the CPU utilisation by each database.

Regards,
Praveen.

October 21, 2003 - 4:59 pm UTC

there are no init.ora parameters that would accomplish this. none.

the only way to effectively control resource utilization is to -- well -- have a SINGLE INSTANCE per server. You'll never be sorry for going that way.

Praveen - Have a look at the Oracle doco for resource manager

Matt, October 21, 2003 - 6:43 pm UTC

I think Tom is correct. A single instance is the best way to achieve what you want - that and resource manager will allow you to allocate server resources amongst applications/application users.

Mat.

How well my database is doing?

Subbu, November 25, 2003 - 3:01 pm UTC

IS there some kind of report that has say some fancy charts to show for management that oracle provides - may be part of OEM? How can i tell others that database is performing well based on such charts/graphs?
My experiance with such things is it varies from day after day depending on what else was happening at the same time. Is there any standard things that can be pulled from database everyday to tell /convince someone that system is ok or not ok and the reason is so and so.
I know how it sounds asking such questions, but is there a way that you can help me with this.
Key thing is staspack report with some numbers will not be useful here as this is not for other DBAs to understand but for other folks who know nothing about database.
I'm looking for some thing ready to present that comes with OEM or some other simple way of getting some information.
Thanks for your help.
Subbu

November 25, 2003 - 3:51 pm UTC

you cannot.

it is a falicy to think you can.

one persons ideal system is anothers nightmare system.

since it is for the pointy haired manager, any graph will do, just steal some from quotes.yahoo.com or something, they'll be as meaningful as anything else.

A real manager would want to know

o whats the end user experience
o how many transactions have I historically done, how long did they take and how
are we doing now

for example -- note that NEITHER of those can be captured in the database!!! they can ONLY be captured by the application itself.

On asktom, I'm very interested in:

1) how many pages did I serve up in the last minute, hour, day, week, month
2) take 1 and compare the day to yesterday (are we up or down), the week to last week, the month to last month
3) whats the average, minimum and maximum page render time BY page type and aggregated
4) how many users have i had in the last hour, day, week, month
5) take numbers 3 and 4 and compare to historical values

etc -- as you can see, none of them are database metrics, they are APPLICATION metrics through and through. This is what I measure, this is what I report on. It is the only meaningful thing.

Tell me -- you show a graph that shows the cache hit ratio in the database was 99.99%. Does that translate into anything "useful" when reporting on how well your application (and remember, it is the application that is paramount, relevant here) performed?

it could be an indication of really bad performance
it could be an indication of really good performance
it could be totally irrelevant to anything (most probable)

If you have a 99.99% cache hit but the application takes 30 seconds to tab from field to field -- who cares? The end users will be telling you "it is very slow"

So, your database "doing well", nope, not going to go there (pet peeve of mine). Your application (and I'll bet you cannot get this information -- developers, for whatever reason loathe to instruement their code) must supply you the pretty charts.

Thats what I really like about html db (the tool we used to build this site). It does this all for me - the entire application is 100% instruemented so we can report on its use, performance and fine tune it when needed!

Arun Gupta, November 25, 2003 - 4:58 pm UTC

Tom,
I always used to get the hash value from spreport(start id, end id) and then run sprepsql(start id, end id, hash value) to get detailed information about a sql including execution plan. Today I ran sprepsql and the report had a big list of All Optimizer Plan(s) for this Hash Value. Then there were three different plans with different plan hash value. Which plan actually represents the plan for the query? How can there be multiple plans for the same query? What is plan hash value and how do I get the plan for a specific PHV?

Thanks.

November 25, 2003 - 6:41 pm UTC

they all do.

select * from emp;

run by 50 people -- each of which own a table EMP -- 50 plans.

select * from emp;

run by 50 people using the CBO each with different sort_area_sizes -- 50 plans.

there is a view v$sql_shared_cursor that can help you pinpoint the differences (why there are child cursors)

well said but.....

Subbu, November 25, 2003 - 5:20 pm UTC

Tom,
Thanks for your response. Well said.
Our monitoring here is matured enough in capturing what is going on with application and analyzing the historical data and trend. The application even capture oracle time and stores in database. Application team produces the chart to show how that particular page was performing.
Now there are few days where there are variation in the overall response time.
It can be attributed to database or network or other app server or cpu or memory of one of the other component etc.
What is required is some trend information for the database to say that if i see this wait time goes from x to y then there is some problem.
Thinking about this over and over i thought of providing information about various wait time and plot them.
Again i've a problem here if the usage is less then the wait time is going to be less.
Which then leads to taking average wait time which may or may not tell exact information about the health status.
Thinking about it again and again i wonder what will be best to share with others to make them understand that database is doing fine.
Any thought?
Thanks
Subbu

November 25, 2003 - 6:49 pm UTC

excellent -- rare to have that happen!

you need to trace the application -- not the database. database = big thing. application = thing that has measurable performance variation.

averages are virtually useless (you might want to get Cary Millsaps new book
</code> http://www.amazon.com/exec/obidos/tg/detail/-/059600527X/ <code>
) you want the application to spit out a trace (10046 trace) when this happens. statspack can only get you so far. its really nice but not the right tool for this in my opinion. I'd want an application trace here.

latch related question

Sudhir, November 25, 2003 - 6:51 pm UTC

Can you please comment on why so many latches are used?
And will it be a performance problem .. Thanks

SQL> select * from dba_sequences where sequence_name like 'LT%';

SEQUENCE_OWNER                 SEQUENCE_NAME                   MIN_VALUE  MAX_VALUE INCREMENT_BY C O
------------------------------ ------------------------------ ---------- ---------- ------------ - -
CACHE_SIZE LAST_NUMBER
---------- -----------
LT                       LT_SEQ_NBR                        1 1000000000            1 N N
    100000    53445258


SQL> select * from stats where name like 'LATCH.lib%';

NAME                                                    VALUE
-------------------------------------------------- ----------
LATCH.library cache                                 965461615
LATCH.library cache pin                             612216914
LATCH.library cache pin allocation                  172640540
LATCH.library cache load lock                            4203

SQL> declare
n number := 0;
  2    3  begin
 for i in 1..100000
 loop
  select LT.LT_seq_nbr.nextval into n from dual;
 end loop;
end;
/
  4    5    6    7    8    9  

PL/SQL procedure successfully completed.

SQL> SQL> select * from stats where name like 'LATCH.lib%';

NAME                                                    VALUE
-------------------------------------------------- ----------
LATCH.library cache                                 965965393
LATCH.library cache pin                             612618872
LATCH.library cache pin allocation                  172642010
LATCH.library cache load lock                            4203

SQL> select 965965393-965461615, 612618872-612216914 from dual;

965965393-965461615 612618872-612216914
------------------- -------------------
             503778              401958

SQL>

November 25, 2003 - 7:14 pm UTC

latches are lightweight serialization devices

getting a sequence is going to be a tiny "one at a time" type of operation.

Also, with the default cache of 20, you had tons and tons of recursive sql to update seq$ and commit that tiny transaction.


ops$tkyte@ORA920> create sequence little_cache;
 
Sequence created.
 
ops$tkyte@ORA920> create sequence big_cache cache 1000000;
 
Sequence created.
 
ops$tkyte@ORA920>
ops$tkyte@ORA920> exec runStats_pkg.rs_start
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA920> declare
  2      n number;
  3  begin
  4      for i in 1 .. 100000
  5      loop
  6          select little_cache.nextval into n from dual;
  7      end loop;
  8  end;
  9  /
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA920> exec runStats_pkg.rs_middle
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA920> declare
  2      n number;
  3  begin
  4      for i in 1 .. 100000
  5      loop
  6          select big_cache.nextval into n from dual;
  7      end loop;
  8  end;
  9  /
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA920> exec runStats_pkg.rs_stop(20000);
Run1 ran in 2855 hsecs
Run2 ran in 2132 hsecs
run 1 ran in 133.91% of the time
 
Name                                  Run1        Run2        Diff
STAT...session logical reads       320,725     300,569     -20,156
LATCH.shared pool                  225,252     200,295     -24,957
LATCH.row cache objects             30,038          36     -30,002
LATCH.enqueue hash chains           30,052          44     -30,008
LATCH.library cache pin            440,205     400,204     -40,001
LATCH.library cache                560,353     500,374     -59,979
LATCH.cache buffers chains         670,983     604,673     -66,310
STAT...recursive calls             170,125     100,131     -69,994
STAT...session pga memory         -131,072      65,536     196,608<b>
STAT...redo size                 4,388,524      70,492  -4,318,032</b>
 
Run1 latches total versus runs -- difference and pct
Run1        Run2        Diff       Pct
2,354,253   2,007,468    -346,785    117.27%
 
PL/SQL procedure successfully completed.


<b>some amount of latching is going to be 100% unavoidable.  Your goal is to use techniques that minimize latching</b>

Tom, thank you. One comment though

Sudhir, November 25, 2003 - 9:37 pm UTC

Even in your example, min 500k library cache latches were obtained for 100k seq nextvals. More like 5 latches for 1 nextval? Should the 5 then include all child latches etc.

I like the stats view you have so much, I am almost using it as much I can to just get some idea on various things that are just hard to understand.

Thanks

November 26, 2003 - 7:41 am UTC

you were running SQL, forget about the sequence for a moment.  You did stuff in the database and virtually everything is going to take some latching.  



ops$tkyte@ORA920> exec runStats_pkg.rs_start
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA920> declare
  2      n number;
  3  begin
  4      for i in 1 .. 100000
  5      loop
  6          select i into n from dual;
  7      end loop;
  8  end;
  9  /
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA920> exec runStats_pkg.rs_middle
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA920> declare
  2      n number;
  3  begin
  4      for i in 1 .. 100000
  5      loop
  6          select big_cache.nextval into n from dual;
  7      end loop;
  8  end;
  9  /
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA920> exec runStats_pkg.rs_stop(20000);
Run1 ran in 2063 hsecs
Run2 ran in 2415 hsecs
run 1 ran in 85.42% of the time
 
Name                                  Run1        Run2        Diff
STAT...session pga memory max       65,536     131,072      65,536
LATCH.shared pool                  100,170     200,263     100,093
STAT...session pga memory         -131,072           0     131,072
LATCH.library cache pin            200,106     400,223     200,117
LATCH.sequence cache                     0     300,002     300,002
LATCH.library cache                200,228     500,418     300,190
 
Run1 latches total versus runs -- difference and pct
Run1        Run2        Diff       Pct
1,106,842   2,008,294     901,452     55.11%
 
PL/SQL procedure successfully completed.

Related to Statspack

Sudhir, November 26, 2003 - 6:29 am UTC

Is there some way to save statpack data to remote database rather than the current one where you are collecting it? That really will open up a door for us. We have many instances and each takes up Gigs of space for statspack. Partly due to 6mo to 1yr retention policy of statspack data. And then we have extentions running as well.

Thank you

November 26, 2003 - 7:50 am UTC

statspack is just a bunch of tables and a script to query them.

Sure, you could export/import it into various schemas on another instance -- but -- it'll take the same amount of space at the end of the day?

I would (for ease of use reasons) just leave it where it is. Adding a couple of gigs to a database isn't going to "hurt" it.

Arun Gupta, November 26, 2003 - 8:51 am UTC

This is a production system and we have only one copy of each table. We use pga_aggregate target and WORKAREA_SIZE_POLICY=AUTO. The only other thing I can think of is can FGAC cause this behaviour?

Other thing I observed is the statspack queries the Oracle version when spreport is run. I can have snapshots taken with version 9.2.0.3 showing up as 9.2.0.4 because I upgraded. Is this correct?

Thanks...

<quote>
they all do.

select * from emp;

run by 50 people -- each of which own a table EMP -- 50 plans.

select * from emp;

run by 50 people using the CBO each with different sort_area_sizes -- 50 plans.

there is a view v$sql_shared_cursor that can help you pinpoint the differences
(why there are child cursors)
</quote>

November 26, 2003 - 10:34 am UTC

look in the view i told you about, it'll tell you why.

FGAC can definitely do this -- IF your predicate policy returns N different predicates, you have N different queries that all "look" the same in v$sql but are different under the covers (the hidden predicates)

statspack is just reporting the version of the database it is connected to, so yes, 9204 would be "correct"

Arun Gupta, November 26, 2003 - 1:24 pm UTC

Sorry, didn't mean to ignore your advise. I did check the v$sql_shared_cursor view and was trying to interpret the results among other things. I ran the query as:
SELECT
*
FROM
v$sql_shared_cursor
WHERE
kglhdpar in ( select address from v$sql where hash_value = 3949466389)

The AUTH_CHECK_MISMATCH=Y. Can this be due to FGAC? This database uses FGAC extensively and the predicates do vary. The different plans shown by sprepsql also vary quite a bit.

One more question. If sprepsql reports 6 distinct plans, then why all plans are not shown in the report? The reports only had two or three plans.

Thanks

November 26, 2003 - 2:51 pm UTC

means different users accessing different objects (the select * from emp example where there is more than one emp table)

or the predicate policy references other tables so the comprehensive set of tables in the query is different from query to query.

statspack only shows that which was executed during the time measured. if the other 3 or 4 plans were not used....

Is statspack right tool for DW also?

Arun Gupta, December 18, 2003 - 3:35 pm UTC

Tom,
Can I use statspack as the starting point to identify resource intensive queries in a DW database? For OLTP databases, I take snapshots not more than 15 minutes apart. Is this ok for DW also? Please give me some pointers to keep in mind when tuning DW queries.

Thanks.

December 18, 2003 - 4:16 pm UTC

you can, but you'll probably just want to look at v$sql directly as you'll have much less shared sql in a DW (more one off queries) and will need to look for patterns. a tool like OEM might help you here as well.

Negative soft parse ratio!

A reader, December 18, 2003 - 6:00 pm UTC

Hi Tom,
Snippet from statspack report shows negative soft parse ratio as below:
Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait Ratio: 99.99
Buffer Hit Ratio: 92.57
Library Hit Ratio: 91.21
Redo NoWait Ratio: 100.00
In-memory Sort Ratio: 100.00
Soft Parse Ratio: -1,356.50
Latch Hit Ratio: 98.16

What could be the reason for this?
Thanks

December 18, 2003 - 6:32 pm UTC

the formula is

round(100*(1-:hprs/:prse),2) pctval

so, look UP in your report to:

Parses: 0.24 53.50
Hard parses: 0.00 0.00

and based on the number of seconds you ran for -- determine how many hard and how many parses you did and we'll work from there.

More info...

A reader, December 18, 2003 - 6:48 pm UTC

Thanks for quick response.
Yes, there are more hard parses, as below:

Start Id End Id Start Time End Time (Minutes)
-------- -------- -------------------- -------------------- -----------
82 85 19-Dec-03 08:57:21 19-Dec-03 09:42:44 45.38

Cache Sizes
~~~~~~~~~~~
db_block_buffers: 32768
db_block_size: 16384
log_buffer: 10485760
shared_pool_size: 100M

Load Profile
~~~~~~~~~~~~

Cache Sizes
~~~~~~~~~~~
db_block_buffers: 32768
db_block_size: 16384
log_buffer: 10485760
shared_pool_size: 100M

Load Profile
~~~~~~~~~~~~
Per Second Per Transaction
--------------- ---------------
Redo size: 2,377.99 4,072.49
Logical reads: 415.02 710.75
Block changes: 8.47 14.50
Physical reads: 30.83 52.80
Physical writes: 0.12 0.21
User calls: 5.82 9.96
Parses: 4.32 7.39
Hard parses: 62.89 107.71
Sorts: 1.70 2.90
Transactions: 0.58

Rows per Sort: 47.03
Pct Blocks changed / Read: 2.04
Recursive Call Pct: 94.05
Rollback / transaction Pct: 0.00

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait Ratio: 99.99
Buffer Hit Ratio: 92.57
Library Hit Ratio: 91.21
Redo NoWait Ratio: 100.00
In-memory Sort Ratio: 100.00
Soft Parse Ratio: -1,356.50
Latch Hit Ratio: 98.16
It is DW system, and I ran report at the peak morning time.
What should I do to improve this, being a DW system?

December 19, 2003 - 6:54 am UTC

one of the counters must have rolled over -- ignore this report, do it at another peak.

In a DW, hard parsing is *normal* and expected. You are running queries that take many seconds (minutes/hours) instead of many queries per second. Hard parsing is not only OK in a DW, but many times mandatory (to take full advantage of histograms, star transformations and the like)

Still negative values

A reader, December 22, 2003 - 8:28 pm UTC

Tom,
I ran at various times, and every time I got negative soft parse ratio:
STATSPACK report for

DB Name DB Id Instance Inst Num Release OPS Host
---------- ----------- ---------- -------- ---------- ---- ----------
PEDW 1372283050 pedw 1 8.1.6.3.0 NO laedw001

Snap Length
Start Id End Id Start Time End Time (Minutes)
-------- -------- -------------------- -------------------- -----------
468 470 23-Dec-03 09:00:13 23-Dec-03 09:30:21 30.13

Cache Sizes
~~~~~~~~~~~
db_block_buffers: 32768
db_block_size: 16384
log_buffer: 10485760
shared_pool_size: 100M

Load Profile
~~~~~~~~~~~~
Per Second Per Transaction
--------------- ---------------
Redo size: 2,570.95 4,960.81
Logical reads: 614.46 1,185.64
Block changes: 9.15 17.66
Physical reads: 46.62 89.96
Physical writes: 0.14 0.27
User calls: 5.18 9.99
Parses: 5.24 10.11
Hard parses: 204.17 393.95
Sorts: 2.62 5.06
Transactions: 0.52

Rows per Sort: 45.94
Pct Blocks changed / Read: 1.49
Recursive Call Pct: 96.06
Rollback / transaction Pct: 0.00

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait Ratio: 99.99
Buffer Hit Ratio: 92.41
Library Hit Ratio: 91.67
Redo NoWait Ratio: 100.00
In-memory Sort Ratio: 100.00
Soft Parse Ratio: -3,795.84
Latch Hit Ratio: 97.21

__________________
Further, I get lots of waits for Library cache pin, how to reduce this?
Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt Time
-------------------------------------------- ------------ ------------ -------
library cache pin 639,106 229,914 78.42
latch free 901,144 50,436 17.20
library cache load lock 65 7,402 2.52
db file scattered read 1,503 4,614 1.57
log file sync 913 452 .15
-------------------------------------------------------------

December 23, 2003 - 9:49 am UTC

look further down and see what numbers are being fed into this -- also, is there a chance that some of the v$ numbers have "wrapped around" (they are in general 32 bit integers so sometimes on a DB that has been up for a long time -- they "roll around, go negative, approach zero and go for another loop". that can easily cause things like this)

trace vs. perception

j., December 23, 2003 - 10:58 am UTC

hi tom,

we 've traced a procedure that used to run some hundred seconds but now consistently takes about 12000 seconds and got the following summary for such a long run:

********************************************************************************

OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 0.00 0.00 0 0 0 0

Misses in library cache during parse: 0

OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 24 0.09 0.06 0 0 0 0
Execute 102853 8.62 7.97 0 14678 915 2348
Fetch 102848 12.27 11.57 0 963803 0 72465
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 205725 20.98 19.61 0 978481 915 74813

Misses in library cache during parse: 4

19 user SQL statements in session.
7 internal SQL statements in session.
26 SQL statements in session.
********************************************************************************
Trace file: deut_ora_338494.trc
Trace file compatibility: 9.00.01
Sort options: default

1 session in tracefile.
19 user SQL statements in trace file.
7 internal SQL statements in trace file.
26 SQL statements in trace file.
21 unique SQL statements in trace file.
206127 lines in trace file.

from this we (DBAs are involved) can't figure out the reason why it is taking that much time. we double-checked the structures and the data processed, but didn't find a clue ...

so we now don't have an idea how to go on.

what would you suggest?

December 23, 2003 - 11:57 am UTC

according to that, it took 20 seconds in the database.

means -- database not the issue. the network OR the application is the issue.

that 's why it is some kind of mystery to us

j., December 25, 2003 - 6:15 am UTC

we have a 100% server side application. we only submit an anonymous block containing the call of one package procedure. the executed code processes "local" data only (no database links involved).

one thing is making us (very) confused: the trace file grows to nearly its final size in the first few minutes although the application is still running for hours after that (we examined active statements) ...

December 25, 2003 - 10:11 am UTC

what is your max dump file size and is the last line in the trace file "hey, this file was truncated"

what's going on with my database

A reader, December 25, 2003 - 6:49 pm UTC

Hi

I am looking a statspack report from last month in from my database and I am puzzled

Snap Id Snap Time Sessions
------- ------------------ --------
Begin Snap: 2612 13-Nov-03 16:48:38 340
End Snap: 2613 13-Nov-03 17:22:27 340
Elapsed: 33.82 (mins)

Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt Time
-------------------------------------------- ------------ ------------ -------
enqueue 657 200,756 70.10
db file sequential read 109,537 33,992 11.87
latch free 40,310 20,274 7.08
db file scattered read 79,659 15,955 5.57
log file sync 6,827 5,080 1.77

Statistic Total per Second per Trans
--------------------------------- ---------------- ------------ ------------
CPU used by this session 234,822 115.7 20.4
CPU used when call started 234,905 115.8 20.5

How come my waits are using more CPU than my system...?! 200 seconds of waits in 33 minutes snap, my enqueues must be very bad, there must be many users waiting huh

This application uses BLOBs, may be that is the problem?

December 25, 2003 - 7:37 pm UTC

that is not 200 seconds, that is 2007 seconds of wait time for enqueues.

that could have been one user waiting pretty much the entire time (2029 seconds)...

or it could have been 2000 users each waiting a second....

or any number/combo in between.

Now, enqueues are LOCKS (select * from emp for update in session one. Now, do the same in session 2, session 2 blocks. it'll accumulate enqueue waits until session 1 commits or rolls back)

this is nothing with blobs. you have a concurrency issue (maybe -- if just one guy was blocked the entire time, you might just have had a strange occurence. I lean towards that -- that one user was blocked for the entire time, but only cause the numbers are so close, the total time and the total wait time)

Look for blockers and blockees.

I meant 2000 seconds

A reader, December 25, 2003 - 6:55 pm UTC

waits in 33 minutes snap

sorry!

Understanding Statspack.

noel seq, January 20, 2004 - 6:54 am UTC

I have learnt immensely from this site. With regard to analyzing statspack reports, kindly throw some light on the following extract of my statspack report and would like to if I am on the right track.

Although the soft parse ratio is good at 99.8%, I find that the logical reads per transaction at 50,000 seems to be very high. Is this logical I/O you always talk about which is further reflected in SQL query part of the statspack report in the 'gets per exec' column. Am I to start tuning queries with high 'gets per exec' first.

Secondly I find that the rollback per transaction is 39% which again seems high. How do I identify which queries that need tuning. Or is it that I keep tuning those queries with high 'gets per exec' and my rollback per transaction will reduce.

Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 3,866.26 21,026.65
Logical reads: 9,789.83 53,242.02
Block changes: 18.28 99.42
Physical reads: 2,196.84 11,947.51
Physical writes: 6.44 35.01
User calls: 116.91 635.84
Parses: 18.54 100.84
Hard parses: 0.04 0.20
Sorts: 2.16 11.73
Logons: 0.08 0.46
Executes: 26.18 142.41
Transactions: 0.18

% Blocks changed per Read: 0.19 Recursive Call %: 10.78
Rollback per transaction %: 39.04 Rows per Sort: 24.98

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 99.99 Redo NoWait %: 100.00
Buffer Hit %: 77.56 In-memory Sort %: 99.86
Library Hit %: 99.63 Soft Parse %: 99.80
Execute to Parse %: 29.19 Latch Hit %: 99.98
Parse CPU to Parse Elapsd %: 83.24 % Non-Parse CPU: 99.99

SQL ordered by Gets

Buffer Gets Executions Gets per Exec % Total Hash Value
--------------- ------------ -------------- ------- ------------
3,917,864 5 783,572.8 39.4 1676175757
select i.branch_cd , i.series_prefix, i.serial_no, i.execution_
dt,i.settlement_type,i.settlement_no from instruction i where
i.instruction_no in (select id.instruction_no from instruction_
detail id where i.depos_cd = id.depos_cd and i.dp_id = id.dp_i
d and i.client_cd = id.client_cd and i.instruction_no = id.ins

929,943 3 309,981.0 9.3 2369995686
SELECT /*+ RULE*/ c.dep_ac_no, h.holder_nm holder_nm, a.addr1, a
.addr2, a.addr3, a.city, a.state, a.country, a.zip, a.phone1, a.
phone2, a.phone3, a.fax, a.email, c.client_cd,c.client_stat FROM
client c, client_address a, client_holders h WHERE c.depos_cd =
h.depos_cd AND c.dp_id = h.dp_id AND c.client_cd = h.client_cd

767,158 3 255,719.3 7.7 921042803
SELECT /*+ RULE*/ c.dep_ac_no, h.holder_nm holder_nm, a.addr1, a
.addr2, a.addr3, a.city, a.state, a.country, a.zip, a.phone1, a.
phone2, a.phone3, a.fax, a.email, c.client_cd,c.client_stat FROM
client c, client_address a, client_holders h WHERE c.depos_cd =
h.depos_cd AND c.dp_id = h.dp_id AND c.client_cd = h.client_cd

Kind Regards,
Noel.

January 20, 2004 - 8:21 am UTC

LIOs and parsing rates have no real correlation.

50,000 LIO's per transaction seems high, if you are a transactional system, yes. You would identify the high load sql (seems have that -- right there -- might be as easy as losing the rule hint to fix them -- might be in need of looking at indexes).

a 40% rollback per transaction means that 40% of the time -- you do work and then undo that work. You need some transaction level auditing (in the application) to see which transactions are rolling back most frequently and if they are "read/write" transactions -- you need to figure out "why"

What about RESOURCE_LIMIT=true

Scott Watson, January 20, 2004 - 2:00 pm UTC

Tom,

Can you explain why setting the RESOURCE_LIMIT parameter to true before running your run_until procedure failed to update the CPU used statistic more frequently.

I was lead to believe that by setting the parameter RESOURCE_LIMIT to TRUE that CPU usage statistics would be updated more often, and as a result I could see which process was actually consuming CPU cycles. However, after taking statspack snapshots
o before the procedure started
o before it ended
o and just after it ended

proved to show that all the time ended up in the final snapshot. IE there was no CPU time in the snapshot collected for 2.5 minutes of the 3 minute job.

Assuming I had a profile setup for my user how is the profile enforced if the CPU time is not updated more frequently.

By the way I ran the test the first time with RESOURCE_LIMIT=FALSE and then ran the test again after issuing an alter system set resource_limit=true. I was running this test on my laptop running (9.2.0.3 on W2K)

Thanks,
Scott

January 20, 2004 - 2:16 pm UTC

because it was not designed to do that? the code just doesn't do that -- that stat is not update in the system until the call completes.

you have to have a call complete to get that statistic dumped.

CPU time is not reliable with statspack when you have long running procedures.

Rollback in Transactions

noel, January 28, 2004 - 4:10 am UTC

You were talking about transaction auditing at the application level. How should I go about it. Could you be a little more clear.

I checked the view v$db_object_cache which showed me a lot of updates. As per the appliction each record has to be authorised by a specific person(s). Is this the rollback that is appearing in the report.

BTW is the entire row containing all the old values copied to the rollback segment or only the old values of the updated column(s).

This is because the columns that are updated contain null values and are updated only at the time of such authorisations and not at the time of the insert.

If the second option is true, I am back to square one and will have to look at other possibilities.

Noel.

January 28, 2004 - 8:45 am UTC

I mean the application itself would be instruemented (contain code) to let you (us) know what it is doing and how long it takes to do that.

sufficient data is placed into rollback -- if you insert a row of 5meg -- the undo generated is very very small (undo = delete rowid). if you update 5meg of data -- the undo is very large. If you delete -- same thing, large. depends - we put in there just enough to undo it again.

I don't see two options in this so cannot really comment.

v$filestat

David, April 07, 2004 - 8:33 pm UTC

With oracle data spread across logical volume managers with striping and mirroring, are results from v$filestat still useful? Thanks.

April 08, 2004 - 10:00 am UTC

yes. you have timing information....

you probably haven't striped all physical devices as one big stripe set soooo you still have hotspots.

it can identify what object(s) are "hot" easily.

volume vs. file

David, April 08, 2004 - 11:11 am UTC

I hear that in 10g if you use ASM, oracle is striping at the file level and not at the logical level. What is the difference between the two? Thanks.

April 08, 2004 - 3:16 pm UTC

ASM is an alternative to using files (or raw disk). It is an entirely different beast.

We stripe all database objects across all available devices (really bringing to an end that "few extents....." discussion for once and for all) in a disk group.

totally different than using tablespaces containing many files physically.

statspack time interval

A reader, April 23, 2004 - 12:25 pm UTC

You are an advocate of time intreval between two
snapshots being 15 minutes to a max of 30 minutes
(during peak time or when the problem is occuring.)

Do you think there is any advantage of taking this
idea further and reducing the time between to even
less - say 1 minute? That way the averaging effect
would be even less. Let us assume that we are talking
about the peak/problem time only for this discussion.
(This is somewhat in line with Cary's principle
although I do understand that statspack's systemwide
approach is very different from Cary's session based
approach.

Many Thanx!

April 23, 2004 - 2:03 pm UTC

yes, i do small snaps of individual applications running in small windows like that.

ramp something up and get it going steady state. won't really matter if you measure 15 minutes or 1 minute for a transactional system (quicky transactions)

Fastest time to execute a transaction

A reader, April 29, 2004 - 4:25 pm UTC

Hi Sir,

May I ask how long should it take to insert a new row into a normal table, say 10 columns with number or varchar2 columns? Assume no triggers or MVs affected.

I an debugging a performance issue here, which the application (written in C#) inserts 1500 new rows into a table in one transaction, and by average each of them takes 10 milliseconds. (Somehow I can't use bulk mode for technical reasons).

Is 10 milliseonds for one simple insert statement considered to be fast, normal, or slow, based on your experience?

Thanks,

April 29, 2004 - 4:51 pm UTC

well.... when you take into consider the round trips, the extra work row by row processing entails -- perhaps 10ms is hugely awesome.

Are you array processing? I cannot give you a C# example since I don't run any OS's that are able to run that language (funny problem sort of -- thought we tackled that sort of stuff years ago, cross platform) but in pro*c I set up two examples:

static void bad_insert( )
{
EXEC SQL BEGIN DECLARE SECTION;
int a = 1;
int b = 1;
int c = 1;
int d = 1;
int e = 1;
varchar f[40];
varchar g[40];
varchar h[40];
varchar i[40];
varchar j[40];
int I = 0;
EXEC SQL END DECLARE SECTION;

EXEC SQL WHENEVER SQLERROR DO sqlerror_hard();

f.len = sprintf( f.arr, "hello world %d", I );
g.len = sprintf( g.arr, "hello world %d", I );
h.len = sprintf( h.arr, "hello world %d", I );
i.len = sprintf( i.arr, "hello world %d", I );
j.len = sprintf( j.arr, "hello world %d", I );

for( I = 0; I < 1500; I++ )
{
exec sql insert into t bad (a,b,c,d,e,f,g,h,i,j) values ( :a, :b, :c, :d, :e, :f, :g, :h, :i, :j );
}
exec sql commit;
}

insert slow by slow (meant row by row) versus:

static void array_insert( )
{
EXEC SQL BEGIN DECLARE SECTION;
int a[100];
int b[100];
int c[100];
int d[100];
int e[100];
varchar f[100][40];
varchar g[100][40];
varchar h[100][40];
varchar i[100][40];
varchar j[100][40];
int I;
EXEC SQL END DECLARE SECTION;

EXEC SQL WHENEVER SQLERROR DO sqlerror_hard();

for( I = 0; I < 100; I++ )
{
a[I] = b[I] = c[I] = d[I] = e[I] = I;
f[I].len = sprintf( f[I].arr, "hello world %d", I );
g[I].len = sprintf( g[I].arr, "hello world %d", I );
h[I].len = sprintf( h[I].arr, "hello world %d", I );
i[I].len = sprintf( i[I].arr, "hello world %d", I );
j[I].len = sprintf( j[I].arr, "hello world %d", I );
}

for( I = 0; I < 15; I++ )
{
exec sql insert into t good (a,b,c,d,e,f,g,h,i,j) values ( :a, :b, :c, :d, :e, :f, :g, :h, :i, :j );
}
exec sql commit;
}

Array processing -- do about 100 at a time. The results are:

insert into t bad (a,b,c,d,e,f,g,h,i,j)
values
(:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7,:b8,:b9)

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1500 1.40 1.45 0 27 1723 1500
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 1501 1.40 1.45 0 27 1723 1500

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 120

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 1500 0.00 0.00
SQL*Net message from client 1500 0.00 0.13
********************************************************************************

insert into t good (a,b,c,d,e,f,g,h,i,j)
values
(:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7,:b8,:b9)

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 15 0.04 0.02 0 37 332 1500
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 16 0.04 0.02 0 37 332 1500

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 120

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net more data from client 60 0.00 0.00
SQL*Net message to client 15 0.00 0.00
SQL*Net message from client 15 0.00 0.00

The array processing was sooo fast, it was hard to actually time. The row by row stuff was soooo slow, so slow. And it did lots more work (query mode gets). it would generate more redo, more undo as well.

So, I think it should be very fast to insert 1500 rows, as long as you don't do it a row at a time.

Think how long it would take to get a book from the bookstore if you had to get it a page at a time, instead of a book at a time.

Dictionary cache pct miss very high

Arun Gupta, June 08, 2004 - 9:02 am UTC

Tom,
I am seeing the following in statspack reports everyday:
Dictionary Cache Stats for DB: DPWP Instance: DPWP Snaps: 266 -267
->"Pct Misses" should be very low (< 2% in most cases)
->"Cache Usage" is the number of cache entries being used
->"Pct SGA" is the ratio of usage to allocated size for that cache

Get Pct Scan Pct Mod Final
Cache Requests Miss Reqs Miss Reqs Usage
------------------------- ------------ ------ ------- ----- -------- ----------
dc_files 184 100.0 0 0 0
dc_free_extents 3 100.0 0 0 0
dc_global_oids 4 50.0 0 0 0
dc_histogram_defs 545,202 3.1 0 8 1,753
dc_object_ids 2,991,357 0.1 0 0 842
dc_objects 342,517 2.0 0 0 776
dc_profiles 3,597 0.1 0 0 2
dc_rollback_segments 251 0.0 0 0 36
dc_segments 698,620 0.7 0 0 766
dc_sequences 5,253 4.5 0 5,253 19
dc_tablespace_quotas 20 50.0 0 5 0
dc_tablespaces 1,819,263 0.0 0 0 4
dc_user_grants 1,337,393 0.1 0 0 204
dc_usernames 738,073 0.2 0 0 100
dc_users 3,932,786 0.1 0 21 362
-------------------------------------------------------------

The pct miss for some dictionary cache objects is 100%. Is this something alarming? If yes, please point me in the right direction on how to fix it. The shared pool advisory from the same statspack report is as follows.

Shared Pool Advisory for DB: DPWP Instance: DPWP End Snap: 267
-> Note there is often a 1:Many correlation between a single logical object
in the Library Cache, and the physical number of memory objects associated
with it. Therefore comparing the number of Lib Cache objects (e.g. in
v$librarycache), with the number of Lib Cache Memory Objects is invalid

Estd
Shared Pool SP Estd Estd Estd Lib LC Time
Size for Size Lib Cache Lib Cache Cache Time Saved Estd Lib Cache
Estim (M) Factr Size (M) Mem Obj Saved (s) Factr Mem Obj Hits
----------- ----- ---------- ------------ ------------ ------- ---------------
208 .6 192 24,230 3,391,960 1.0 819,472,088
256 .7 239 30,744 3,401,062 1.0 820,733,548
304 .9 286 37,249 3,406,432 1.0 821,466,257
352 1.0 333 43,720 3,409,779 1.0 821,901,967
400 1.1 380 50,254 3,411,890 1.0 822,182,640
448 1.3 427 56,801 3,413,226 1.0 822,379,457
496 1.4 474 63,279 3,414,085 1.0 822,518,910
544 1.5 521 69,457 3,415,637 1.0 823,109,880
592 1.7 568 73,864 3,417,017 1.0 823,660,169
640 1.8 616 89,130 3,417,681 1.0 823,888,281
688 2.0 664 103,456 3,418,025 1.0 823,969,827
736 2.1 703 105,921 3,418,224 1.0 824,029,568
-------------------------------------------------------------

We are on 9ir2. The report is over a 15 minute period.
Thanks.

June 08, 2004 - 9:55 am UTC

look at the numbers -- 184, 3, 4, 20.... hmmm. 100% of a very small number is, well, a small number of misses.

9i upgrade

A reader, June 22, 2004 - 4:51 pm UTC

Tom, we have upgraded to 9i. We have a baseline with 8i.
We got some quuries running 15-35% faster in 9i. WE also have other queries running 20-60% slower in 9i compared to 8i. Where should I start -- to identify the reasons for this performance degradation.

June 22, 2004 - 10:19 pm UTC

do you still have the 8i instance?

9i upgrade

durga rani, June 23, 2004 - 11:47 am UTC

Yes. We have a complete backup of 8i instance.
Thank you for answering my question.

June 23, 2004 - 12:54 pm UTC

compare the autotrace traceonly explains and tkprofs for those queries. You might quickly see a pattern.

free buffer

Rory B. Concepcion, July 02, 2004 - 10:05 pm UTC

Hi Tom,

Here's my question. When I run an update on a table converting the null values of a column into 0, i saw that the waits are free buffer waits. correct me if i'm wrong but does it mean that the dirty buffers are being written to free up space? So one solution would be for me to increase my db cache size.
Are there tables that I can query to see the amount of free space in the database buffers? I mean like dba_free_space for database files? Thanks so much

July 03, 2004 - 10:16 am UTC

another solution would be to checkpoint more aggresively.

v$bh is the view that shows the buffer cache.

sprepsql.sql no found in 8.1.7.4

Mike, July 07, 2004 - 4:37 pm UTC

There is no sprepsql.sql installed in 8.1.7. How can I use the Hash Values from statspack report to find out the full queries?
Thank you.

July 07, 2004 - 6:19 pm UTC

sprepsql reports things that didn't exist in 8i (resource usage and sql plans)

but, you can use the hash against V$SQLTEXT_WITH_NEWLINES (hash_value is generally "good enough" for a key there)

Top 10 proactive monitoring script

SHGoh, July 13, 2004 - 6:04 am UTC

Dear Tom,

As a good DBA, it is important to be able to anticipate what is going to happen to the system (data files is going to full, table A almost reach max extents size) and act on it before the problem happen. Be proactive. I would appreciate if you can tell us what is the TOP events that we must be aware of.

One of your very good example is to detect those max extents that nearly full (max_extents - extents < 3).

Thanks in advance.

Rgds
Goh

July 13, 2004 - 11:45 am UTC

i don't measure "extents" and things like that. I use autoextend datafiles, LMTs and watch the file system. there are thousands of tools out there will reports -- i would suggest finding one you like and using it (OEM for example, the capacity planner).

that example isn't anything I use, someone asked me how to write a query that did that is all.

confused with stats

mike, July 13, 2004 - 8:55 am UTC

Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt Time
-------------------------------------------- ------------ ------------ -------
SQL*Net message from dblink 105,584 18,877,832 95.61
log file sync 27,247 423,843 2.15
latch free 49,276 316,227 1.60
log file parallel write 13,742 45,607 .23
buffer busy waits 977 40,465 .20
-------------------------------------------------------------
Wait Events for DB: OPP1 Instance: OPP1 Snaps: 356 -357
-> cs - centisecond - 100th of a second
-> ms - millisecond - 1000th of a second
-> ordered by wait time desc, waits desc (idle events last)

Tom,
The snapshot took by interval of 60 minutes. I am not sure I understood the stats as shown above. cs - centisecond - 100th of a second, the event SQL*Net message from dblink had 18,877,832 cs = 18,877.832 s
perfstat@opp1> select 18877.832/60/60 hours from dual;

HOURS
----------
5.24384222

but the snap was 1 hour!

July 13, 2004 - 12:05 pm UTC

don't forget -- the waits = sum of waits over sessions.

suppose you had 100 users, if they all waited 189 seconds (each), you would have 18900 seconds of wait.

also, sql*net message's in general (and this one in particular) "dump time". the wait isn't over till the wait is over.

Sooooo, if you started a long running remote statement at 9am.....
and it was still executing remotely at 10am, 11am, noon, 1pm..... (locally you are waiting on sqlnet message from dblink)

and you took a snapshot at 1pm

and at 1:15, the job finished

and you took a snapshot at 1:30pm -- you would see 4 hours and 15 minutes of sqlnet message from dblink contributed to the statspack!

(cpu is done the same way by the way)

so, there are two things that could have happened:

a) lots of users with small waits added up
b) a long running session "dumping" time

stats

MIke, July 16, 2004 - 1:06 am UTC

The following stats collected at our OLPT database the peak time by the intervale of 60 min. How do you think the database performed?

what the wait of library cache load lock?

Snap Id Snap Time Sessions
------- ------------------ --------
Begin Snap: 2238 14-Jul-04 19:06:29 351
End Snap: 2248 14-Jul-04 20:06:57 351
Elapsed: 60.47
(mins)

Cache Sizes
~~~~~~~~~~~
db_block_buffers: 10000
log_buffer: 163840
db_block_size: 8192
shared_pool_size: 100000000

Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 681,454.91 111,080.49
Logical reads: 103,117.74 16,808.70
Block changes: 6,018.62 981.06
Physical reads: 9,969.64 1,625.10
Physical writes: 469.42 76.52
User calls: 1,540.18 251.06
Parses: 206.48 33.66
Hard parses: 42.95 7.00
Sorts: 72.06 11.75
Logons: 0.37 0.06
Executes: 1,874.89 305.62
Transactions: 6.13

% Blocks changed per Read: 5.84
Recursive Call %: 72.80
Rollback per transaction %: 1.28
Rows per Sort: 541.47

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 97.42
Redo NoWait %: 100.00
Buffer Hit %: 90.33
In-memory Sort %: 97.27
Library Hit %: 96.97
Soft Parse %: 79.20
Execute to Parse %: 88.99
Latch Hit %: 99.02
Parse CPU to Parse Elapsd %: 87.86
% Non-Parse CPU: 99.97

Shared Pool Statistics Begin End
------ ------
Memory Usage %: 74.99 78.84
% SQL with executions>1: 51.42 75.82
% Memory for SQL w/exec>1: 32.12 56.87

Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt Time
-------------------------------------------- ------------ ------------ -------
library cache load lock 2,235 234,948 28.60
latch free 87,932 118,594 14.43
direct path write 816,881 116,646 14.20
db file sequential read 25,748,443 97,729 11.90
buffer busy waits 9,670,941 78,884 9.60

July 16, 2004 - 11:03 am UTC

reading tea leaves, that is what this is like.

looks like a lightly used system with 6tps.

if the latch frees are library cache latches (which i sort of suspect) this either was taken right after a cold start (lots of cache loads there) or the developers didn't read the chapter on bind variables.

the direct path write could indicate your sort area size is too small

but -- at the end of the day, ignore the statspack for a moment -- how did the system perform from the end users perspective? that is what counts ..

Mike, July 16, 2004 - 11:28 am UTC

"looks like a lightly used system with 6tps."

what's 6tps?

July 16, 2004 - 1:56 pm UTC

tps = transactions per second.

MIke, July 16, 2004 - 8:28 pm UTC

"
tps = transactions per second.
"
How is the transaction defined? upon COMMIT?
1) update 5 records in a table;
commit;
2) update 50 records in a table;
commit;

both 1) and 2) has one transaction completed.

July 16, 2004 - 11:02 pm UTC

i read their statspack. it shows transaction per second.

transactions are delineated by commits in databases.

Reader

A reader, July 18, 2004 - 8:11 pm UTC

For the WAIT on the 10046 trace file

WAIT #6: nam='db file sequential read' ela= 9195 p1=14 p2=7354 p3=1

What is the ela=9195 in seconds. Is this 9.195 seconds or
less than that. If I add all the elapsed time together,for
this wait, in trace file that I took for 2 hrs, I get a total of 21 hours. Seems like something is wrong

The Oracle version is 9.2.0.4 on SUN

July 19, 2004 - 7:24 am UTC

divide by 1,000,000 in 9i to get seconds.

Correction

A reader, July 18, 2004 - 10:03 pm UTC

Subsequent to posting about the elapsed time calculation,
I looked at a reference point where it reads,
in 9i, the elapsed time in in 1,000,000th of a second

Please confirm, Thanks

Trigger is contextswitch

Oleg Oleander, July 30, 2004 - 2:00 pm UTC

Dear Tom,

Placing an insert trigger on a table means that for every inserted row a contextswitch occurs to run the trigger's plsql block?
That is a bad thing for "insert as select", isnt it?

Please share your thoughts on it?

Best regards,
Oleg

July 30, 2004 - 5:45 pm UTC

will it help performance? no.

use triggers sparingly, every trigger added will necessarily increase the runtime of the modifications on that table.

Excellent guidelines

jharvu, August 11, 2004 - 10:17 am UTC

Hi Tom,
Your explanations are simply priceless and I for one , appreciate your efforts in helping millions of DBAs .
Thank you.

Reader

A reader, August 21, 2004 - 4:17 pm UTC

The "library cache pin" wait event. Would it even remotely
be caused by flushing the shared_pool on a live database

Thanks

August 21, 2004 - 8:23 pm UTC

well, flushing the shared pool on a live database is going to cause tons of havoc -- lots and lots of hard parsing, tons of latch frees on the library cache.

it could also cause this event as this is taken to load the stuff into the cache as well and if they aren't there, well, they have to be gotten there (so it'll be held slightly longer)

but flushing the shared pool is a huge huge performance hit.

A reader, August 21, 2004 - 11:40 pm UTC

>>>>>>>>flushing the shared pool on a live database is going to cause tons of
havoc -- lots and lots of hard parsing, tons of latch frees on the library
cache.
>>>>>>>>>
does bounce a database do the same as flushing shared pool?

August 22, 2004 - 8:18 am UTC

it is even worse than flushing the shared pool -- you just lost your buffer cache you spent hours warming up.

how to stop statspack running

A reader, September 15, 2004 - 11:36 am UTC

Tom,

Please let us know, how to stop statspack running ...

@?/rdbms/admin/spdrop -- to drop

is there anything like endstats?

Thanks

September 15, 2004 - 11:43 am UTC

statspack only runs when YOU RUN IT.

so, to stop it running, stop running it. it only "runs" when you "statspack.snap"

CPU time

David Pujol, September 16, 2004 - 11:17 am UTC

Hi Tom, I'm analyzing a statspack report:

Snap Id Snap Time Sessions
------- ------------------ --------
Begin Snap: 381 14-Sep-04 17:52:50 37,377
End Snap: 382 14-Sep-04 18:07:57 37,377
Elapsed: 15.12 (mins)

and ..

CPU used by this session 11,939,921 13,164.2 102.0
CPU used when call started 1,293,999 1,426.7 11.1

I have a 24 cpu's. In 15 min window, I have a 15x60x24 = 21600 seconds. Look "CPU used by this session", I have a 119399 CPU seconds. If I have a 21600 window cpu, how is possible that I'm getting 119399 cpu seconds?. I'm using MTS.

A lot of thanks

September 16, 2004 - 11:35 am UTC

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:7641015793792#8768340190291 <code>

looks like cpu dumping to me. do you have any long running processes on this machine.

Connection pool and STATSPACK

A reader, September 16, 2004 - 2:29 pm UTC

We have an EJB application contained in an oracle application server with connection pool set. This oracle application server connects to the Oracle database.
If the mode of connect between the oracle application server and the oracle database is dedicated, can STATSPACK be run to trace a particular session.
or will we run into problems you have mentioned in your book Expert one-to-one oracle Pg. 452, setting up trace
<quote>...I would recommend never attempting to use SQL Trace with MTS.......<.quote>

Thank you

September 16, 2004 - 3:14 pm UTC

you can trace a particular connection pool sesssion - but that will map to dozens of application sessions -- probably not what you wanted.

it would be the same as tracing with shared server before 10g gave us a way to tie it all together.

Connection pool and STATSPACK

A reader, September 16, 2004 - 3:38 pm UTC

Q1) In the scenario mentioned in the previous posting - how can we do tracing on an EJB application which has connection pooling.

Q2)
<quote>
it would be the same as tracing with shared server before 10g gave us a way to tie it all together.
</quote>

...before 10g gave us a way to tie it all together.

The version of the DB we have is 10.1.0.3.0 - Prod. How is it different in 10g. Can you please explain or point to a URL.

Thank you

September 16, 2004 - 3:42 pm UTC

q1) prior to 10g, when you can stuff "logical session" info into the trace files, it is difficult to to application tracing unless the middle tier was built to do it.

it would have to turn on and off tracing for a "logical session" -- and you would have to take all of the result trace files basically, tkprof them and then look at them. It would be similar to a tkprof with "aggregate=no".

q2) </code> https://asktom.oracle.com/Misc/oramag/on-fetching-storing-and-indexing.html <code>

Kimberly Floss just did an article on that, would be a good starting point.

Tracing a user conencting to an app server

A reader, September 17, 2004 - 11:42 am UTC

Tom,

Thank you for pointing to the article above.

1) Is Automatic Database Diagnostic Monitor (ADDM) a replacement for STATSPack in 10G.

2) Looked at the above article.
The trcsess utility lets you selectively extract trace data from numerous trace files and save them into a single file based on criteria such as session ID or module name. But in a J2EE environment where you have connection pooling, the sessionId itself could be shared(right?). That being the case how can we trace the activity of a single user connecting to the application server(connection pooling present on app server), which in turn is connecting to the database

September 17, 2004 - 11:57 am UTC

1) it is a bigger, better, more GUI statspack that does much more than statspack.

statspack is still "there"

2) you can trace by "service", by client id OR by session id. client id is something unique you make up.

A reader, September 17, 2004 - 12:24 pm UTC

Tom,
We have a performance problem in the EJB application after the DB has been upgraded to 10G. The db seems to be behaving erratically at times. Are AWR and ADDM the right tools for diagnosing the problems or are they any new tools in the 10G arsenal.

Thank you

September 17, 2004 - 1:18 pm UTC

yes, they would be tools to use.

old tricks

Javier Morales, September 21, 2004 - 7:03 am UTC

Hi Tom,

I'm trying to find the right size of buffer cache in a database. Months ago, one DBA did a tuning report finding that buffer cache in a Oracle8i instance was really small (bd_block_buffers=2500 8kb block size, so a total of 20Mb of buffer cache size). I think so (just a feeling), but he wrote in the report that the right size will be 40000 block buffers.

He didn't tell why 40000, but he told that if there would be problems with swap or else, to raise down this number.

OEMGR does a very pretty graph to find the "aproximated" right size based on profit and hit caché, but I can't use this tools.

My way to find this manually was (in Oracle8):

select sum(count) result
from sys.x$kcbrbh
where indx < :value;

where :value was the number of blocks to add, and "result" shown the increase in hits.

Is there something somewhere (dictionary, v$ or else) where to find an approach to this?.

Thanks in advance,
Javier

September 21, 2004 - 8:06 am UTC

the cache advisors were added in 9i, in 8i -- there are none.

So, your buffer cache is 20m, small -- sure, but have you identified it as a problem? are your applications primarily waiting on IO?

If so, increase the buffer cache based on the RAM in your system (hopefully, you have only one INSTANCE running on your server).

Assuming Oracle has "the machine" (the machine is a db server) and you are running in dedicated server mode -- save about 1-3meg / connected user and give the rest to the database (perhaps saving 20% for other stuff and the OS)

Eg: say 1 gig of ram, 50 connected users.

200 meg for stuff
150 meg for pga's in the dedicated server
---
350

leaving 650 for your SGA.

This is a "ROT"ten Rule Of Thumb
^ ^ ^

a starting place if you have no idea where else to start.

Of course, if you have other things on this server, other instances, whatever -- all bets are off -- look at the ram you want to use for this instance and use that number instead of the "1gig"

But first and foremost, take some measurements -- how many PIO's do you do? is your major wait for IO? save that information so that after you make any changes you can see what you actually accomplished, what you did.

am I suffering I/O contention?

pinguman, October 14, 2004 - 3:06 am UTC

Hi

I have this statspack report (part of it), working with Oracle 8.1.7.4 on Sun Solaris 8 (Fujitsu Primepower), 6 external disks on RAID-5

Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt Time
-------------------------------------------- ------------ ------------ -------
db file sequential read 60,952 55,221 44.09
db file scattered read 32,709 23,171 18.50
log file sync 2,528 17,757 14.18
db file parallel write 310 11,241 8.98
log file parallel write 3,214 5,733 4.58

Statistic Total per Second per Trans
--------------------------------- ---------------- ------------ ------------
CPU used by this session 85,613 23.5 36.5
CPU used when call started 85,611 23.5 36.5

Tablespace
------------------------------
Av Av Av Av Buffer Av Buf
Reads Reads/s Rd(ms) Blks/Rd Writes Writes/s Waits Wt(ms)
-------------- ------- ------ ------- ------------ -------- ---------- ------
CWS_D
76,592 21 ###### 8.0 919 0 33 36.4
TEMP
5,881 2 521.9 21.2 12,779 4 1 10.0
CWS_I
17,253 5 15.8 1.0 717 0 0 0.0
AURORA20_D
6,298 2 16.4 3.7 185 0 0 0.0
SYSTEM
2,079 1 29.2 1.1 1,079 0 1 10.0
RBS
29 0 34.8 1.0 2,296 1 0 0.0
PERFSTAT
147 0 50.5 1.0 177 0 0 0.0
ADM_AURORA
55 0 34.9 1.9 1 0 0 0.0
FARMACIA
17 0 97.1 3.0 39 0 0 0.0
DWH
3 0 350.0 1.0 2 0 0 0.0

As you can see my wait time is roughly 120000 cs and service time 85613 cs. Most of waits is index scans. Am I correct that my I/O system is suffering contention?

Also if we see average read of each tablespace in milliseconds we see quite high values, from 15ms to 521ms, I think it´s very slow for SCSI disks unfortunately I dont have any other statspack report of other hardware to compare with. Do you agree with me that the disks are quite slow :-?

Cheers

October 14, 2004 - 9:59 am UTC

hehehe, how did you compute service time from stats pack (i gotta get that note pulled one of these days).

I stopped at 500 red lights on my last car trip.

Was I suffering from red light contention? was that

a) good
b) bad
c) neither good nor bad

(warning, which ever one you pick -- I'll successfully prove in turn that the other two were the right answer. For you see the right answer is

d) insufficient data to really say

but, lets see:

db file sequential read 60,952 55,221 44.09
db file scattered read 32,709 23,171 18.50

that would be an average of 0.009056307 seconds/random IO and 0.007062276 per multi-block IO.

sooo, ask your hardware vendor "is that slow"

from this metalink note,

pinguman, October 14, 2004 - 10:13 am UTC

Hi

From note 223117.1 , Tuning I/O-related waits Last Revision Date: 18-MAY-2004 it states how can we determine wait time is affecting the performance or not. If you say we cannot determine the service time from statspack what is that note doing on Metalink...? And being revised again and again...?

I asked if the disks are slow because again statspack shows the average read in ms, if that´s not a correct data what is that for in statspack report?

db file sequential read 60,952 55,221 44.09
db file scattered read 32,709 23,171 18.50

Those are waits and not the time it takes to read a block no? I mean if these data arent meaningful then why we gather all this and generate a report.

Btw the interval is one hour

October 14, 2004 - 10:50 am UTC

(hence my comment above... you know ... (i gotta get that note
pulled one of these days). )

you said "disks slow", i see average response time of 0.009 and 0.007

is that slow?

the data is meaningful -- you just need to understand what you are looking at (eg: you gave NO TIME WINDOW -- if that was a 30 second statspack, maybe I'd be really worried. if that was a 30 minute one -- maybe not)

You need to take the numbers, your KNOWLEDGE of your system, the "goal" <<<=== most important and use them all.

for example:

"User George is complaining about slow performance. George is my boss. I must make George happy. George is running program 'X'. Program 'X' does *no physical IO whatsoever*. "fixing" physical io might actually make Georges life miserable -- as the users that were waiting on IO are now constantly consuming CPU -- which is what George needed MORE of (now he has less)"

Statspack is "system wide". What are you trying to make "go faster" (and why). You need to see what *IT* is most impacted by.

the data is 100% meaningful.
the data must be analyzed 100% in context.

the number 42 -- does it mean anything to you? not out of context like that it doesn't...

I don't see how you can go from "0.009/0.007 seconds average response time" to "i have disk contention" yet.

what was the logic behind it?

what is the rated response time of your hardware itself? how fast is it *supposed* to be (yes, 9ms, 7ms seems a bit slow, 5ms is averagish? (guessing -- i'm not a hardware guy really, don't read the boxes anymore) -- but is that contention induced or configuration induced or is it not relevant at all because it is not affecting George or something else?)

what is average read ms

A reader, October 15, 2004 - 3:05 am UTC

From previous user post, what does Av Rd(ms) in Statspack means?

October 15, 2004 - 11:31 am UTC

from spreport (8i)

, decode( sum(e.phyrds - nvl(b.phyrds,0))
, 0, 0
, (sum(e.readtim - nvl(b.readtim,0)) /
sum(e.phyrds - nvl(b.phyrds,0)))*10) atpr

where phyrds is the number of physical reads, but NOT the number of blocks read...

readtim is time (only measured in 1/100th of a second in 8i)

so, without knowing the number of blocks (the actual IO amounts, how many bits did we read) and with the granularity of the clock being at 1/100th, not really 1/1000th....

I took the WAIT time and divided by the number of waits -- at an aggregate level, seems 'reasonable'. Every time we waited it was for about 0.007 to 0.009 seconds.

what is the use of Av Rd(ms)

A reader, October 15, 2004 - 3:21 pm UTC

Hi

This seems interesting, what is the use of Av Rd(ms)? Is it the disk seek time? The time it takes to do what? If it's the time to perform X reads but if we dont know the number of blocks being read then what's the meaning of this statistic?

October 15, 2004 - 5:59 pm UTC

it cannot be seek time...

it is the time to perform reads
divided by
the number of times a read call was issued

read calls can read varying amounts of information. take it for what it is, the above calculation.

buffer busy wait

aru, October 19, 2004 - 9:58 pm UTC

Hi Tom,

Our database is not running anything in parallel but then why am I getting this wait here.

Vanp > select event, total_waits, average_wait
from v$system_event
where event in ('buffer busy wait','db file parallel write');

EVENT TOTAL_WAITS AVERAGE_WAIT
---------------------------------------------- ------- ------------
db file parallel write 19763 1.70788848

Please could you clarify when and under what circumstances can we get waits for this event and furtermore what exactly is this wait event all about. No docs. that I checked have been real helpful.

Thanks,
Regards,
Aru.

October 20, 2004 - 7:05 am UTC

how's about the reference guide? did you try that:

</code> http://docs.oracle.com/docs/cd/B10501_01/server.920/a96536/apa5.htm#971497 <code>

End time

Yogesh, October 20, 2004 - 7:09 am UTC

Is there any way to know the start and end time of the query in 8.0.4? Some user started the query when he left office, now he want to know how much time it took to execute.

I couldn't find any system table/view. Please help.

October 20, 2004 - 11:18 am UTC

there isn't one by default, you would have wanted to know before hand you wanted to know that.

DBWR

Aru, October 20, 2004 - 5:27 pm UTC

Hi Tom,

Yes I had already read that .....

"db file parallel write
This event occurs in the DBWR. It indicates that the DBWR is performing a parallel write to files and blocks. The parameter requests indicates the real number of I/Os that are being performed. When the last I/O has gone to disk, the wait ends.

Wait Time: Wait until all of the I/Os are completed

Parameters:

files
This indicates the number of files to which the session is writing

blocks
This indicates the total number of blocks to be written

requests
This indicates the total number of I/O requests, which will be the same as blocks"

What confused me was this "It indicates that the DBWR is performing a parallel write to files and blocks"
-Why is the DBWR performing a parallel write to files and blocks when we have only one DBWR configured.
-Under what circumstances does the DBWR do parallel writes and reads?
-Is/are there any other options available for the DBWR?

Thanks Tom,
Regards,
ARU.

October 20, 2004 - 8:52 pm UTC

o why wouldn't it?

o whenever it can (it is faster then write, wait, write, wait, write, wait. it is just WRITE, wait

o serial writing

it is just dbwr sending off of the OS a bunch of stuff to do, rather than one at a time.

Av Rd(ms) statistic

A reader, October 20, 2004 - 5:53 pm UTC

Hi

I have the entire thread and find quite amusing the question about Av Rd(ms) shown in Statspack report. I dont see the use of this statistic neither. If we see 10ms or 510ms so what? What does that mean?

You said

"it is the time to perform reads
divided by the number of times a read call was issued"

A read call can read X blocks so if let's say Oracle makes 4 read calls and each reads 10000 blocks and all this in 10 seconds does this mean Oracle is reading fast enough?

If for example yesterday snapshot between 11 am and 12 am I see 500ms in my temp tablespace and today I see 100ms what does that mean? That my temp is less loaded today or did I have I/O problems yesterday and today not?

It's just that I find many statistics ambiguos, doesnt have meaning to use some of them at all!

Rgds

but the question on spin_count remains

rahul, October 26, 2004 - 6:37 am UTC

Hi Tom,
I wanted to know a bit about spin_count and so i scanned the post. It helped me but i still have a few doubts in mind.
1] Agreed the solution lies in reducing the no. of i/o. but what i would like to know ow to find the value of spin_count
and how do increase it (mean on what basis)
2] will increasing the amount of dml latches help.
3] what will be effect of increasing the no. of processors?

please correct me if i am on the wrong track.

October 26, 2004 - 7:59 am UTC

1) leave it default, period. (thats already been stated on this page)

2) help what exactly?

3) the system could

a) run slower
b) run faster
c) not be changed at all

there are always at least three outcomes -- maybe more (permutations of the above list, some things might be faster -- others slower)

sorry -- but I don't know what "track" you are on, what problem you are trying to solve.

Statspack snapids

Robert, December 29, 2004 - 3:24 pm UTC

Tom,

Is it true that statspack reports compare and produces a report based on the snapshot image captured AT any two given points in time, not BETWEEN any two given points in time.

Thanks,
Robert D. Ware

December 29, 2004 - 7:47 pm UTC

the images (numbers) are captured AT two points in time.

then statspack reports on the differences in the numbers BETWEEN those two points in time.

At noon, you say "take a snapshot", it copies the v$ table numbers.

15 minutes later you say "take another snapshot", it copies the v$ table numbers..

when you generate the report, you say "take 12:15 and 12:00 and report on the difference in the numbers between those two points in time"

A reader, January 20, 2005 - 5:09 pm UTC

Tom,

buffer cache hit ratio = ( 1 ( physical reads / (db block gets + consistent gets))) * 100

I read in some article that this percentage should be greater than 90% and so DB_BLOCK_BUFFERS have to be increased. Is that true? Should DB_BLOCK_BUFFERS be increased?

What is the reason behind this ratio being low? Please
clarify.

January 20, 2005 - 7:40 pm UTC

pardon me, i just fell on the floor laughing... just getting up now.

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:4973488577582 <code>

(suggest you ask the author of the article "why, and are there any cases where this is not true")

Buffer Hit%

Bipul, January 21, 2005 - 9:39 am UTC

buffer hit % has nothing [or very little] to do with performance. In our database [its a typical OLTP, with some batch job] I see buffer hit% between 70 to 95 and performance of db doesn't co-relate with this number. I think a very low number can be alarming.

January 21, 2005 - 12:22 pm UTC

it is more reasonable to say that "a buffer cache hit ratio that CHANGED" is a cause for a flag. That is, you take a system that has been getting a cache hit of X% historically. Today it starts getting a Y% hit. that would be cause for "curiosity" -- not alarm and not joy.

Say that X<Y. Is that good? I say no, it is bad because the developer decided to rule hint a query to "make it use indexes" in the mistaken belief "indexes good, full scans bad". The cache hit went up simply because of the extra 50,000,000,000 LIO's the system is performing trying to run this query (really slowly) now.

But wait, I say yes, it is good -- because we altered some lobs to be nocache (we only read the documents once) and we stopped having them push stuff out of the buffer cache.

But wait, I say neither good nor bad. We just plopped a new application in last night -- the old stuff is still doing what it is doing, the new stuff is going just great, but tends to read a really small table lots.

Same with X>Y -- is it good, bad or indifferent -- the answer is "yes" to all three.

A change in a ratio is an indicator something has changed.

Cache hit of one segment

Bipul, January 24, 2005 - 4:59 pm UTC

Hi Tom.

If I want to measure the cache hit ratio of one particular segment [we are using 9iR2], how do I do that? For example, we have a Oracle text index and I want to measure the cache hit of token table [$I table] of this index.

Regards
-bipul

January 24, 2005 - 5:09 pm UTC

ops$tkyte@ORA9IR2> select statistic_name, value
  2  from v$segstat
  3  where obj# = ( select object_id from all_objects where owner = 'SCOTT' and object_name = 'EMP' );
 
STATISTIC_NAME                                                        VALUE
---------------------------------------------------------------- ----------
logical reads                                                           256
buffer busy waits                                                         0
db block changes                                                         16
physical reads                                                            6
physical writes                                                           8
physical reads direct                                                     0
physical writes direct                                                    0
global cache cr blocks served                                             0
global cache current blocks served                                        0
ITL waits                                                                 0
row lock waits                                                            0
 
11 rows selected.

Bipul, January 25, 2005 - 12:40 pm UTC

Thanks. Exactly what I was looking for.

Plans not coming

A reader, February 08, 2005 - 4:18 pm UTC

Hi Tom,

Taking SP snap at level 6 is supposed to give "SQL Plans and SQL Plan usage". But my SP Report does show these details. Please let me know if I am missing soemthing.

Thanks,
Adam

CPU used by this session

Nilanjan Ray, February 10, 2005 - 2:40 am UTC

Tom,

Just wanted to: what exactly is "CPU used by this session". One site(</code> http://www.ubtools.com/cgi-bin/ib/ikonboard.cgi?act=ST;f=25;t=4 <code> says
<<quote>>
CPU used by this session = parse time cpu + recursive cpu usage + others

This is the most well-known, but wrong formula I've read in many Oracle documentations.

parse time cpu includes parse cpu time of both recursive and user statements. recursive cpu usage includes both parse cpu time and non-parse cpu of recursive statements. That means parse cpu usage of recursive statements is included in both parse time cpu and recursive cpu usage. In other words, it's duplicated and formula above is not correct.

ubTools offers the following formula:

CPU used by this session = parse time cpu + others(exec_and_fetch_time_cpu)
<<quote>>

what is exec_and_fetch_time_cpu ?

Regards

February 11, 2005 - 2:43 am UTC

I am not so sure they are correct. unless they are talking about the description of cpu used by this session (i is not clear to me whether they are saying "the description is wrong" or "the value reported by the statistic is wrong"

if the values were wrong, the cpu times reported for most things would exceed elapsed time by large margins. so, they should be able to demonstrate that for us.

(and you would have to ask the author of an article in most cases "what did you mean by this "exec and fetch time cpu" and how exactly do you think we could find it)

I think they were saying "the description provided is wrong", but I have an easier description.

cpu use by this session is cpu used by that session.

db file sequential read

jm, February 12, 2005 - 1:42 am UTC

hi tom

i am tuning a database

this is what i got earlier in statspack

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 99.94 Redo NoWait %: 99.98
Buffer Hit %: 93.30 In-memory Sort %: 100.00
Library Hit %: 99.73 Soft Parse %: 98.84
Execute to Parse %: 68.87 Latch Hit %: 99.95
Parse CPU to Parse Elapsd %: % Non-Parse CPU:

Shared Pool Statistics Begin End
------ ------
Memory Usage %: 82.05 78.93
% SQL with executions>1: 57.04 59.70
% Memory for SQL w/exec>1: 60.07 62.32

Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (s) Wt Time
-------------------------------------------- ------------ ----------- -------
db file sequential read 5,671,445 50,760 74.40
db file scattered read 1,759,355 9,043 13.25
db file parallel read 124,094 4,830 7.08
buffer busy waits 143,744 1,732 2.54
db file parallel write 12,074 1,231 1.80
-------------------------------------------------------------

Avg
Total Wait wait Waits
Event Waits Timeouts Time (s) (ms) /txn
---------------------------- ------------ ---------- ---------- ------ --------
db file sequential read 5,671,445 0 50,760 9 97.5
db file scattered read 1,759,355 0 9,043 5 30.2

SQL ordered by Gets for DB: GEMS Instance: GEMS Snaps: 4063 -4064
-> End Buffer Gets Threshold: 10000
-> Note that resources reported for PL/SQL includes the resources used by
all SQL statements called within the PL/SQL code. As individual SQL
statements are also reported, it is possible and valid for the summed
total % to exceed 100

CPU Elapsd
Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
161,082,493 1 161,082,493.0 63.0 0.00 0.00 859932146
SELECT DISTINCT ( A . DOCKET_NO ) DOCNO , A . BKG_STN BKGSTN ,
A . DLY_STN DLYSTN , A . OU_CODE OUCD , A . PROD_SERV_CODE SERCD
, C . BOOKING_BASIS BASIS , A . IN_OUT_DATE_TIME INDT , C . ACT
UAL_WT ACTWT , C . CHARGED_WT CHWT , A . GOODS_CODE , B . GOODS_
NAME DESR , A . ASSURED_DLY_DT ASSDT FROM GEMS_CONSIGNMENT_STOCK

5,687,553 3 1,895,851.0 2.2 0.00 0.00 3937628940
SELECT DISTINCT ( A . DOCKET_NO ) DOCNO , A . BKG_STN BKGSTN ,
A . DLY_STN DLYSTN , A . OU_CODE OUCD , A . PROD_SERV_CODE SERCD
, C . BOOKING_BASIS BASIS , A . IN_OUT_DATE_TIME INDT , C . ACT
UAL_WT ACTWT , C . CHARGED_WT CHWT , A . GOODS_CODE , B . GOODS_
NAME DESR , A . ASSURED_DLY_DT ASSDT FROM GEMS_CONSIGNMENT_STOCK

--------------------------------------------------------------
i changed db_file_multiblock read count to 16 from 8
optimizer_index_caching to 90 from default
optimizer_index_cost_adj to 25 from default

and then i got this in statspack

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 99.86 Redo NoWait %: 99.99
Buffer Hit %: 89.71 In-memory Sort %: 100.00
Library Hit %: 99.76 Soft Parse %: 98.94
Execute to Parse %: 77.13 Latch Hit %: 99.95
Parse CPU to Parse Elapsd %: % Non-Parse CPU:

Shared Pool Statistics Begin End
------ ------
Memory Usage %: 63.81 91.49
% SQL with executions>1: 31.43 31.47
% Memory for SQL w/exec>1: 46.43 42.97

Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (s) Wt Time
-------------------------------------------- ------------ ----------- -------
db file sequential read 4,145,170 28,771 74.75
db file scattered read 947,484 3,407 8.85
db file parallel read 65,188 2,575 6.69
buffer busy waits 138,956 2,015 5.24
db file parallel write 13,844 854 2.22

Event Waits Timeouts Time (s) (ms) /txn
---------------------------- ------------ ---------- ---------- ------ --------
db file sequential read 4,145,170 0 28,771 7 44.3
db file scattered read 947,484 0 3,407 4 10.1

-> End Buffer Gets Threshold: 10000
-> Note that resources reported for PL/SQL includes the resources used by
all SQL statements called within the PL/SQL code. As individual SQL
statements are also reported, it is possible and valid for the summed
total % to exceed 100

CPU Elapsd
Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
5,219,628 2,177 2,397.6 5.3 0.00 0.00 2838017588
select parent_owner,parent_name,parent_link_name,parent_type,par
ent_timestamp,property from ora_kglr7_dependencies where owner=:
1 and name=:2 and type=:3 and obj#=:4 order by order_number

4,940,031 60 82,333.9 5.1 0.00 0.00 768710814
Select Zone_Code , Acc_OU , Billing_OU , Cust_Code , Min ( Cust_
Name ) Cust_Name , Sum ( Total_MonthEnd_OS ) + Sum ( Total_Month
End_OS1 ) Total_MonthEnd_OS , Sum ( Amount_Collected ) + Sum ( W
rite_Off ) Amount_Collected , Sum ( Amount_Deducted ) Amount_Ded
ucted , Sum ( Claim_Amt ) Claim_Amt , Sum ( Others ) Others , Su

the most cost taking query has been reduced from earlier cost of 3250 to 74 only but still db file sequential read is taking 74% of total time.

what do u suggest.

February 12, 2005 - 12:40 pm UTC

"u" is out at lunch, maybe when they come back.....

u/you

A reader, February 13, 2005 - 10:49 pm UTC

sorry for referring as u.

but suggest why db file sequential reads are high

A reader, February 14, 2005 - 7:40 am UTC

February 14, 2005 - 8:37 am UTC

because you are doing lots of index reads. perhaps it was the setting of optimizer related parameters that says "indexes are cached", when in fact -- they are not.

A reader, February 14, 2005 - 11:08 pm UTC

i changed db_file_multiblock read count to 16 from 8
optimizer_index_caching to 90 from default
optimizer_index_cost_adj to 25 from default

even then statspack shows the same top 5 with same percentage.

February 15, 2005 - 3:14 pm UTC

so? those just made indexes "even more appealing", you increases db file sequential reads (INDEX reads typically)

you said "indexes are cached", but -- they apparently were not.

you said "table scans are more expensive", but -- they might not be.

it did exactly what you asked it to do it appears.

10G TUNING APPROACH

A reader, February 21, 2005 - 12:11 pm UTC

Hi Tom,
Why he says
</code> http://www.adp-gmbh.ch/ora/misc/10g.html <code>
"statspack will be somewhat obsolete with 10g although it is still around. It's functionality is "taken over" by the Oracle kernel and the results can be made visible with enterprise manager (EM)."

Is something better to use than statspack in 10g.
Please if you can clary
Thank you.

February 21, 2005 - 12:59 pm UTC

AWR, ADDM

yes. consider them statspack on steroids.

</code> http://docs.oracle.com/docs/cd/B14117_01/server.101/b10752/toc.htm <code>

Thanks Tom

A reader, February 21, 2005 - 1:25 pm UTC

Oracle Consultant

Dawar, February 22, 2005 - 5:38 pm UTC

OS: Linux
DB Version: 10.1.0.2.0

How will I increase shared pool size?
I would like to add another 50 MB to the current size.

Regards,
Dawar

February 23, 2005 - 1:48 am UTC

if you set up the sga_max_size larger than the initial sga size, you can use alter system to add N*granule_size bits of ram to a pool online (granules are the minimum size of memory chunks in 9i and up, typically 4 or 8 meg in size).

if your sga_max_size = sga current size, then you would have to shrink some other component by 50m and then add it to the shared pool using alter system again.

Or, use alter system to set the shared_pool_size = itself+50m scope=spfile and restart.

ADDM and AWR

A reader, February 23, 2005 - 6:01 am UTC

In 10g using ADDM and AWR, is it still good practise to use snapshot periods around 15-30 minutes to identify the possible problem more precise?

February 23, 2005 - 9:14 am UTC

if you want to "tune an application", yes -- addm/awr can make suggestions and spot *trends* using the hour long views (they are especially good at trending and noticing that "something has changed over time").

If you were to ask me to look at it, I'd still be asking for a smaller window.

read time

reader, March 11, 2005 - 3:26 am UTC

Tom,

The average read time on our temp tablespace as per statspack report is as below:(The same is less than 10ms on other tablespaces)

TEMP /d1/oradata/PROD/temp1.dbf
5,681 5 269.5 8.1 2,487 2 0

I think that 269.5 ms is very much high. What could be the possible cause for this?

The temp tablespace size is 1GB and the pga_aggregate_target size is 200M.

Thanks and Regards.

March 11, 2005 - 6:16 am UTC

use OS utilities to measure this, async io -- multiblock reads and other things can get in the way of these numbers.

you want the raw throughput numbers, these are aggregates (read call for N blocks took "x ms", not single block io's all)

low execute to parse in statspack report

jianhui, March 15, 2005 - 12:06 pm UTC

Tom,
During several days time window the STATSPACK reports show Execute to parse ratio is pretty low like 30%, 60%, and with relative low soft parse ratio accordingly like 80%. First of all, other than fixing the application code to utilize more soft parse with bind variables and PL/SQL, is it anything can be done at database instance level to improve that? Secondly, how to find those codes(SQLs) that use poor parse approach, as developers are using lots of third party tools on which they have no or little control?
Many thx

March 15, 2005 - 9:04 pm UTC

well, in the olden days of client server, a exec/parse ratio of 90+% would be common (you logged in at 9am, run the application for the day, log out -- plenty of reuse chances)

then enter the app server tier. Now each page is a "connection pool grab", limited sharing. so 40-60% becomes normal.

Negative is always bad.

session_cached_cursors may be of some use here.

Relevancy of Top Statspack wait events and Zoran's post

AR, April 27, 2005 - 6:43 pm UTC

Tom,
Way up on this thread, Zoran Martic from Dublin, quoted Steven Adams and explained why the "top 5 wait" events listed by Statspack are not usually very useful. You seem to have generally agreed with him as well.

I tried reading his posts a few times. But I am having a hard time comprehending it! Why are these top Statspack "background events" not relevent when trying to figure a slowdown with a "foreground" process? Why are events like "control file parallel write", "db file parallel write" not relevent to "foreground" processes? I don't get it :(.

It'll be great if you could explain the fundamental differences between Steve Adams script ( </code> http://www.ixora.com.au/scripts/sql/response_time_breakdown.sql <code>)and the Statspack top events.

Many many thanks.
- AR

April 27, 2005 - 8:28 pm UTC

The problem with the background waits is that a background waits on them, clients do not. clients are the ones that are concerned about response times.

Consider:

biggest wait is log file parallel wait.

do you care?

not unless clients are waiting on lots of log file syncs. clients waiting on lgwr wait on log file sync, lgwr might wait in log file parallel write. If clients are not waiting on lgwr, who cares what lgwr is waiting on.

DFS LOCK HANDLE

bipul, April 28, 2005 - 11:44 am UTC

Hi Tom.

Recently I have started seeing some slowness in the performance of our database and I was looking at the statspack report for the period when performance is bad. The top 5 Timed events is

Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
CPU time 916 24.80
DFS lock handle 23,314 621 16.80
global cache cr request 211,299 617 16.71
db file sequential read 90,770 501 13.57
async disk IO 5,712 233 6.31

We collect statspack data every 30 minutes and each node has 2 cpu [ its a 2 node cluster]. I would really appreciate, if you could explain the "DFS lock handle" wait event. Whats the cause and how to resolve it?

Thanks
bipul

April 28, 2005 - 2:01 pm UTC

I'd much rather backup to the application here -- can you trace AN APPLICATION instance and see what the APPLICATIONS (which is what people are waiting for) are waiting for.

this can be caused by lots of stuff. do you truncate, drop, use parallel query, update intensive (you'd need to explain what you do)

A reader, May 02, 2005 - 4:50 am UTC

My statspack info as follows. I have a problem, my buffer hit ratio is soo low. But Physical reads are soo high. Could you please tell why is the reason for this?

Cache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 128M Std Block Size: 8K
Shared Pool Size: 128M Log Buffer: 512K

Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 2,254.49 6,437.35
Logical reads: 903.76 2,580.55
Block changes: 7.18 20.51
Physical reads: 819.44 2,339.79
Physical writes: 0.85 2.43
User calls: 12.93 36.92
Parses: 8.25 23.57
Hard parses: 0.87 2.49
Sorts: 1.31 3.75
Logons: 0.00 0.00
Executes: 13.64 38.95
Transactions: 0.35

% Blocks changed per Read: 0.79 Recursive Call %: 80.37
Rollback per transaction %: 0.00 Rows per Sort: 30.35

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 100.00 Redo NoWait %: 100.00
Buffer Hit %: 9.33 In-memory Sort %: 100.00
Library Hit %: 97.51 Soft Parse %: 89.45
Execute to Parse %: 39.49 Latch Hit %: 99.50
Parse CPU to Parse Elapsd %: 95.69 % Non-Parse CPU: 89.43

Shared Pool Statistics Begin End
------ ------
Memory Usage %: 80.75 91.12
% SQL with executions>1: 22.85 21.31
% Memory for SQL w/exec>1: 20.03 19.38

Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
db file scattered read 47,736 40 51.38
CPU time 36 45.74
db file sequential read 257 0 .64
log file parallel write 381 0 .56
SQL*Net break/reset to client 8,147 0 .54
-------------------------------------------------------------

May 02, 2005 - 8:31 am UTC

why is it a problem? there is insufficient data here to comment on anything.

no idea how long the snap was for.
what sql was running.

this could have been a simple full table scan of a big table with not much else going on and is perfectly "awesome" for all we know.

see
</code> http://www.jlcomp.demon.co.uk/statspack_01.html <code>

A reader, May 20, 2005 - 2:05 pm UTC

I am asked to interpret STATSAPACK reports of 5 different databases. What STATSPACK sections should I check? Can you please explain the following?

Thanks in advance!!

Cache Sizes (end)
~~~~~~~~~~~~~~~~
Buffer Cache: 784M Std Block Size: 8K
Shared Pool Size: 512M Log Buffer: 512K

Load Profile
~~~~~~~~~~~ Per Second Per Transaction

Redo size: 1,204.41 117,251.35
Logical reads: 95.86 9,332.38
Block changes: 7.78 757.09
Physical reads: 0.18 17.12
Physical writes: 0.12 12.01
User calls: 0.90 87.65
Parses: 1.16 112.99
Hard parses: 0.01 0.64
Sorts: 0.88 85.31
Logons: 0.02 1.69
executes: 2.00 194.30
Transactions: 0.01

shailesh saraff, June 02, 2005 - 8:31 am UTC

Hello Tom,

How can I get queries which have taken more elapsed time from statspack report. Tkprof allows us to sort on elapsed time for parse/execute/fetch and filter. Is this possible in statspack.
Please let me know.

Does anyone knows any other site like oraperf.com for statspack report analysis?

Thanks & Regards,

Shailesh

statspack report

dhamodharan.L, November 21, 2005 - 9:27 am UTC

the answer u had given is very useful for me.
and u please give me the query which is poorly performing,using this statspackreport.please help me.
note:(in this application loading the screen itself is slow,i want to know which query is used to load the screen,is there any tools in oracle other than stats pack to find out the poor performing queries?please give me good guidance,thank you).iam expecting you alot

November 21, 2005 - 9:53 am UTC

use sql_trace and tkprof.

documented in the performance guide (and thousands of examples on this site)

Statspack of a DWH database !!

Charlee, February 15, 2006 - 4:53 pm UTC

Hello Tom,

One of my DWH database giving performance problem. Top 5 events are..
Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
db file sequential read 715,234 6,308 25.09
PX Deq: Execute Reply 3,010 5,510 21.92
CPU time 4,714 18.75
io done 430,912 3,631 14.44
db file scattered read 132,392 2,927 11.64

What could be wrong ? Can you pls suggest something ( From OS side CPU shows 0% Idle). Many thanks for your reply.

February 15, 2006 - 9:53 pm UTC

looks perfect.

I'm assuming this is for a 15 week period of time.

(yes, sarcasm).....

slightly "insufficient data" to say anything about anything.

Statspack of a DWH database !!

Charlee from London, February 16, 2006 - 8:03 pm UTC

Tom,

This is one hour interval data..and DB performance is very slow.. I am pasting here some other events. Pls suggest me what i can do here .. Thanks a lot

Cache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 240M Std Block Size: 16K
Shared Pool Size: 160M Log Buffer: 1,024K

Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 275,011.55 30,736.09
Logical reads: 4,056.67 453.39
Block changes: 1,053.13 117.70
Physical reads: 760.03 84.94
Physical writes: 211.14 23.60
User calls: 298.66 33.38
Parses: 32.77 3.66
Hard parses: 0.13 0.01
Sorts: 6.35 0.71
Logons: 0.04 0.00
Executes: 237.19 26.51
Transactions: 8.95

% Blocks changed per Read: 25.96 Recursive Call %: 15.05
Rollback per transaction %: 2.73 Rows per Sort: 683.19

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 100.00 Redo NoWait %: 99.99
Buffer Hit %: 83.68 In-memory Sort %: 99.87
Library Hit %: 99.86 Soft Parse %: 99.60
Execute to Parse %: 86.18 Latch Hit %: 99.98
Parse CPU to Parse Elapsd %: 32.61 % Non-Parse CPU: 98.77

Shared Pool Statistics Begin End
------ ------
Memory Usage %: 94.14 94.31
% SQL with executions>1: 37.45 38.41
% Memory for SQL w/exec>1: 37.53 38.46

Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
db file sequential read 715,234 6,308 25.09
PX Deq: Execute Reply 3,010 5,510 21.92
CPU time 4,714 18.75
io done 430,912 3,631 14.44
db file scattered read 132,392 2,927 11.64
-------------------------------------------------------------

Wait Events for DB: CUBDB Instance: CUBDB Snaps: 3999 -4000
-> s - second
-> cs - centisecond - 100th of a second
-> ms - millisecond - 1000th of a second
-> us - microsecond - 1000000th of a second
-> ordered by wait time desc, waits desc (idle events last)

Avg
Total Wait wait Waits
Event Waits Timeouts Time (s) (ms) /txn
---------------------------- ------------ ---------- ---------- ------ --------
db file sequential read 715,234 0 6,308 9 19.9
PX Deq: Execute Reply 3,010 2,806 5,510 1830 0.1
io done 430,912 27,527 3,631 8 12.0
db file scattered read 132,392 0 2,927 22 3.7
log file sync 18,818 336 886 47 0.5
log buffer space 937 245 398 425 0.0

Background Wait Events for DB: CUBDB Instance: CUBDB Snaps: 3999 -4000
-> ordered by wait time desc, waits desc (idle events last)

Avg
Total Wait wait Waits
Event Waits Timeouts Time (s) (ms) /txn
---------------------------- ------------ ---------- ---------- ------ --------
io done 430,843 27,515 3,629 8 12.0
log file parallel write 60,862 0 245 4 1.7
db file parallel write 8,491 0 79 9 0.2
control file parallel write 2,216 0 67 30 0.1
db file single write 4,192 0 53 13 0.1
log buffer space 373 1 10 26 0.0
db file sequential read 4,372 0 6 1 0.1
LGWR wait for redo copy 334 100 2 7 0.0

SQL ordered by Gets for DB: CUBDB Instance: CUBDB Snaps: 3999 -4000
-> End Buffer Gets Threshold: 10000
-> Note that resources reported for PL/SQL includes the resources used by
all SQL statements called within the PL/SQL code. As individual SQL
statements are also reported, it is possible and valid for the summed
total % to exceed 100

CPU Elapsd
Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
2,755,724 1 2,755,724.0 16.9 134.39 938.95 3567182773
Module: wiqt.exe
SELECT D_CONTRACT.CONTRACT_ID, R_SYSTEM.DESCRIPTION, D_
CONTRACT.PDM_CONTRACT_START_DATE, D_CONTRACT.PDM_CUR_CONTRACT
_MATURITY_DATE, decode(D_VENDOR.SOURCE_PLATFORM_ID,'GBR', D_V
ENDOR.VENDOR_NAME,D_VENDOR.VENDOR_BRANCH_NAME), D_CUSTOMER.CU
STOMER_NAME, R_COLLATERAL.DESCRIPTION, R_COLLATERAL.DW_COD

2,171,491 2,391 908.2 13.3 138.88 693.26 3617941024
Module: pmdtm@ukkinua01ceefge (TNS V1-V3)
INSERT INTO PDM_DBE.F_RENTAL(CONTRACT_KEY,GE_FISCAL_MONTH_KEY,RE
NTAL_DUE_DATE_KEY,RENTAL_TYPE_REF_KEY,PLATFORM_REF_KEY,SYSTEM_RE
F_KEY,VENDOR_KEY,CHANNEL_REF_KEY,SUB_CHANNEL_REF_KEY,SALES_GROUP
_REF_KEY,CURRENCY_REF_KEY,RENTAL_PERIOD,INTEREST_INCOME,FINANCIA
L_RENTAL_AMOUNT,TOTAL_RENTAL_AMOUNT,LIFE_INS_PREMIUM_AMOUNT,NON_

1,993,489 496,763 4.0 12.2 158.28 468.84 2971466967
Module: pmdtm@ukkinua01ceefge (TNS V1-V3)
UPDATE PDM_STG.T_PDM_STG_RENTAL SET PROC_TAG = :1 WHERE FILE_ID
= :2 AND REC_NO = :3

SQL ordered by Reads for DB: CUBDB Instance: CUBDB Snaps: 3999 -4000
-> End Disk Reads Threshold: 1000

CPU Elapsd
Physical Reads Executions Reads per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
1,282,800 17 75,458.8 42.0 1619.38 10847.80 2917432900
Module: busobj.exe
SELECT COUNT(DISTINCT A_CONTRACT_PORTFOLIO_CUBE.CONTRACT_KEY)
, TO_CHAR(D_DATE.FIN_YEAR) || ' - ' || lpad(TO_CHAR(D_DATE.FI
N_MONTH_OF_YEAR),2,0), SUM(A_CONTRACT_PORTFOLIO_CUBE.CURRENT_
OEC*USD_CONV_RATE_PNL), R_CHANNEL.DESCRIPTION, SUM(A_CONTR
ACT_PORTFOLIO_CUBE.GUARANTEED_RESIDUAL_AMOUNT*USD_CONV_RATE_PNL)

231,709 1 231,709.0 7.6 423.33 2537.14 1022648319
Module: PL/SQL Developer
SELECT R_CHANNEL.DESCRIPTION, R_COLLATERAL.DESCRIPTION, R_PLA
TFORM.DESCRIPTION, SUM(A_CONTRACT_PORTFOLIO.ORIG_COST_OF_EQUIPM
ENT_USD), SUM(A_CONTRACT_PORTFOLIO.RESIDUAL_VALUE_USD), COUNT(
DISTINCT D_CONTRACT.CONTRACT_ID), COUNT( DISTINCT DECODE(D_CON
TRACT_STATUS.LIVE, 'Y', D_CONTRACT.CONTRACT_ID, NULL)) FROM D_

164,914 1 164,914.0 5.4 84.28 516.41 3670692112
Module: PL/SQL Developer
select count(*) from a_contract_portfolio where platform_ref_key
=20003 and system_ref_key=45026 and month_key between 38534 an
d 38687

Instance Activity Stats for DB: CUBDB Instance: CUBDB Snaps: 3999 -4000

Statistic Total per Second per Trans
--------------------------------- ------------------ -------------- ------------
CPU used by this session 471,444 117.3 13.1
CPU used when call started 471,871 117.4 13.1
CR blocks created 2,198 0.6 0.1
Cached Commit SCN referenced 17,820 4.4 0.5
Commit SCN cached 3 0.0 0.0
DBWR buffers scanned 806,990 200.7 22.4
DBWR checkpoint buffers written 86,266 21.5 2.4
DBWR checkpoints 22 0.0 0.0
DBWR free buffers found 608,584 151.4 16.9
DBWR lru scans 2,149 0.5 0.1
DBWR make free requests 3,447 0.9 0.1
DBWR revisited being-written buff 2,630 0.7 0.1
DBWR summed scan depth 806,990 200.7 22.4

February 17, 2006 - 1:32 pm UTC

tea leaves....

you spent a lot of time waiting for physical IO, but you already knew that.

Really, nothing much can be determined from this. Other than you have queries that are "big", do lots of physical IO.

look at techniques to make them go faster - materialized views, bitmap indexes, segment space compression, etc.

Yet another Statspack report

Jason Flemming, February 27, 2006 - 6:21 pm UTC

STATSPACK report for

DB Name DB Id Instance Inst Num Release Cluster Host
------------ ----------- ------------ -------- ----------- ------- ------------
CDXWEB 373975140 cdxweb 1 9.2.0.3.0 NO CHCODEORCL1

Snap Id Snap Time Sessions Curs/Sess Comment
------- ------------------ -------- --------- -------------------
Begin Snap: 821 27-Feb-06 00:01:38 14 2.1
End Snap: 823 27-Feb-06 00:31:44 14 2.1
Elapsed: 30.10 (mins)

Cache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 512M Std Block Size: 8K
Shared Pool Size: 512M Log Buffer: 512K

Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 20,407.11 5,754.14
Logical reads: 2,020.52 569.72
Block changes: 45.62 12.86
Physical reads: 382.12 107.75
Physical writes: 11.03 3.11
User calls: 731.90 206.37
Parses: 39.49 11.13
Hard parses: 1.25 0.35
Sorts: 19.43 5.48
Logons: 1.80 0.51
Executes: 40.18 11.33
Transactions: 3.55

% Blocks changed per Read: 2.26 Recursive Call %: 12.06
Rollback per transaction %: 0.00 Rows per Sort: 18.88

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 100.00 Redo NoWait %: 99.99
Buffer Hit %: 81.51 In-memory Sort %: 99.99
Library Hit %: 98.45 Soft Parse %: 96.83
Execute to Parse %: 1.71 Latch Hit %: 99.95
Parse CPU to Parse Elapsd %: 102.58 % Non-Parse CPU: 95.66

Shared Pool Statistics Begin End
------ ------
Memory Usage %: 91.70 91.69
% SQL with executions>1: 35.89 35.90
% Memory for SQL w/exec>1: 39.86 39.84

This is a Statspack report I have taken from one of our applications. It is an OLTP application. The execute to parse ratio is 1.71 %. Could you guide me as to why it could be so low? Where should i start looking for problems.

February 27, 2006 - 6:40 pm UTC

the only reason it could be that low (and that soft parse percent is pretty low too!) is because........

The application is parsing a lot and not reusing cursors. The only thing - the ONLY thing that can cause this is by an application doing

parse/ (please bind!) / execute / close / goto PARSE

over and over. Probably doesn't use any PLSQL (which caches sql nicely for you).

You are likely spending as much, if not more, time PARSING SQL as you are executing it!

see how you parse just about the same number of things you execute per second... You have to look at the application to fix this, there is nothing you can really do at the database level (there is session cached cursors which can marginally help, but won't affect these reported numbers at all)

Absolutely right!!!

Jason Flemming, February 27, 2006 - 9:39 pm UTC

Tom,
That is exactly how this application is architectured. It is a third party application that we are using. There is absolutely no PL/SQL procedures used, it is all Java.

Thanks a lot for your Input, Tom, you are the Best!

Better Understanding

Jason Flemming, February 28, 2006 - 11:03 am UTC

This is the output from v$sql. This includes SQL_TEXT,PARSE_CALLS,EXECUTIONS.

SELECT * FROM CDX_VisitModified WHERE RTRIM(SiteID)=:1 41703 41704

select privilege#,level from sysauth$ connect by grantee#=prior privilege# and p
rivilege#>0 start with (grantee#=:1 or grantee#=1) and privilege#>0 10640 10640

ALTER SESSION SET NLS_LANGUAGE = 'AMERICAN' 10637 10637

SELECT VALUE FROM NLS_INSTANCE_PARAMETERS WHERE PARAMETER ='NLS_DATE_FORMAT' 10637 10637

ALTER SESSION SET NLS_TERRITORY = 'AMERICA' 10637 10637

SELECT * FROM CDX_TBPatient WHERE RTRIM(SiteID)=:1 AND RTRIM(TableName)=:2 9343 9343

SELECT * FROM CDX_TBSession WHERE RTRIM(SessionID)=:1 6657 6657

UPDATE CDX_TBSession SET LastUpdated=:1, CDXData=:2 WHERE RTRIM(SessionID)=:3 907 907

Bind variables are used wherever necessary, even then number of parsed calls = executions. How should the system be architected to correct this?

Thanks again

March 01, 2006 - 7:35 am UTC

The recursive sql issued by oracle (select privilege#...) and the DDL (alter) will always look like this. We cannot "cache" our cursors (as our recursive sql would deplete your applications set of available cursors). And hopefully, recursive sql isn't done unnecessarily (eg: you don't hard parse like mad for example)

the other ones - you have to go back to the application and tell them to NOT parse/bind/execute/close - but rather to parse once and execute many.

Put is all into PLSQL and it'll cache itself.

I feel bad for you:

SELECT * FROM CDX_TBPatient WHERE RTRIM(SiteID)=:1 AND RTRIM(TableName)=:2
9343 9343

SELECT * FROM CDX_TBSession WHERE RTRIM(SessionID)=:1
6657 6657

UPDATE CDX_TBSession SET LastUpdated=:1, CDXData=:2 WHERE RTRIM(SessionID)=:3
907 907

those rtrims have to hurt don't they.... OUCH that hurts.

Rtrims did hurt!

Jason Flemming, March 01, 2006 - 12:58 pm UTC

The RTRIMS did hurt, until i built some function based indexes and we got some decent performance improvement. Could you help me understand (maybe couple of lines of java code) on how to achieve it.

Tom, thanks a lot for your contibution. I have both your books and read them very often. On a different note did you get to writing the book on Analytics you had promised you would write.

March 01, 2006 - 1:52 pm UTC

no java, we are just doing SQL?

I would question the need for rtrim in the first place. tell me why it is needed, and I'll tell you why you have a bug in your developed application!!!

I don't recall promising to write a book about analytics?

Maybe not promised !

Jason, March 01, 2006 - 6:24 pm UTC

Maybe you did not promise you mentioned you were thinking of writing a book on Analytics

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:12864646978683 <code>

Thanks for the reply to my original query. I have found the driection i need to pursue

March 02, 2006 - 9:12 am UTC

"were thinking of"

is not equal to

"promised to write"

physical writes non checkpoint sruge

jianhui, March 23, 2006 - 12:05 pm UTC

Hi Tom,
I heard that parallel query will causes physical writes non checkpoint since dirty buffer needs to be written to disk for the parallel read.

I have a database showing physical writes non checkpoint surge from 1-2 digits per 10 seconds to 4 digits (5000) range per 10 seconds and lasts for a bout 50 seconds (we sample the system stats every 10 seconds). However I dont see any parallel query around the time window at all.

(1)Are there other activities causing this non checkpoint write to jump all in a suddern?

(2)Could it be cache buffer too small causing dirty buffer forced flushed out to make room?

(3)How should I deal with them?

The database is 9205 on solaris, fast_start_mttr_target is 0.

Thank you.

March 23, 2006 - 1:48 pm UTC

they are blocks that had to be written because we needed the buffers - does not mean anything "bad" necessarily - you would need a WAIT for something like "free buffer waits" before you get concerned.

This is a statistic, do you have some corresponding increase in waits you need to account for?

high physical writes non checkpoint

jianhui, March 24, 2006 - 12:55 am UTC

Hi Tom,
Thank you for the answer. The highest wait during the same time was enqueue, however I did not have detail about what enqueue was. Could it be possible to tell based on your experience, what type of enqueues could be on the list to consider first? Any connection between enqueue and high non checkpoint writes?
Thanks

March 24, 2006 - 9:39 am UTC

enqueues are TYPICALLY for rows

but there are lots
</code> http://docs.oracle.com/docs/cd/B19306_01/server.102/b14237/enqueues.htm#i855817 <code>

if it was TC, it could be related. but we'd need that detail.

oracle document not so complihensive

jianhui, March 24, 2006 - 5:15 pm UTC

Tom,
Thanks for the link. Sometimes I am frastrated when I want to find out what event means, what latch means, what enqueue means, all I have from oracle document is the name of it, without whole lot of explaination, so without helps from someone with oracle, all I can do is guess the meaning by the name and it's helpless many cases. I guess that business secret oracle does not want to make them public :-) I am stucked here at the reference document, google search or visiting your site is the option left for me :-)

Wait Event Question

Len, March 25, 2006 - 8:15 am UTC

Hi Mr.Kyte,

I was ask to look at statspack that run for 1200 min.
Actualy , im not gooing to check it, but when i looked at
the top 5 waut event i found that the top wait event
took long for 11,000,000 cs.

if the report long was 1200 min, how come that 11,000,000/100/60 = 1800 min is bigger ?

Thank You.

March 25, 2006 - 9:09 am UTC

1200 minutes, only about 1200-15 minutes too long!

Let's say the report was for 1 minute instead.

And during that 1 minute I had the EMP table locked.

And during that 1 minute 500 users tried to update rows in EMP (but would be blocked).

We would experience in that 1 minute 500 minutes of enqueue waits.

Multiple sessions waiting simultaneously each contribute their bit of wait time. It is very easy to get more "wait time" than "elapsed wall clock time" since N sessions can all be waiting for the same thing.

Sessions Question

Lena, March 27, 2006 - 1:31 pm UTC

Hi Mr. Kyte
I have a question abount the "session" column at the bigging of the statspack report.

I took the folowing steps:
1. startup oracle on my laptot.
2. the first thing after oracle opened :
execute statspack.snap -- twice.
3. @spreports.sql;

The result was as follow:
DB Name DB Id Instance Inst Num Release RAC Host
------------ ----------- ------------ -------- ----------- --- ----------------
xxxx 1112574731 xxxx 1 10.1.0.3.0 NO localhost.locald
omain

Snap Id Snap Time Sessions Curs/Sess Comment
--------- ------------------ -------- --------- -------------------
Begin Snap: 21 26-Mar-06 21:25:42 18 4.9
End Snap: 22 26-Mar-06 21:34:01 18 5.7
Elapsed: 8.32 (mins)

Could you please explain , how come sessions = 18 ?

Thanks.

March 27, 2006 - 3:24 pm UTC

select count(*) from v$session. You have 18 sessions.

select * from v$session and you can see them, they are the Oracle backgrounds.

Additional question about sessions in statspack

Len, April 03, 2006 - 4:04 pm UTC

Hi Tom,
Thanks for your answer.
I would like to ask another question about this issue.
In your presentation: "Tools I Use", in the section about
the statspack there were more then 67,000 sessions in your
statspack example.
1. Its look to me as a HUGE number of sessions.
Even if there were 1000 users , and each one of them
had 40 sessions (active or inactive) there are still
27,000 sessions missing.
Could you please explain this number ?
After all its realy a huge number of sessions, should i
be worry if get numebr in this size at my statspack ?
2. Is there any init.ora parameter that set the
maximum number of sessions ?
3. Where i can find more information about each section of
the statspack (expect of expert-one-on-one :)) ?
Thanks for your patience.

Is there a resource contention?

jade, April 10, 2006 - 10:41 am UTC

Hi Tom,

I have gathered a stats report to indentify if there is resource contention. But I really need your help. I don't have old report to do the comparison... :(

We have an application deployment one week ago which involved middle tier java codes and database packages changes. Every since then, we get a CPU hit on database from a scheduled job that run every 5 minutes on application ((40% of CPU usage for that single process). This problem had never happened before the deployment, and the system load is almost the same as before.

Two things I really don't understand why this happen is:
1. I am not able to reproduce the same problem in other environment
2. The CPU hit happens any time even there is no transactions going on in the system. The job does full table scan on a table which is only 300-400 rows at this moment. that is a very small table.

I have attached a stats report, hope you can help me identify the problem:

STATSPACK report for

DB Name DB Id Instance Inst Num Release Cluster Host
------------ ----------- ------------ -------- ----------- ------- ------------
DEXIT01 257137264 DEXIT01 1 9.2.0.1.0 NO scdb1

Snap Id Snap Time Sessions Curs/Sess Comment
------- ------------------ -------- --------- -------------------
Begin Snap: 19 09-Apr-06 18:00:40 ####### .0

End Snap: 20 09-Apr-06 19:00:42 ####### .0

Elapsed: 60.03 (mins)

Cache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 1,520M Std Block Size:
4K
Shared Pool Size: 512M Log Buffer:
2,048K

Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 192.26 2,849.89
Logical reads: 45,699.52 677,406.02
Block changes: 0.44 6.56
Physical reads: 129.37 1,917.66
Physical writes: 3.24 47.98
User calls: 1.41 20.95
Parses: 0.85 12.57
Hard parses: 0.00 0.00
Sorts: 0.53 7.86
Logons: 0.04 0.64
Executes: 1.29 19.07
Transactions: 0.07

% Blocks changed per Read: 0.00 Recursive Call %: 72.51
Rollback per transaction %: 98.77 Rows per Sort: 19.92

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 100.00 Redo NoWait %: 100.00
Buffer Hit %: 99.72 In-memory Sort %: 100.00
Library Hit %: 100.00 Soft Parse %: 100.00
Execute to Parse %: 34.11 Latch Hit %: 100.00
Parse CPU to Parse Elapsd %: 85.19 % Non-Parse CPU: 99.98

Shared Pool Statistics Begin End
------ ------
Memory Usage %: 33.00 33.00
% SQL with executions>1: 89.55 89.55
% Memory for SQL w/exec>1: 76.61 76.61

Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
CPU time 2,820 93.99
db file scattered read 58,227 153 5.10
direct path write 181 9 .29
db file sequential read 19,758 7 .25
control file parallel write 1,169 4 .13
-------------------------------------------------------------
^LWait Events for DB: DEXIT01 Instance: DEXIT01 Snaps: 19 -20
-> s - second
-> cs - centisecond - 100th of a second
-> ms - millisecond - 1000th of a second
-> us - microsecond - 1000000th of a second
-> ordered by wait time desc, waits desc (idle events last)

Avg
Total Wait wait Waits
Event Waits Timeouts Time (s) (ms) /txn
---------------------------- ------------ ---------- ---------- ------ --------
db file scattered read 58,227 0 153 3 239.6
direct path write 181 0 9 48 0.7
db file sequential read 19,758 0 7 0 81.3
control file parallel write 1,169 0 4 3 4.8
buffer busy waits 1,652 0 4 2 6.8
ARCH wait on SENDREQ 59 0 2 30 0.2
control file sequential read 901 0 0 0 3.7
latch free 178 19 0 2 0.7
process startup 3 0 0 76 0.0
log file sync 7 0 0 28 0.0
db file parallel write 10 5 0 10 0.0
log file parallel write 22 22 0 2 0.1
SQL*Net more data to client 1,521 0 0 0 6.3
direct path read 37 0 0 1 0.2
undo segment extension 242 242 0 0 1.0
virtual circuit status 4,806 4,807 140,072 29145 19.8
SQL*Net message from client 5,349 0 28,791 5383 22.0
jobq slave wait 66 63 196 2966 0.3
SQL*Net message to client 5,348 0 0 0 22.0
-------------------------------------------------------------
^LBackground Wait Events for DB: DEXIT01 Instance: DEXIT01 Snaps: 19 -20
-> ordered by wait time desc, waits desc (idle events last)

Avg
Total Wait wait Waits
Event Waits Timeouts Time (s) (ms) /txn
---------------------------- ------------ ---------- ---------- ------ --------
control file parallel write 1,169 0 4 3 4.8
process startup 3 0 0 76 0.0
control file sequential read 468 0 0 0 1.9
db file parallel write 10 5 0 10 0.0
log file parallel write 22 22 0 2 0.1
rdbms ipc reply 4 0 0 10 0.0
rdbms ipc message 4,313 4,281 24,056 5578 17.7
smon timer 13 12 3,419 ###### 0.1
-------------------------------------------------------------
^LSQL ordered by Gets for DB: DEXIT01 Instance: DEXIT01 Snaps: 19 -20
-> End Buffer Gets Threshold: 10000
-> Note that resources reported for PL/SQL includes the resources used by
all SQL statements called within the PL/SQL code. As individual SQL
statements are also reported, it is possible and valid for the summed
total % to exceed 100

CPU Elapsd
Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
165,196,310 85 1,943,486.0 100.4 2730.90 2693.33 3893717104
BEGIN dtx.TxMover_sp(:1,:2,:3,:4); END;

164,823,840 85 1,939,104.0 100.1 2724.06 2686.56 1068249169
DELETE FROM transactions t WHERE status = :b1
AND trans_id != (SELECT MAX (tra
ns_id) FROM transactions tt
WHERE t.pos_device_id = tt.pos_device_id
AND status = :b1)

April 11, 2006 - 10:19 am UTC

the job is doing something that uses cpu - trace the job.

turn on sql trace, run the job in the foreground in sqlplus, exit sql plus, use tkprof to see what it does.

and why is it a problem for something to use cpu?

Trace file for the process

A reader, April 12, 2006 - 11:36 am UTC

Hi Tom,

Thanks. I am pretty new this. The reason we consider this a problem because this application has been running in the system for the past 2 years and it had never created Memory/IO/CPU high usage before the new changes take in place a couple weeks ago. It is such an important module that keeps running with only 3 second of interval. It is being considered a potential bottleneck. We are still trying to do the load testing.

I have gathered the trace file by running it on frond-end. But I need your expert oppinions on this. Appreciate your help!!!

TKPROF: Release 9.2.0.1.0 - Production on Wed Apr 12 10:43:08 2006

Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.

Trace file: dexit01_ora_19546.trc
Sort options: default

********************************************************************************
count = number of times OCI procedure was executed
cpu = cpu time in seconds executing
elapsed = elapsed time in seconds executing
disk = number of physical reads of buffers from disk
query = number of buffers gotten for consistent read
current = number of buffers gotten in current mode (usually for update)
rows = number of rows processed by the fetch or execute call
********************************************************************************

alter session set sql_trace = true

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 0 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 1 0.00 0.00 0 0 0 0

Misses in library cache during parse: 0
Misses in library cache during execute: 1
Optimizer goal: CHOOSE
Parsing user id: 69
********************************************************************************

BEGIN DBMS_OUTPUT.ENABLE(2000); END;

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.02 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 1
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 0.02 0.01 0 0 0 1

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 69
********************************************************************************

declare
out_ret_code number;
out_rows number;
out_trans_id transactions.TRANS_ID%type;
begin
dtx.txmover_sp(0,out_ret_code,out_rows,out_trans_id);
dbms_output.put_line('out_ret_code ' || out_ret_code);
dbms_output.put_line('out_rows ' || out_rows);
dbms_output.put_line('out_trans_id ' || out_trans_id);
end;

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.03 0 0 0 0
Execute 1 0.03 0.05 0 0 0 1
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 0.04 0.09 0 0 0 1

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 69
********************************************************************************

select condition
from
cdef$ where rowid=:1

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 14 0.00 0.00 0 0 0 0
Execute 14 0.00 0.01 0 0 0 0
Fetch 14 0.00 0.00 0 28 0 14
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 42 0.00 0.01 0 28 0 14

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: SYS (recursive depth: 2)

Rows Row Source Operation
------- ---------------------------------------------------
1 TABLE ACCESS BY USER ROWID CDEF$

********************************************************************************

SELECT txmover_history_delay
FROM dexit_hosts
WHERE system_id = 1

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.02 0.05 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.00 0.00 0 2 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 0.02 0.05 0 2 0 1

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 69 (recursive depth: 1)

Rows Row Source Operation
------- ---------------------------------------------------
1 TABLE ACCESS BY INDEX ROWID DEXIT_HOSTS
1 INDEX UNIQUE SCAN DHS_PK (object id 28919)

********************************************************************************

SELECT *
FROM transactions
WHERE status = :b2
AND transaction_type not in ('PTV','BAL')
AND transaction_date < :b1
ORDER BY trans_id

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.01 0.01 0 0 0 0
Fetch 6 0.02 0.02 0 1524 0 5
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 8 0.03 0.03 0 1524 0 5

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 69 (recursive depth: 1)

Rows Row Source Operation
------- ---------------------------------------------------
5 TABLE ACCESS BY INDEX ROWID TRANSACTIONS
387 INDEX FULL SCAN TRA_PK (object id 29041)

********************************************************************************

INSERT INTO pos_transactions
(trans_id, pos_device_id,
merchant_store_id,
merchant_owner_id,
acquirer_id, pan_id,
transaction_date, pos_date,
transaction_type, amount,
fee_amount, currency_id,
MESSAGE_TYPE,
settle_trans_id,
consumer_acct_id,
contract_id, approval_code,
sys_trace_no, terminal_info,
store_info, owner_info,
ref_trans_id,
monitor_flag, test_flag,
void_flag, status, update_cnt,
settlement_day_id, custom_attr
)
VALUES (:b27, :b26,
:b25,
:b24,
:b23, :b22,
:b21, :b20,
:b19, :b18,
:b17, :b16,
:b15,
:b14,
:b13,
:b12, :b11,
:b10, :b9,
:b8, :b7,
:b6,
:b5, :b4,
:b3, :b2, 0,
NULL, :b1
)

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.01 0 0 0 0
Execute 5 0.01 0.00 0 5 129 5
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 6 0.01 0.01 0 5 129 5

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 69 (recursive depth: 1)
********************************************************************************

UPDATE transactions
SET status = :b2
WHERE trans_id = :b1

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 5 0.00 0.00 0 15 5 5
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 6 0.00 0.00 0 15 5 5

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 69 (recursive depth: 1)

Rows Row Source Operation
------- ---------------------------------------------------
5 UPDATE
5 INDEX UNIQUE SCAN TRA_PK (object id 29041)

********************************************************************************

SELECT *
FROM transactions
WHERE status = :b2
AND transaction_type = 'PTV'
AND transaction_date < :b1
ORDER BY trans_id

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.02 0.01 0 1512 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 0.02 0.01 0 1512 0 0

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 69 (recursive depth: 1)

Rows Row Source Operation
------- ---------------------------------------------------
0 TABLE ACCESS BY INDEX ROWID TRANSACTIONS
387 INDEX FULL SCAN TRA_PK (object id 29041)

********************************************************************************

SELECT *
FROM transactions
WHERE status = :b2
AND transaction_type = 'BAL'
AND transaction_date < :b1
ORDER BY trans_id

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.01 0.01 0 1512 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 0.01 0.01 0 1512 0 0

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 69 (recursive depth: 1)

Rows Row Source Operation
------- ---------------------------------------------------
0 TABLE ACCESS BY INDEX ROWID TRANSACTIONS
387 INDEX FULL SCAN TRA_PK (object id 29041)

********************************************************************************

DELETE FROM transactions t
WHERE status = :b1
AND trans_id !=
(SELECT MAX (trans_id)
FROM transactions tt
WHERE t.pos_device_id = tt.pos_device_id
AND status = :b1)

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 1 23.07 22.64 0 2194958 21 5
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 23.08 22.64 0 2194958 21 5

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 69 (recursive depth: 1)

Rows Row Source Operation
------- ---------------------------------------------------
0 DELETE
5 FILTER
329 TABLE ACCESS FULL TRANSACTIONS
325 SORT AGGREGATE
331 TABLE ACCESS FULL TRANSACTIONS

********************************************************************************

COMMIT

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 1 0
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 0.00 0.00 0 0 1 0

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 69 (recursive depth: 1)
********************************************************************************

BEGIN DBMS_OUTPUT.GET_LINES(:LINES, :NUMLINES); END;

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 1 0.00 0.05 0 0 0 1
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 0.01 0.05 0 0 0 1

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 69

********************************************************************************

OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 3 0.04 0.05 0 0 0 0
Execute 4 0.03 0.11 0 0 0 3
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 7 0.07 0.16 0 0 0 3

Misses in library cache during parse: 3
Misses in library cache during execute: 1

OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 22 0.03 0.07 0 0 0 0
Execute 30 23.09 22.67 0 2194978 156 15
Fetch 23 0.05 0.05 0 4578 0 20
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 75 23.17 22.80 0 2199556 156 35

Misses in library cache during parse: 9

12 user SQL statements in session.
14 internal SQL statements in session.
26 SQL statements in session.
********************************************************************************
Trace file: dexit01_ora_19546.trc
Trace file compatibility: 9.00.01
Sort options: default

1 session in tracefile.
12 user SQL statements in trace file.
14 internal SQL statements in trace file.
26 SQL statements in trace file.
13 unique SQL statements in trace file.
306 lines in trace file.

April 12, 2006 - 11:44 am UTC

look at your delete - doesn't it just "pop right out and smack you in the forehead"?

trace file

jade, April 12, 2006 - 12:10 pm UTC

Yes... the delete is huge problem... at the same, is the parse a problem also?

for every query running in the job, it gets at least 1 parse. Is it a hard parse?

is there a document that explain the trace file in details?

Jade

April 12, 2006 - 7:32 pm UTC

you must parse at least once - it is not a problem.

See the times there - that says it all - the execute is the time there.

the performance guide walks through the tkprof report.

Is the System CPU Bound?

Su Baba, April 12, 2006 - 4:45 pm UTC

Below is a portion of my statspack report over a period of ~ 8 minutes. Looking at the "Top 5 Timed Events," it appears that the system is spending 70% of its time waiting for CPU. However, the "instance Activity Stats for DB" shows that the system is using 0.12 CPU second per second and the machine has 2 CPUs. Isn't this contradictory? Am I reading the stats right?

...
...

Snap Id Snap Time Sessions Curs/Sess Comment
--------- ------------------ -------- --------- -------------------
Begin Snap: 2 10-Apr-06 11:49:26 20 6.1
End Snap: 3 10-Apr-06 11:57:37 26 9.2
Elapsed: 8.18 (mins)

...
...

Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
CPU time 59 70.60
SQL*Net more data to client 148,551 9 10.50
log file switch completion 29 3 4.05
log file sync 2,249 3 4.04
log file parallel write 3,078 3 3.82
-------------------------------------------------------------
...
...

Instance Activity Stats for DB: xxxxx Instance: xxxxx Snaps: 2 -3

Statistic Total per Second per Trans
--------------------------------- ------------------ -------------- ------------
CPU used by this session 5,895 12.0 3.4
CPU used when call started 5,893 12.0 3.4
...
...

April 12, 2006 - 7:41 pm UTC

you cannot tell "waiting for cpu" from a statspack.
you can tell "how much cpu" was used.

In ~8 minutes, you used about 1/16th the cpu (assuming no long running calls that is!)

you used 1 out of 16 cpu minutes.

this is not a "wait", it is a "timed event" report

Skipping internal statements in Statspack??

Shailesh Saraff, April 13, 2006 - 1:24 am UTC

Hello Tom,

We are using Oracle 10.1.0.2 on windows 2000 SP 4. Statspack reports generated show lot of Oracle internally issued statements, I would like to know how to skip these statements and take only those which are executed by our application. TKprof has parameter SYS=NO to skip internal statements. Is there something in statspack?

Please let me know.

Thanks & Regards,

Shailesh

April 13, 2006 - 7:46 am UTC

not that I am aware of.

Dave, April 13, 2006 - 5:07 am UTC

if the internal statements are the one using the most resources, it probably means your app wasnt running or doing anything. or maybe your app is so good it doesnt need use any resources

So no there is no way, and it negates the point of it (that to show the top sql)

Su Baba, April 13, 2006 - 11:56 am UTC

Back to the question posted on 4/12/06 (statspack report pasted below again). How do I interpret this report? The system spent 70% of time using CPU? However, even though CPU time is the top timed event, the system is only using about 1/16th the CPU. Assume this report is generated during a period of stress testing (representing peak workload), does this mean the CPU is under-utilized?

...
...

Snap Id Snap Time Sessions Curs/Sess Comment
--------- ------------------ -------- --------- -------------------
Begin Snap: 2 10-Apr-06 11:49:26 20 6.1
End Snap: 3 10-Apr-06 11:57:37 26 9.2
Elapsed: 8.18 (mins)

...
...

Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
CPU time 59 70.60
SQL*Net more data to client 148,551 9 10.50
log file switch completion 29 3 4.05
log file sync 2,249 3 4.04
log file parallel write 3,078 3 3.82
-------------------------------------------------------------
...
...

Instance Activity Stats for DB: xxxxx Instance: xxxxx Snaps: 2 -3

Statistic Total per Second per Trans
--------------------------------- ------------------ -------------- ------------
CPU used by this session 5,895 12.0 3.4
CPU used when call started 5,893 12.0 3.4
...
...

April 14, 2006 - 11:56 am UTC

if this was a stress test AND the calls to the database were short, then this says:

you had 8*2*60 = 960 cpu seconds to use
you used 59 of them.

during that period of time, you experienced almost no waiting whatsoever.

The "system" spent 6% of its time doing something (59 out of a possible 960 cpu seconds where used)
The "system" spent about 94% of its time just sitting there waiting to be told to do something.

Now, if your "peak" was measured like this:

a) at time T0 we started some PLSQL and it started working. This plsql takes a while to run.

b) at time T1 we snapped
c) at time T2 (8 minutes later) we snapped

d) at time T3 the plsql started to finish executing.....

Then you have a "not useful" statspack report since the CPU and some other V$ related statistics (which are not updated UNTIL THE CALL COMPLETES - time T3 in this case) are not accounted for in this report.

That is what I mean when I said "assuming you did short calls". If you have a client application doing things like parse, fetching, closing, doing updates and such - then the cpu time is pretty much accurate. If you ran something like I described above with t0, t1, t2, t3 - then your cpu timings are likely way off.

Your advice about this statspack

Yoav, April 27, 2006 - 5:59 pm UTC

Hi Tom,
we have a 3rd party application on oracle 8.1.7.4 db, that is running job at
the end of each month. its took about 40 Hour to complete.

I took a snapshot for this period of time and i would like to get your opinion about my analysis:

1. The statspack run 3030 min more than it should ...
2. 1 of 4 transaction end with rollback -> very bad appliaction design.
see also , SQL ordered by Gets
3. soft parse = 99.74. too good to be ture. there is cursor_sharing = FORCE ,
due to performance problem in the application.
4. the application is sorting heavily: 9500 time per second -> very bad appliaction design.
4. alot of row-by-row processing , see also : SQL ordered by Executions,
again, its look like application design problem, they are not using bulk process.
5. From the : "Instance Activity Stats", its look like the pga should be increase.
6. I would like to get your advice regard to the parameters at the end of the report.
are the values of : hash_area_size , db_block_buffers , buffer_pool_keep ,buffer_pool_recycle
looks reasonable ?
7. because changing the code of the application could take alot of time i will appreciate any
advice from you , that i could implement in the instance level.

Thank You very much.

STATSPACK report for

DB Name DB Id Instance Inst Num Release OPS Host
------------ ----------- ------------ -------- ----------- --- ------------
xxxx 12345 xxxxx 1 8.1.7.4.0 NO xxxx

Snap Id Snap Time Sessions
------- ------------------ --------
Begin Snap: 12764 01-May-06 19:00:31 456
End Snap: 12815 03-May-06 22:00:59 456
Elapsed: 3,060.47 (mins)

Cache Sizes
~~~~~~~~~~~
db_block_buffers: 524288 log_buffer: 262144
db_block_size: 4096 shared_pool_size: 89128960

Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 129,876.68 29,747.96
Logical reads: 2,054.39 470.55
Block changes: 949.55 217.49
Physical reads: 299.29 68.55
Physical writes: 72.85 16.69
User calls: 329.95 75.58
Parses: 37.97 8.70
Hard parses: 0.10 0.02
Sorts: 9,597.31 2,198.24
Logons: 0.07 0.02
Executes: 195.04 44.67
Transactions: 4.37

% Blocks changed per Read: 46.22 Recursive Call %: 16.25
Rollback per transaction %: 24.00 Rows per Sort: 0.03

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 100.00 Redo NoWait %: 100.00
Buffer Hit %: 85.43 In-memory Sort %: 100.00
Library Hit %: 99.97 Soft Parse %: 99.74
Execute to Parse %: 80.53 Latch Hit %: 99.99
Parse CPU to Parse Elapsd %: 79.17 % Non-Parse CPU: 100.00

Shared Pool Statistics Begin End
------ ------
Memory Usage %: 70.24 77.72
% SQL with executions>1: 73.86 25.35
% Memory for SQL w/exec>1: 85.29 74.91

Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt Time
-------------------------------------------- ------------ ------------ -------
db file sequential read 15,204,906 6,856,997 79.03

SQL ordered by Gets
-> End Buffer Gets Threshold: 10000
-> Note that resources reported for PL/SQL includes the resources used by
all SQL statements called within the PL/SQL code. As individual SQL
statements are also reported, it is possible and valid for the summed
total % to exceed 100

Buffer Gets Executions Gets per Exec % Total Hash Value
--------------- ------------ -------------- ------- ------------
60,169,138 190,149 316.4 15.9 3478003248
rollback

SQL ordered by Executions
-> End Executions Threshold: 100

Executions Rows Processed Rows per Exec Hash Value
------------ ---------------- ---------------- ------------
1,981,159 1,981,159 1.0 603193705
SELECT to_char(last_day(to_date(:1,'yyyymmdd')),'yyyymmdd')
from dual

1,601,575 1,601,575 1.0 1171422917
insert into tc_invoice_lines (invoice_type,invoice_year, invoice
_no,line_no,asmachta_no, item_code,debit_credit_reason, total_am
ount,call_count, dur_min_count,total_amount_for_vat, currency_co
de, currency_amount, heb_inv_reason, display_dur_min_count, orig
in_amount, origin_amount_for_vat) values (:1,:2,:3,:4, :5,:6,:7,

1,585,352 1,585,643 1.0 3414820554
update tc_subs_detail set prev_invoice_date = last_invoice_date,
last_invoice_date = to_date(:1,'yyyymmdd') where area = :2 and
phone = :3 and heker_no = :4 and start_date >= to_date(:5,'yyyym
mddhh24miss')

1,585,351 0 0.0 1735577819
SELECT to_char(min_call_date,'yyyymmddhh24miss'), to_char(max_ca
ll_date,'yyyymmddhh24miss'), (pulse_rate - discount_amount), pul
se_rate, number_of_calls, dur_count, call_type, load_by, unrated
_minute, number_of_unrated_calls, unlimit_internet_use, user_nam
e, 0, discount_amount, accum_by_region, rowid, count_attendees,

1,264,823 1,264,823 1.0 2104591056
insert into tc_deb_cred(area,phone,heker_no,charge_sign, total_c
harge,value_date,deb_cred_status,invoice_year,bezeq_asmachta, de
bit_credit_reason,invoice_number, heskem_no,mivtza_no,source_id,
action_date,deb_cred_status_date,user_no, invoice_type, term_co
de ,charge_month, deb_cred_eng_reason, finterms_line_no,internet

932,026 932,026 1.0 612177218
update tc_finterms set prev_charge_date = last_charge_date, last
_charge_date = to_date(:1,'yyyymmdd'), updating_date = to_date(:
2,'yyyymmddhh24miss') where rowid = :3

869,796 869,796 1.0 1485258072
SELECT to_char(last_day(add_months( to_date(:1,'yyyymmdd'),1)),
'yyyymmdd') from dual

Instance Activity Stats for DB:

Statistic Total per Second per Trans
--------------------------------- ---------------- ------------ ------------
...
session pga memory 19,394,921,792 105,620.7 24,192.2
session pga memory max 19,551,900,520 106,475.6 24,388.0
session uga memory ################ ############ ############
session uga memory max 1,466,483,896 7,986.2 1,829.2
sorts (disk) 11 0.0 0.0
sorts (memory) 1,762,334,147 9,597.3 2,198.2
...
table fetch by rowid 73,565,348 400.6 91.8
table fetch continued row 893,580 4.9 1.1
table scan blocks gotten 56,925,841 310.0 71.0
table scan rows gotten 2,183,093,123 11,888.7 2,723.1

Buffer Pool Statistics for DB:
-> Pools D: default pool, K: keep pool, R: recycle pool

Free Write Buffer
Buffer Consistent Physical Physical Buffer Complete Busy
P Gets Gets Reads Writes Waits Waits Waits
- ----------- ------------- ----------- ---------- ------- -------- ----------
D 66,778,823 114,266,802 53,178,752 12,523,898 0 0 3
K 2,671 57,744 2,671 0 0 0 0
R 1,569,242 13,264,898 1,484,301 558,538 0 0 0
-------------------------------------------------------------

Buffer wait Statistics for DB:
-> ordered by wait time desc, waits desc

Tot Wait Avg
Class Waits Time (cs) Time (cs)
------------------ ----------- ---------- ---------
data block 2 1 1
undo header 1 0 0

init.ora Parameters for DB:

End value
Parameter Name Begin value (if different)
----------------------------- --------------------------------- --------------
O7_DICTIONARY_ACCESSIBILITY TRUE
_trace_files_public TRUE
audit_file_dest /software/oracle/admin/xxxx/aud
audit_trail DB
background_dump_dest /software/oracle/admin/xxxx/bdu
backup_tape_io_slaves TRUE
buffer_pool_keep 15616
buffer_pool_recycle 262144
commit_point_strength 100
compatible 8.1.7
control_files /xxxx/oractl1/control01.ctl, /b
core_dump_dest /software/oracle/admin/xxxx/cdu
cursor_sharing FORCE
db_block_buffers 524288
db_block_lru_latches 16
db_block_size 4096
db_domain WORLD
db_file_multiblock_read_count 8
db_files 1024
db_name xxxxx
db_writer_processes 4
distributed_transactions 250
dml_locks 900
event 4031 trace name errorstack level
hash_area_size 0
hash_join_enabled FALSE
java_pool_size 20971520
job_queue_interval 60
job_queue_processes 1
large_pool_size 0
log_archive_format %t_%S.arc
log_archive_start FALSE
log_buffer 262144
log_checkpoint_interval 0
log_checkpoint_timeout 0
max_dump_file_size 10240
max_enabled_roles 60
open_cursors 1000
optimizer_mode RULE
parallel_max_servers 0
parallel_min_servers 0
partition_view_enabled TRUE
processes 300
remote_dependencies_mode SIGNATURE
remote_login_passwordfile NONE
remote_os_authent TRUE
resource_limit TRUE
rollback_segments BIG_RS, R131, R132, R133, R134, R
session_cached_cursors 40
shared_pool_size 89128960
sort_area_retained_size 4194304
sort_area_size 4194304
timed_statistics TRUE
transactions 913
transactions_per_rollback_seg 5
utl_file_dir *

Statspack - other uses

Jayadevan, August 14, 2006 - 12:41 am UTC

Dear Tom,
Can I use statspack to find out which are the most frequently used procedures/functions in an application? Is there any other way of doing this? I am doing the Databse Design Review of an application with 900 odd tables and a really large number of procedures/functions. I wanted to choose the procedures/functions that are most frequently used instead of going through each and every line of code. For queries, I used TKPROF and chose the most 'problematic' queries.

August 14, 2006 - 10:46 am UTC

no, you cannot.

only the top level calls appear in v$ tables (the "being PROCEDURE; end;" type of anonymous blocks)

Things that are not directly called via a client - but rather from some other plsql code that is called by the client - you cannot see that.

There is a source code profiler however, dbms_profiler. Search for that on this site.

A reader, August 17, 2006 - 12:45 pm UTC

Hi Tom,
Could you advise if the forumla is some what approximately correct.
Thank you.

CPU used when call started = parse time cpu + recursive cpu usage + other
OR
CPU used when call started = recursive cpu usage + other (because parse time cpu is included in parse time cpu)

August 17, 2006 - 1:05 pm UTC

it would be all cpu used - regardless of the source, so first one.

the others are just breakouts of the aggregate.

A reader, August 17, 2006 - 3:06 pm UTC

Hi Tom,
Sorry to waste your valuble time but could you confirm
that Statspack report
recursive cpu usage does not include parse time CPU.
Thanks again.

August 17, 2006 - 3:23 pm UTC

never really went into it that far.

but parse time cpu would be something that included recursive sql cpu if anything :)

a parse might cause recursive sql to be executed.
recursive sql might have to be parsed.
which might incur some more recursive sql.
........

six one way, 1/2 dozen the other. But a parse would be the thing that would trigger recursive sql.

Raghu, August 22, 2006 - 12:44 am UTC

While trying to read the statspack-related responses on your site, I see that there are some diagnostic items to look for : keep statspack cycles short, check soft parse % (higher=better) and so on and we are trying to follow those as we troubleshoot a performance issue. To fix this, we recently went from 4CPU to 10CPU and bumped up RAM from 24G to 28G in our DW and performance did not *budge*. Not one bit! I am wondering if we are doing anything that is artificially limiting the horsepower that the DB can get. Here's a very recent statspack: Parse CPU to Parse Elapsd % is 9.52. Isn't that way off? Anything else that is glaringly bad? DB size is about 450G. Runs on IBM AIX.

STATSPACK report for

DB Name DB Id Instance Inst Num Release Cluster Host
------------ ----------- ------------ -------- ----------- ------- ------------
xxxxx 1665185253 xxxxx 1 9.2.0.7.0 NO yyyyy

Snap Id Snap Time Sessions Curs/Sess Comment
--------- ------------------ -------- --------- -------------------
Begin Snap: 8303 20-Aug-06 22:00:04 64 459.9
End Snap: 8304 20-Aug-06 22:30:04 78 379.7
Elapsed: 30.00 (mins)

Cache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 2,048M Std Block Size: 16K
Shared Pool Size: 800M Log Buffer: 4,096K

Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 1,345,265.56 1,402,129.71
Logical reads: 44,630.19 46,516.70
Block changes: 7,918.18 8,252.88
Physical reads: 1,328.73 1,384.90
Physical writes: 663.81 691.86
User calls: 242.72 252.98
Parses: 9.30 9.69
Hard parses: 2.64 2.75
Sorts: 3.63 3.79
Logons: 0.57 0.59
Executes: 5,049.18 5,262.61
Transactions: 0.96

% Blocks changed per Read: 17.74 Recursive Call %: 95.53
Rollback per transaction %: 0.23 Rows per Sort: ########

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 99.99 Redo NoWait %: 100.00
Buffer Hit %: 98.19 In-memory Sort %: 99.01
Library Hit %: 99.89 Soft Parse %: 71.59
Execute to Parse %: 99.82 Latch Hit %: 99.48
Parse CPU to Parse Elapsd %: 9.52 % Non-Parse CPU: 99.61

Shared Pool Statistics Begin End
------ ------
Memory Usage %: 70.10 76.86
% SQL with executions>1: 82.75 80.46
% Memory for SQL w/exec>1: 40.16 38.53

Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
db file scattered read 94,332 4,053 25.24
db file sequential read 209,284 3,842 23.92
CPU time 2,253 14.03
direct path read 98,116 1,748 10.89
PX Deq Credit: send blkd 440,587 1,432 8.92
-------------------------------------------------------------

August 27, 2006 - 1:57 pm UTC

that would simply mean you didn't need more cpu and you didn't need more ram.

You have a data warehouse, so you are not parsing very often at all (and you aren't). So a ratio based on small numbers isn't going to be hugely meaningful.

could well be: suboptimal sql. lack of use of features such as materialized views (which are the "indexes" of a data warehouse), bitmap indexes and the like.

I would not focus on parse related issues on a system where the amount of time spent parsing is trivial when compared to the amount of time spent EXECUTING sql (a warehouse is characterized by queries that take seconds - sometimes many seconds - to execute, as opposed to most traditional systems where queries are expected to be instantaneous).

seems most of the time spent waiting for stuff right now is..... all IO related, the one resource you didn't really seem to touch with your hardware upgrade.

CPU used when call started

A, August 22, 2006 - 8:39 am UTC

Hi Tom.
Clarify, please, a difference between "CPU used by this session" and "CPU used when call started"
You said "the amount of cpu time you already used when before the current call (if any) started".
But "CPU used by this session" also doesn't include the time of the current call until it end - ?
Why sometimes cpu used by this session is less than cpu used when call started, but the reverse may be true.

I searched much, but I have no understanding :(

Parsing vs Executing

Raghu, August 28, 2006 - 1:37 pm UTC

You said "I would not focus on parse related issues on a system where the amount of time spent parsing is trivial when compared to the amount of time spent EXECUTING sql" and that's valid. You also said that we weren't either CPU bound or RAM bound but rather IO bound - which is why the bump up did little good. Our DBA/SysAdmin says that IO isn't being contented for and is not the problem - can *statspack* tell us something specific about what the bottleneck might be?

August 28, 2006 - 4:38 pm UTC

are they the same person that said "we need more cpu and memory" by any chance???

You are spending most of your time waiting for IO (that is ALL ANYONE can say from this statspack).

You might well not have a "bottleneck" at all, you may have reached the physical capacity of your given machine to perform.

but who knows, give the tiny bit of information we have here - you cannot really say...

contended

Raghu, August 28, 2006 - 1:38 pm UTC

Sorry, I meant contenDed for

CPU used when call started August

A, August 29, 2006 - 2:57 am UTC

Hi Tom
(sorry for the repetition)

Clarify, please, a difference between "CPU used by this session" and "CPU used when call started"
You said "the amount of cpu time you already used when before the current call (if any) started".
But "CPU used by this session" also doesn't include the time of the current call until it end - ?
Why sometimes cpu used by this session is less than cpu used when call started, but the reverse may be true.

I searched much, but I have no understanding :(

CPU used when call started

A, September 01, 2006 - 7:56 am UTC

Hi Tom.
Tom, answer, please (I asked 29 August/ 22 Aug here)
Or the question is very stupid ?

September 01, 2006 - 8:34 am UTC

they are basically the same for all intents and purposes. are you seeing abnormally wide variations between them?

one just reflects the cpu used before the call started, at the start of a user call - i.e. just before executing a parse, exec, fetch etc.

the other on exit from a user call - i.e. just before returning to the client process

CPU used when call started

A, September 01, 2006 - 9:29 am UTC

You said "are you seeing
abnormally wide variations between them?"
Yes, sometime I see cpu used by this session is three times more.

September 01, 2006 - 10:36 am UTC

three times what, a huge number or a small number.

3 is three times 1

but the variation is trivial

good way to model physical writes?

Ryan, September 14, 2006 - 10:24 am UTC

We are doing performance testing on an application in development. We want to see how much CPU, Physical IO, waits, etc... everything is doing so we can do capacity planning.

Is there a good way to model physical writes for a given pl/sql job? We have been forcing checkpoints to get the numbers and then running an AWR snapshot? Is there a better way?

Also, we are having trouble getting consistent redo information as well. I turn set autotrace on. I do an array insert into a table. I run it 5 times and I get 5 different redo numbers? I am the only person on the box. Why does this happen?

September 14, 2006 - 10:46 am UTC

why would you force anything? you just made it entirely artificial.

index splits will generate different amounts of redo (for example). Space management (we needed to extend the table) will generate different amounts of redo. I would never expect them to be identical.

Model your system
Set up the simulation
Let is run and run - snapshot it from time to time.

Your real system will not be a static model either, it will fluctuate up and down over time.

ORA-03001: unimplemented feature

Khurram, January 16, 2007 - 8:01 am UTC

When i am looking at diagonastic summary at 10g OEM there is 5 findings in the performance area when i drill down it ADDM has identified a number of performance issue which are sequels when i click on these individaul sequel and run advisor ,after running it gives me no recommendation for some sequel and mostly give ORA-03001: unimplemented feature in the recommendations pane for others sequel ,why ADDM is giving no recommendation and error ,is there any missing configuration with ADDM?

ORA-03001: unimplemented feature

SQL> SELECT banner FROM v$version
  2  /

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.1.0.2.0 - Prod
PL/SQL Release 10.1.0.2.0 - Production
CORE    10.1.0.2.0      Production
TNS for 32-bit Windows: Version 10.1.0.2.0 - Production
NLSRTL Version 10.1.0.2.0 - Production

SQL> show parameter compatible

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
compatible                           string      10.1.0.2.0
SQL>

Thanx Tom

Khurram

Alexander the ok, March 07, 2007 - 10:03 am UTC

Tom,

I was wondering if you know of an easy way to check to see if the database was or is working hard, or "maxed out" if you will. I can see stats pack has a % memory used, which is nice but anything I can find out CPU? We have a LPAR configuration where the amount of resources changes dynamically based on what is needed by the database. It's difficult to know if the CPU is all used because it always appears that way, it will just allocate more if it needs it.

I don't think wait events on a stats pack will give me what I want because if someone says "the database was slow yesterday", and I look at waits, there's no "bad" wait right? I mean, maybe it is maybe it isn't? Does this come down to keeping a snapshot around for comparison?

I guess what I'm really asking is a way to say, "yep the database was really busy during that time, I should look into that" Instead of, the application is slow, maybe it's the network, appserver, etc etc and I can't really get that from the workload repository, or can I? Thanks.

March 07, 2007 - 11:08 am UTC

statspack reports cpu used as well.

We cannot tell you how much of the "system" cpu that relates to - since that is dynamic and changing.

Performance tuning methods

Anand, March 12, 2007 - 7:20 pm UTC

Tom,

I am trying to tune very big aplication procedures in my Prod instance.Unfortunatley we dont have equalent size of data in any other environment and management doesn't agree to enable trace on problem procedures in Production.So my only option is to trace problem queries in procedures by going through each query,generate plan and findout the problem query

However if query is not a problem and the problem is due to huge number of executions through cursor loop, I am not able to figureout this just by goig through the procedure.Also goign through the 4000 line code and finding problem query is time consuming.Developers are already over loaded and they can't spend time for me to explain business logic to findout how many times loop will be executed

I get stats report every week and all procedures in top 20 are still top20 from the last 6 months.

I was also checking v$sql for this but not successful in getting solution

Can you please suggest me what methods I can use to findout problem queries within procedures which are in top20 stats report.

Your help is greatly appreciated

Thanks
Anand

March 12, 2007 - 8:49 pm UTC

... procedures by going through each query,generate plan and findout the problem query ...

I'd be really interested in how THAT WORKS... You can read plans and say "good or bad"?? If you can, please email me directly - we need to codify that, we call that an "optimizer" :)

... and they can't spend time for me to explain business l ...

lol, let us see, you have applications that kill the system, that don't perform in a fashion end users can use them - umm, what are the developers doing that would mean "we cannot make the existing system actually function as it should"??

if you have 10g, starting using AWR and ADDM.

it could well be you have no problem queries, only problem procedures.

what is a "stats report"

I am using 9.2.0.4

Anand, March 12, 2007 - 7:24 pm UTC

In the last mail I forgot to mention that I am using 9.2.0.4 in SUN cluster oracle RAC env.

Performance tuning methods

Anand, March 13, 2007 - 6:07 pm UTC

Tom,

Thanks very much for your suggetions.

I am sorry for typing stats report instead of stats pack report

So as a summary if there is no possibility to enable trace in Production and if there is no equalent prodcution environment we should know the business logic to know how many times loops inside procedures will execute.

How it is going to impact Production if I enable trace only for the particular problem procedure for one time?

Thanks
Anand

March 13, 2007 - 9:02 pm UTC

how will it impact production?

probably not much at all, for a single session, one time.

How to find in which Procedure/Function the particular query is

Anand, March 14, 2007 - 2:57 pm UTC

Tom,

I found many queries whioch are taking too many buffer gets using v$sql.How Can I findout in which procedure these are in.So that I can infomr developer about this

Thanks
Anand

March 15, 2007 - 7:19 am UTC

depends, if you have current software, you have the columns

PROGRAM_ID NUMBER Program identifier
PROGRAM_LINE# NUMBER Program line number

and if the programmer is using dbms_application_info, you have action, client_info, and module to look at.

but you do have the sql_text, you could use "like" on dba_source and see if you can get a hit as well.

Dead lock errors in alert log and audit option

Anand, March 15, 2007 - 5:05 pm UTC

Tom,

Thanks Very much for your suggetions.I am learning a lot from your asktom.oracle.com

Can you please help me to resolve 2 more issues related to perforamce.

1) For one of my applications, dead lock is happening but this is not reporting in alert.log of Oracle.We could capture this error in our application log.Some one in my project says If we capture internally it wont appear in Oracle alert.log.

How is that possible.I am thinking that if there is dead lock it should be in alert.log.Can you please suggest me which is correct?

2) In my Prod DB audit sys logging is enabled.I want to trace all update statements on a particular table with time stamp and which user is executed.How can I get this information?

Thanks
Anand

March 16, 2007 - 3:08 pm UTC

1) then it is not happening. tell us what YOU think a deadlock is.

2) in 10g, you can use fine grained auditing (dbms_fga). In all releases, you can audit the fact that the table was updated - but not the statement (via the simple AUDIT command). In 9i you could write a DML trigger to do custom auditing.

Joins and Index fast full scan

Anand, March 16, 2007 - 11:48 pm UTC

Tom,

Thanks very much for your suggetions.

in sql tuning,I am comfortable in finding correct indexes,statistics,distinct num of values,column order in an Index .. etc but I like to know about tuning join queries.If two table are joining in 3 columns,that query going for full table scan on both tables.I am requesting developrs to use more where clause conditions(not joins) to make use of Index or for more performance.

Mark Gurry(sql tuning pocket reference) says the joined column should be leading position in an index but I never saw query using index eventhough join column is leading in an Index.Can you please suggest me where can i get more information about this?

Why Index fast full scans are very expensive?it is almost equalent to scanning full table.At which situation optimizer uses Index fast full scan?How it works?

Thanks
Anand

March 17, 2007 - 4:11 pm UTC

... I am requesting developrs to use more where clause conditions(not joins) to make use of Index or for more performance ...

well, that just does not make SENSE. No sense whatsoever.

they should put more predicates in? why? that *changes* what they are going after. You cannot tune by changing the question, you have to take the questions they need to answer and answer them.

I disagree in general that

a) full scans are bad
b) indexes are good
c) the join columns should be first
d) that index fast full scans are expensive.

UNDO and TEMP usage by a schema

Anand, March 26, 2007 - 7:37 pm UTC

Tom,

I am learning a lot from you.Thanks for your help.can I get one more suggetion about performance

1)Can you please suggest the procedure to know the high resource consumption application?

2)How can I findout UNDO and TEMP space usage by a schema? do we have any tables for this?

3)If I want to get UNDO,TEMP space or any other resource used by a schema for 24 Hours period,can get this info.

I am using Oracle 9.2.0.4 in SUN cluster environment.

Thanks
Anand

March 27, 2007 - 9:25 am UTC

1) define high resource first.

2) do you mean by a session? a schema doesn't use that

3) ok, you seem to mean by "login". getting "temp" will not truly be really possible as the amount of temp goes up and down over the life of a session (so unless you poll frequently and snapshot the v$tables.

Same with UNDO.

System Performance

RAM, May 18, 2007 - 5:07 am UTC

Here is the excerpt from the statspack data in our system

Load Profile
~~~~~~~~~~~~                            Per Second       Per Transaction
                                   ---------------       ---------------
                  Redo size:              9,159.71              3,729.73
              Logical reads:             10,094.39              4,110.32
              Block changes:                 28.69                 11.68
             Physical reads:                449.99                183.23
            Physical writes:                 23.57                  9.60
                 User calls:                367.93                149.82
                     Parses:                 54.01                 21.99
                Hard parses:                  0.16                  0.06
                      Sorts:                 10.75                  4.38
                     Logons:                  0.07                  0.03
                   Executes:                101.99                 41.53
               Transactions:                  2.46

Shared Pool Statistics Begin End
------ ------
Memory Usage %: 93.11 93.29
% SQL with executions>1: 18.00 17.97
% Memory for SQL w/exec>1: 24.29 24.26

Top 5 Timed Events
~~~~~~~~~~~~~~~~~~                                                     % Total
Event                                               Waits    Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
CPU time                                                          510    91.68
LNS wait on SENDREQ                                   211           9     1.67
log file sync                                       1,900           6     1.13
db file scattered read                             48,772           6     1.10
log file parallel write                             3,885           5      .98

CPU used by this session                      50,969           57.0         23.2
parse count (failures)                             3            0.0          0.0
parse count (hard)                               140            0.2          0.1
parse count (total)                           48,342           54.0         22.0
parse time cpu                                   519            0.6          0.2
parse time elapsed                               531            0.6          0.2
recursive calls                               69,891           78.1         31.8
recursive cpu usage                           17,679           19.8          8.0

We are having performance issues in this system.We could capture many SQL's which are not using Bind variables and these are executed heavily on the system.
Verified the same from the section shared pool statistics
where its shown as % SQL with executions>1: 18.00 17.97

But interestingly the hard parses are low , also the break down on cpu for the same is low as well.

ie parse cpu is around 1 % (519/50969).

I am not able to understand why the hard parses and the break down by cpu is low even though we have captured sql's which are executed heavily and not using Bind Variables.

We are not CPU bound as well , we have 4 cpu cycles per sec and using 0.57 cpu cycles per sec.

Please can i have your thoughts on the same.

Thanks in advance

May 18, 2007 - 4:05 pm UTC

so, that would mean

a) you do not have that many sql's not binding - you parse 48,000 plus statements, only 140 are "new"

b) your verification approach is not meaningful, it doesn't show "not using binds", it shows you parse, execute, close - parse, execute, close - the same statements over and over and over. They may (likely do in this case) use binds.

I see nothing to indicate "slow" here.

Nice Update

RAM, May 18, 2007 - 5:25 pm UTC

Hi tom ,

Thanks for the excellent update.

I just wanted to clarify that there is nothing slow at the database level and close this issue at the database end and start investigating at other tiers.

Thanks again

Statspack Level 7

Anand Varadarajan, May 23, 2007 - 8:17 am UTC

The results from my statspack of a 1hr(I know ithas to be reduced to 15) shows Dual table. Buffer Busy waits is clear/understandable but I don't know what are the remaining things. I have not been able to understand why there will be a lock on Dual.

Top 5 Buf. Busy Waits per Segment for DB: DMS  Instance: DMS1  Snaps: 87 -89
-> End Segment Buffer Busy Waits Threshold:       100

                                                                  Buffer
                                           Subobject  Obj.          Busy
Owner      Tablespace Object Name          Name       Type         Waits  %Total
---------- ---------- -------------------- ---------- ----- ------------ -------
MULDMS     APP        GBL_REP_MIS_DAILY_ST            TABLE           68   50.00
MULDMS     APP        IDX_GBL_REP_MIS_DAIL            INDEX           32   23.53
SYS        SYSTEM     DUAL                            TABLE           12    8.82
MULDMS     APP        SVI_NEW_SWCL                    TABLE           11    8.09
MULDMS     USER6      VH_PSF                          TABLE            4    2.94
          -------------------------------------------------------------


Top 5 Row Lock Waits per Segment for DB: DMS  Instance: DMS1  Snaps: 87 -89
-> End Segment Row Lock Waits Threshold:       100

                                                                     Row
                                           Subobject  Obj.          Lock
Owner      Tablespace Object Name          Name       Type         Waits  %Total
---------- ---------- -------------------- ---------- ----- ------------ -------
SYS        SYSTEM     DUAL                            TABLE           90   91.84
MULDMS     USER5      PH_RECEIPTS                     TABLE            8    8.16
          -------------------------------------------------------------
Top 5 CR Blocks Served (RAC) per Segment for DB: DMS  Instance: DMS1  Snaps: 87
-> End Global Cache CR Blocks Served Threshold:      1000

                                                                      CR
                                           Subobject  Obj.        Blocks
Owner      Tablespace Object Name          Name       Type        Served  %Total
---------- ---------- -------------------- ---------- ----- ------------ -------
MULDMS     USER4      AM_USERS                        TABLE        3,395   16.39
SYS        SYSTEM     DUAL                            TABLE        1,525    7.36
MULDMS     USER5      PD_RECEIPTS                     TABLE        1,468    7.09
MULDMS     INDX       SYS_C0014503                    INDEX        1,382    6.67
MULDMS     INDX2      IDX_PD_RECEIPTS$1               INDEX          746    3.60
          -------------------------------------------------------------

May 23, 2007 - 8:56 am UTC

those numbers are so tiny.. why do you think they are a problem?

are you using an old sqlplus client? there was a release that used to do a select for update on dual upon logging in.

STATSPACK Observations

Mark, May 23, 2007 - 11:29 am UTC

Hello Tom,

This is the STATSPACK report for 1 hour duration. I have some questions.

STATSPACK report for

DB Name DB Id Instance Inst Num Release Cluster Host
------------ ----------- ------------ -------- ----------- ------- ------------
ORADB1 2430511375 ORADB1 1 9.2.0.6.0 NO ORACLE

Snap Id Snap Time Sessions Curs/Sess Comment
--------- ------------------ -------- --------- -------------------
Begin Snap: 29275 23-May-07 13:00:02 95 8,212.8
End Snap: 29276 23-May-07 14:00:01 98 7,965.7
Elapsed: 59.98 (mins)

Cache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 25,216M Std Block Size: 8K
Shared Pool Size: 800M Log Buffer: 1,024K

Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 23,894.69 2,423.20
Logical reads: 38,129.84 3,866.81
Block changes: 129.02 13.08
Physical reads: 135.94 13.79
Physical writes: 29.44 2.99
User calls: 667.84 67.73
Parses: 123.04 12.48
Hard parses: 2.71 0.27
Sorts: 21.24 2.15
Logons: 0.14 0.01
Executes: 134.32 13.62
Transactions: 9.86

% Blocks changed per Read: 0.34 Recursive Call %: 18.47
Rollback per transaction %: 0.81 Rows per Sort: 292.27

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 99.99 Redo NoWait %: 99.98
Buffer Hit %: 99.64 In-memory Sort %: 99.99
Library Hit %: 98.89 Soft Parse %: 97.80
Execute to Parse %: 8.40 Latch Hit %: 100.00
Parse CPU to Parse Elapsd %: 29.71 % Non-Parse CPU: 97.40

Shared Pool Statistics Begin End
------ ------
Memory Usage %: 79.06 74.84
% SQL with executions>1: 42.47 52.57
% Memory for SQL w/exec>1: 87.76 89.70

Instance Activity Stats for DB: ORADB1 Instance: ORADB1 Snaps: 29275 -29276

Statistic Total per Second per Trans
--------------------------------- ------------------ -------------- ------------
CPU used by this session 507,945 141.1 14.3
CPU used when call started 489,641 136.1 13.8
CR blocks created 1,057 0.3 0.0
Cached Commit SCN referenced 0 0.0 0.0
Commit SCN cached 0 0.0 0.0
DBWR buffers scanned 0 0.0 0.0
DBWR checkpoint buffers written 21,430 6.0 0.6
DBWR checkpoints 34 0.0 0.0
DBWR free buffers found 0 0.0 0.0
DBWR lru scans 0 0.0 0.0
DBWR make free requests 0 0.0 0.0
DBWR revisited being-written buff 0 0.0 0.0
DBWR summed scan depth 0 0.0 0.0
DBWR transaction table writes 242 0.1 0.0
DBWR undo block writes 6,458 1.8 0.2
DFO trees parallelized 0 0.0 0.0
PX local messages recv'd 0 0.0 0.0
PX local messages sent 0 0.0 0.0
Parallel operations downgraded 25 0 0.0 0.0
SQL*Net roundtrips to/from client 2,062,875 573.2 58.1
SQL*Net roundtrips to/from dblink 166,228 46.2 4.7
active txn count during cleanout 646 0.2 0.0
background checkpoints completed 22 0.0 0.0
background checkpoints started 22 0.0 0.0
background timeouts 12,680 3.5 0.4
branch node splits 1 0.0 0.0
buffer is not pinned count 121,648,051 33,800.5 3,427.8
buffer is pinned count 66,885,999 18,584.6 1,884.7
bytes received via SQL*Net from c 160,202,690 44,513.1 4,514.2
bytes received via SQL*Net from d 1,253,493,615 348,289.4 35,320.6
bytes sent via SQL*Net to client 644,608,544 179,107.7 18,163.6
bytes sent via SQL*Net to dblink 18,150,918 5,043.3 511.5
calls to get snapshot scn: kcmgss 930,558 258.6 26.2
calls to kcmgas 52,982 14.7 1.5
calls to kcmgcs 2,185 0.6 0.1
change write time 814 0.2 0.0
cleanout - number of ktugct calls 1,461 0.4 0.0
cleanouts and rollbacks - consist 186 0.1 0.0
cleanouts only - consistent read 176 0.1 0.0
cluster key scan block gets 8,218 2.3 0.2
cluster key scans 1,458 0.4 0.0
commit cleanout failures: buffer 3 0.0 0.0
commit cleanout failures: callbac 669 0.2 0.0
commit cleanout failures: cannot 8 0.0 0.0
commit cleanout failures: hot bac 1 0.0 0.0
commit cleanouts 108,176 30.1 3.1
commit cleanouts successfully com 107,495 29.9 3.0
commit txn count during cleanout 14,038 3.9 0.4
consistent changes 3,175 0.9 0.1
consistent gets 136,718,063 37,987.8 3,852.4
consistent gets - examination 12,649,542 3,514.7 356.4
cursor authentications 2,801 0.8 0.1
data blocks consistent reads - un 3,168 0.9 0.1
db block changes 464,352 129.0 13.1
db block gets 511,372 142.1 14.4
deferred (CURRENT) block cleanout 56,603 15.7 1.6
Instance Activity Stats for DB: ORADB1 Instance: ORADB1 Snaps: 29275 -29276

Statistic Total per Second per Trans
--------------------------------- ------------------ -------------- ------------
dirty buffers inspected 0 0.0 0.0
enqueue conversions 218,439 60.7 6.2
enqueue releases 208,770 58.0 5.9
enqueue requests 208,796 58.0 5.9
enqueue timeouts 23 0.0 0.0
enqueue waits 52 0.0 0.0
exchange deadlocks 0 0.0 0.0
execute count 483,429 134.3 13.6
free buffer inspected 619 0.2 0.0
free buffer requested 493,557 137.1 13.9
hot buffers moved to head of LRU 6 0.0 0.0
immediate (CR) block cleanout app 362 0.1 0.0
immediate (CURRENT) block cleanou 7,270 2.0 0.2
index fast full scans (full) 13 0.0 0.0
index fast full scans (rowid rang 0 0.0 0.0
index fetch by key 8,030,012 2,231.2 226.3
index scans kdiixs1 3,314,042 920.8 93.4
leaf node 90-10 splits 116 0.0 0.0
leaf node splits 267 0.1 0.0
logons cumulative 495 0.1 0.0
messages received 67,050 18.6 1.9
messages sent 67,051 18.6 1.9
no buffer to keep pinned count 0 0.0 0.0
no work - consistent read gets 120,500,009 33,481.5 3,395.4
opened cursors cumulative 16,145 4.5 0.5
parse count (failures) 91 0.0 0.0
parse count (hard) 9,736 2.7 0.3
parse count (total) 442,824 123.0 12.5
parse time cpu 13,227 3.7 0.4
parse time elapsed 44,519 12.4 1.3
physical reads 489,262 135.9 13.8
physical reads direct 567 0.2 0.0
physical reads direct (lob) 162 0.1 0.0
physical writes 105,943 29.4 3.0
physical writes direct 84,248 23.4 2.4
physical writes direct (lob) 267 0.1 0.0
physical writes non checkpoint 89,403 24.8 2.5
pinned buffers inspected 569 0.2 0.0
prefetched blocks 342,380 95.1 9.7
prefetched blocks aged out before 904 0.3 0.0
process last non-idle time 3,599 1.0 0.1
queries parallelized 0 0.0 0.0
recursive calls 544,515 151.3 15.3
recursive cpu usage 27,796 7.7 0.8
redo blocks written 210,849 58.6 5.9
redo buffer allocation retries 54 0.0 0.0
redo entries 264,717 73.6 7.5
redo log space requests 50 0.0 0.0
redo log space wait time 379 0.1 0.0
redo ordering marks 0 0.0 0.0
redo size 85,996,996 23,894.7 2,423.2
redo synch time 6,612 1.8 0.2
redo synch writes 60,596 16.8 1.7
redo wastage 18,626,712 5,175.5 524.9
redo write time 5,664 1.6 0.2
redo writer latching time 1 0.0 0.0
Instance Activity Stats for DB: ORADB1 Instance: ORADB1 Snaps: 29275 -29276

Statistic Total per Second per Trans
--------------------------------- ------------------ -------------- ------------
redo writes 61,657 17.1 1.7
rollback changes - undo records a 2 0.0 0.0
rollbacks only - consistent read 944 0.3 0.0
rows fetched via callback 689,542 191.6 19.4
session connect time 0 0.0 0.0
session logical reads 137,229,292 38,129.8 3,866.8
session pga memory 9,285,440 2,580.0 261.6
session pga memory max 11,374,808 3,160.6 320.5
session uga memory 81,609,178,216 22,675,514.9 2,299,562.6
session uga memory max 50,878,880 14,137.0 1,433.7
shared hash latch upgrades - no w 3,350,591 931.0 94.4
shared hash latch upgrades - wait 0 0.0 0.0
sorts (disk) 9 0.0 0.0
sorts (memory) 76,416 21.2 2.2
sorts (rows) 22,336,571 6,206.3 629.4
summed dirty queue length 0 0.0 0.0
switch current to new buffer 1,278 0.4 0.0
table fetch by rowid 32,759,610 9,102.4 923.1
table fetch continued row 555,801 154.4 15.7
table scan blocks gotten 111,490,075 30,978.1 3,141.5
table scans (long tables) 27 0.0 0.0
table scans (rowid ranges) 0 0.0 0.0
table scans (short tables) 17,460 4.9 0.5
transaction rollbacks 2,748 0.8 0.1
transaction tables consistent rea 0 0.0 0.0
transaction tables consistent rea 0 0.0 0.0
user calls 2,403,548 667.8 67.7
user commits 35,201 9.8 1.0
user rollbacks 288 0.1 0.0
workarea executions - onepass 9 0.0 0.0
workarea executions - optimal 101,146 28.1 2.9
write clones created in backgroun 0 0.0 0.0
write clones created in foregroun 124 0.0 0.0
-------------------------------------------------------------

buffer is not pinned count 121,648,051 33,800.5 3,427.8

The description says "Number of times a buffer was free when visited. Useful only for internal debugging purposes". Can you tell what could be the reason for such a high value?

consistent gets - examination 12,649,542 3,514.7 356.4

Creating index should reduce 'consistent gets - examination'? If not how this is different from 'consistent gets'?

no work - consistent read gets 120,500,009 33,481.5 3,395.4

What could be the reason for such a high value?

redo size 85,996,996 23,894.7 2,423.2

Should I add more redo logs?

redo wastage 18,626,712 5,175.5 524.9

What is redo wastage?

session uga memory 81,609,178,216 22,675,514.9 2,299,562.6

Is this value too high?

table scan blocks gotten 111,490,075 30,978.1 3,141.5

Is the value too high because of full table scans?

Are there any other areas which I can tune?

Thanks

May 26, 2007 - 9:32 am UTC

code button, please - it is virtually unreadable as posted.

you hard parse too much - that is what popped out to me.

you prefer unpinned buffers, good to be high.

creating an index will only reduce consistent gets if using the index results in less consistent gets - says 'creating index should reduce' is absolutely not true. You can create an index, force us to use it and have the consistent gets go from 100 to 1000000000000

redo wastage

Mark, May 29, 2007 - 7:42 am UTC

Hello To,
I did not understand what you replied for "buffer is not pinned count". Can you please explain me?

Plus What is redo wastage?

Thanks

May 30, 2007 - 10:50 am UTC

if you got to a buffer and wanted to reuse it- you would be disappointed to find it pinned by someone else.

so, finding it not pinned - that makes your day.

redo wastage:
http://docs.oracle.com/docs/cd/B19306_01/server.102/b14237/stats002.htm#sthref4793

ctl-f and search for redo wastage.

perl scripts consumes CPU

CT, June 06, 2007 - 12:57 pm UTC

Hi Tom
In one of our instance , I am seeing lot of the following scripts running forever and the numbers keep increasing. Atleast 20 of them. No users were able to query the data base, The total CPU is consumed by these scripts. Wehave not seen this happening before.

/home/oracle/rdbms/perl/bin/perl /home/oracle/rdbms/sysman/admin/scripts/db/dbresp.pl

I am wondering could you shed some light on this what this scripts(I don't see any detailed comment in the script except it does executes the user defined sql ,reply success or failure and response time)are doing and does every SQL will kick start these. We ended up killing all the process because the whole system was down.

I am not sure whether this is right place to clarify, if you can clarify or you can guide me, It would really help us in solving our problem.

Thank you very much
CT

transaction count

nn, July 02, 2007 - 11:00 am UTC

How can I find out the transaction count from my database. 10.2.0.2

July 03, 2007 - 9:56 am UTC

define transaction count first.

be specific.

parse to elapse ration and Freelist

Keyur, July 14, 2007 - 4:03 pm UTC

I was reading following thread and I have question about your comment.
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:7641015793792#7687478652988

Statistic Total per Second per Trans
--------------------------------- ------------------ -------------- ------------
CPU used by this session 522,826 289.8 803.1

2.8 cpu seconds used per second, you have 4 cpu seconds per second to use.

How can you say 2.8 second used per second. 289.8 is second or mili second ? Why you devide 298.8 to 100 ?

Would you please xplain why you advice 1) just add more freelists, ALTER TABLE and ALTER INDEX let you do that ? "Buffer Busy Waits" is for high for Data Block, then how Freel List increase will help ? What if I am using ASSM ?

Last What is Parse to Elapse indicate ? how ratio has been calculated ?

Thanks

~ Keyur

July 17, 2007 - 10:46 am UTC

it was hundredths of a second. 289.8 was 2.898 cpu seconds

the information was spread all over that thread - you have to look up a review or two to get the entire story. Crux of the story is "doing data loads, buffer busy waits biggest waits". During a concurrent load, high buffer busy waits is an indication of contention on the freelists. If you are using ASSM - the advice cannot apply of course since you do not have freelists to manage!

Parse CPU to Parse Elapsd from statspack:

      ,'Parse CPU to Parse Elapsd %:'                  dscr
      , decode(:prsela, 0, to_number(null)
                      , round(100*:prscpu/:prsela,2))  pctval

- if they were EQUAL (cpu = 5 seconds, elapsed = 5 seconds) this ratio would be "100%" - meaning you didn't have to wait whilst parsing.

if elapsed starts getting much larger than cpu time (eg: cpu = 5 seconds, elapsed = 50 seconds), then this ratio goes to zero (10% in this case). The further from 100% - the more time you spent WAITING to parse, rather than parsing....

running statspack report first time

A reader, July 18, 2007 - 12:22 pm UTC

Hi Tom,

We have DB version 8.1.6 and 8.1.7 in my company. I have just joined the company and found out that these guys dont run statspack report. I thik earlier DBA used to do performance tuning using v$ views.

I am thinking to install statspack and run the report. My question is if i will do this is it going to affect the performance of database. Does running statspack take too much resources.
I am new to this field so forgive me if its a silly question.

Thanks

July 18, 2007 - 12:58 pm UTC

statspack only consumes resources when the snap is running - so when you issue statspack.snap - while that is running, that is the only 'hit' on the database.

you do not "turn on" statspack, you execute a snap every now and then to collect the contents of the v$ tables.

Buffer Busy Wait

Keyur, July 20, 2007 - 11:24 am UTC

Hello Tom,

Sorry went through whole thread and most of question I asked are already answered. But I have new question again.

You said,
"Crux of the story is "doing data loads, buffer busy waits biggest waits". During a concurrent load, high buffer busy waits is an indication of contention on the freelists. If you are using ASSM - the advice cannot apply of course since you do not have freelists to manage!"

If I am using ASSM and I have still buffer busy wait and my table has initrans = 1, Is that something we can say I dont have enough bit in bitmap of table and so I need to increase initrans.

How can we find about short of bit in bitmap or short of initran ? Is there any section of AWR Snapshot report we should look at ?

Is there any documentation or link about how to read and understand New AWR Snapshot Report ? I am looking for more detail about Time Model statistics, AWR Report. Please provide any link or reference.

Thanks

July 20, 2007 - 5:08 pm UTC

if you are loading, initrans isn't going to be much of an issue excepting maybe on indexes, but they would show as itl waits, not what you say.

Contention ITL

A reader, July 23, 2007 - 10:24 pm UTC

Then, How can we find which segment is hot segment/Contetion on Particular Segment ? or how can we find which block is hot block ? specially using AWR/Statspack.

July 24, 2007 - 9:37 am UTC

v$segment_statistics would be useful

statspack report

A reader, July 31, 2007 - 4:26 pm UTC

Hey Tom,

I have joined this new company recently. Here we have oracle 8.1.7 (pretty old) and there is no statspack report had ever taken on any of the database. So do you think its a good idea if i will start to take statspack report. By the way there is no complaint from end users about performance as such but i just want to see the statistics. I just wanted to check with you is it a good idea to run statspack report even if we don't have performance issue.

Thanks

August 02, 2007 - 10:39 am UTC

it'll help you generate a baseline of what your database did when users were not complaining.

so when they do complain, you have something to compare to.

so yes, it would be a good idea to have some observations.... for the future.

Automate AWR & ADDR report

A reader, September 03, 2007 - 11:26 pm UTC

Tom,

We have a testing RAC database. We want to automate the generation of AWR and ADDR reports. E.g, for instance 1, run AWR report for snapshot between snapshot_id 1 & 2, 2 & 3, 3 & 4, etc (for all available snapshots). The database will be reset regularly so we can't keep the performance data inside the database. Is there a easy way to pass the parameters in? Thanks!

September 05, 2007 - 1:39 pm UTC

It'll be similar to what I've done with statspack in the past, eg:

column b new_val begin_snap
column e new_val end_snap
define report_name=multiuser_&NumUsers._&NumIters
select max(decode(rn,1,snap_id)) e,
       max(decode(rn,2,snap_id)) b
  from (
select snap_id, row_number() over (order by snap_id desc) rn
  from perfstat.stats$snapshot
       )
 where rn <= 2
/

@?/rdbms/admin/spreport

I read spreport to see what defines it was expecting (B and E apparently). I set these values and the report_name.

then the spreport call doesn't prompt.

looking at awrrpt.sql I see:

....

@@awrrpti

undefine num_days;
undefine report_type;
undefine report_name;
undefine begin_snap;
undefine end_snap;

so, it undefines those after running the report - I would presume those are the defines you need to set before calling awrrpt.sql and it would not prompt for them...

see the heading of awrrpti.sql as well - it lists them all with some notes.

Is it or isnt it a problematic query ?

Yoav, September 05, 2007 - 3:04 pm UTC

Hi Tom,
From the following one hour statspack report :
I analyzed the following statspack report with one of
my colleague and we had difference of opinion regarding
Hash Value: 1430248672 in the below statspack.
1. would you concern the first query (Hash Value:
1430248672) as a problematic one ,
or do you suggest to forget about it and focus
on the rest.
Please note that the first query is also exists in the
"Execute to parse" section as describe below
STATSPACK report for

DB Name DB Id Instance Inst Num Release Cluster Host
------------ ----------- ------------ -------- ----------- ------- ------------
xxxx 2993006132 xxx 1 9.2.0.8.0 NO venus

Snap Id Snap Time Sessions Curs/Sess Comment
--------- ------------------ -------- --------- -------------------
Begin Snap: 52230 05-Sep-07 12:00:06 150 #########
End Snap: 52231 05-Sep-07 13:00:07 154 #########
Elapsed: 60.02 (mins)

Cache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 2,432M Std Block Size: 8K
Shared Pool Size: 704M Log Buffer: 1,024K

Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 104,213.58 8,905.60
Logical reads: 70,724.88 6,043.81
Block changes: 355.32 30.36
Physical reads: 2,069.13 176.82
Physical writes: 180.82 15.45
User calls: 3,210.44 274.35
Parses: 2,221.00 189.80
Hard parses: 6.18 0.53
Sorts: 161.23 13.78
Logons: 0.13 0.01
Executes: 2,229.63 190.53
Transactions: 11.70

% Blocks changed per Read: 0.50 Recursive Call %: 19.42
Rollback per transaction %: 0.15 Rows per Sort: 108.30

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 99.91 Redo NoWait %: 100.00
Buffer Hit %: 97.10 In-memory Sort %: 100.00
Library Hit %: 99.76 Soft Parse %: 99.72
Execute to Parse %: 0.39 Latch Hit %: 99.84
Parse CPU to Parse Elapsd %: 61.57 % Non-Parse CPU: 93.55

Shared Pool Statistics Begin End
------ ------
Memory Usage %: 93.45 93.45
% SQL with executions>1: 69.16 64.49
% Memory for SQL w/exec>1: 74.70 71.19

Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
db file sequential read 5,479,368 43,793 75.17
CPU time 6,134 10.53
buffer busy waits 218,520 2,738 4.70
db file scattered read 301,809 2,193 3.76
log file sync 43,222 1,606 2.76
-------------------------------------------------------------

SQL ordered by Gets for DB: xxx Instance: xxx Snaps: 52230 -52231
-> End Buffer Gets Threshold: 10000
-> Note that resources reported for PL/SQL includes the resources used by
all SQL statements called within the PL/SQL code. As individual SQL
statements are also reported, it is possible and valid for the summed
total % to exceed 100

CPU Elapsd
Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
27,697,086 435,336 63.6 10.9 175.85 210.04 1430248672
Module: PSAPPSRV@cornelius (TNS V1-V3)
SELECT ROLEUSER,ROLENAME,DYNAMIC_SW FROM PSROLEUSER WHERE ROLENAME = :1

21,534,432 1,307 16,476.2 8.5 95.52 105.30 707978434
Module: PSAPPSRV@cornelius (TNS V1-V3)
SELECT DISTINCT D.RC_DETAIL ,D.RC_SHORT_DESCR FROM PS_RC_QUICK_C
D_TBL A , PS_NAP_QUICK_PRVG QP ,PS_RF_GRP_MEMBER M , PSOPRALIAS
U,PS_RC_CA_TY_DE_TBL D WHERE U.OPRID= :1 AND A.EVENTTYPE = :2 AN
D A.BUSINESS_UNIT = :3 AND A.RC_CATEGORY = :4 AND A.RC_TYPE = :5
AND D.SETID =:6 AND A.EFF_STATUS = 'A' AND QP.BUSINESS_UNIT =

19,447,876 2,129 9,134.7 7.6 115.36 1252.11 703479427
Module: PSAPPSRV@cornelius (TNS V1-V3)
SELECT 1 FROM PS_RF_INST_PROD P, NAP_IDD D WHERE P.SETID=D.SETID
AND P.INST_PROD_ID=D.INST_PROD_ID AND D.NAP_TAXDN_IND='Y' AND p
.bo_id_cust = :1 and p.nap_crm_status in ('C_03','C_04','I_01')
and rownum < 2

....

SQL ordered by Parse Calls for DB: xxx Instance: xxx Snaps: 52230 -52231
-> End Parse Calls Threshold: 1000

% Total
Parse Calls Executions Parses Hash Value
------------ ------------ -------- ----------
538,657 538,655 6.74 3963084412
Module: PSAPPSRV@turkish (TNS V1-V3)
SELECT PKG.RB_TEMPLATE_ID, PKG.RB_TEMPLATE_NAME FROM PS_RBC_PACK
AGE_DFN PKG, PS_RBC_TEMPLAT_FIL TMPL WHERE PKG.RB_PACKAGE_NAME=:
1 AND PKG.RB_PACKAGE_ID=:2 AND PKG.LANGUAGE_CD=:3 AND PKG.RB_TEM
PLATE_NAME = TMPL.RB_TEMPLATE_NAME AND PKG.RB_TEMPLATE_ID = TMPL
.RB_TEMPLATE_ID AND PKG.LANGUAGE_CD = TMPL.LANGUAGE_CD AND (TMPL

522,781 522,781 6.54 2991391735
Module: PSAPPSRV@turkish (TNS V1-V3)
SELECT FILENAME FROM PS_RBC_PACKAGE_FIL WHERE RB_PACKAGE_NAME=:1
AND RB_PACKAGE_ID=:2 AND LANGUAGE_CD=:3

505,605 505,595 6.32 672227430
Module: PSAPPSRV@turkish (TNS V1-V3)
SELECT 'X' FROM PS_RBC_PACKAG_TMPL PKG, PS_RBC_TEMPLAT_FIL TMPL
WHERE PKG.RB_PACKAGE_NAME=:1 AND PKG.RB_PACKAGE_ID=:2 AND PKG.LA
NGUAGE_CD=:3 AND PKG.RB_TEMPLATE_NAME = TMPL.RB_TEMPLATE_NAME AN
D PKG.RB_TEMPLATE_ID = TMPL.RB_TEMPLATE_ID AND PKG.LANGUAGE_CD =
TMPL.LANGUAGE_CD AND (TMPL.RB_DELV_CHANNEL = '1' OR TMPL.RB_DEL

435,337 435,336 5.44 1430248672
Module: PSAPPSRV@cornelius (TNS V1-V3)
SELECT ROLEUSER,ROLENAME,DYNAMIC_SW FROM PSROLEUSER WHERE ROLENAME = :1

277,238 277,238 3.47 1732080893
Module: PSMONITORSRV.exe
SELECT VERSION FROM PSVERSION WHERE OBJECTTYPENAME = 'SYS'

September 05, 2007 - 5:45 pm UTC

well, you spend most time waiting for IO, but I don't see the report that shows the queries that are most affected by IO

it is hard to tune with statspack...

Queries effected by IO

A reader, September 05, 2007 - 6:35 pm UTC

Hi,
Thank you for your very quick response.

Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 104,213.58 8,905.60
Logical reads: 70,724.88 6,043.81
Block changes: 355.32 30.36
Physical reads: 2,069.13 176.82
Physical writes: 180.82 15.45

From the above i most effected from Logical I/O (70,724) then from Physical I/O (~2000 per/sec)
So , i went to see what happen in the :"SQL ordered by Gets " .
Do you think i should check the "SQL ordered by Reads" section ?
So i

September 11, 2007 - 7:38 am UTC

                                                     CPU      Elapsd
  Buffer Gets    Executions  Gets per Exec  %Total Time (s)  Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
     27,697,086      435,336           63.6   10.9   175.85    210.04 1430248672
Module: PSAPPSRV@cornelius (TNS V1-V3)
SELECT ROLEUSER,ROLENAME,DYNAMIC_SW FROM PSROLEUSER WHERE ROLENAME = :1

given that, a possible IOT (index organized table) might be useful, if you put all of the information for a given rolename together - we can minimize the logical IO for that query.

but, you would actually probably want to see if you are doing most or a lot of your physical IO against this table first (ultimately, you are waiting on IO, physical IO)

Vikas Atrey, September 10, 2007 - 3:14 am UTC

Gets per execution for the query are 63.6 but no of executions are very high so definitely system will benefit even if slight improvement is possible.

Between this query looks like storing user related security info. Can you think of some ways to reduce no of executions for the same.

September 12, 2007 - 10:50 am UTC

to reduced the number of executions, one must be familiar with the application itself.

Ask the developers why they execute this query so often and if they could see a way to not have to.

wrt Matts question with statspack,

kulbhushan, September 26, 2007 - 1:08 pm UTC

Tom,
This is wrt Matts question with statspack, i see there significant latch free waits.
Further going down to Instance latch statistics,there are library cache waits and cache buffer chains wait events.
So why no suggestions for tuning those?

September 26, 2007 - 10:09 pm UTC

because I wasn't reading a 5 page statspack, I was addressing what he was asking.

wrt to Matts questtion again

kulbhushan, September 27, 2007 - 6:15 am UTC

Tom,
Ok.. I would like to know what would you suggest to tune latch free events ?

September 27, 2007 - 7:21 am UTC

well, we'd want to start by trying to identify what the latches were being gotten for - and then reduce that need.

But, this is a huge page, lots of statspack on them, I didn't see a quick obvious statspack with a LATCH BREAKDOWN section - so cannot comment.

Statspack data from Matt's question

kulbhushan, September 28, 2007 - 12:29 am UTC

Below are top 5 wait events and latch info breakdown from Statspack reported by Matt.Please Explain what would you do tune Latch free wait event?.

Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (s) Wt Time
-------------------------------------------- ------------ ----------- -------
buffer busy waits 60,866 699 22.89
latch free 219,813 515 16.87
log file parallel write 10,751 494 16.17
db file sequential read 9,711 319 10.44
db file parallel write 2,446 278 9.10

Latch Requests Miss /Miss (s) Requests Miss
------------------------ -------------- ------ ------ ------ ------------ ------
shared pool 73,360 0.0 0.0 0 0
messages 50,845 0.1 0.0 0 0
redo allocation 914,048 0.3 0.0 0 0
session idle bit 4,878,547 0.0 0.1 0 0
library cache 28,579,396 4.6 0.1 0 0
cache buffer handles 39,898 0.0 0.2 0 0
latch wait list 1,100,299 0.6 0.2 0 0
undo global data 124,805 0.0 0.3 0 0
redo writing 32,727 0.1 0.3 0 0
cache buffers chains 17,523,755 0.4 0.4 0 30,543 0.0
list of block allocation 27,350 0.0 0.5 0 0
transaction allocation 41,136 0.1 0.5 0 0
checkpoint queue latch 146,571 0.0 0.5 0 0
enqueues 51,443 0.0 0.6 0 0
enqueue hash chains 41,265 0.1 0.7 0 0
cache buffers lru chain 26,876 0.0 1.0 0 28,229 0.1

September 28, 2007 - 5:31 pm UTC

major misses are on the library cache - look to reduce parsing.

but we still don't have the correct section, a miss is not a wait, waits are waits and we can miss without waiting.

the section you really want looks more like this (this is NOT from matt, this is from some other system):

Latch Sleep breakdown  DB/Inst: ORA10G/ora10g  Snaps:        98-       99
-> ordered by misses desc

                                      Get                            Spin &
Latch Name                       Requests      Misses      Sleeps Sleeps 1->3+
-------------------------- -------------- ----------- ----------- ------------
library cache                  42,647,725      50,695      87,841 0/22416/2141
                                                                  0/6869
shared pool                    53,865,026      46,005      52,223 0/42249/2624
                                                                  /1132
row cache objects              52,603,886       3,289       3,292 0/3286/3/0
session allocation              5,354,279         730         730 0/730/0/0
child cursor hash table        17,514,294         496         496 0/496/0/0
enqueue hash chains             5,019,461         416         416 0/416/0/0
library cache pin              17,552,400         412         412 0/412/0/0
enqueues                        5,018,368         410         410 0/410/0/0
library cache lock             15,026,414         290         290 0/290/0/0
slave class create                     33           3           3 0/3/0/0
library cache load lock             4,858           1           1 0/1/0/0

the sleeps - they correspond to waits.
the misses - they correspond to elevated CPU time (when we miss, we burn cpu)

the top waits that went with my latch report was:

Top 5 Timed Events
~~~~~~~~~~~~~~~~~~                                                      % Total
Event                                               Waits    Time (s) Call Time
-------------------------------------------- ------------ ----------- ---------
latch: library cache                               87,840      32,275     58.59
CPU time                                                        5,206      9.45
control file parallel write                           522       3,957      7.18
latch: shared pool                                 52,223       3,919      7.11
library cache pin                                   1,078       3,163      5.74
          ------------------------------------------------------------

that library cache wait - all about parsing, the high CPU time, lots of that due to high MISSES on latches

reduce the misses, reduce the cpu time used.
reduce the sleeps, reduce the wait time (which will reduce the misses which will...)

Thanks for explaination

kulbhushan, September 30, 2007 - 12:21 pm UTC

Tom
Thanks for wonderful explaination.

Cheers

avg read time > 100ms

Reene, December 19, 2007 - 9:09 am UTC

Hi Tom,

in one of my statspack reports , the avg read time for many of the datafiles is more than 140ms.

the other thing is the files for which reads/sec is very low there only I see avg read at high side i.e > 100 ms,
but for the datafiles where avg read/sec is high there there the avg read time is ok - that in the range of 10-30 ms.

my top 5 wait events always shows db file sequential read.

I have 2 question on this -

1. is it ok to conclude that we have a storage issue ?
2. if not then what could be the other reason.
3. why the avg read time is high for the datafiles with so less reads/sec

allow me to past part of this statspack so that you will have some data to look at ,

STATSPACK report for

DB Name DB Id Instance Inst Num Release Cluster Host
------------ ----------- ------------ -------- ----------- ------- ------------
ODS 2133231000 ods 1 9.2.0.6.0 NO GGNSBIMID4

Snap Id Snap Time Sessions Curs/Sess Comment
--------- ------------------ -------- --------- -------------------
Begin Snap: 13263 11-Dec-07 12:00:05 126 #########
End Snap: 13264 11-Dec-07 15:27:44 136 #########
Elapsed: 207.65 (mins)

Cache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 1,408M Std Block Size: 16K
Shared Pool Size: 864M Log Buffer: 3,072K

Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: -700,139.29 -3,134,400.08
Logical reads: 291,177.69 1,303,551.16
Block changes: 3,041.97 13,618.37
Physical reads: 6,950.33 31,115.39
Physical writes: 339.98 1,522.01
User calls: 1.48 6.64
Parses: 3.19 14.29
Hard parses: 0.11 0.51
Sorts: 1.00 4.47
Logons: 0.03 0.13
Executes: 84.44 378.04
Transactions: 0.22

% Blocks changed per Read: 1.04 Recursive Call %: 98.79
Rollback per transaction %: 0.00 Rows per Sort: 6310.39

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 100.00 Redo NoWait %: 100.00
Buffer Hit %: 97.64 In-memory Sort %: 99.95
Library Hit %: 99.68 Soft Parse %: 96.46
Execute to Parse %: 96.22 Latch Hit %: 99.87
Parse CPU to Parse Elapsd %: 2.00 % Non-Parse CPU: 99.90

Shared Pool Statistics Begin End
------ ------
Memory Usage %: 85.03 87.71
% SQL with executions>1: 21.95 26.35
% Memory for SQL w/exec>1: 12.25 14.13

Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
db file sequential read 7,361,051 20,964 37.50
db file scattered read 7,513,133 13,417 24.00
PL/SQL lock timer 669 10,641 19.04
CPU time 7,818 13.99
db file parallel write 66,768 1,570 2.81
-------------------------------------------------------------
Wait Events for DB: ODS Instance: ods Snaps: 13263 -13264
-> s - second
-> cs - centisecond - 100th of a second
-> ms - millisecond - 1000th of a second
-> us - microsecond - 1000000th of a second
-> ordered by wait time desc, waits desc (idle events last)

Avg
Total Wait wait Waits
Event Waits Timeouts Time (s) (ms) /txn
---------------------------- ------------ ---------- ---------- ------ --------
db file sequential read 7,361,051 0 20,964 3 2,645.0
db file scattered read 7,513,133 0 13,417 2 2,699.7
PL/SQL lock timer 669 669 10,641 15906 0.2
db file parallel write 66,768 0 1,570 24 24.0
direct path read 89,821 0 896 10 32.3
log buffer space 883 67 189 214 0.3
latch free 30,562 28,875 104 3 11.0
process startup 145 71 95 658 0.1
control file parallel write 5,335 0 38 7 1.9
direct path write 31,396 0 35 1 11.3
log file sync 1,632 4 33 20 0.6
PX Deq: Execute Reply 64 9 21 323 0.0
log file parallel write 67,442 0 12 0 24.2
free buffer waits 9 9 9 977 0.0
log file switch completion 82 0 8 97 0.0
buffer busy waits 2,724 0 7 3 1.0
enqueue 629 0 7 11 0.2
db file parallel read 273 0 5 20 0.1
control file sequential read 6,217 0 5 1 2.2
library cache pin 12 0 3 255 0.0
local write wait 200 0 2 12 0.1
LGWR wait for redo copy 2,680 38 1 0 1.0
async disk IO 274 0 1 4 0.1
PX Deq: Parse Reply 6 0 1 130 0.0
row cache lock 60 0 1 13 0.0
direct path read (lob) 65 0 0 7 0.0
PX Deq: Msg Fragment 110 0 0 4 0.0
PX Deq: Signal ACK 16 8 0 28 0.0
SQL*Net break/reset to clien 27 0 0 16 0.0
PX Deq: Join ACK 6 0 0 65 0.0
log file single write 26 0 0 9 0.0
log file sequential read 26 0 0 6 0.0
PX Deq Credit: send blkd 22 0 0 5 0.0
PX Deq Credit: need buffer 96 0 0 1 0.0
db file single write 85 0 0 1 0.0
undo segment extension 33,680 33,665 0 0 12.1
SQL*Net more data to client 329 0 0 0 0.1
buffer deadlock 90 88 0 0 0.0
SQL*Net message from client 15,431 0 98,044 6354 5.5
queue messages 6,875 6,594 42,003 6110 2.5
PX Idle Wait 2,615 2,582 5,057 1934 0.9
jobq slave wait 578 554 1,670 2889 0.2
PX Deq: Table Q Normal 99,594 0 633 6 35.8
wakeup time manager 38 38 371 9766 0.0
PX Deq: Execution Msg 176 1 20 115 0.1
SQL*Net message to client 15,446 0 0 0 5.6
SQL*Net more data from clien 12 0 0 0 0.0
-------------------------------------------------------------

---then IO stat

TS_CMS_ACCT_DATA /dwdata4/oradata/ods/CMS_ACCT_DATA01.DBF
48 0 106.5 1.0 10 0 0
/dwdata9/oradata/CMS_ACCT_DATA02.DBF
62 0 163.4 1.0 11 0 0
/exportdata1/ods/CMS_ACCT_DATA06.DBF
66 0 138.3 1.0 11 0 0
/exportdata2/ods/CMS_ACCT_DATA07.DBF
52 0 64.2 1.0 10 0 0
/exportdata3/oracle/CMS_ACCT_DATA04.DBF
66 0 170.2 1.0 11 0 0
/sbiods/dm/DataFiles/CMS_ACCT_DATA05.DBF
55 0 182.9 1.0 11 0 0

TS_CMS_ACCT_IDX /dwdata10/oradata/CMS_ACCT_IDX02.DBF
51 0 174.7 1.0 11 0 0
/dwdata14/oradata/CMS_ACCT_IDX03.dbf
47 0 104.9 1.0 11 0 0
/dwdata7/oradata/ods/CMS_ACCT_IDX01.DBF
47 0 122.1 1.0 11 0 0
/exportdata1/ods/CMS_ACCT_IDX05.DBF
66 0 138.3 1.0 11 0 0
/sbiods/dm/DataFiles/CMS_ACCT_IDX04.DBF
66 0 175.9 1.0 11 0 0

TS_CMS_AMOS_DATA /dwdata1/oradata/ods/CMS_AMOS_DATA01.DBF
48 0 105.8 1.0 10 0 0
/exportdata1/ods/CMS_AMOS_DATA02.DBF
66 0 138.3 1.0 11 0 0
/sbiods/dm/DataFiles/CMS_AMOS_DATA01.DBF
66 0 176.2 1.0 11 0 0

TS_CMS_AMOS_IDX /dwdata7/oradata/ods/CMS_AMOS_IDX01.DBF
41 0 116.8 1.0 11 0 0
/exportdata1/ods/CMS_AMOS_IDX03.DBF
66 0 138.3 1.0 11 0 0
/sbiods/dm/DataFiles/CMS_AMOS_IDX02.DBF
55 0 177.5 1.0 11 0 0

TS_CMS_AMPH_DATA /dwdata14/oradata/ods/CMS_AMPH_DATA02.DBF
55 0 173.1 1.0 11 0 0
/dwdata6/oradata/ods/CMS_AMPH_DATA01.DBF
48 0 106.0 1.0 10 0 0
/exportdata1/ods/CMS_AMPH_DATA05.DBF
66 0 138.3 1.0 11 0 0
/exportdata3/oracle/CMS_AMPH_DATA03.DBF
55 0 183.1 1.0 11 0 0
/sbiods/dm/DataFiles/CMS_AMPH_DATA04.DBF
55 0 177.5 1.0 11 0 0

TS_CMS_AMPH_IDX /dwdata13/oradata/ods/CMS_AMPH_IDX02.DBF
10 0 204.0 1.0 10 0 0
/dwdata3/oradata/ods/CMS_AMPH_IDX01.DBF
File IO Stats for DB: ODS Instance: ods Snaps: 13263 -13264
->ordered by Tablespace, File

Tablespace Filename
------------------------ ----------------------------------------------------
Av Av Av Av Buffer Av Buf
Reads Reads/s Rd(ms) Blks/Rd Writes Writes/s Waits Wt(ms)
-------------- ------- ------ ------- ------------ -------- ---------- ------
47 0 121.9 1.0 11 0 0
TS_CMS_AMPH_IDX /exportdata1/ods/CMS_AMPH_IDX04.DBF
66 0 138.3 1.0 11 0 0
/sbiods/dm/DataFiles/CMS_AMPH_IDX03.DBF
66 0 151.5 1.0 11 0 0

TS_CMS_ASSD_DATA /dwdata2/oradata/ods/CMS_ASSD_DATA01.DBF
48 0 105.8 1.0 10 0 0
/exportdata1/ods/CMS_ASSD_DATA03.DBF
66 0 138.3 1.0 11 0 0
/sbiods/dm/DataFiles/CMS_ASSD_DATA02.DBF
66 0 176.2 1.0 11 0 0

TS_CMS_ASSD_IDX /dwdata7/oradata/odsCMS_ASSD_IDX01.DBF
41 0 116.8 1.0 11 0 0
/exportdata1/ods/odsCMS_ASSD_IDX02.DBF
66 0 110.3 1.0 11 0 0
/sbiods/dm/DataFiles/odsCMS_ASSD_IDX02.DBF
55 0 177.5 1.0 11 0 0

TS_CMS_ATSM_DATA /dwdata1/oradata/ods/CMS_ATSM_DATA01.DBF
48 0 105.8 1.0 10 0 0
/dwdata4/oradata/ods/CMS_ATSM_DATA02.dbf
51 0 174.9 1.0 11 0 0
/exportdata1/ods/CMS_ATSM_DATA05.DBF
55 0 122.7 1.0 11 0 0
/exportdata3/oracle/CMS_ATSM_DATA03.DBF
66 0 170.3 1.0 11 0 0
/sbiods/dm/DataFiles/CMS_ATSM_DATA04.DBF
66 0 157.4 1.0 11 0 0

TS_CMS_ATSM_IDX /dwdata5/oradata/ods/CMS_ATSM_IDX02.DBF
55 0 161.6 1.0 11 0 0
/dwdata6/oradata/ods/CMS_ATSM_IDX01.DBF
41 0 117.1 1.0 11 0 0
/exportdata1/ods/CMS_ATSM_IDX03.DBF
55 0 122.4 1.0 11 0 0

TS_CMS_ATST_DATA /dwdata2/oradata/ods/CMS_ATST_DATA01.DBF
48 0 105.8 1.0 10 0 0
/dwdata8/oradata/ods/CMS_ATST_DATA02.DBF
55 0 161.6 1.0 11 0 0
/exportdata1/ods/CMS_ATST_DATA03.DBF
55 0 122.4 1.0 11 0 0

TS_CMS_ATST_IDX /dwdata6/oradata/ods/CMS_ATST_IDX01.DBF
41 0 116.8 1.0 11 0 0
/exportdata1/ods/CMS_ATST_IDX02.DBF
55 0 122.4 1.0 11 0 0

TS_CMS_AUTH_DATA /dwdata7/oradata/ods/CMS_AUTH_DATA01.DBF
48 0 106.0 1.0 10 0 0
File IO Stats for DB: ODS Instance: ods Snaps: 13263 -13264
->ordered by Tablespace, File

Tablespace Filename
------------------------ ----------------------------------------------------
Av Av Av Av Buffer Av Buf
Reads Reads/s Rd(ms) Blks/Rd Writes Writes/s Waits Wt(ms)
-------------- ------- ------ ------- ------------ -------- ---------- ------
TS_CMS_AUTH_DATA /exportdata1/ods/CMS_AUTH_DATA04.DBF
55 0 122.4 1.0 11 0 0
/exportdata3/oracle/CMS_AUTH_DATA02.DBF
66 0 170.2 1.0 11 0 0
/sbiods/dm/DataFiles/CMS_AUTH_DATA03.DBF
66 0 176.1 1.0 11 0 0

TS_CMS_AUTH_IDX /dwdata2/oradata/CMS_AUTH_IDX02.dbf
48 0 106.0 1.0 10 0 0
/dwdata3/oradata/ods/CMS_AUTH_IDX01.DBF
47 0 121.9 1.0 11 0 0
/exportdata1/ods/CMS_AUTH_IDX04.DBF
55 0 122.4 1.0 11 0 0
/sbiods/dm/DataFiles/CMS_AUTH_IDX03.DBF
66 0 175.9 1.0 11 0 0

TS_CMS_CARD_DATA /dwdata1/oradata/ods/CMS_CARD_DATA01.DBF
48 0 106.5 1.0 10 0 0
/dwdata3/oradata/ods/CMS_CARD_DATA03.DBF
62 0 163.4 1.0 11 0 0
/dwdata5/oradata/ods/CMS_CARD_DATA04.DBF
58 0 168.8 1.0 11 0 0
/dwdata6/oradata/ods/CMS_CARD_DATA02.dbf
51 0 175.3 1.0 11 0 0
/dwdata8/oradata/ods/CMS_CARD_DATA05.DBF
48 0 106.0 1.0 10 0 0
/exportdata1/ods/CMS_CARD_DATA08.DBF
55 0 122.4 1.0 11 0 0
/exportdata3/oracle/CMS_CARD_DATA06.DBF
66 0 154.1 1.0 11 0 0
/sbiods/dm/DataFiles/CMS_CARD_DATA07.DBF
66 0 157.4 1.0 11 0 0

TS_CMS_CARD_IDX /dwdata13/oradata/ods/CMS_CARD_IDX03.DBF
20 0 138.0 1.0 10 0 0
/dwdata7/oradata/ods/CMS_CARD_IDX01.DBF
47 0 122.1 1.0 11 0 0
/dwdata8/oradata/ods/CMS_CARD_IDX02.dbf
51 0 175.3 1.0 11 0 0
/exportdata1/ods/CMS_CARD_IDX05.DBF
55 0 122.4 1.0 11 0 0
/exportdata3/oracle/CMS_CARD_IDX04.DBF
66 0 186.2 1.0 11 0 0

TS_CMS_CUST_DATA /dwdata12/oradata/ods/CMS_CUST_DATA02.dbf
41 0 117.1 1.0 11 0 0
/dwdata3/oradata/ods/CMS_CUST_DATA01.DBF
48 0 106.5 1.0 10 0 0
/exportdata1/ods/CMS_CUST_DATA03.DBF
55 0 122.2 1.0 11 0 0

TS_CMS_CUST_IDX /dwdata12/oradata/ods/CMS_CUST_IDX02.DBF
File IO Stats for DB: ODS Instance: ods Snaps: 13263 -13264
->ordered by Tablespace, File

Tablespace Filename
------------------------ ----------------------------------------------------
Av Av Av Av Buffer Av Buf
Reads Reads/s Rd(ms) Blks/Rd Writes Writes/s Waits Wt(ms)
-------------- ------- ------ ------- ------------ -------- ---------- ------
41 0 117.1 1.0 11 0 0
TS_CMS_CUST_IDX /dwdata7/oradata/ods/CMS_CUST_IDX01.DBF
47 0 122.1 1.0 11 0 0
/exportdata1/ods/CMS_CUST_IDX03.DBF
55 0 122.2 1.0 11 0 0

TS_CMS_DLT_DATA /dwdata1/oradata/ods/CMS_DLT_DATA01.DBF
830 0 13.3 1.0 2,706 0 1 20.0
/dwdata11/oradata/CMS_DLT_DATA04.DBF
61 0 154.3 1.0 1,859 0 0
/dwdata12/oradata/CMS_DLT_DATA05.DBF
75 0 127.7 1.0 93 0 0
/dwdata13/oradata/CMS_DLT_DATA06.DBF
67 0 141.3 1.0 1,923 0 0
/dwdata14/oradata/CMS_DLT_DATA07.DBF
80 0 119.3 1.0 115 0 0
/dwdata3/oradata/ods/CMS_DLT_DATA02.dbf
142 0 56.9 1.0 2,052 0 0
/dwdata4/oradata/ods/CMS_DLT_DATA03.dbf
61 0 120.7 1.0 1,941 0 0
/exportdata1/ods/CMS_DLT_DATA08.DBF
75 0 100.4 1.0 463 0 0

TS_CMS_HST_MISC_DATA /dwdata5/oradata/ods/CMS_HST_MISC_DATA01.DBF
47 0 153.6 1.0 11 0 0
/exportdata1/ods/CMS_HST_MISC_DATA02.DBF
55 0 122.0 1.0 11 0 0

TS_CMS_HST_MISC_IDX /dwdata5/oradata/ods/CMS_HST_MISC_IDX01.DBF
47 0 153.4 1.0 11 0 0
/exportdata1/ods/CMS_HST_MISC_IDX02.DBF
66 0 130.9 1.0 11 0 0

TS_CMS_MISC_DATA /dwdata1/oradata/ods/CMS_MISC_DATA01.DBF
48 0 105.8 1.0 10 0 0
/dwdata11/oradata/ods/CMS_MISC_DATA06.DBF
55 0 169.3 1.0 11 0 0
/dwdata14/oradata/ods/CMS_MISC_DATA08.dbf
38 0 107.6 1.0 10 0 0
/dwdata14/oradata/ods/CMS_MISC_DATA09.dbf
38 0 107.6 1.0 10 0 0
/dwdata16/oradata/ods/CCMS_MISC_DATA07.DBF
55 0 174.0 1.0 11 0 0
/dwdata4/oradata/ods/CMS_MISC_DATA02.dbf
58 0 168.6 1.0 11 0 0
/dwdata5/oradata/ods/CMS_MISC_DATA03.dbf
66 0 150.9 1.0 12 0 0
/dwdata7/oradata/ods/CMS_MISC_DATA04.dbf
58 0 168.4 1.0 11 0 0
/dwdata8/oradata/ods/CMS_MISC_DATA05.dbf
51 0 174.9 1.0 11 0 0
/exportdata1/ods/CMS_MISC_DATA13.DBF
File IO Stats for DB: ODS Instance: ods Snaps: 13263 -13264
->ordered by Tablespace, File

Tablespace Filename
------------------------ ----------------------------------------------------
Av Av Av Av Buffer Av Buf
Reads Reads/s Rd(ms) Blks/Rd Writes Writes/s Waits Wt(ms)
-------------- ------- ------ ------- ------------ -------- ---------- ------
66 0 130.8 1.0 11 0 0
TS_CMS_MISC_DATA /exportdata3/oracle/CMS_MISC_DATA10.dbf
55 0 177.3 1.0 11 0 0
/sbiods/dm/DataFiles/CMS_MISC_DATA12.dbf
55 0 176.9 1.0 11 0 0

TS_CMS_MISC_IDX /dwdata2/oradata/ods/CMS_MISC_IDX01.DBF
48 0 105.8 1.0 10 0 0
/dwdata7/oradata/ods/CMS_MISC_IDX02.dbf
59 0 166.1 1.0 12 0 0
/exportdata1/ods/CMS_MISC_IDX03.DBF
66 0 130.8 1.0 11 0 0

TS_CMS_MST_ABEH_DATA /dwdata14/oradata/ods/CMS_MST_ABEH_DATA04.DBF
13,852 1 19.4 23.9 8,708 1 0
/dwdata16/oradata/ods/CMS_MST_ABEH_DATA03.DBF
18,251 1 15.8 18.2 10,115 1 0
/dwdata4/oradata/ods/CMS_MST_ABEH_DATA01.DBF
9,301 1 18.9 4.6 1,931 0 0
/dwdata6/oradata/ods/CMS_MST_ABEH_DATA02.DBF
74 0 137.8 1.4 11 0 0
/exportdata1/ods/CMS_MST_ABEH_DATA06.DBF
86 0 108.3 3.6 11 0 0
/exportdata3/oracle/CMS_MST_ABEH_DATA05.DBF
8,436 1 20.1 16.4 3,775 0 0

TS_CMS_MST_ABEH_IDX /dwdata15/oradata/ods/CMS_MST_ABEH_IDX03.DBF
5,018 0 10.8 1.0 23 0 0
/dwdata3/oradata/ods/CMS_MST_ABEH_IDX02.dbf
8,689 1 5.0 1.0 24 0 0
/dwdata6/oradata/ods/CMS_MST_ABEH_IDX01.DBF
7,133 1 6.1 1.0 24 0 0
/exportdata2/ods/CMS_MST_ABEH_IDX05.DBF
66 0 130.3 1.0 11 0 0
/sbiods/dm/DataFiles/CMS_MST_ABEH_IDX04.DBF
9,622 1 7.8 1.0 20 0 0

TS_CMS_MST_ACCT_DATA /dwdata15/oradata/ods/CMS_MST_ACCT_DATA02.DBF
16,847 1 49.6 61.0 315,211 25 0
/dwdata8/oradata/ods/CMS_MST_ACCT_DATA01.DBF
3,286 0 127.5 58.1 67,593 5 0
/exportdata2/ods/CMS_MST_ACCT_DATA04.DBF
97 0 161.0 18.5 490 0 0
/exportdata3/oracle/CMS_MST_ACCT_DATA03.DBF
4,341 0 51.3 61.7 120,074 10 0

TS_CMS_MST_ACCT_IDX /dwdata4/oradata/ods/CMS_MST_ACCT_IDX01.DBF
20 0 138.0 1.0 10 0 0
/dwdata8/oradata/ods/CMS_MST_ACCT_IDX02.DBF
51 0 174.7 1.0 11 0 0
/exportdata2/ods/CMS_MST_ACCT_IDX03.DBF
66 0 130.3 1.0 11 0 0

December 19, 2007 - 10:50 am UTC

a 3.5 hour statspack is virtually useless for tuning with.

could be that the disks that are infrequently accessed (they are hardly touched in this 3.5 hours) are totally uncached by your san and are sort of on the back burner (no one touched them for a long time)

but not really knowing your storage architecture...

anyway - since they account for such a teeny tiny itsy bitsy bit of the overall IO (you have a lot of IO

db file sequential read                         7,361,051      20,964    37.50
db file scattered read                          7,513,133      13,417    24.00

these things are what I would call "NOISE", annoying but probably nothing to worry about in the grand scheme of things)

Your overall average IO response:

                                                                   Avg
                                                     Total Wait   wait    Waits
Event                               Waits   Timeouts   Time (s)   (ms)     /txn
---------------------------- ------------ ---------- ---------- ------ --------
db file sequential read         7,361,051          0     20,964      3  2,645.0
db file scattered read          7,513,133          0     13,417      2  2,699.7

pretty good - what you need to look into would be "how do I significantly reduce that amount of overall IO"

because even if you "fixed" the thing you are focused on right now - it would not impact anyone, no one would notice - because no one is waiting a horribly long time for it because it happens so *infrequently*

Reene

Reene, December 20, 2007 - 2:11 am UTC

Hi

How did you arrive at this

"anyway - since they account for such a teeny tiny itsy bitsy bit of the overall IO (you have a lot of IO ) "

specially at the last part "you have a lot of IO".

I just want to learn this , from where , which data indicates this.

Regards

December 20, 2007 - 10:02 am UTC

ummm

...
anyway - since they account for such a teeny tiny itsy bitsy bit of the overall IO (you have a lot of IO

db file sequential read                         7,361,051      20,964    37.50
db file scattered read                          7,513,133      13,417    24.00

the numbers 7,361,051 and 7,513,133 represent (a cut and paste directly from your report) a lot of IO.

You waited almost 15 million times for IO.

And you are focused on a section of the report that shows 10's of IO's

10's of IO's teeny tiny - compared to 15 MILLION

AWR and trigger SQL

Mike, January 25, 2008 - 5:00 pm UTC

10.2.0.3.0

I was looking at an ASH report in OEM and noticed that SQL executed through the firing of a trigger didn't show up in the report. Do you know if it's captured in the ADDM snapshot and stored in AWR? If so, how can I drill down to it (see how often it's executed, and what type of overhead it incurs.)

performance investigation

deba, April 06, 2008 - 5:22 am UTC

Hi Tom,

Recently, our hardware have been upgraded. Now there are 15 cpu and each cpu has 4 thread. This leads to confusion a lot.

Now if I see cpu stat from v$sqlarea ( or anywhere in Oracle ) then I can get cpu time is secs per execution((cpu_time/executions)/1000000) . Now this time for each cpu or for each thread ? I think it should be logical for each thread. But our OS administrator is saying cpu is one but threads are 4 and these are virtual. Now if I need to monitor cpu then what should I do ? Please give me some concept and if possible some explanation as well with examples .

Now another confusion, job_queue_process was inititally 4 ( for 4 cpu without any thread in my previous system ). Now what should be the value in new system ?

Thanks
Deba

April 07, 2008 - 8:59 am UTC

cpu time is reported to us by the OS. Whatever your OS returns is what you see there.

It is not clear what you have here physically - do you have a HYPERTHREADED cpu - in which case you have 15 physical cores, which appear to be 60 cpu's to the OS (but there is only really 15 of them).

Or do you have a 4core chip - in which case there really are in effect 4 cpus on each chip and you really do have 60 cpus.

If you need to monitor cpu, you use the OS provided tools to do so, not really sure what you are looking for here at all.

Query

Aru, April 22, 2008 - 6:21 pm UTC

Hi Tom,
Nice thread this. I have being reading a lot of your threads and you say lots of times 'There is very little happening on the database server' or 'There is no load on the database server' or 'The database server is basically idle'.
How do you come to these conclusions? What part of the report or tkprof or some other tool set do you base these conclusions from? I would love to basically have the confidence to say that as well when it's the case, but just don't know how.
Thanks for everything Tom.
Regards,
Aru.

April 23, 2008 - 6:04 pm UTC

... What part of the report or tkprof ...

usually "transactions per second", "executes per second", "cpu used" types of things.

Response Time

bobby, July 24, 2008 - 5:06 am UTC

Hello Tom

I have been reading a lot of your threads, and greatly appreciate your views and knowledge on Oracle. I have myself cleared a lot of doubts and concepts about Oracle just by readign questions and answers from your site.

This is about a doubt I have myself this time.

One of our customers would like to know the number of requests, the response time, and the memory consumed
during particular hours of the day. I am not very sure about how to get this data, even at a rough
estimate.

I have taken 10 min difference of Snaps during those hours and am generating those reports.
I was planning to calculate the total response time as per suggestions in Metalink Note.223117.1
I dont know how to check the number of requests from application, but I could give him the number of
transactions happening during that interval using the value or 'transactions per second' from Statspack report, and multiplying it by the number of seconds for which the report was generated.
And the response time for each transaction could be taken from dividing response time calculated above by the number of transactions calculated above.

From reading your views in this page, I feel that this is not the right approach.
Could you please tell me whats the right way to get the required data, or what all assumptions do we have to
make in case we follow this approach.

Thanks very much
Waiting for a brilliant response, as you have provided always to people here.

July 24, 2008 - 10:56 am UTC

the only way to get a transaction response time is.....

for the client to record it, period, there is no other way

In fact, the only meaningful information for an end user (these management types) comes from the application, only the application can say:

we did X order transactions, minimum response time is A, max was B, average was C
we did M return transactions, ....
we did N abc transactions...
and so on.

DBA's and developers - statspack is interesting and useful

end users - application generated data is interesting and useful.

but...

bobby, July 25, 2008 - 4:06 am UTC

Thanks for the response Tom

But then, what do the figures of Transactions per second indicate in the Statspack report.
Is it possible to get a database perspective for that duration ? The approach is not useful, or some assumptions have to be taken for using them?

Also, could we get the memory utilized from the reports ? (using the figures of Memory Usage % maybe...?)

July 29, 2008 - 9:40 am UTC

transactions per second is the number of commits per second.

it tells you NOTHING about how long those transactions took.

suppose the following happens

at 6am start 1,000 batch jobs
at 9am take a statspack snapshot
.... all 1,000 batch jobs finish here and commit their work
at 9:05am take a statspack snapshot

you did an average of 3.3 transactions per second.

Now, what can you say about that?

[this space left intentionally blank]

Your management wants a report from your applications stating what work your applications have achieve in X units of time.

DBA's and Developers - they can use statspacks and traces. Managers, it is just a big bunch of numbers that tells them almost nothing.

Thanks

Bobby, August 01, 2008 - 10:37 am UTC

Thanks Tom

By the way, is that possible on 10g ?
I could see this on the link:

http://www.oracle.com/technology/pub/articles/schumacher_analysis.html

It gives the value of
"Response Time Per Txn (secs)"

August 03, 2008 - 1:56 pm UTC

it will not be an accurate summary from the end user perspective at all.

there is only one place to measure this if you ask me - and that is in the client application. That is the only place you really know what the transaction response rates truly where - and you really need to categorize them by transaction TYPE. without that, this is just an average of some of the time spent in the database by any and all transactions - a meaningless number.

Analyse statspack

Louis, August 18, 2008 - 11:32 am UTC

Hi Tom,

Majority of the job get benefit of implanting good technologie( new server, better memory etc ), all but one job. We have run a statspack on that one.

here the result

Snap Id Snap Time Sessions Curs/Sess Comment
--------- ------------------ -------- --------- -------------------
Begin Snap: 27001 18-Aug-08 09:16:49 38 189.4
End Snap: 27002 18-Aug-08 09:40:27 39 204.0
Elapsed: 23.63 (mins)

Cache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 816M Std Block Size: 8K
Shared Pool Size: 400M Log Buffer: 7,500K

Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 85,217.98 240,236.77
Logical reads: 65,659.36 185,099.34
Block changes: 928.93 2,618.74
Physical reads: 59.52 167.80
Physical writes: 6.99 19.71
User calls: 10.26 28.91
Parses: 488.46 1,377.02
Hard parses: 0.00 0.00
Sorts: 267.25 753.40
Logons: 0.12 0.35
Executes: 2,024.67 5,707.72
Transactions: 0.35

% Blocks changed per Read: 1.41 Recursive Call %: 99.81
Rollback per transaction %: 0.00 Rows per Sort: 25.33

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 100.00 Redo NoWait %: 100.00
Buffer Hit %: 99.91 In-memory Sort %: 100.00
Library Hit %: 100.01 Soft Parse %: 100.00
Execute to Parse %: 75.87 Latch Hit %: 99.89
Parse CPU to Parse Elapsd %: 30.09 % Non-Parse CPU: 80.33

Shared Pool Statistics Begin End
------ ------
Memory Usage %: 91.72 91.73
% SQL with executions>1: 95.84 95.84
% Memory for SQL w/exec>1: 95.63 95.63

Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
db file sequential read 83,245 1,151 45.96
latch free 16,686 780 31.13
CPU time 523 20.88
buffer busy waits 1,769 34 1.37
log file parallel write 998 5 .21
-------------------------------------------------------------

Get Get Slps Time NoWait NoWait
Latch Requests Miss /Miss (s) Requests Miss
------------------------ -------------- ------ ------ ------ ------------ ------
Consistent RBA 1,000 0.0 0 0
FIB s.o chain latch 8 0.0 0 0
FOB s.o list latch 5,404 0.0 0 0
SQL memory manager latch 1 0.0 0 471 0.0
SQL memory manager worka 78,705 0.0 0 0
active checkpoint queue 1,196 0.0 0 0
archive control 2 0.0 0 0
cache buffer handles 1,225,624 0.0 0.0 0 0
cache buffers chains 171,062,801 0.0 0.2 429 415,817 1.1
cache buffers lru chain 17,222 0.1 0.2 0 1,576,003 0.3
channel handle pool latc 349 0.0 0 0
channel operations paren 1,471 0.0 0 0
checkpoint queue latch 117,634 0.0 1.0 0 12,011 0.0
dml lock allocation 65,375 0.0 0.1 0 0
dummy allocation 349 0.0 0 0
enqueue hash chains 115,910 0.0 0.0 0 0
enqueues 11,357 0.0 0.0 0 0
event group latch 175 0.0 0 0
hash table column usage 2 0.0 0 0
job_queue_processes para 24 0.0 0 0
ktm global data 4 0.0 0 0
lgwr LWN SCN 999 0.0 0 0
library cache 13,586,420 0.9 0.0 289 0
library cache pin 11,628,454 0.1 0.1 21 0
library cache pin alloca 3,925,659 0.1 0.0 9 0
list of block allocation 4,705 0.0 0 0
loader state object free 1,108 0.0 0 0
messages 8,075 0.0 0 0
mostly latch-free SCN 1,001 0.2 0.0 0 0
multiblock read objects 1,104 0.0 0 0
ncodef allocation latch 43 0.0 0 0
post/wait queue 315 0.0 0 233 0.0
process allocation 349 0.0 0 175 0.0
process group creation 349 0.0 0 0
redo allocation 1,018,290 0.1 0.1 1 0
redo copy 0 0 1,016,287 0.0
redo writing 5,204 0.0 0 0
resumable state object 5 0.0 0 0
row cache enqueue latch 2,746,890 0.1 0.0 0 0
row cache objects 2,822,625 0.6 0.0 26 0
sequence cache 39,583 0.0 1.0 0 0
session allocation 310,879 0.0 0.0 0 0
session idle bit 30,785 0.0 0 0
session switching 43 0.0 0 0
session timer 481 0.0 0 0
shared pool 5,302,485 0.6 0.0 2 0
sim partition latch 0 0 450 0.0
simulator hash latch 5,478,373 0.0 0.0 0 0
simulator lru latch 5,026 0.1 0.4 0 79,854 4.1
sort extent pool 47 0.0 0 0

so

Latch Activity for DB: TMP7 Instance: tmp7 Snaps: 27001 -27002
->"Get Requests", "Pct Get Miss" and "Avg Slps/Miss" are statistics for
willing-to-wait latch get requests
->"NoWait Requests", "Pct NoWait Miss" are for no-wait latch get requests
->"Pct Misses" for both should be very close to 0.0

Pct Avg Wait Pct
Get Get Slps Time NoWait NoWait
Latch Requests Miss /Miss (s) Requests Miss
------------------------ -------------- ------ ------ ------ ------------ ------
transaction allocation 65,189 0.0 0 0
transaction branch alloc 43 0.0 0 0
undo global data 4,269,575 0.0 0.0 4 2 0.0
user lock 698 0.0 0 0
-------------------------------------------------------------
Latch Sleep breakdown for DB: TMP7 Instance: tmp7 Snaps: 27001 -27002
-> ordered by misses desc

Get Spin &
Latch Name Requests Misses Sleeps Sleeps 1->4
-------------------------- -------------- ----------- ----------- ------------
library cache 13,586,420 119,756 5,871 114003/5636/
116/1/0
cache buffers chains 171,062,801 43,676 8,497 35984/7271/2
67/154/0
shared pool 5,302,485 33,844 581 33265/577/2/
0/0
row cache objects 2,822,625 15,909 486 15430/473/5/
1/0
library cache pin 11,628,454 14,076 829 13254/815/7/
0/0
library cache pin allocati 3,925,659 4,035 144 3892/142/1/0
/0
undo global data 4,269,575 1,958 94 1866/90/2/0/
0
row cache enqueue latch 2,746,890 1,381 34 1348/32/1/0/
0
redo allocation 1,018,290 1,165 90 1076/88/1/0/
0
cache buffer handles 1,225,624 585 18 567/18/0/0/0
session allocation 310,879 142 5 137/5/0/0/0
simulator hash latch 5,478,373 48 1 47/1/0/0/0
cache buffers lru chain 17,222 23 4 19/4/0/0/0
dml lock allocation 65,375 9 1 8/1/0/0/0
simulator lru latch 5,026 5 2 3/2/0/0/0
sequence cache 39,583 4 4 1/2/1/0/0
checkpoint queue latch 117,634 1 1 0/1/0/0/0
-------------------------------------------------------------
Latch Miss Sources for DB: TMP7 Instance: tmp7 Snaps: 27001 -27002
-> only latches with sleeps are shown
-> ordered by name, sleeps desc

NoWait Waiter
Latch Name Where Misses Sleeps Sleeps
------------------------ -------------------------- ------- ---------- --------
cache buffer handles kcbzgs 0 12 11
cache buffer handles kcbzfs 0 6 7
cache buffers chains kcbchg: kslbegin: bufs not 0 3,336 3,150
cache buffers chains kcbgtcr: fast path 0 3,143 2,121
cache buffers chains kcbgtcr: kslbegin excl 0 1,322 1,521
cache buffers chains kcbchg: kslbegin: call CR 0 212 888
cache buffers chains kcbrls: kslbegin 0 211 426
cache buffers chains kcbzwb 0 100 16
cache buffers chains kcbzgb: scan from tail. no 0 83 0
cache buffers chains kcbget: pin buffer 0 38 151
cache buffers chains kcbgcur: kslbegin 0 16 55
cache buffers chains kcbcge 0 12 136
cache buffers chains kcbget: exchange 0 5 1
cache buffers chains kcbgtcr 0 5 0
cache buffers chains kcbget: exchange rls 0 3 1
cache buffers chains kcbnew 0 2 0
cache buffers chains kcbnlc 0 2 21
cache buffers chains kcbgtcr: kslbegin shared 0 1 0
cache buffers lru chain kcbzgb: multiple sets nowa 0 3 3
cache buffers lru chain kcbgtcr:CR Scan:KCBRSKIP 0 1 0
checkpoint queue latch kcbklbc: Link buffer into 0 1 0
dml lock allocation ktaiam 0 1 1
library cache kglupc: child 0 1,587 1,536
library cache kglpnc: child 0 1,471 2,006
library cache kglpndl: child: before pro 0 843 620
library cache kglpin: child: heap proces 0 632 836
library cache kgllkdl: child: cleanup 0 486 15
library cache kglpnp: child 0 318 534
library cache kglpndl: child: after proc 0 240 7
library cache kglhdgn: child: 0 177 223
library cache kglobpn: child: 0 55 35
library cache kglic 0 36 19
library cache kglpin 0 14 5
library cache kglget: child: KGLDSBRD 0 8 17
library cache kgldte: child 0 0 3 3
library cache kglpsl: child 0 1 0
library cache pin kglpnc: child 0 300 275
library cache pin kglupc 0 219 263
library cache pin kglpndl 0 158 87
library cache pin kglpnal: child: alloc spac 0 98 144
library cache pin kglpnp: child 0 55 61
library cache pin alloca kglpnal 0 75 84
library cache pin alloca kglpndl 0 50 49
library cache pin alloca kgllkdl 0 19 11
redo allocation kcrfwr 0 90 88
row cache enqueue latch kqreqd 0 23 3
row cache enqueue latch kqreqa 0 11 31
row cache objects kqrpfl: not dirty 0 406 19
row cache objects kqrpre: find obj 0 79 466
row cache objects kqrcmt: while loop 0 1 0
sequence cache kdnss 0 2 6
sequence cache kdnnxt: cached seq 0 1 0
sequence cache kdnssd 0 1 0
Latch Miss Sources for DB: TMP7 Instance: tmp7 Snaps: 27001 -27002
-> only latches with sleeps are shown
-> ordered by name, sleeps desc

NoWait Waiter
Latch Name Where Misses Sleeps Sleeps
------------------------ -------------------------- ------- ---------- --------
session allocation ksuprc 0 4 2
session allocation ksudlc 0 1 3
shared pool kghupr1 0 580 577
shared pool kghalo 0 1 1
simulator hash latch kcbsacc: lookup dba 0 1 0
simulator lru latch kcbs_simulate: simulate se 0 2 2
undo global data ktudba: KSLBEGIN 0 89 92
undo global data ktudnx:child 0 4 1
undo global data ktubnd:child 0 1 1
-------------------------------------------------------------

Parse CPU to Parse Elapsd %: 30.09
seem very low. Is the lach seem normal.

1) library cache is at 0.9 ptc get miss.
2) cache buffers chains have 1.1 Pct nowait Miss
3) checkpoint queue latch have 1.0 Avg Slps / miss
4) others stats ...

Witch of this stat seem to be the problem.

Can we get a good advise to see the direction of the probleme.

tx

August 20, 2008 - 9:59 am UTC

none of what you listed.

look at your waits up there. You parse like mad (488 per second), you are doing IO.

to review a single job, you cannot use statspack, you have to use sql_trace/tkprof or the ASH information in 10g and above

job probleme

Louis, August 22, 2008 - 1:17 pm UTC

We will give you more information.

In past, this job take a long time to execute. To make it faster, they split the load of this job into 10 same little job. So when this job start, 10 job is schedule and execute in the same time.

So we will have 10 process on same time running.

So like that it faster.

So the tkprof is not on one job but on 10 little job.

There a part of that big tkprof of 1.500 ko with the most hot stats( PARSE, execute, fetch stat).

Can help us to check what to search for ?

begin select utsp.utsp_id, atgl.atgl_nom_abrg, atgl.atgl_id, 
  atgl.atgl_copie_atgl_id, nvl(utsp.utsp_ind_quest_parent,'N'), 
  atgl.atgl_datatype, atgl.atgl_format_car, atgl.atgl_nbr_car_max_val, 
  nvl(utsp.utsp_nom_proc_entree,atgl.atgl_nom_proc_entree), 
  nvl(utsp.utsp_nom_proc_sortie,atgl.atgl_nom_proc_sortie), 
  nvl(utsp.utsp_nom_proc_lov,atgl.atgl_nom_proc_lov), atgl.atgl_code_info, 
  atgl.atgl_tranche, decode(utsp.utsp_detail_tbl,'SOUS_TBL_DETAIL_OBT','DEOB',
   'INTR_TBL_DETAIL_INTR','DEIN', 'SINI_TBL_DETAIL_SINISTRE_OBT','DSIO', 
  'CORR_TBL_DETAIL_DOCMNT_EXPD','DEDE','SINI_TBL_MNT_COUVRT', 'MOCO'), 
  decode(utsp.utsp_detail_tbl_2,'SOUS_TBL_DETAIL_OBT','DEOB', 
  'INTR_TBL_DETAIL_INTR','DEIN',  'SINI_TBL_DETAIL_SINISTRE_OBT','DSIO', 
  'CORR_TBL_DETAIL_DOCMNT_EXPD','DEDE','SINI_TBL_MNT_COUVRT', 'MOCO'), 
  nvl(atgl.atgl_ind_plage,'N'), nvl(atgl.atgl_ind_val,'N'), 
  nvl(utsp.utsp_ind_compo_base,'N'), nvl(utsp.utsp_ind_compo_etabli,'N'), 
  nvl(atgl.atgl_ind_lov_restrict,'N'), utsp.atgl_id_maitr_dnm, 
  nvl(utsp.utsp_ind_subst,'O'), utsp.atgl_id_detail_mult_dnm, 
  decode(:l_type_trt_spec,'RENOU_NOUVEAU',utsp.vaat_id_val_def_renou,
  nvl(utsp.vaat_id_val_def,atgl.vaat_id_val_def)), utsp.atgl_id_quest_dnm, 
  utsp.utsp_dt_vig, decode(greatest(utsp.utsp_dt_vig,:l_dt_query_vig_expir),  
  :l_dt_query_vig_expir,:l_dt_query_vig_expir,sysdate)  bulk collect into  
  pilo_pck_matrice.l_utsp_id,  pilo_pck_matrice.l_nom_abrg,  
  pilo_pck_matrice.l_atgl_id,  pilo_pck_matrice.l_atgl_copie_atgl_id,  
  pilo_pck_matrice.l_utsp_ind_quest_parent,  pilo_pck_matrice.l_datatype,  
  pilo_pck_matrice.l_format_car,  pilo_pck_matrice.l_nbr_car_max_val,  
  pilo_pck_matrice.l_nom_proc_entree,  pilo_pck_matrice.l_nom_proc_sortie,  
  pilo_pck_matrice.l_nom_proc_lov,  pilo_pck_matrice.l_atgl_code_info,  
  pilo_pck_matrice.l_tranche,  pilo_pck_matrice.l_detail_tbl,  
  pilo_pck_matrice.l_detail_tbl_2,  pilo_pck_matrice.l_ind_plage,  
  pilo_pck_matrice.l_ind_val_spec,  pilo_pck_matrice.l_ind_compo_base,  
  pilo_pck_matrice.l_ind_compo_etabli,  pilo_pck_matrice.l_ind_lov_restrict,  
  pilo_pck_matrice.l_atgl_id_maitr,  pilo_pck_matrice.l_utsp_ind_subst,  
  pilo_pck_matrice.l_atgl_id_detail_mult,  pilo_pck_matrice.l_vaat_id_val_def,
    pilo_pck_matrice.l_atgl_id_quest_dnm,  pilo_pck_matrice.l_dt_vig_utsp,  
  pilo_pck_matrice.l_dt_query_vig_expir_used from   pilo_tbl_attr_glob atgl, 
  pilo_tbl_utilsn_spec utsp where  atgl.atgl_id = utsp.atgl_id_spec and    
  atgl.atgl_ind_compo_inter = 'N' and    utsp.gapr_id = :l_fk_utsp_n and    
  (:l_dt_query_vig_expir between utsp.utsp_dt_vig and nvl(utsp.utsp_dt_expir,
  :l_dt_query_vig_expir) or    sysdate between utsp.utsp_dt_vig and 
  nvl(utsp.utsp_dt_expir,sysdate)) and    utsp.utsp_id = (   select 
  max(utsp2.utsp_id)   from   pilo_tbl_utilsn_spec utsp2   where  
  utsp2.gapr_id = :l_fk_utsp_n   and    utsp2.atgl_id_spec = 
  utsp.atgl_id_spec   and    (:l_dt_query_vig_expir between utsp2.utsp_dt_vig 
  and nvl(utsp2.utsp_dt_expir,:l_dt_query_vig_expir)   or     sysdate between 
  utsp2.utsp_dt_vig and nvl(utsp2.utsp_dt_expir,sysdate))) and    exists (  
  select 1   from   pilo_tbl_ind_attr_spec inas   where  inas.utsp_id = 
  utsp.utsp_id and    inas.inas_ind_somr = decode(:l_ind_lire_somr,'O','O',
  inas.inas_ind_somr) and    nvl(inas.vaat_id_statut,0) = 
  decode(utsp.utsp_ind_compo_etabli,'O',0,decode(:l_mode_exec,'GAPR',
  :l_vaat_id_statut_sous,0)) and    decode(greatest(utsp.utsp_dt_vig,
  :l_dt_query_vig_expir),  :l_dt_query_vig_expir,:l_dt_query_vig_expir,
  sysdate)  between inas.inas_dt_vig and nvl(inas.inas_dt_expir,:l_dt_max)) 
  order by utsp.utsp_id; end;


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse       72      0.03       0.01          0          0          0           0
Execute     72      0.11       0.07          0          0          0          72
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      144      0.14       0.09          0          0          0          72

Misses in library cache during parse: 1
Optimizer goal: RULE
Parsing user id: 1728  (OPUS)   (recursive depth: 1)

**********************************************************************************

SELECT UTSP.UTSP_ID, ATGL.ATGL_NOM_ABRG, ATGL.ATGL_ID, 
  ATGL.ATGL_COPIE_ATGL_ID, NVL(UTSP.UTSP_IND_QUEST_PARENT,'N'), 
  ATGL.ATGL_DATATYPE, ATGL.ATGL_FORMAT_CAR, ATGL.ATGL_NBR_CAR_MAX_VAL, 
  NVL(UTSP.UTSP_NOM_PROC_ENTREE,ATGL.ATGL_NOM_PROC_ENTREE), 
  NVL(UTSP.UTSP_NOM_PROC_SORTIE,ATGL.ATGL_NOM_PROC_SORTIE), 
  NVL(UTSP.UTSP_NOM_PROC_LOV,ATGL.ATGL_NOM_PROC_LOV), ATGL.ATGL_CODE_INFO, 
  ATGL.ATGL_TRANCHE, DECODE(UTSP.UTSP_DETAIL_TBL,'SOUS_TBL_DETAIL_OBT','DEOB',
   'INTR_TBL_DETAIL_INTR','DEIN', 'SINI_TBL_DETAIL_SINISTRE_OBT','DSIO', 
  'CORR_TBL_DETAIL_DOCMNT_EXPD','DEDE','SINI_TBL_MNT_COUVRT', 'MOCO'), 
  DECODE(UTSP.UTSP_DETAIL_TBL_2,'SOUS_TBL_DETAIL_OBT','DEOB', 
  'INTR_TBL_DETAIL_INTR','DEIN', 'SINI_TBL_DETAIL_SINISTRE_OBT','DSIO', 
  'CORR_TBL_DETAIL_DOCMNT_EXPD','DEDE','SINI_TBL_MNT_COUVRT', 'MOCO'), 
  NVL(ATGL.ATGL_IND_PLAGE,'N'), NVL(ATGL.ATGL_IND_VAL,'N'), 
  NVL(UTSP.UTSP_IND_COMPO_BASE,'N'), NVL(UTSP.UTSP_IND_COMPO_ETABLI,'N'), 
  NVL(ATGL.ATGL_IND_LOV_RESTRICT,'N'), UTSP.ATGL_ID_MAITR_DNM, 
  NVL(UTSP.UTSP_IND_SUBST,'O'), UTSP.ATGL_ID_DETAIL_MULT_DNM, DECODE(:B2 ,
  'RENOU_NOUVEAU',UTSP.VAAT_ID_VAL_DEF_RENOU,NVL(UTSP.VAAT_ID_VAL_DEF,
  ATGL.VAAT_ID_VAL_DEF)), UTSP.ATGL_ID_QUEST_DNM, UTSP.UTSP_DT_VIG, 
  DECODE(GREATEST(UTSP.UTSP_DT_VIG,:B1 ), :B1 ,:B1 ,SYSDATE) 
FROM
 PILO_TBL_ATTR_GLOB ATGL, PILO_TBL_UTILSN_SPEC UTSP WHERE ATGL.ATGL_ID = 
  UTSP.ATGL_ID_SPEC AND ATGL.ATGL_IND_COMPO_INTER = 'N' AND UTSP.GAPR_ID = 
  :B3 AND (:B1 BETWEEN UTSP.UTSP_DT_VIG AND NVL(UTSP.UTSP_DT_EXPIR,:B1 ) OR 
  SYSDATE BETWEEN UTSP.UTSP_DT_VIG AND NVL(UTSP.UTSP_DT_EXPIR,SYSDATE)) AND 
  UTSP.UTSP_ID = ( SELECT MAX(UTSP2.UTSP_ID) FROM PILO_TBL_UTILSN_SPEC UTSP2 
  WHERE UTSP2.GAPR_ID = :B3 AND UTSP2.ATGL_ID_SPEC = UTSP.ATGL_ID_SPEC AND 
  (:B1 BETWEEN UTSP2.UTSP_DT_VIG AND NVL(UTSP2.UTSP_DT_EXPIR,:B1 ) OR SYSDATE 
  BETWEEN UTSP2.UTSP_DT_VIG AND NVL(UTSP2.UTSP_DT_EXPIR,SYSDATE))) AND EXISTS 
  ( SELECT 1 FROM PILO_TBL_IND_ATTR_SPEC INAS WHERE INAS.UTSP_ID = 
  UTSP.UTSP_ID AND INAS.INAS_IND_SOMR = DECODE(:B7 ,'O','O',
  INAS.INAS_IND_SOMR) AND NVL(INAS.VAAT_ID_STATUT,0) = 
  DECODE(UTSP.UTSP_IND_COMPO_ETABLI,'O',0,DECODE(:B5 ,'GAPR',:B6 ,0)) AND 
  DECODE(GREATEST(UTSP.UTSP_DT_VIG,:B1 ), :B1 ,:B1 ,SYSDATE) BETWEEN 
  INAS.INAS_DT_VIG AND NVL(INAS.INAS_DT_EXPIR,:B4 )) ORDER BY UTSP.UTSP_ID


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse       72      0.00       0.00          0          0          0           0
Execute     72      0.05       0.01          0          0          0           0
Fetch       72      0.01       0.01          0        632          0          16
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      216      0.06       0.03          0        632          0          16

Misses in library cache during parse: 1
Optimizer goal: RULE
Parsing user id: 1728  (OPUS)   (recursive depth: 2)

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  SORT ORDER BY 
      0   FILTER  
      1    NESTED LOOPS  
      1     TABLE ACCESS BY INDEX ROWID PILO_TBL_UTILSN_SPEC 
      1      INDEX RANGE SCAN PILO_IDX_UTSP_GAPR (object id 15949)
      1     TABLE ACCESS BY INDEX ROWID PILO_TBL_ATTR_GLOB 
      1      INDEX UNIQUE SCAN ATGL_PK (object id 15906)
      1    SORT AGGREGATE 
      1     TABLE ACCESS BY INDEX ROWID PILO_TBL_UTILSN_SPEC 
      1      AND-EQUAL  
      2       INDEX RANGE SCAN PILO_IDX_UTSP_GAPR (object id 15949)
      1       INDEX RANGE SCAN PILO_IDX_UTSP_ATGL_SPEC (object id 15946)
      0    TABLE ACCESS BY INDEX ROWID PILO_TBL_IND_ATTR_SPEC 
      0     INDEX RANGE SCAN PILO_IDX_INAS_UTSP (object id 15939)


Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   GOAL: RULE
      0   SORT (ORDER BY)
      0    FILTER
      1     NESTED LOOPS
      1      TABLE ACCESS (BY INDEX ROWID) OF 'PILO_TBL_UTILSN_SPEC'
      1       INDEX (RANGE SCAN) OF 'PILO_IDX_UTSP_GAPR' (NON-UNIQUE)
      1      TABLE ACCESS (BY INDEX ROWID) OF 'PILO_TBL_ATTR_GLOB'
      1       INDEX (UNIQUE SCAN) OF 'ATGL_PK' (UNIQUE)
      1     SORT (AGGREGATE)
      1      TABLE ACCESS (BY INDEX ROWID) OF 'PILO_TBL_UTILSN_SPEC'
      1       AND-EQUAL
      2        INDEX (RANGE SCAN) OF 'PILO_IDX_UTSP_GAPR' (NON-UNIQUE)

      1        INDEX (RANGE SCAN) OF 'PILO_IDX_UTSP_ATGL_SPEC' 
                   (NON-UNIQUE)
      0     TABLE ACCESS (BY INDEX ROWID) OF 'PILO_TBL_IND_ATTR_SPEC'
      0      INDEX (RANGE SCAN) OF 'PILO_IDX_INAS_UTSP' (NON-UNIQUE)

********************************************************************************

********************************************************************************

SELECT INAS.UTSP_ID, NVL(INAS.INAS_IND_OBLGTR,'N'), NVL(INAS.INAS_IND_SAISI,
  'N'), NVL(INAS.INAS_IND_MODIF,'N'), NVL(INAS.INAS_IND_TEMP,'N'), 
  NVL(INAS.INAS_IND_SOMR,'N'), NVL(INAS.INAS_IND_TARIF,'N'), 
  NVL(INAS.INAS_IND_TARIF_RENOU,'N'), NVL(INAS.INAS_IND_AFFCHG,'N'), 
  NVL(INAS.INAS_IND_AFFCHG_DESCR,'N'), NVL(INAS.INAS_IND_AFFCHG_MAJUSC,'N'), 
  NVL(INAS.INAS_IND_SAUVGRD,'N'), NVL(INAS.INAS_IND_OBLGTR_PROC_CALC,'N'), 
  NVL(INAS.INAS_SEQ_AFFCHG,0), NVL(INAS.INAS_SEQ_AFFCHG_SOMR,-1), 
  NVL(INAS.VAAT_ID_REGRPMNT,0), DECODE(GREATEST(UTSP.UTSP_DT_VIG,:B1 ), :B1 ,
  :B1 ,SYSDATE) 
FROM
 PILO_TBL_IND_ATTR_SPEC INAS, PILO_TBL_ATTR_GLOB ATGL, PILO_TBL_UTILSN_SPEC 
  UTSP WHERE INAS.UTSP_ID = UTSP.UTSP_ID AND INAS.INAS_IND_SOMR = DECODE(:B6 ,
  'O','O',INAS.INAS_IND_SOMR) AND NVL(INAS.VAAT_ID_STATUT,0) = 
  DECODE(UTSP.UTSP_IND_COMPO_ETABLI,'O',0,DECODE(:B4 ,'GAPR',:B5 ,0)) AND 
  DECODE(GREATEST(UTSP.UTSP_DT_VIG,:B1 ), :B1 ,:B1 ,SYSDATE) BETWEEN 
  INAS.INAS_DT_VIG AND NVL(INAS.INAS_DT_EXPIR,:B3 ) AND ATGL.ATGL_ID = 
  UTSP.ATGL_ID_SPEC AND ATGL.ATGL_IND_COMPO_INTER = 'N' AND UTSP.GAPR_ID = 
  :B2 AND (:B1 BETWEEN UTSP.UTSP_DT_VIG AND NVL(UTSP.UTSP_DT_EXPIR,:B1 ) OR 
  SYSDATE BETWEEN UTSP.UTSP_DT_VIG AND NVL(UTSP.UTSP_DT_EXPIR,SYSDATE)) AND 
  UTSP.UTSP_ID = ( SELECT MAX(UTSP2.UTSP_ID) FROM PILO_TBL_UTILSN_SPEC UTSP2 
  WHERE UTSP2.GAPR_ID = :B2 AND UTSP2.ATGL_ID_SPEC = UTSP.ATGL_ID_SPEC AND 
  (:B1 BETWEEN UTSP2.UTSP_DT_VIG AND NVL(UTSP2.UTSP_DT_EXPIR,:B1 ) OR SYSDATE 
  BETWEEN UTSP2.UTSP_DT_VIG AND NVL(UTSP2.UTSP_DT_EXPIR,SYSDATE))) ORDER BY 
  UTSP.UTSP_ID


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse       72      0.00       0.00          0          0          0           0
Execute     72      0.02       0.01          0          0          0           0
Fetch       72      0.00       0.01          0        524          0          16
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      216      0.02       0.03          0        524          0          16

Misses in library cache during parse: 1
Optimizer goal: RULE
Parsing user id: 1728  (OPUS)   (recursive depth: 2)

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  SORT ORDER BY 
      0   FILTER  
      0    TABLE ACCESS BY INDEX ROWID PILO_TBL_IND_ATTR_SPEC 
      2     NESTED LOOPS  
      1      NESTED LOOPS  
      1       TABLE ACCESS BY INDEX ROWID PILO_TBL_UTILSN_SPEC 
      1        INDEX RANGE SCAN PILO_IDX_UTSP_GAPR (object id 15949)
      1       TABLE ACCESS BY INDEX ROWID PILO_TBL_ATTR_GLOB 
      1        INDEX UNIQUE SCAN ATGL_PK (object id 15906)
      0      INDEX RANGE SCAN PILO_IDX_INAS_UTSP (object id 15939)
      0    SORT AGGREGATE 
      0     TABLE ACCESS BY INDEX ROWID PILO_TBL_UTILSN_SPEC 
      0      AND-EQUAL  
      0       INDEX RANGE SCAN PILO_IDX_UTSP_GAPR (object id 15949)
      0       INDEX RANGE SCAN PILO_IDX_UTSP_ATGL_SPEC (object id 15946)


Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   GOAL: RULE
      0   SORT (ORDER BY)
      0    FILTER
      0     TABLE ACCESS (BY INDEX ROWID) OF 'PILO_TBL_IND_ATTR_SPEC'
      2      NESTED LOOPS
      1       NESTED LOOPS
      1        TABLE ACCESS (BY INDEX ROWID) OF 
                   'PILO_TBL_UTILSN_SPEC'
      1         INDEX (RANGE SCAN) OF 'PILO_IDX_UTSP_GAPR' 
                    (NON-UNIQUE)
      1        TABLE ACCESS (BY INDEX ROWID) OF 'PILO_TBL_ATTR_GLOB'
      1         INDEX (UNIQUE SCAN) OF 'ATGL_PK' (UNIQUE)
      0       INDEX (RANGE SCAN) OF 'PILO_IDX_INAS_UTSP' (NON-UNIQUE)
      0     SORT (AGGREGATE)
      0      TABLE ACCESS (BY INDEX ROWID) OF 'PILO_TBL_UTILSN_SPEC'
      0       AND-EQUAL
      0        INDEX (RANGE SCAN) OF 'PILO_IDX_UTSP_GAPR' (NON-UNIQUE)

      0        INDEX (RANGE SCAN) OF 'PILO_IDX_UTSP_ATGL_SPEC' 
                   (NON-UNIQUE)

********************************************************************************

SELECT NOPR.NORM_ID, NOPR.NORM_ID_DEPN 
FROM
 PROD_TBL_NORM_PROD NOPR ,PROD_TBL_NORM NORM WHERE NOPR.PROD_CODE = :B3 AND 
  NORM.NORM_ID = NOPR.NORM_ID AND NORM.NORM_CONTXT IS NULL AND INSTR(',
  '||NVL(:B7 ,NOPR.NORM_ID)||',', ','||NOPR.NORM_ID||',') != 0 AND 
  PILO_PCK_MATRICE_1.FNC_VERIF_SELECT_NORM( :B6 , NOPR.PROD_CODE, 
  NOPR.NORM_ID, NOPR.NOPR_DT_VIG, NOPR.NOPR_DT_EXPIR, :B2 , :B1 , :B5 ) IS 
  NOT NULL CONNECT BY PRIOR NOPR.NORM_ID_DEPN = NOPR.NORM_ID AND 
  NOPR.PROD_CODE = :B3 AND :B2 BETWEEN NOPR.NOPR_DT_VIG AND 
  NVL(NOPR.NOPR_DT_EXPIR,:B1 ) START WITH NOPR.NORM_ID = :B4 AND 
  NOPR.PROD_CODE = :B3 AND :B2 BETWEEN NOPR.NOPR_DT_VIG AND 
  NVL(NOPR.NOPR_DT_EXPIR,:B1 )


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        0      0.00       0.00          0          0          0           0
Execute     47      0.00       0.00          0          0          0           0
Fetch      125      3.34       3.27          0     339622          0         103
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      172      3.34       3.28          0     339622          0         103

Misses in library cache during parse: 0
Optimizer goal: RULE
Parsing user id: 1728  (OPUS)   (recursive depth: 1)

Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   GOAL: RULE
      0   FILTER
      0    CONNECT BY (WITHOUT FILTERING)
      0     FILTER
      0      COUNT
      0       NESTED LOOPS
      0        TABLE ACCESS (FULL) OF 'PROD_TBL_NORM'
      0        TABLE ACCESS (BY INDEX ROWID) OF 'PROD_TBL_NORM_PROD'
      0         INDEX (RANGE SCAN) OF 'PROD_IDX_NOPR_NORM' 
                    (NON-UNIQUE)
      0     COUNT
      0      NESTED LOOPS
      0       TABLE ACCESS (FULL) OF 'PROD_TBL_NORM'
      0       TABLE ACCESS (BY INDEX ROWID) OF 'PROD_TBL_NORM_PROD'
      0        INDEX (RANGE SCAN) OF 'PROD_IDX_NOPR_NORM' (NON-UNIQUE)


********************************************************************************

SELECT VAAT.VAAT_ID 
FROM
 PILO_TBL_VAL_ATTR VAAT WHERE VAAT.VAAT_CODE_INFO = UPPER(:B1 )


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        0      0.00       0.00          0          0          0           0
Execute   3832      0.04       0.09          0          0          0           0
Fetch     3832      0.11       0.07          0      11496          0        3832
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     7664      0.15       0.17          0      11496          0        3832

Misses in library cache during parse: 0
Optimizer goal: RULE
Parsing user id: 1728  (OPUS)   (recursive depth: 1)

Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   GOAL: RULE
      0   TABLE ACCESS (BY INDEX ROWID) OF 'PILO_TBL_VAL_ATTR'
      0    INDEX (RANGE SCAN) OF 'PILO_IDX_VAAT_CODE_INFO' (NON-UNIQUE)

********************************************************************************

SELECT NOPR.NORM_ID, NOPR.NORM_ID_DEPN 
FROM
 PROD_TBL_NORM_PROD NOPR ,PROD_TBL_NORM NORM WHERE NOPR.PROD_CODE = :B3 AND 
  NORM.NORM_ID = NOPR.NORM_ID AND NORM.NORM_CONTXT IS NULL AND INSTR(',
  '||NVL(:B7 ,NOPR.NORM_ID)||',', ','||NOPR.NORM_ID||',') != 0 AND 
  PILO_PCK_MATRICE_1.FNC_VERIF_SELECT_NORM( :B6 , NOPR.PROD_CODE, 
  NOPR.NORM_ID, NOPR.NOPR_DT_VIG, NOPR.NOPR_DT_EXPIR, :B2 , :B1 , :B5 ) IS 
  NOT NULL CONNECT BY PRIOR NOPR.NORM_ID_DEPN = NOPR.NORM_ID AND 
  NOPR.PROD_CODE = :B3 AND :B2 BETWEEN NOPR.NOPR_DT_VIG AND 
  NVL(NOPR.NOPR_DT_EXPIR,:B1 ) START WITH NOPR.NORM_ID = :B4 AND 
  NOPR.PROD_CODE = :B3 AND :B2 BETWEEN NOPR.NOPR_DT_VIG AND 
  NVL(NOPR.NOPR_DT_EXPIR,:B1 )


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        0      0.00       0.00          0          0          0           0
Execute     47      0.00       0.00          0          0          0           0
Fetch      125      3.34       3.27          0     339622          0         103
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      172      3.34       3.28          0     339622          0         103

Misses in library cache during parse: 0
Optimizer goal: RULE
Parsing user id: 1728  (OPUS)   (recursive depth: 1)

Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   GOAL: RULE
      0   FILTER
      0    CONNECT BY (WITHOUT FILTERING)
      0     FILTER
      0      COUNT
      0       NESTED LOOPS
      0        TABLE ACCESS (FULL) OF 'PROD_TBL_NORM'
      0        TABLE ACCESS (BY INDEX ROWID) OF 'PROD_TBL_NORM_PROD'
      0         INDEX (RANGE SCAN) OF 'PROD_IDX_NOPR_NORM' 
                    (NON-UNIQUE)
      0     COUNT
      0      NESTED LOOPS
      0       TABLE ACCESS (FULL) OF 'PROD_TBL_NORM'
      0       TABLE ACCESS (BY INDEX ROWID) OF 'PROD_TBL_NORM_PROD'
      0        INDEX (RANGE SCAN) OF 'PROD_IDX_NOPR_NORM' (NON-UNIQUE)


********************************************************************************

SELECT NORM.NORM_ID 
FROM
 PROD_TBL_NORM NORM WHERE NORM.NORM_ID = :B7 AND :B4 BETWEEN NORM.NORM_DT_VIG 
  AND NVL ( NORM.NORM_DT_EXPIR, :B1 ) AND NORM.VAAT_ID_TYPE_APPL = :B6 AND 
  NORM.NORM_TYPE = 'VALIDATION' AND NORM.NORM_EXEC = DECODE(:B5 ,
  'SOUS_VALIDE_2','V','SINI_EXPEDITION','E',NULL) AND :B4 BETWEEN :B3 AND NVL 
  ( :B2 , :B1 )


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute    900      0.01       0.02          0          0          0           0
Fetch      900      0.03       0.01          0       2775          0          21
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     1801      0.04       0.04          0       2775          0          21

Misses in library cache during parse: 1
Optimizer goal: RULE
Parsing user id: 1728  (OPUS)   (recursive depth: 2)

Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   GOAL: RULE
      0   FILTER
      0    TABLE ACCESS (BY INDEX ROWID) OF 'PROD_TBL_NORM'
      0     INDEX (UNIQUE SCAN) OF 'NORM_PK' (UNIQUE)

********************************************************************************


********************************************************************************

OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse       56      0.13       0.11          4        306          0           0
Execute     82      0.06       0.03          0          0          0           0
Fetch       50      0.28       0.31          0       8946          0          65
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      188      0.47       0.46          4       9252          0          65

Misses in library cache during parse: 16


OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse     1795      2.81       2.74         40        786          4           0
Execute  15943      4.70       4.86        310      51456       5456        1093
Fetch    32101      8.69       8.86        787     725333         69       28476
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    49839     16.20      16.47       1137     777575       5529       29569

Misses in library cache during parse: 585
Misses in library cache during execute: 3

 1458  user  SQL statements in session.
  738  internal SQL statements in session.
 2196  SQL statements in session.
  600  statements EXPLAINed in this session.
********************************************************************************
Trace file: /oracle/opus/tmp6/log/udump/tmp6_ora_27666.trc
Trace file compatibility: 9.02.00
Sort options: default

       1  session in tracefile.
    1458  user  SQL statements in trace file.
     738  internal SQL statements in trace file.
    2196  SQL statements in trace file.
     985  unique SQL statements in trace file.
     600  SQL statements EXPLAINed using schema:
           LMALLETTE.prof$plan_table
             Default table was used.
             Table was created.
             Table was dropped.
   63940  lines in trace file.

August 26, 2008 - 7:35 pm UTC

Not sure what you are asking for here. You would look for the same sort of stuff you did when you had a single job - your poorly performing bits of code.

trace them
analyze their output

there is the summery of the tkprof

Louis, August 22, 2008 - 1:31 pm UTC

OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 56 0.13 0.11 4 306 0 0
Execute 82 0.06 0.03 0 0 0 0
Fetch 50 0.28 0.31 0 8946 0 65
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 188 0.47 0.46 4 9252 0 65

Misses in library cache during parse: 16

OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1795 2.81 2.74 40 786 4 0
Execute 15943 4.70 4.86 310 51456 5456 1093
Fetch 32101 8.69 8.86 787 725333 69 28476
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 49839 16.20 16.47 1137 777575 5529 29569

Misses in library cache during parse: 585
Misses in library cache during execute: 3

1458 user SQL statements in session.
738 internal SQL statements in session.
2196 SQL statements in session.
600 statements EXPLAINed in this session.
********************************************************************************
Trace file: /oracle/opus/tmp6/log/udump/tmp6_ora_27666.trc
Trace file compatibility: 9.02.00
Sort options: default

1 session in tracefile.
1458 user SQL statements in trace file.
738 internal SQL statements in trace file.
2196 SQL statements in trace file.
985 unique SQL statements in trace file.
600 SQL statements EXPLAINed using schema:
LMALLETTE.prof$plan_table
Default table was used.
Table was created.
Table was dropped.
63940 lines in trace file.

transactions per second in last 7 days

Review, September 09, 2008 - 11:11 am UTC

Tom, Is it possible to determine
average transactions per second in last 7 days ?
From AWR repor I can make analysis of 1hour interval and I have to manually add up all the values to find the average. Which views should I look to get values of last 7 days ?

September 10, 2008 - 9:16 am UTC

you would have to be using statspack or awr in general.

you would generate a 7 day statspack report, and then you would have this relatively meaningless number that no one can really use for anything useful :)

buffer busy waits

bobby, September 14, 2008 - 2:46 pm UTC

Hello Tom

Thanks for your help and hints.
I have another performance issue which I am trying to resolve, and would greatly appreciate your expert comments on this.

This is an 8.1.7.4.0 database, and a one hour snap from it shows below figures. I have read in your comments that you stress on shorter snap durations, but we could get only the one hour snap reports here.

Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 9,042.37 6,287.23
Logical reads: 7,275.06 5,058.41
Block changes: 100.31 69.75
Physical reads: 1,768.76 1,229.83
Physical writes: 28.69 19.95
User calls: 68.77 47.82
Parses: 27.05 18.81
Hard parses: 5.59 3.89
Sorts: 93.43 64.96
Logons: 0.13 0.09
Executes: 38.94 27.07
Transactions: 1.44

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 90.31 Redo NoWait %: 99.99
Buffer Hit %: 75.69 In-memory Sort %: 100.00
Library Hit %: 93.05 Soft Parse %: 79.32
Execute to Parse %: 30.54 Latch Hit %: 98.59
Parse CPU to Parse Elapsd %: 59.40 % Non-Parse CPU: 95.49

Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt Time
-------------------------------------------- ------------ ------------ -------
db file sequential read 5,487,333 3,773,404 45.15
buffer busy waits 1,398,383 3,756,021 44.94
db file parallel write 939 590,541 7.07
db file scattered read 120,181 122,392 1.46
resmgr:waiting in end wait 439 32,271 .39
-------------------------------------------------------------

The tablespace IO stats show the highest value of 'Writes','Av Writes/sec' and of 'Buffer Waits' on a particular tablespace.
The file IO stats similarly show the highest value of 'Writes','Av Writes/sec' and of 'Buffer Waits' on a particular datafile belonging to that tablespace.
The datafile contains only one segment, and that is a partition of a big table.
The Buffer wait stats are as below:

Tot Wait Avg
Class Waits Time (cs) Time (cs)
------------------ ----------- ---------- ---------
data block 2,512,853 3,740,290 1
undo block 24,477 25,697 1
undo header 7 63 9
segment header 2 2 1
-------------------------------------------------------------

The cache buffers chains show highest value of Get Requests

Pct Avg Pct
Get Get Slps NoWait NoWait
Latch Name Requests Miss /Miss Requests Miss
----------------------------- -------------- ------ ------ ------------ ------
Token Manager 1,607 0.0 782 0.0
X$KSFQP 7 0.0 0
active checkpoint queue latch 3,296 0.0 0
archive control 34 0.0 0
archive process latch 3 0.0 0
begin backup scn array 25 0.0 0
cache buffer handles 13,322 0.0 0.0 0
cache buffers chains 59,674,008 1.0 0.0 7,748,873 0.0
cache buffers lru chain 2,426,818 0.9 0.0 6,672,584 1.2
channel handle pool latch 869 0.0 0
..................
.........

The 'SQL Ordered by Gets' do not show any DML queries, only SELECT queries, and those SELECT queries are using hints to use a particular index in that same table.
Some Update queries in 'SQL ordered by execution' section and 2 insert queries in the 'SQL ordered by Version Count' section, with values as below:

Version
Count Executions Hash Value
-------- ------------ ------------
29 19 1259756290
28 6 1485469596

The update queries are on the same table, but insert queries are on different table.
The tables and indexes in the database have not been analyzed for more than 2 years, and the customer doesnt want it analyzed as he had a bad experience in
performance 2 years back, when they were analyzed.

A comparison of the report with another day's (when the problem of slowness was not felt) report of same duration and same time, shows similar trend, but
the event 'Buffer busy waits' does not figure in the Top 5 events (as shown below). The file IO stats again show highest value of Buffer Waits on same datafile, but
this time it shows a very high value of Reads, and less value of Writes (as compared to other datafiles)

Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt Time
-------------------------------------------- ------------ ------------ -------
db file sequential read 7,209,057 1,686,614 95.05
db file scattered read 103,910 27,785 1.57
db file parallel write 2,169 21,376 1.20
buffer busy waits 93,944 15,990 .90
log file sync 6,333 11,363 .64
-------------------------------------------------------------

Whats your advice for working towards resolving this issue.

1. Is it right to think that the indexes, being unanalyzed for such long period, and being forcefully used by hints, are causing the waits majorly, and
should be analyzed to reduce the waits. And how could we decide on the analyze percentage, or that it should be computed. I have earlier observed in another
database that analyzing tables at a percentage (estimating) is sometimes better for performance than computing the stats.

2. Should we try increasing the PCTFREE values or adding more freelists to reduce waits further. I have learned about this from your books.
3. What other observations could be made from the reports to enhance the performance. Would you like me to paste the complete report here so you could have a look
and understand better

Thanks for the learnings you provide

September 16, 2008 - 9:56 pm UTC

... but we
could get only the one hour snap reports here.
...

you need to rephrase that, of course you can get shorter ones, you and your coworkers have decided not to.

... and
those SELECT queries are using hints to use a particular index in that same
table.
...

and that probably is why you have so many IO requests - the index just might be the worst way to access the data.
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:6749454952894#6760861174154

so, which was better - with or without indexes.....

IF you are doing reads only (looks likely, lots of db file sequential reads - the IO we do going from index to table, index to table over and over and over and over....) the buffer busy waits are likely because the cache was full and we needed to empty out parts of it before we could put the block we just read into it.

1) hints can be good, hints can be evil. Even if you updated statistics, if you hinted - so what, it wouldn't change our mind because you made up our mind for us.

2) pctfree? Why, you are reading here - not inserting?? Not sure where the pctfree or freelists came from - they would be something to look at for WRITE contention - but you say you are doing all reads here...

3) an hour is too long. 8i is really old. No, I wouldn't want the entire report. You are doing a ton of IO, look for ways to reduce it (eg: look at the queries doing the most logical IO and see why they are doing so much and do something about it...)

... Soft Parse %: 79.32 ....

and of course the coders did not bind :(

Soft Parse

Bobby, September 23, 2008 - 3:46 am UTC

Thanks Tom

Just one more doubt about the previous question.

Were you able to deduce about non-usage of bind variables, looking at soft parse ratio ?
Actually from the reports, I saw that the queries were really using ':1' and ':2' at the places of variables, so I think bind variables are being used.

September 24, 2008 - 3:36 pm UTC

there was no deducing - it was quite explicitly shown to us that you have a hard parse problem:

Parses:                 27.05                 18.81
               <b> Hard parses:                  5.59                  3.89
                 </b>     Sorts:                 93.43                 64.96
                     Logons:                  0.13                  0.09
                   Executes:                 38.94                 27.07
               Transactions:                  1.44


Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            Buffer Nowait %:   90.31       Redo NoWait %:   99.99
            Buffer  Hit   %:   75.69    In-memory Sort %:  100.00
            Library Hit   %:   93.05      <b>  Soft Parse %:   79.32</b>
         Execute to Parse %:   30.54         Latch Hit %:   98.59

One out of every Five sql statements is hard parsed, that is really quite bad.

You saw reports of "high load sql" - sql that doesn't use binds won't appear that typically - if you think about it - it is executed ONCE and never again - you are seeing in the reports the sql that is executed over and over and over again. You won't find the "non-bind" sql in here - it is only done once and never again.

INITRANS

Anil, October 29, 2008 - 2:08 pm UTC

Tom,
We have a huge table (20GB in size) with lots of DML and select queries happening simultaneously. The value of INITTRANS and MAXTRANS as 1 and 255 respectively. Do you think it's a good candidate for increasing the INITRANS as sometimes I have seen BUFFER BUSY WAITS event in STATSPACK ? And when I queried v$session_wait and dba_extents, it pointed to this table.

Most of our queries crawl using this table.

select p1 "File #", p2 "Block #", p3 "Reason Code"
from v$session_wait
where event = 'buffer busy waits'

select owner, segment_name, segment_type
from dba_extents
where file_id = P1
and P2 between block_id and block_id + blocks -1;

Please suggest should we increase INITRANS ?

October 29, 2008 - 4:13 pm UTC

you would see itl waits in v$segment_statistics - buffer busy waits are typically caused by

a) waiting on another session that has the block in current mode, when you want it in current mode - initrans would not help that, only one session can have the block in current mode. Only solution here is to reduce the number of interested parties (on a 20g table, seems the modifications could be spread all over the place)

b) waiting for dbwr to make space in the buffer cache - you need an empty block and there are none. Solution is to make dbwr more aggressive about cleaning out the buffer cache or making dbwr able to do that faster.

c) waiting for another session to read the block into the cache - you are both interested in it, but only one of you needs to do the physical IO - the other waits.

so, see if you have lots of ITL waits on that segment before you look at initrans (maxtrans is always 255 now, deprecated parameter/setting)

Thanks

Anil, October 30, 2008 - 6:05 am UTC

Tom,
Thanks.

We are using Oracle 9i

Replies to your points :
(a) one of the suggestion was to partition the table

(b) db_writer_processes = 8

(c) following are the other waits as of now on this table

To reduce the logical reads we have created indexes on the most of the columns which are used in WHERE clause but there are some queries which goes for full table scan and we cannot avoid it.

STATISTIC_NAME VALUE

logical reads 330561056
buffer busy waits 82711700
db block changes 10655152
physical reads 142062247
physical writes 1117229
physical reads direct 0
physical writes direct 0
global cache cr blocks served 0
global cache current blocks served 0
ITL waits 2
row lock waits 14

Any suggestion ?

October 30, 2008 - 8:37 am UTC

suggestion would be to verify you actually have a problem worth solving.

No data has been presented to argue that you do.

These numbers you present are since the database was started. I'll just assume that was two months ago. These numbers are therefore tiny.

Thanks

Anil, October 30, 2008 - 11:00 am UTC

Tom,
Does a higher value of "ITL Waits" for any object signify that there's lots of DML operations?

November 02, 2008 - 3:09 pm UTC

well, think about what this means...

If you "get that", you'll be able to answer your own question - and realize the answer is "it depends"

first, before we even begin down this path - ask yourself, "what is lots", what defines to you "lots" (because whatever you pick - odds are, I won't pick that, I'll pick something else...)

Ok, what is an ITL wait - it is when the number of transactions that are trying to take place on a single block exceed the ITL list that block can hold.

Suppose you have a one block table.
Suppose you have a one MILLION block table.

Suppose in both cases ITL waits are evenly distributed across all blocks in the table. In case 1, 100% take place against 1 block. In case 2, 1/100000% take place against any single block.

Now, if both have 10,000 ITL waits - which has "more DML operations" (probably)

I'd say case two has lots more DML than case 1 (probably)

You cannot infer absolutely the DML rate from the ITL waits without lots more information....

Loading issues

sriram, February 12, 2009 - 6:36 am UTC

Tom,

We are in the process of doing some Loading through an ETL which does update destination tables , but we were experiencing lot of issues with respect to the updates on one of the table.

The statspack report confirmed that we are having issues on
one of the update query , where it does almost 10k physical reads to process a single row and this staement is executed may be many times.

Physical Reads Executions Reads per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
379,920 40 9,498.0 19.8 106.16 218.14 3371213990
Module: pmdtm@lxgdwsi1 (TNS V1-V3)
UPDATE IA_SALES_ORDLNS SET PRODUCT_KEY = :1, SALES_PROD_KEY = :2
, MFG_PROD_KEY = :3, SPLR_PROD_KEY = :4, PLANT_LOC_KEY = :5, STO
RAGE_LOC_KEY = :6, SHIPPING_LOC_KEY = :7, SALES_AREA_ORG_KEY = :
8, SALES_GEO_ORG_KEY = :9, COMPANY_ORG_KEY = :10, BUSN_AREA_ORG_
KEY = :11, CHNL_TYPE_KEY = :12, CHNL_POINT_KEY = :13, CUSTOMER_K

Now we could find another table where the updates were done fast with only 4 buffer gets.

172 40 4.3 0.0 0.01 0.02 2974485037
Module: pmdtm@lxgdwsi1 (TNS V1-V3)
UPDATE OD_SALES_ORDLNS SET CURR_KEY = :1, PRODUCT_ID = :2, SALES
_PROD_ID = :3, MFG_PROD_ID = :4, SPLR_PROD_ID = :5, PLANT_LOC_ID
= :6, STORAGE_LOC_ID = :7, SHIPPING_LOC_ID = :8, SALES_AREA_ORG
_ID = :9, SALES_GEO_ORG_ID = :10, COMPANY_ORG_ID = :11, BUSN_ARE
A_ORG_ID = :12, CHNL_TYPE_ID = :13, CHNL_POINT_ID = :14, CUSTOME

so before trying to explain the plan ,i tried to see the total no of blocks and rows on both the tables.

Table Name rows No.Of Blocks
IA_SALES_ORDLNS 2439913 99480
OD_SALES_ORDLNS 2454030 167419

Now when i do a select count(*) from IA_SALES_ORDLNS IT takes around 60 secs to respond

While select count(*) from OD_SALES_ORDLNS responds with in 6-8 secs.

Why is that so ?

We wont be able to touch the code as it is generated from informatica, also please let me know how will i go about reducing the physical reads

Thanks

February 12, 2009 - 3:50 pm UTC

without the plans - preferably in the format of a row source operation report in a tkprof - there are TOO MANY reasons.

I'm not going to hypothesize and list them all.

get a tkprof.

execution plans on AWR SQL Report

Sean, March 25, 2009 - 2:36 pm UTC

Tom,
I have the query shown as the Top one on AWR.

Elapsed CPU Elap per % Total
Time (s) Time (s) Executions Exec (s) DB Time SQL Id
---------- ---------- ------------ ---------- ------- -------------
3,942 1,619 16,844 0.2 20.0 7zb4x904c5d3d
Module: Prx3Usr.exe
Select RelMessUsers.ID,Messages.SUBJECT,Messages.FROM_R,Messages.PRIORITY,Messag
es.SENT,Messages.ACTIVATIONDATE,Messages.TITLEID,Messages.TYPE,Messages.WHEN_P,M
essages.WHEN_DATE,Messages.SAVEBODYMEMO,Messages.SAVEFREEMEMO,Messages.READTYPE,
Messages.RE_PATIENT,Messages.SAVEINPREVENC,Messages.NURSINGTASKSTATUS,Messages.N

because it was executed over 10,000/0.5hrs, it is the big problem source of the applications running slow. I want to do something with it. plan is 1) duplicate the exec plan shown on awr sql report (awrsqrpt.sql) 2) make the improvement.
I used the sql Id get the queies exec plans on the report:

Plan Hash Total Elapsed 1st Capture Last Capture
# Value Time(ms) Executions Snap ID Snap ID
--- ---------------- ---------------- ------------- ------------- --------------
1 909396650 3,941,814 1,6844 781 781
2 2151852467 0 0 781 781
-------------------------------------------------------------

I have been able to duplicate Plan 2 shown by tkprof with the data, explain-plan on query with place-holds (:h) and etc.

But Plan 2 was NOT used by the application (executions=0). We have to work on the Plan1 and to make the improvement on the Plan1 not the Plan2. it might be due the sets of data being used, but why did the plan with place-holds only matching Plan2? How could we duplicate the plan1 which the application run it?

the two plans as shown:

Plan 1(PHV: 909396650)
----------------------

Plan Statistics DB/Inst: PRX3/PRX3 Snaps: 780-781
-> % Total DB Time is the Elapsed Time of the SQL statement divided
into the Total Database Time multiplied by 100

Stat Name Statement Per Execution % Snap
---------------------------------------- ---------- -------------- -------
Elapsed Time (ms) 3,941,814 234.0 20.0
CPU Time (ms) 1,618,764 96.1 19.3
Executions 16,844 N/A N/A
Buffer Gets ########## 9,659.3 21.3
Disk Reads 0 0.0 0.0
Parse Calls 33,688 2.0 1.8
Rows 68,390 4.1 N/A
User I/O Wait Time (ms) 0 N/A N/A
Cluster Wait Time (ms) 0 N/A N/A
Application Wait Time (ms) 0 N/A N/A
Concurrency Wait Time (ms) 884,480 N/A N/A
Invalidations 0 N/A N/A
Version Count 3 N/A N/A
Sharable Mem(KB) 237 N/A N/A
-------------------------------------------------------------

Execution Plan
-------------------------------------------------------------------------------------------------
-------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Tim
e |
-------------------------------------------------------------------------------------------------
-------
| 0 | SELECT STATEMENT | | | | 614 (100)|
|
| 1 | FILTER | | | | |
|
| 2 | SORT ORDER BY | | 1 | 251 | 614 (3)| 00:
00:05 |
| 3 | TABLE ACCESS BY INDEX ROWID | RELMESSUSERS | 1 | 40 | 2 (0)| 00:
00:01 |
| 4 | NESTED LOOPS | | 1 | 251 | 613 (3)| 00:
00:05 |
| 5 | NESTED LOOPS | | 1 | 211 | 611 (3)| 00:
00:05 |
| 6 | NESTED LOOPS | | 1 | 167 | 610 (3)| 00:
00:05 |
| 7 | NESTED LOOPS | | 1 | 143 | 609 (3)| 00:
00:05 |
| 8 | NESTED LOOPS | | 1 | 120 | 608 (3)| 00:
00:05 |
| 9 | TABLE ACCESS BY INDEX ROWID| HCPEOPLE | 1 | 27 | 1 (0)| 00:
00:01 |
| 10 | INDEX UNIQUE SCAN | HC32_PRIMARYKEY | 1 | | 0 (0)|
|
| 11 | TABLE ACCESS FULL | MESSAGES | 1 | 93 | 607 (3)| 00:
00:05 |
| 12 | TABLE ACCESS BY INDEX ROWID | PRXPREVENCGROUPS | 1 | 23 | 1 (0)| 00:
00:01 |
| 13 | INDEX UNIQUE SCAN | PP148_ID | 1 | | 0 (0)|
|
| 14 | TABLE ACCESS BY INDEX ROWID | HCPEOPLE | 1 | 24 | 1 (0)| 00:
00:01 |
| 15 | INDEX UNIQUE SCAN | HC32_PRIMARYKEY | 1 | | 0 (0)|
|
| 16 | TABLE ACCESS BY INDEX ROWID | PATIENTSPRAXIS | 1 | 44 | 1 (0)| 00:
00:01 |
| 17 | INDEX UNIQUE SCAN | PP78_PRIMARYKEY | 1 | | 0 (0)|
|
| 18 | INDEX RANGE SCAN | MESDER | 1 | | 1 (0)| 00:
00:01 |
-------------------------------------------------------------------------------------------------
-------

Plan 2(PHV: 2151852467)
-----------------------

Plan Statistics DB/Inst: PRX3/PRX3 Snaps: 780-781
-> % Total DB Time is the Elapsed Time of the SQL statement divided
into the Total Database Time multiplied by 100

Stat Name Statement Per Execution % Snap
---------------------------------------- ---------- -------------- -------
Elapsed Time (ms) 0 N/A 0.0
CPU Time (ms) 0 N/A 0.0
Executions 0 N/A N/A
Buffer Gets 0 N/A 0.0
Disk Reads 0 N/A 0.0
Parse Calls 0 N/A 0.0
Rows 0 N/A N/A
User I/O Wait Time (ms) 0 N/A N/A
Cluster Wait Time (ms) 0 N/A N/A
Application Wait Time (ms) 0 N/A N/A
Concurrency Wait Time (ms) 0 N/A N/A
Invalidations 0 N/A N/A
Version Count 3 N/A N/A
Sharable Mem(KB) 123 N/A N/A
-------------------------------------------------------------

Execution Plan
-------------------------------------------------------------------------------------------------
------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
|
-------------------------------------------------------------------------------------------------
------
| 0 | SELECT STATEMENT | | | | 242 (100)|
|
| 1 | FILTER | | | | |
|
| 2 | SORT ORDER BY | | 134 | 33634 | 242 (2)| 00:0
0:02 |
| 3 | HASH JOIN | | 134 | 33634 | 241 (2)| 00:0
0:02 |
| 4 | HASH JOIN | | 131 | 27117 | 144 (1)| 00:0
0:02 |
| 5 | TABLE ACCESS FULL | HCPEOPLE | 210 | 5040 | 3 (0)| 00:0
0:01 |
| 6 | HASH JOIN | | 131 | 23973 | 140 (0)| 00:0
0:02 |
| 7 | NESTED LOOPS | | 133 | 21280 | 138 (0)| 00:0
0:02 |
| 8 | NESTED LOOPS | | 133 | 8911 | 7 (0)| 00:0
0:01 |
| 9 | TABLE ACCESS BY INDEX ROWID| HCPEOPLE | 1 | 27 | 1 (0)| 00:0
0:01 |
| 10 | INDEX UNIQUE SCAN | HC32_PRIMARYKEY | 1 | | 0 (0)|
|
| 11 | INDEX RANGE SCAN | RELMESSUSERS_IDX | 133 | 5320 | 6 (0)| 00:0
0:01 |
| 12 | TABLE ACCESS BY INDEX ROWID | MESSAGES | 1 | 93 | 1 (0)| 00:0
0:01 |
| 13 | INDEX UNIQUE SCAN | ME48_PRIMARYKEY | 1 | | 0 (0)|
|
| 14 | INDEX FAST FULL SCAN | PRXPREVCGRP_IDX1 | 1071 | 24633 | 2 (0)| 00:0
0:01 |
| 15 | TABLE ACCESS FULL | PATIENTSPRAXIS | 28574 | 1227K| 96 (2)| 00:0
0:01 |
-------------------------------------------------------------------------------------------------
------

March 29, 2009 - 8:35 pm UTC

... 1,6844 .... interesting, did we really do that or did you edit everything?

use

select * from
table(dbms_xplan.display_cursor(null,null,'typical +peeked_binds'));

but replace null null with the sql_id and child number of interest, it'll show you the bind variable value used to optimize the query.

Sean, March 29, 2009 - 9:19 pm UTC

I did not edit the number of executions. Yes, it was abnormal for runnig that high in 1/2 hr. we had at 20,000 to 30,000 at the peak. I had been out the project and I cannot run that query for the peeked-binds. sorry.

Elapsed CPU Elap per % Total
Time (s) Time (s) Executions Exec (s) DB Time SQL Id
---------- ---------- ------------ ---------- ------- -------------
14,043 3,705 28,892 0.5 34.9 f4ckz3ffu7tv7
Module: Prx3Usr.exe
Select RelMessUsers.ID,Messages.SUBJECT,Messages.FROM_R,Messages.PRIORITY,Messag
es.SENT,Messages.ACTIVATIONDATE,Messages.TITLEID,Messages.TYPE,Messages.WHEN_P,M
essages.WHEN_DATE,Messages.SAVEBODYMEMO,Messages.SAVEFREEMEMO,Messages.READTYPE,
Messages.RE_PATIENT,Messages.SAVEINPREVENC,Messages.NURSINGTASKSTATUS,Messages.N

March 30, 2009 - 4:10 pm UTC

1,6844 .... interesting, did we really do that or did you edit everything?

really, we put the comma in the wrong place?

You would need the peeked binds if you want that plan however.....

A Reader

A Reader, April 02, 2009 - 2:39 am UTC

Hi Tom,

I tried looking in the whole thread but was not able to find the a good enough explaination for the two timed events that are showing up in our AWR report.
This is a IBM AIX server with 100Gb ram with 16 CPUs Oracle version 10gR2
During Peak load I was able to find following in the list of Top 5 Timed Events

Event Waits Time(s) A Avg Wait(ms) % Total Call Time Wait Class
---------------------------------------------------------------------------------------------------
CPU time 1,093 65.3
PX Deq Credit:
send blkd 261,789 293 1 17.5 Other

db file sequential
read 32,002 68 2 4.1 User I/O

os thread
startup 67 41 615 2.5 Concurrency

SQL*Net more
data to client 779,315 15 0 .9 Network

I understand that "PX Deq Credit: send blkd", "SQL*Net more
data to client " are Idle wait events.
Please let me know
1)What should I understand if I get "os thread startup" as the top 5 timed event ?
2)What should I understand if I get "PX Deq Credit: send blkd " and "SQL*Net more data to client " as the top 5 timed event?
3) I found on Metalink the way to reduce wait on "PX Deq Credit: send blkd " event but I want to know can this be the cuase of slow performace on DB.? Does wait on this event can in anyway give us any directions where the problem lies? is it N/w or OS or DB?

Following are the observation at the application front when this report was taken,
1) The application running on this DB experineced very slow performace under load with several 100 Users connected to it and several others trying to connect the DB
2) Also there is delay being noticed to connect to DB via SQLplus and with Application connect string during load testing of application. Can you please guide me where should I look for to know the cause of this delay in getting the connection to DB? Can I somehow get the casue of it from the AWR report or I need to look at some other utilities at OS level.
3) What happend in the background when we try to get a connection on DB(by passing the listener i.e withour using the TNS string). I know how listener handle a new request and a process is created at OS end. But what could be the problem if there is a delay in making a seesion and that too when I by pass the listener.?
5) Our team is trying to analyse the reason for the performace issue on the DB and I need to make sure that DB is not the major concern or bottle neck for the application being slow. Request you to please help me out in this to give reach to a conclusion.

Some more information from the Report below
Operating System Statistics
--------------------------
Statistic Total
NUM_LCPUS 0
NUM_VCPUS 0
AVG_BUSY_TIME 11,445
AVG_IDLE_TIME 49,235
AVG_IOWAIT_TIME 315
AVG_SYS_TIME 1,378
AVG_USER_TIME 10,049
BUSY_TIME 183,425
IDLE_TIME 788,067
IOWAIT_TIME 5,377
SYS_TIME 22,313
USER_TIME 161,112
LOAD 0
OS_CPU_WAIT_TIME 205,100
RSRC_MGR_CPU_WAIT_TIME 0
PHYSICAL_MEMORY_BYTES ###############
NUM_CPUS 16
NUM_CPU_CORES 8

Load Profile
------------
Per Second Per Transaction
Redo size: 93,679.17 10,358.06
Logical reads: 63,792.94 7,053.55
Block changes: 819.31 90.59
Physical reads: 213.39 23.59
Physical writes: 9.77 1.08
User calls: 2,663.28 294.48
Parses: 1,397.17 154.48
Hard parses: 2.68 0.30
Sorts: 562.83 62.23
Logons: 0.26 0.03
Executes: 3,104.11 343.22
Transactions: 9.04

% Blocks changed per Read: 1.28 Recursive Call %: 59.55
Rollback per transaction %: 0.24 Rows per Sort: 14.78

Instance Efficiency Percentages (Target 100%)
--------------------------------------------
Buffer Nowait %: 100.00 Redo NoWait %: 100.00
Buffer Hit %: 99.80 In-memory Sort %: 100.00
Library Hit %: 99.61 Soft Parse %: 99.81
Execute to Parse %: 54.99 Latch Hit %: 99.83
Parse CPU to Parse Elapsd %: 25.18 % Non-Parse CPU: 99.31

Shared Pool Statistics
----------------------
Begin End
Memory Usage %: 92.37 92.27
% SQL with executions>1: 92.18 96.01
% Memory for SQL w/exec>1: 92.45 94.31

Please let me know in case more information is required.

Thanks

April 02, 2009 - 9:51 am UTC

you do not say how long that (unreadable) report was for

you have several hundred users and more on the way.... and you are using parallel query? parallel query is good when you have less users than cpu - do you really mean to be using it? You are allowing your several hundred users to be "several THOUSAND users" with parallel operations.

install statspack/perfstat to take performance snapshots

Suraj, April 05, 2009 - 10:08 am UTC

I was asked to install statspack/perfstat to take performance snapshots every 15 minutes on the 15 minutes except during our batch window of midnight until 7am, during which time we would like performance snapshots taken every 2 hours at the top of the hour (midnight, 2, 4 and 6).

I am really not sure how to do that, I am working on Oracle 10g.

April 07, 2009 - 6:02 am UTC

not clear on what you don't know how to do - install statspack or configure a job to run statspack

installing
http://docs.oracle.com/docs/cd/B10501_01/server.920/a96533/statspac.htm#21605

to schedule something for every 15 minutes - unless you are between midnight and 7am - you just need a function to return that date.

ops$tkyte%ORA11GR1> create or replace function my_next_date return date
  2  is
  3  begin
  4          return
  5         case when to_char(sysdate,'hh24') between '00' and '05'
  6              then trunc(sysdate,'dd')+
  7                                (trunc(to_number(to_char(sysdate,'hh24'))/2)+1)*(2/24)
  8              when to_char(sysdate,'hh24') = '06'
  9              then trunc(sysdate,'dd')+7/24
 10                          else trunc(sysdate,'hh') + (trunc(to_number(to_char(sysdate,'mi'))/15)+1)*15/24/60
 11          end;
 12  end;
 13  /

Function created.

and then you can schedule:

ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> variable n number
ops$tkyte%ORA11GR1> begin
  2          dbms_job.submit( :n, 'statspack.snap;', 
                    interval => 'my_next_date()' );
  3  end;
  4  /

PL/SQL procedure successfully completed.

or you can use dbms_scheduler, up to you...

A reader

A reader, April 06, 2009 - 12:13 am UTC

The report was run for 10 mins interval.

I used parallel query option due to performance reasons to make things fasters. Also thinking that the machine is quite high end to handle it,

will this hamper the performance further?

April 07, 2009 - 6:06 am UTC

when you have many more concurrent users

than cpus

you don't want those users to be able to use even MORE cpu - would you????

you have 16 cpus, with hundreds of users (assuming many are concurrent) it is highly unlikely you want parallel - think about it, say you have 32 concurrent users (2 per cpu) and you let them run parallel 8 - now you have SIXTEEN per cpu.

is that a) good or b) bad?

A Reader

A reader, April 06, 2009 - 12:20 am UTC

And appologies for the unreadable report :( I received the report in HTML format and when i pasted it here it looked that way. I do not have privileges to get the report myself. Hope you can still make something out of it help me out ..

install statspack/perfstat to take performance snapshots

Suraj Sharma, April 07, 2009 - 11:42 pm UTC

Thanks Tom, Scheduling is what I was looking for...

Regards,
Suraj

statspack report - ASYNC DISK I/O

A reader, April 08, 2009 - 5:47 am UTC

Hello Tom

one of my production database shows the wait event "ASYNC DISK I/O" as top timed events in statspack report, once in a while.and this causes the overall performance degradation around that time frame.

my questions are :

1) what are the potential reasons of this wait . I have gone through a metalink note on this as well but could not understand well.
2) how to reduce this wait.what to monitor and how to monitor to troubleshott it effectively .

Here is a snap of this wait events

STATSPACK report for

DB Name DB Id Instance Inst Num Release Cluster Host
------------ ----------- ------------ -------- ----------- ------- ------------
abcde 11190066 abcde 1 9.2.0.8.0 NO abcddb
v01

Snap Id Snap Time Sessions Curs/Sess Comment
--------- ------------------ -------- --------- -------------------
Begin Snap: 1052 02-Apr-09 02:49:14 54 25.7
End Snap: 1053 02-Apr-09 03:49:17 59 16.5
Elapsed: 60.05 (mins)

Cache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 5,552M Std Block Size: 8K
Shared Pool Size: 2,048M Log Buffer: 1,024K

Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 5,299.92 4,862.64
Logical reads: 11,921.29 10,937.71
Block changes: 26.81 24.60
Physical reads: 436.54 400.53
Physical writes: 67.03 61.50
User calls: 210.92 193.52
Parses: 15.33 14.07
Hard parses: 0.03 0.03
Sorts: 8.45 7.76
Logons: 0.06 0.06
Executes: 38.37 35.21
Transactions: 1.09

% Blocks changed per Read: 0.22 Recursive Call %: 21.96
Rollback per transaction %: 0.00 Rows per Sort: 438.41

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 100.00 Redo NoWait %: 100.00
Buffer Hit %: 96.73 In-memory Sort %: 99.89
Library Hit %: 99.96 Soft Parse %: 99.79
Execute to Parse %: 60.04 Latch Hit %: 100.00
Parse CPU to Parse Elapsd %: 96.26 % Non-Parse CPU: 99.27

Shared Pool Statistics Begin End
------ ------
Memory Usage %: 37.64 37.67
% SQL with executions>1: 70.02 70.02
% Memory for SQL w/exec>1: 76.85 76.88

Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
async disk IO 235,668 10,329 76.35
CPU time 1,311 9.69
log file sync 3,714 416 3.08
log file parallel write 7,924 408 3.01
db file parallel write 216 277 2.05
-------------------------------------------------------------

April 13, 2009 - 1:25 pm UTC

what database features are you using (eg: data guard?)

who is waiting for this (v$session_event will tell you this when this is occurring)

yes we are using data guard

A reader, April 17, 2009 - 7:13 am UTC

Hi Tom

yes we are using data gurad , protection mode is "maximum performance". can this wait event be due to data guard, if yes how .

kinldy explain a bit.

Regards

April 17, 2009 - 10:19 am UTC

you want to see who is waiting on this - it is either a process reading the logs, tracing being written, or arch waiting.

In any event, it is not likely this is the "cause" of the "poor performance" but more likely an overall symptom - your IO is overtaxed at that moment.

If it is a background (arch for example) waiting for this - you don't really care - UNLESS someone is waiting for the background (but we don't see that)

The only client wait event here:

Event                                               Waits    Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
async disk IO                                     235,668      10,329    76.35
CPU time                                                        1,311     9.69
log file sync                                       3,714         416     3.08
log file parallel write                             7,924         408     3.01
db file parallel write                                216         277     2.05

is the log file sync (which is not huge). log file parallel write - lgwr does that (and that is the cause of the log file sync waits...). db file parallel write - that is dbwr and unless you see something like "free buffer" waits - your applications are not waiting on it. CPU time - everyone uses that (it is not a wait event, just a metric).

So, what process is experiencing the wait?

A reader, August 24, 2009 - 4:26 pm UTC

I am using the following SQL to extract the elasped time , Percentage DB time .
The elap per Exec (s) matches , but the % Total DB time does not match ( the differntial ranges from 0.5% to 2%)




Select snap_id,
       sql_id,
       executions,
       elapsed_time_delta,
       elap_per_sec,
       -- sum(elapsed_time_delta) over(PARTITION BY snap_id) Total_DB_time,
       Round(elapsed_time_delta / sum(elapsed_time_delta)
             over(PARTITION BY snap_id) * 100,
             2) Percentage_DB_Time
  from (SELECT snap_id,
               dba_hist_sqlstat.sql_id sql_id,
               SUM(executions_delta) executions,
               SUM(elapsed_time_delta) elapsed_time_delta,
               dba_hist_sqlstat.module,
               round((SUM((elapsed_time_delta)) / 1000000) /
                     SUM(executions_delta),
                     2) elap_per_sec,
               sum(elapsed_time_total) elapsed_time_total
          FROM dba_hist_sqlstat
         WHERE snap_id = (5399)
           AND (EXECUTIONS_DELTA <> 0 and ELAPSED_TIME_DELTA <> 0)
           AND instance_number = 2 --which instance ?
         GROUP BY snap_id, dba_hist_sqlstat.sql_id, dba_hist_sqlstat.module)
 order by 7 desc


I am doing this because AWR comparison report does not report elap_per sec .
I am on 10g Release 2 . 





  SNAP_ID SQL_ID        EXECUTIONS ELAPSED_TIME_DELTA ELAP_PER_SEC PERCENTAGE_DB_TIME
--------- ------------- ---------- ------------------ ------------ ------------------
     5399 bn30dmqukmymf         60            2557301          .04              12.07
     5399 d950rh1swfab4      22415            2005692            0               9.47


  Elapsed      CPU                  Elap per  % Total
  Time (s)   Time (s)  Executions   Exec (s)  DB Time    SQL Id
---------- ---------- ------------ ---------- ------- -------------
         3          2           60        0.0    12.6 bn30dmqukmymf
         2          2       22,415        0.0     9.9 d950rh1swfab4

Performance Investigation Approach when using STATSPACK

Hari, March 30, 2010 - 3:01 am UTC

Hi Tom,

Greetings

Is there any site or reference material which we can keep to analyse AWR report, like explaining each and every technical term used in the report, the various sections and and their meanings?

I wanted to give a practical training to my team and make them to understand what each terminology implies? If I have this, then I can pass it on to them and clear only their doubts.

Thanks

Hari

awr report understanding

A reader, April 07, 2010 - 8:36 am UTC

Hi Tom,

in the AWR reports there are so much of data, each section will have some significance,i have 2 questions

1) please consider the following activity report from one of my AWR report :

Instance Activity Stats - Absolute Values
Statistics with absolute values (should not be diffed)
Statistic Begin Value End Value
session cursor cache count 4,141,863 4,167,572
opened cursors current 633 549
workarea memory allocated 99,369 95,474
logons current 108 106

what does it mean...begin value and end value..what they represent. looking at these statistics what can I conclude.

if end value for session cache count is greate then begin value then what does it mean ?

2) can you point me to a document which talks about how to interpret an AWR report.

Thanks

April 12, 2010 - 8:47 pm UTC

begin is the value of a given metric at the beginning of the time window for the report.

end is the value of that metric at the end of that time window.

they are just numbers. If I told you:

at the beginning, you had $5 in the bank.
at the end you had $10 in the bank.

What could you do with that information? You could do things like say "at the end I had 10$, I started with 5$". You couldn't say some other things like "In the middle I had 100$", you don't know, you might have, you might not have.

They are just numbers - they represent point in time observations of some metric.

You can refer to the Reference Guide as it explains the statistics, but I would suggest using the EM user interface - it interprets most of what you need interpreted.

Deepank, April 26, 2010 - 9:27 am UTC

Tom,

I need to investigate performance issue over DB link . I need to calculate how much data (In KB/MB) is getting transferred over DB link. How can I approach to this issue?

Thanks in advance.

April 26, 2010 - 9:47 am UTC

NAME                                               VALUE
--------------------------------------------- ----------
bytes sent via SQL*Net to dblink                       0
bytes received via SQL*Net from dblink                 0
SQL*Net roundtrips to/from dblink                      0
bytes via SQL*Net vector to dblink                     0
bytes via SQL*Net vector from dblink                   0

v$mystat joined to v$statname, lots of dblink related statistics.

deepank, April 28, 2010 - 5:12 am UTC

Thanks Tom.

After running a test I got below results. So data sent/received over DB link is = (417074 + 15247945) bytes. Do I need to take any other stats into consideration?

NAME V DIFF
--------------------------------------------- ---------- ----------------
bytes sent via SQL*Net to dblink 417074 417,074
bytes received via SQL*Net from dblink 15247945 15,247,945
SQL*Net roundtrips to/from dblink 1395 1,395

April 28, 2010 - 8:31 am UTC

you tell me if you want any other? You have bytes sent and received - anything else you think you might need?

AWR report

Jayadevan, July 21, 2010 - 11:11 pm UTC

Hi Tom,
One of our databases start slowing down during night. We took 2 consecutive AWR reports (10 minute duration each) during this period and did not find any long-running queries . We found one index consistently under "Segments by Row Lock Waits", taking 95-99% in % Capture. I couldn't see any SQL doing delete/update against the base table in the AWR reports which were sent to me by the Prod Support team. What could explain this? Is it likely that the locks were there and the UPDATE/DELETE sqls were also running, but they did not show up in AWR reports since they finished execution after the AWR reports were taken?
Thanks.

cursor: pin S wait on X wait in my performance test

A reader, October 13, 2010 - 8:37 am UTC

Hi Tom

During a perfamnce test we have done the topped timed invents was found to be cursor: pin S wait on X, So i have taken the ASH report to see which query was accounting for the most of the wait,

To my suprise i found the call was to DBMS_MONITOR.SESSION_TRACE_DISABLE .

As the concurrency increases the percentage wait on this is increasing, can you let me know when this event will happen and how to reduce the wait time on this event.

If only execute privellage is given to me will i be able to run dbms_monitor.SESSION_TRACE_DISABLE or do i need dba privellage for sure.

The version of db used is 11.1.0.7.

Thanks in advance

October 13, 2010 - 8:59 am UTC

if your biggest, most measureable wait was this - which you obviously would never do in production in general, well done.

Why are you running a performance test with sql tracing? ASH and AWR would have the information you need without it.

Increase in Hard parses

V Dhanumjai, October 25, 2010 - 3:48 am UTC

Hi Tom

While performaning some random test i found something peculiar

I have gathered the some statistics regarding the parses from v$mystat and they were like this.

parse time cpu 7
parse time elapsed 15
parse count (total) 129
parse count (hard) 8
parse count (failures) 4
parse count (describe) 0

Then i ran the below anonymous block to know the differences between the parse statistics.

begin
for i in 1..1000 loop
dj_sys.TURN_TRACE_OFF();
end loop;
end;

dj_sys.TURN_TRACE_OFF(); is nothing but a call to DBMS_MONITOR.SESSION_TRAC_DISABLE. I am not given the DBA privillage to execute any procedure/funciton of DBMS_MONITOR package,so all my parsing should fail and parse count (failures) should increment by 1000 this was happening as expected, but one other thing observed was a huge increment in hard parse count rather than an increase by 100.

After the test

parse time cpu 8062
parse time elapsed 8084
parse count (total) 1133
parse count (hard) 321009
parse count (failures) 1004
parse count (describe) 0

before after
parse count (total) 129 1133 (approx 1000) as expected
parse count (failures) 4 1004 (1000 as expected)
parse count (hard) 8 321009

Can you plese let me know why there is huge difference in hard parses count rather than being an increment of approx 1000.

thanks & Regards
V.dhanumjai

October 25, 2010 - 5:59 pm UTC

what is dj_sys and so on. No idea why you think all parsing should fail??

you need to give us a working example - otherwise we are guessing and I don't like to waste my time guessing if I don't have to.

Increase in Hard parses

V. Dhanumjai, October 26, 2010 - 2:27 pm UTC

DJ_SYS is a package, inside the package there a procedure TURN_TRACE_OFF();

the working code is

create or replace PACKAGE  dj_sys AS

PROCEDURE turn_trace_off;

END dj_Sys;

create or replace PACKAGE BODY dj_sys
AS

PROCEDURE turn_trace_off IS
      str       VARCHAR2(100);
      
   BEGIN
str := 'BEGIN dbms_monitor.session_trace_disable (NULL, NULL); END; ';
  EXECUTE IMMEDIATE str;
   EXCEPTION
   WHEN OTHERS THEN
      dbms_output.put_line(SQLCODE || SQLERRM);
      null;
   END turn_trace_off;
END dj_Sys;

I am not given any execute privellage on DBMS_MONITOR because the output looked like this when i queried dba_tab_privs view.

select GRANTEE ,OWNER, GRANTOR,PRIVILEGE, GRANTABLE ,HIERARCHY
 from dba_tab_privs where table_name = 'DBMS_MONITOR'

The output is
GRANTEE OWNER GRANTOR PRIVILEGE GRANTABLE HIERARCHY
OEM_MONITOR SYS SYS EXECUTE NO NO
DBA SYS SYS EXECUTE NO NO

And when i ran the anonymous block with set serveroutput on

begin
for i in 1..1000 loop
dj_sys.TURN_TRACE_OFF();
end loop;
end;

the output was able to see the message

anonymous block completed
-6550ORA-06550: line 1, column 7:
PLS-00201: identifier 'DBMS_MONITOR' must be declared
ORA-06550: line 1, column 7:
PL/SQL: Statement ignored

The above error PLS -00201 and the output from dba_tab_Privs view prompted me to assume that the parses was failing due to lack of execute privellege on DBMS_MONITOR package.

V$mystat stastics

Before the test

parse time cpu 7
parse time elapsed 15
parse count (total) 129
parse count (hard) 8
parse count (failures) 4
parse count (describe) 0

After the test

parse time cpu 8062
parse time elapsed 8084
parse count (total) 1133
parse count (hard) 321009
parse count (failures) 1004
parse count (describe) 0

I thought my assumption was right when i saw the parse count(failed) from v$mystat statistic increased exactly by 1000, but not able to identify the cause why the hard parse count increase drastically.

October 26, 2010 - 8:22 pm UTC

   EXCEPTION
   WHEN OTHERS THEN
      dbms_output.put_line(SQLCODE || SQLERRM);
      null;
   END turn_trace_off;

http://asktom.oracle.com/pls/asktom/asktom.search?p_string=%22i+hate+your+code%22

sorry, you just lost my interest entirely. Not going to look at it while there exists a "when others then null". Tired of that junk, that is just A BUG IN YOUR CODE.

Not worthy of comment.

INITRAN\

babloo, August 02, 2011 - 10:21 am UTC

Hi Tom
how do we know what should be the initial value of initran if we know say we have 500 concurrent users most of them creating new records. I understand that we need to do load testing but setting an thoughtfull value may save me some time

August 02, 2011 - 11:38 am UTC

I'd let it default and only change it if ITL waits were being reported in v$segment_statistics

The default, especially with ASSM (automatic segment space management) which tends to spread inserts out, is typically more than sufficient.

Parallel query - awr report

A reader, June 22, 2012 - 8:21 am UTC

Hi Tom,

Just a clarification on 2 questions,

1. when i look at the AWR report for 12 hours (total batch process) on SQL order by elapsed time few queries on top which seems to be misleading in timing.
The total batch process itself took just 12 hours whereas the queries says it got executed > 30 hours, below is the one.

Elapsed Time (s) CPU Time (s) Executions Elap per Exec (s) % Total DB Time SQL Id SQL Module SQL Text
119,359 46 1 119359.27 22.51 dyzun84pjdd8v pmdtm@itlinuxbl68.hq.emirates.com (TNS V1-V3) select a.*, b.ASSIGNSEQNO , b....
66,707 279 1 66707.18 12.58 6pbhus24b1a33 pmdtm@itlinuxbl68.hq.emirates.com (TNS V1-V3) SELECT DISTINCT A.TRIPSEQNO, a...
38,855 34 1 38854.60 7.33 129bmfh01r35k pmdtm@itlinuxbl68.hq.emirates.com (TNS V1-V3) select a.*, b.ASSIGNSEQNO , b....
14,573 4,778 1 14573.09 2.75 1ggnt0cb7kfgh pmdtm@itlinuxbl68.hq.emirates.com (TNS V1-V3) select '' planningname, flight...

If i want to know the timing of first query the report says the total execution is 119,359 seconds for 1 execution. 119,359/60/60 - 33 hours.

Below is the execution plan from dba_hist_sql_plan.

OPERATION OPTIONS OBJECT_NAME OPTIMIZER COST BYTES CPU_COST IO_COST TIME TO_CHAR(TIMESTAMP,'
------------------------------ --------------- -------------------- -------------------- ---------- ---------- ---------- ---------- ---------- -------------------
SELECT STATEMENT ALL_ROWS 1037 27-01-2012:20:09:19
PX COORDINATOR 27-01-2012:20:09:19
PX SEND QC (RANDOM) :TQ10003 1037 24033480 2625014496 966 30 27-01-2012:20:09:19
HASH JOIN RIGHT OUTER 1037 24033480 2625014496 966 30 27-01-2012:20:09:19
PX RECEIVE 934 4948335 2594068128 864 27 27-01-2012:20:09:19
PX SEND HASH :TQ10002 934 4948335 2594068128 864 27 27-01-2012:20:09:19
VIEW 934 4948335 2594068128 864 27 27-01-2012:20:09:19
FILTER 27-01-2012:20:09:19
MERGE JOIN 934 7351812 2594068128 864 27 27-01-2012:20:09:19
SORT JOIN 1 14 15186 1 1 27-01-2012:20:09:19
BUFFER SORT 27-01-2012:20:09:19
PX RECEIVE 1 14 15186 1 1 27-01-2012:20:09:19
PX SEND BROADCAST :TQ10000 1 14 15186 1 1 27-01-2012:20:09:19
INDEX RANGE SCAN INDX_SYS_PARAM3_TEST 1 14 15186 1 1 27-01-2012:20:09:19
3

SORT JOIN 933 56411152 2594052942 863 27 27-01-2012:20:09:19
PX BLOCK ITERATOR 930 56411152 2476072954 863 27 27-01-2012:20:09:19
TABLE ACCESS FULL BRN_ODS_ASSIGNMENT 930 56411152 2476072954 863 27 27-01-2012:20:09:19
BUFFER SORT 27-01-2012:20:09:19
PX RECEIVE 101 16939260 10100078 101 3 27-01-2012:20:09:19
PX SEND HASH :TQ10001 101 16939260 10100078 101 3 27-01-2012:20:09:19
TABLE ACCESS BY INDEX ROWID BRN_ODS_ASSIGNMENT_T 101 16939260 10100078 101 3 27-01-2012:20:09:19
EMP

NESTED LOOPS 102 18966180 10115264 102 3 27-01-2012:20:09:19
INDEX RANGE SCAN INDX_SYS_PARAM3_TEST 1 14 15186 1 1 27-01-2012:20:09:19
3

INDEX RANGE SCAN IDX_ASS_TEMP1 9 2159400 9 1 27-01-2012:20:09:19

I heard from one of my collegue if at all we need to get accurate timing of parallel query we need to divide it by number of CPU, my server has 16 CPU. If this is the case does it have to be like 119359/16/60/60 = 2 hours approx. Does this way of analyzing is correct? if not which is the correct way to get timing of sql which uses the parallel option.

2. If i run this query manually it finishes in 3 minutes whereas the report says it took many hours, what is the correct approach to tune this query?

Thanks in Advance.

Thanks,
Sikki.

June 22, 2012 - 8:29 am UTC

if you are using parallel query - the times you are seeing are added up over all of the parallel execution servers.

If you did a parallel 10 and each parallel execution server took 1 minute, then the query would probably take 1 minute of wall clock time to finish - but you would have consumed 10 minutes of database time.

That is the *accurate* amount of time consumed by that query.

I would never divide by the cpu count, that would lead to a grossly inaccurate number.

If you were doing parallel 64 on a 16 core machine - would all 64 servers run at the same time (no, they cannot). simple division is too simple.

If you want a good report - you should be instrumenting the heck out of your batch process.

Or use ASH to investigate the single individual session you are interested in.

parallel awr report

A reader, June 22, 2012 - 11:58 am UTC

Many Thanks for the clarification. 

When this query being executed I could see 65 users are being executed, can you let me know how do i calculate the timing now?
SQL> set numwidth 20
SQL> select EXECUTIONS,PX_SERVERS_EXECUTIONS,USERS_EXECUTING,FIRST_LOAD_TIME,ROWS_PROCESSED,ELAPSED_TIME,LAST_ACTIVE_TIME from v$sql where sql_id='dyzun84pjdd8v'
  2  /
          EXECUTIONS PX_SERVERS_EXECUTIONS      USERS_EXECUTING
-------------------- --------------------- --------------------
FIRST_LOAD_TIME           ROWS_PROCESSED         ELAPSED_TIME LAST_ACTI
------------------- -------------------- -------------------- ---------
                   1                    32                   65
2012-06-22/20:07:02              1086721          92860759192 22-JUN-12

June 22, 2012 - 4:58 pm UTC

the timing is what the timing is - you told me what the database time was above.

If you want something that shows how long the application ran this query - use ASH on the single session that was your batch job.

Sporadic slowness in queries

A reader, October 28, 2013 - 2:06 am UTC

Our database is on 11.2.0.2. Since past few days the applications have been complaining about sporadic slowness for some queries. The queries run OK most of the times and return results in within milliseconds. But only sometimes (couple of times a day or so) they might take 30 seconds or more. We are told about it after the fact and the ASH reports don't indicate any unusual activity. Because of the sporadic nature of the issue we haven't been able to track and investigate it during a problem period. The SQLs under question have only one SQL ID so dynamically changing of execution plans should not be happening there.
Any suggestions to tackle this issue would be much appreciated.

November 01, 2013 - 9:08 pm UTC

The SQLs under question have
only one SQL ID so dynamically changing of execution plans should not be
happening there.

??? dynamically changing of plans will only happen if they have the same sql id. You can only change the plan of a query (have more than one plan for a query) if it is the SAME query.

look in v$sql_plan, see if there is more than one plan.

if a query is going from ms's to 30 seconds - ash would be able to see that straight out. can you zero in on one of the sessions and see if we spent more time in the database for that session during that period than normal (eg: is the database slow or is the app server getting slow....)

Performance Investigation Approach when using STATSPACK

Breadcrumb

Question and Answer

You Asked

and Tom said...

Rating

Comments

Additional Information...

Apologies (and some valid STATSPACK data).

Information as requested and some additional stuff

Clarification

Additional on initrans and maxtrans

Error in previous post.

Spot on.

STATSPACK WAIT STATS

Statspack WAIT clarification

Staspack review

Enqueues...

Additional on INITRANS

Investigation locks

Calculating CPU usage from STATSPACK

CPU Usage

CPU usage

from performance tuning guide in 8.1.7

read ITL 500 times?

CPU Utilisation is 100%

Shared Pool Statistics

Too much hard parsing

What did you upgrade ?

Need Your Advice

Great

...

To Riaz ....

to Chiapa...

for Chaiapa

Flip switch after update...

Some comments more , to Riaz

???

Great

Performance problem after upgrade

CTAS

CTAS - more info

CTAS

Waits for dblink

RE: logons current

any doc

reason for waits

who decides how much time should a process wait ?

What does it mean by latches based on a spin? Thanks.

how to set spin_count?

CPU data

What does it mean cpu used when call started? Thanks

current call means the start of the snapshot?

call?

What does call mean? Is it the start of the execution of SQL? Thanks

block splits

Load Issue!!

How to find the CPU Utilisation

Praveen - Have a look at the Oracle doco for resource manager

How well my database is doing?

well said but.....

latch related question

Tom, thank you. One comment though

Related to Statspack

Is statspack right tool for DW also?

Negative soft parse ratio!

More info...

Still negative values

trace vs. perception

that 's why it is some kind of mystery to us

what's going on with my database

I meant 2000 seconds

Understanding Statspack.

What about RESOURCE_LIMIT=true

Rollback in Transactions

v$filestat

volume vs. file

statspack time interval

Fastest time to execute a transaction

Dictionary cache pct miss very high