CPU used by this session
A reader, July 31, 2006 - 10:49 am UTC
Hi Tom
I think CPU used by this session in sp report is about of time in 10s of milliseconds, right?
So the CPU time in one second should be .143 second.
July 31, 2006 - 12:25 pm UTC
ah, I see now, semantic detail. I've modified my response slightly to make up for that.
that is the value of that statistic used per second.
You have 3,669,833 cpu units of time used.
In 4,290 minutes
Which is about 14.3 cpu units of time used per second.
that is how that number is derived.
thank you for the statspack quiestions , one more quiestion
Learning From Tom, July 31, 2006 - 2:29 pm UTC
hi Tom,
always very useful.
couple of quiestions.
1)when checking my sp report , i found that the CPU time is 70% of the response time , do you agree?
when puting each component of the CPU time , i found the following
CPU parse = 55% of the total CPU time
CPU recursive = 25% of the total CPU time
CPU other = 20% of the total CPU time
what i understand is CPU parse time is the time used for parsing the statments , is that right ? if it is right i guess i have a problem with parsing , am i right ?
i saw a nice note on your site talking about soft and hard parsing , in my report soft parsing is about 95% which isn't bad , but you mentioned that we can elminate soft parsing as well by using static PL/SQL packages and keep our cursors open, but i asked our developers if they can do that and they said they have never done that before , could you point me to a resource that show how to implement code with "No Parse" ?
2) CPU other time , is the time i actually use to proccess the data , am i right ? , so it wont be easy to reduce!!!
but what about CPU recursive time ? how can i reduce it ?
3) is there any way in the sp report to know how much rows accessed by the sql statment mentioned in the "SQL order by buffer gets " section ???. i see some statment doing 32,155.7 gets/execution , i believe that is tooo high but i need to make sure before high light this issue , do you know a good way to make sure of that ?
thanksssssssss
July 31, 2006 - 2:59 pm UTC
your statspack is way too long to provide any truly useful information.
I can neither agree, nor disagree. Response time is cpu+wait time. I don't know your wait times. but remember my point right above please.
there is quite simply put "insufficient detail here" to comment on anything - and the window of observation is absurdly large.
A major contributor to cpu time in many systems is latch contention (we spin and burn cpu). reducing latch contention may reduce (significantly) cpu time. Improving algorithms can reduce cpu time. It could be easy to reduce, it could be impossible to reduce. It can only be said that "we cannot say" right now.
v$sql can give you insight into the rows fetched
ops$tkyte%ORA10GR2> SelecT * from dual;
D
-
X
ops$tkyte%ORA10GR2> /
D
-
X
ops$tkyte%ORA10GR2> /
D
-
X
ops$tkyte%ORA10GR2> select sql_text, rows_processed from v$sql where sql_text like 'SelecT%';
SQL_TEXT
-------------------------------------------------------------------------------
ROWS_PROCESSED
--------------
SelecT * from dual
3
strange problem
A reader, October 02, 2006 - 8:47 am UTC
Hi Tom,
i am running into a strange problem
i have 9i database , lately it ran very slow , query that used to take an hour it now need 7 hours!!!!!
i performed a statspack report of 20 minutes between snaps
Elapsed: 20.05 (mins)
Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
CPU time 46,144 96.82
enqueue 400 1,174 2.46
i have 8 cpus , so the available cpu time should be 20 * 60 * 8 = 9600 sec
now i have CPU time = 46,144 , could you please explain ??
i have took several snaps . some snaps look normal , while other snaps look the same ( CPU time > elapsed time )
could you please help
October 02, 2006 - 9:09 am UTC
cpu time is contribued to the v$ tables *after a call completes*
suppose you have a single cpu machine.
suppose also you are the ONLY user on the machine.
Now, since:
ops$tkyte%ORA10GR2> select to_char( to_date('09:20','hh24:mi'), 'sssss' ) from dual;
TO_CH
-----
33600
suppose you have a procedure:
as
begin
loop
exit when to_number( to_char(sysdate,'sssss') ) >= 3360;
end loop;
end;
it just uses cpu until the seconds past midnight for the current day is past 9:20am
Now, given the following timeline:
8:55am: start procedure
9:00am: snap 1
9:15am: snap 2
9:20am: procedure finishes
9:30am: snap 3
the report for snap1/2 will show ZERO CPU TIME USED
the report for snap2/3 will show 25 MINUTES OF CPU TIMED USED (in 15 minutes of available cpu time)
so, I think that you have some really monster long procedures runing during this time and they are CPU "dumping". they've been running for hours - and finally dump all of their cpu time into this window.
So, the cpu time is not "correct" for your period of observation.
thanks, but my case seem to be different
A reader, October 02, 2006 - 11:11 am UTC
Hi Tom,
thank you for your kind help, but my case is a different from what you cleared.
i understand your point ( i have read it in one of your threads before)
but here is my case:
1) i have an idle test machine
2) i schudulered a snap shot every 20 minutes for 10 hours
3) i start run the reporting tool
4) after 9 hours the report is done ( previous it was a couple of hours)
5) after 10 hours the snapshots stoped
6) i stared to create several statspack reports with random snap ids ( ex : 1-2 , 5-6 , 20-21 , ... etc)
all reports show the cpu time the highest component , but some shows cpu time higher than the total available cpu time as i stated in my previous quiestion.
i thought it might be the wrong selection(things was running from previous time than the sp report) as you cleared , i created a report between the begining snap and the last snap ( 10 hours ) and i found that the CPU time was higher than the CPU time avilable during the whole 10 hours ( 10H * 60 * no. of cpus) .
so i am totaly stuck and wish you could give me more help
thanks
October 02, 2006 - 11:33 am UTC
sorry, something is up - You might think the machine is idle - but it really isn't. do you have complete control over this machine, did you verify manually. Have you considered using system tools (vmstat, sar, etc) to measure as well.
yes i did
A reader, October 02, 2006 - 11:42 am UTC
Hi Tom,
yes i have full control on the machine
i used SAR during the whole interval
the cpu time was disrbuited between user and system ( no wait)
the average time from the sar was about 50%-60% user
and 10% system
system wide things look fine , but the main confusing is the high cpu time within the statspack
October 02, 2006 - 11:46 am UTC
utilize support if you believe it to be reproducible.
thanks for your help , one more piece
A reader, October 03, 2006 - 3:52 am UTC
Hi Tom,
i will contact the support but here is another piece it might show more details for you
Elapsed: 19.97 (mins)
Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
SQL*Net message from dblink 262,001 9,210 56.61
again the SQL*Net message from dblink taking 9,210!!! much higher than the total elapsed time
do you have a clue? is the database link causing all of the misleading data for the CPU and the wait time ?
thanks
October 03, 2006 - 6:29 am UTC
cpu time AND sql*net messages are added at the end of the call.
So, I am right back to where I began. CPU/network dumping - and you are doing it to yourself!!!!!
You are the process that starts at 8:55am. You are the process that under-contributes to snap1/2 and OVER contributed to snap2/3.
Does that make sense to you? You are looking at a 20 minute slice of a 10 hour report. You will have some 20 minute slices that underreport cpu/net and others that over report.
This'll sound strange coming from me - but if you do the 10 hour statspack, what then? do the numbers add up at the end?
sorry but i didn't understand
A reader, October 03, 2006 - 8:55 am UTC
Hi Tom,
sorry i didn't got your point clear, please bear with me
when i created report from the whole measurment interval.
Elapsed: 569.96 (mins)
Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
SQL*Net message from dblink 8,246,491 131,427 56.12
that is the report from the begining to the end of the whole process
thanks
October 03, 2006 - 10:43 am UTC
no, cpu time.
Ok, lets say you were taking 15 minute snapshots, every 15 minutes.
You start at 9am you finish at 12 noon.
Your process that starts at 9am does something that runs from 9am-9:16 in a single call. It'll dump 0 seconds of cpu time, 0 seconds of net time into the 9/9:15 slot, but an IMPOSSIBLE 16 minutes into the 9:15/9:30. Further, assume from 9:16-9:29 it did more work. It'll dump that time into the 9:15-9:30 window.
IF you analyze the 9:15 to 9:30 window - you will see impossibly large numbers. If you analyze the 9 to 9:15 window, you will see ZERO.
Now, when you look at net messages (waits) you have to remember they can be multiplied by the number of concurrent sessions, you are using parallel operations, so you have MANY sessions.
If in 10 seconds, 100 sessions wait for sql*net message from dblink, you will have 1,000 seconds of WAIT in that 10 seconds.
But, if you have 10 seconds and 8 cpus and 100 sessions - you can only use 80 seconds of cpu time.
Therefore, does the statspack for the entire process show reasonable cpu time (sqlnet message - you won't ever be able to decipher that at the statspack level)
If the cpu time looks reasonable for the ENTIRE THING then you are suffering from the 9:00 - 9:15, 9:15-9:30 symptoms I've been trying to describe.
many thanks
A reader, October 03, 2006 - 11:45 am UTC
Hi Tom,
it is so clear now , i forgot the # of sessions factor , actually this is a ready made application , but i can see the number of sessions changed during usually
Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
SQL*Net message from dblink 8,246,491 131,427 56.12
CPU time 93,876 39.73
still the CPU is very high.
but at least i have the sar cpu output look nomarl , i guess i can depend on the this instead of " cpu used by this session " , can i ?
many thanks
October 03, 2006 - 4:44 pm UTC
in 10 hours, you have 10*60*60*8 cpu seconds (10 hours, 60 minutes in an hour, 60 seconds in a minute, 8 cpus)
that is 288,000 cpu seconds.
Your reported cpu here is very low compared to that.
A reader, October 05, 2006 - 3:23 am UTC
Hi Tom,
when i have a PL/SQL block that call a long running procedure.
and in the statspack report section "SQL ordered by Gets"
it reported that this block took 1000 CPU time(s)
1)now i have 4 CPUs in my machine, does the 1000 sec is distrbuted among my 4 CPUs or does it only stick on one CPU ?
2)could you please give us a look about how we can distrubte the load of one session among all the CPUs avialable ?
thanks
October 05, 2006 - 8:11 am UTC
Disk Busy
shyam, October 05, 2006 - 8:41 am UTC
Hi Tom,
I recived an issue from server team regd the disk utilization and oracle is utilising the maximum out of it.The Average % Disk Busy was about 74%.
Server:windows 2003
DB version:9.2.0.4
For the above issue i have generated statspack report...Kindly take a look at it.
Snap Id Snap Time Sessions Curs/Sess Comment
------- ------------------ -------- --------- -------------------
Begin Snap: 3 04-Oct-06 07:33:23 129 .4
End Snap: 4 04-Oct-06 10:16:49 158 .2
Elapsed: 163.43 (mins)
Cache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 256M Std Block Size: 8K
Shared Pool Size: 384M Log Buffer: 512K
Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 2,770.53 723.63
Logical reads: 176,645.74 46,137.55
Block changes: 18.42 4.81
Physical reads: 1,612.48 421.16
Physical writes: 0.97 0.25
User calls: 145.04 37.88
Parses: 29.00 7.58
Hard parses: 1.22 0.32
Sorts: 0.85 0.22
Logons: 0.19 0.05
Executes: 25.17 6.57
Transactions: 3.83
% Blocks changed per Read: 0.01 Recursive Call %: 17.02
Rollback per transaction %: 5.31 Rows per Sort: 22.04
Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 99.99 Redo NoWait %: 100.00
Buffer Hit %: 99.09 In-memory Sort %: 99.98
Library Hit %: 97.37 Soft Parse %: 95.79
Execute to Parse %: -15.23 Latch Hit %: 99.92
Parse CPU to Parse Elapsd %: 85.34 % Non-Parse CPU: 99.37
Shared Pool Statistics Begin End
------ ------
Memory Usage %: 89.12 88.42
% SQL with executions>1: 34.54 35.62
% Memory for SQL w/exec>1: 28.60 32.70
Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
db file scattered read 1,009,020 16,251 62.10
CPU time 4,963 18.97
db file sequential read 1,288,954 2,108 8.05
log file sync 35,665 1,001 3.83
buffer busy waits 203,788 799 3.05
-------------------------------------------------------------
Wait Events for DB: PRD1 Instance: prd1 Snaps: 3 -4
-> s - second
-> cs - centisecond - 100th of a second
-> ms - millisecond - 1000th of a second
-> us - microsecond - 1000000th of a second
-> ordered by wait time desc, waits desc (idle events last)
Avg
Total Wait wait Waits
Event Waits Timeouts Time (s) (ms) /txn
---------------------------- ------------ ---------- ---------- ------ --------
db file scattered read 1,009,020 0 16,251 16 26.9
db file sequential read 1,288,954 0 2,108 2 34.3
log file sync 35,665 2 1,001 28 0.9
buffer busy waits 203,788 0 799 4 5.4
log file parallel write 35,734 31,236 742 21 1.0
control file parallel write 3,122 0 203 65 0.1
db file parallel write 902 0 53 58 0.0
control file sequential read 2,610 0 30 11 0.1
enqueue 11 3 11 1010 0.0
direct path read 566 0 7 12 0.0
SQL*Net more data to client 4,203 0 1 0 0.1
latch free 1,073 999 0 0 0.0
SQL*Net break/reset to clien 268 0 0 0 0.0
direct path write 8 0 0 4 0.0
db file parallel read 1 0 0 15 0.0
LGWR wait for redo copy 4 0 0 0 0.0
SQL*Net message from client 1,423,657 0 532,637 374 37.9
virtual circuit status 327 327 10,784 32977 0.0
SQL*Net message to client 1,423,686 0 2 0 37.9
-------------------------------------------------------------
Background Wait Events for DB: PRD1 Instance: prd1 Snaps: 3 -4
-> ordered by wait time desc, waits desc (idle events last)
Avg
Total Wait wait Waits
Event Waits Timeouts Time (s) (ms) /txn
---------------------------- ------------ ---------- ---------- ------ --------
log file parallel write 35,732 31,234 742 21 1.0
control file parallel write 3,122 0 203 65 0.1
db file parallel write 902 0 53 58 0.0
control file sequential read 2,202 0 25 11 0.1
db file scattered read 213 0 4 17 0.0
db file sequential read 198 0 3 13 0.0
LGWR wait for redo copy 4 0 0 0 0.0
rdbms ipc message 77,780 40,613 61,400 789 2.1
pmon timer 3,435 3,435 10,801 3144 0.1
smon timer 32 32 10,486 ###### 0.0
-------------------------------------------------------------
SQL ordered by Gets for DB: PRD1 Instance: prd1 Snaps: 3 -4
-> End Buffer Gets Threshold: 10000
-> Note that resources reported for PL/SQL includes the resources used by
all SQL statements called within the PL/SQL code. As individual SQL
statements are also reported, it is possible and valid for the summed
total % to exceed 100
CPU Elapsd
Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
744,032,331 68 10,941,651.9 43.0 1135.48 -3071.54 32535610
Module: CompData_Client.exe
delete acts_ctn_chk where ctn in (select ctn from acts_ctn_log w
here ctn in (select ctn from acts_ctn_chk) and status='SENDED')
547,671,606 69 7,937,269.7 31.6 1501.69 158.10 1443972639
Module: CompData_Client.exe
delete acts_ctn_chk where ctn in (select replace(ctnno,' ','') f
rom warehouse.pbc_ctninf where replace(ctnno,' ','') in (select
ctn from acts_ctn_chk) and odcflag is not null)
214,442,001 70 3,063,457.2 12.4 966.70 -3881.17 4254914190
Module: CompData_Client.exe
delete acts_ctn_chk where ctn in (select replace(ctnno,' ','') f
rom warehouse.pbc_ctninf where substr(material,-3) not in ('CIE
','CIS','CIN','CIT','CIX') and replace(ctnno,' ','') in (select
ctn from acts_ctn_chk))
206,755,165 71 2,912,044.6 11.9 1037.47 4162.61 2821933335
Module: CompData_Client.exe
update warehouse.pbc_ctninf set odcflag='NOT_CIS' where replace(
ctnno,' ','') in (select replace(ctnno,' ','') from warehouse.pb
c_ctninf where substr(material,-3) not in ('CIE','CIS','CIN','C
IT','CIX') and replace(ctnno,' ','') in (select ctn from acts_ct
n_chk))
2,472,052 69 35,826.8 0.1 13.48 14.28 2460733575
Module: ACTS_CTN_PCB_Client.exe
select * from acts_ctn_log where status='WAITING' and ctn like '
C%' and trndate between sysdate-10 and sysdate-1
October 05, 2006 - 9:33 am UTC
you do a ton of physical IO, not really sure "what else to say"??
well, besides, look at the sql doing tons of physical IO, see if you cannot do something to make it perform less physical IO
Disk Busy
shyam, October 05, 2006 - 8:43 am UTC
From the above statspack, i need your suggestion on which area i have to conecentrate.Further i wud like to inform you that oracle.exe uses 800MB of memory and windows 2003 has 4 processors
October 05, 2006 - 9:34 am UTC
only way to make "disk less busy" is to perform less "physical disk IO", which requires - well, making your sql perform less physical disk IO.
Disk busy
shyam, October 05, 2006 - 8:48 am UTC
I would also like to update in the CPU utilization.
Expecting your response at the earliest
Statistic Total per Second per Trans
--------------------------------- ------------------ -------------- ------------
CPU used by this session 496,276 50.6 13.2
CPU used when call started 496,276 50.6 13.2
CR blocks created 336 0.0 0.0
DBWR buffers scanned 518,626 52.9 13.8
DBWR checkpoint buffers written 1,803 0.2 0.1
DBWR checkpoints 0 0.0 0.0
DBWR free buffers found 517,639 52.8 13.8
DBWR lru scans 273 0.0 0.0
DBWR make free requests 317 0.0 0.0
DBWR summed scan depth 518,626 52.9 13.8
DBWR transaction table writes 51 0.0 0.0
DBWR undo block writes 1,371 0.1 0.0
SQL*Net roundtrips to/from client 1,418,187 144.6 37.8
active txn count during cleanout 872 0.1 0.0
background checkpoints completed 0 0.0 0.0
background checkpoints started 0 0.0 0.0
background timeouts 9,720 1.0 0.3
branch node splits 0 0.0 0.0
buffer is not pinned count 16,004,252 1,632.1 426.3
buffer is pinned count 875,810,604 89,313.8 23,327.6
bytes received via SQL*Net from c 92,010,781 9,383.1 2,450.8
bytes sent via SQL*Net to client 129,679,857 13,224.5 3,454.1
calls to get snapshot scn: kcmgss 335,981 34.3 9.0
calls to kcmgas 36,316 3.7 1.0
calls to kcmgcs 161 0.0 0.0
change write time 150 0.0 0.0
cleanout - number of ktugct calls 586 0.1 0.0
cleanouts and rollbacks - consist 13 0.0 0.0
cleanouts only - consistent read 22 0.0 0.0
cluster key scan block gets 21,402 2.2 0.6
cluster key scans 10,290 1.1 0.3
commit cleanout failures: block l 0 0.0 0.0
commit cleanout failures: buffer 0 0.0 0.0
commit cleanout failures: callbac 21 0.0 0.0
commit cleanout failures: cannot 12 0.0 0.0
commit cleanouts 49,094 5.0 1.3
commit cleanouts successfully com 49,061 5.0 1.3
commit txn count during cleanout 6,232 0.6 0.2
consistent changes 6,362 0.7 0.2
consistent gets 1,732,014,735 176,628.1 46,132.9
consistent gets - examination 1,711,088,536 174,494.0 45,575.6
current blocks converted for CR 9 0.0 0.0
cursor authentications 10,859 1.1 0.3
data blocks consistent reads - un 6,362 0.7 0.2
db block changes 180,666 18.4 4.8
db block gets 173,387 17.7 4.6
deferred (CURRENT) block cleanout 17,801 1.8 0.5
dirty buffers inspected 387 0.0 0.0
enqueue conversions 3,152 0.3 0.1
enqueue deadlocks 0 0.0 0.0
enqueue releases 81,089 8.3 2.2
enqueue requests 81,090 8.3 2.2
enqueue timeouts 0 0.0 0.0
enqueue waits 8 0.0 0.0
execute count 246,826 25.2 6.6
free buffer inspected 31,447 3.2 0.8
Instance Activity Stats for DB: PRD1 Instance: prd1 Snaps: 3 -4
Statistic Total per Second per Trans
--------------------------------- ------------------ -------------- ------------
free buffer requested 15,807,192 1,612.0 421.0
hot buffers moved to head of LRU 7,339 0.8 0.2
immediate (CR) block cleanout app 35 0.0 0.0
immediate (CURRENT) block cleanou 625 0.1 0.0
index fast full scans (full) 174 0.0 0.0
index fetch by key 855,408,345 87,233.2 22,784.2
index scans kdiixs1 251,585 25.7 6.7
leaf node 90-10 splits 7 0.0 0.0
leaf node splits 140 0.0 0.0
logons cumulative 1,815 0.2 0.1
messages received 36,547 3.7 1.0
messages sent 36,547 3.7 1.0
no buffer to keep pinned count 0 0.0 0.0
no work - consistent read gets 20,716,134 2,112.6 551.8
opened cursors cumulative 246,262 25.1 6.6
parse count (failures) 260 0.0 0.0
parse count (hard) 11,979 1.2 0.3
parse count (total) 284,415 29.0 7.6
parse time cpu 3,103 0.3 0.1
parse time elapsed 3,636 0.4 0.1
physical reads 15,811,952 1,612.5 421.2
physical reads direct 6,319 0.6 0.2
physical writes 9,496 1.0 0.3
physical writes direct 6,319 0.6 0.2
physical writes non checkpoint 8,750 0.9 0.2
pinned buffers inspected 29,673 3.0 0.8
prefetch clients - default 0 0.0 0.0
prefetched blocks 13,508,488 1,377.6 359.8
prefetched blocks aged out before 46 0.0 0.0
process last non-idle time 2,102,947,664,834 214,455,197.3 ############
recursive calls 291,650 29.7 7.8
recursive cpu usage 862 0.1 0.0
redo blocks written 83,553 8.5 2.2
redo buffer allocation retries 0 0.0 0.0
redo entries 92,094 9.4 2.5
redo log space requests 0 0.0 0.0
redo log space wait time 0 0.0 0.0
redo ordering marks 0 0.0 0.0
redo size 27,167,856 2,770.5 723.6
redo synch time 99,917 10.2 2.7
redo synch writes 35,651 3.6 1.0
redo wastage 14,280,296 1,456.3 380.4
redo write time 89,308 9.1 2.4
redo writer latching time 0 0.0 0.0
redo writes 35,728 3.6 1.0
rollback changes - undo records a 42 0.0 0.0
rollbacks only - consistent read 334 0.0 0.0
rows fetched via callback 65,137 6.6 1.7
session connect time 2,102,947,664,834 214,455,197.3 ############
session logical reads 1,732,188,122 176,645.7 46,137.6
session pga memory 6,913,468 705.0 184.1
session pga memory max 7,520,972 767.0 200.3
session uga memory 22,813,928 2,326.5 607.7
session uga memory max 164,122,128 16,736.9 4,371.5
shared hash latch upgrades - no w 253,730 25.9 6.8
shared hash latch upgrades - wait 7 0.0 0.0
Instance Activity Stats for DB: PRD1 Instance: prd1 Snaps: 3 -4
Statistic Total per Second per Trans
--------------------------------- ------------------ -------------- ------------
sorts (disk) 2 0.0 0.0
sorts (memory) 8,315 0.9 0.2
sorts (rows) 183,277 18.7 4.9
summed dirty queue length 1,381 0.1 0.0
switch current to new buffer 23 0.0 0.0
table fetch by rowid 11,698,751 1,193.0 311.6
table fetch continued row 88,986 9.1 2.4
table scan blocks gotten 12,882,386 1,313.7 343.1
table scan rows gotten 1,243,127,869 126,772.2 33,111.2
table scans (long tables) 323 0.0 0.0
table scans (rowid ranges) 0 0.0 0.0
table scans (short tables) 514 0.1 0.0
transaction rollbacks 2 0.0 0.0
transaction tables consistent rea 0 0.0 0.0
transaction tables consistent rea 0 0.0 0.0
user calls 1,422,229 145.0 37.9
user commits 35,551 3.6 1.0
user rollbacks 1,993 0.2 0.1
workarea executions - multipass 2 0.0 0.0
workarea executions - onepass 0 0.0 0.0
workarea executions - optimal 13,399 1.4 0.4
write clones created in foregroun 1 0.0 0.0
-------------------------------------------------------------
Tablespace IO Stats for DB: PRD1 Instance: prd1 Snaps: 3 -4
->ordered by IOs (Reads + Writes) desc
Tablespace
------------------------------
Av Av Av Av Buffer Av Buf
Reads Reads/s Rd(ms) Blks/Rd Writes Writes/s Waits Wt(ms)
-------------- ------- ------ ------- ------------ -------- ---------- ------
USERS
1,792,321 183 5.0 4.6 107 0 171,736 2.5
INDX
252,961 26 12.3 15.5 677 0 30,599 12.0
TOOLS
236,180 24 13.1 15.0 578 0 1,115 11.0
PLIST_DATA
10,843 1 16.7 12.0 124 0 0 0.0
SYSTEM
3,413 0 13.6 1.3 69 0 0 0.0
TEMP2
1,170 0 7.4 5.4 1,028 0 0 0.0
PLIST_INDX
1,364 0 12.2 3.1 197 0 0 0.0
UNDOTBS1
80 0 15.5 1.0 1,424 0 0 0.0
HRIS_INDX
13 0 19.2 1.0 0 0 0 0.0
HRIS_DATA
11 0 19.1 1.0 0 0 0 0.0
October 05, 2006 - 9:35 am UTC
please stop putting large snippets of virtually "useless" statspack reports here.... You are doing lots of physical IO, what more can one say exactly???
Disk busy
shyam, October 05, 2006 - 12:02 pm UTC
My apologies...i am a beginner and i am not aware of it..Kindly stop using "discouraging" words like useless
October 05, 2006 - 1:21 pm UTC
well, you have a disk IO issue. All of that stuff is rather "cluttering" here don't you think?
You came in an said "Expecting your response at the earliest", rather pushy? This is just a forum, not support.
thanks for the plsql help
A reader, October 05, 2006 - 11:33 pm UTC
Hi Tom,
thank you for your help and support you are offering us
i have reviewed the thread you recommended earlier
</code>
http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:10498431232211 <code>
but i am some what confused , what do you mean by "run 4 copies of your procedure" , will that cause each procedure to use a different CPU ? , will they run together or one after each other ?
*if your have enough time* could you please show us an example of code that will stick on one cpu and its modification that will use as much number of cpus as possible ?
i have 12 CPUs , there is a report that take the whole night to be done( about 11 hours) , i created statspack snaps at the begin of the process and the end and i found that top statement in the " SQL order by buffer gets" is a plsql block and it took about 10 CPU hours . i want to make sure that our developers is making use of all the avilable CPUs
sorry for make this long
thanksb
October 06, 2006 - 8:51 am UTC
there can be no such example of a single plsql routine running on 4 cpus simultaneously.
do you have access to my book Expert Oracle Database Architecture, in the parallel chapter I go over this concept in great detail - over many pages - showing how to do it "yourself" or to use parallel pipelined functions.
Basically, the concept is YOU have to DIVIDE AND CONQUER. You have to take your SERIAL process and run many of them at the same time, each working on a different bit of the problem.
And the link I pointed you to, well, that shows the approach.
example
Jim, October 06, 2006 - 9:44 am UTC
I agree with TK, you have to roll your own. I had a plsql batch process that ran for 18 hours ( due to some serious database design issues that I could not mediate. But I was able to divide the data into 4 parts by distriubtor id and run 4 parallel plsql jobs, one for each set of distributors. Accounting was ecstatic to get the final report in under 5 hours.
Your opinion
Reader, October 10, 2006 - 6:48 am UTC
Hello Tom,
It's 30 minute snapshort statspack report. Purposely, Iam not sending the entire statspack report. Iam copying only the required things.
Iam working on Oracle 9i.
Question 1:
The Gets per Exec is 253.9. I suppose it's too high. What can be done to improve it?
SQL ordered by Gets DB/Inst: ORCL/orcl Snaps: 21-22
-> Resources reported for PL/SQL code includes the resources used by all SQL
statements called by the code.
-> End Buffer Gets Threshold: 10000 Total Buffer Gets: 297,306
-> Captured SQL accounts for 87.4% of Total Buffer Gets
-> SQL reported below exceeded 1.0% of Total Buffer Gets
CPU Elapsd Old
Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
40,364 159 253.9 13.6 1.07 2.68 2689373535
DECLARE job BINARY_INTEGER := :job; next_date DATE := :mydate;
broken BOOLEAN := FALSE; BEGIN EMD_MAINTENANCE.EXECUTE_EM_DBMS_J
OB_PROCS(); :mydate := next_date; IF broken THEN :b := 1; ELSE :
b := 0; END IF; END;
Question 2:
The Gets per Exec for the below query is around 315.5. Since these are system generated queries for it's internal purpose, I suppose we do not have any control over that.
Is something we can do or should it be ignored?
SQL ordered by Gets DB/Inst: ORCL/orcl Snaps: 21-22
CPU Elapsd Old
Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
12,934 41 315.5 4.4 0.10 0.39 4088878846
update smon_scn_time set orig_thread=0, time_mp=:1, time_dp=:2,
scn=:3, scn_wrp=:4, scn_bas=:5, num_mappings=:6, tim_scn_map=:
7 where thread=0 and scn = (select min(scn) from smon_scn_time
where thread=0)
11,760 38 309.5 4.0 0.06 0.09 146931816
delete from smon_scn_time where thread=0 and scn = (select min(
scn) from smon_scn_time where thread=0)
Thanks
October 10, 2006 - 8:21 am UTC
laughing out loud. why do you suppose it is too high?
it is a procedure, with perhaps tons of sql. you cannot say it is too high from where I'm sitting.
Strange
A reader, October 12, 2006 - 8:35 am UTC
Hi Tom,
i am very confused about one thing in the statspack
i have 9iR2 database
when i generate a statspack report
from snap 10 - 12 , i found the most SQL doing buffer gets is :
CPU Elapsd
Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
234,579,155 1 234,579,155.0 463.1 3604.41 3626.01 1872085176
DECLARE job BINARY_INTEGER := :job; next_date DATE := :mydate;
broken BOOLEAN := FALSE; BEGIN dbms_refresh.refresh('"NEW"."
VIEW"'); :mydate := next_date; IF broken THEN :b
:= 1; ELSE :b := 0; END IF; END;
this is refreshing for some materlized views
it did about 234,579,155 buffer gets
when i do a statspack report from snap 1 - 20 (include the snaps 10 - 12 )
i can't see the previos statment in the SQL order by buffer gets , while the top statment listed
Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
161,526,828 430 375,643.8 5.4 5395.84 6830.07 2712214176
here the top statement did 161,526,828 buffer gets , therefore
i am confused , where is my previous statment , it should be included.
could you please help ?
thanks
October 12, 2006 - 11:29 am UTC
let us see a bit more of the report, the section header, i want to know what section you are looking at here.
more details
A reader, October 12, 2006 - 12:04 pm UTC
Hi Tom,
more details as per your request and thanks for your interest in helping us
SQL ordered by Gets for DB: ORCL Instance: prd Snaps: 215 -235
-> End Buffer Gets Threshold: 10000
-> Note that resources reported for PL/SQL includes the resources used by
all SQL statements called within the PL/SQL code. As individual SQL
statements are also reported, it is possible and valid for the summed
total % to exceed 100
CPU Elapsd
Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
234,579,155 1 234,579,155.0 95.4 3604.41 3626.01 1872085176
DECLARE job BINARY_INTEGER := :job; next_date DATE := :mydate;
broken BOOLEAN := FALSE; BEGIN dbms_refresh.refresh('"NEW"."
VIEW"'); :mydate := next_date; IF broken THEN :b
:= 1; ELSE :b := 0; END IF; END;
============================================
SQL ordered by Gets for DB: PRD Instance: prd Snaps: 2 -432
-> End Buffer Gets Threshold: 10000
-> Note that resources reported for PL/SQL includes the resources used by
all SQL statements called within the PL/SQL code. As individual SQL
statements are also reported, it is possible and valid for the summed
total % to exceed 100
CPU Elapsd
Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
161,526,828 430 375,643.8 5.4 5395.84 6830.07 2712214176
DECLARE job BINARY_INTEGER := :job; next_date DATE := :mydate;
broken BOOLEAN := FALSE; BEGIN JOBS.DO(I_LEVEL=>7,I_
many thanks
October 13, 2006 - 6:38 am UTC
doh! I know the answer to this....
when you enter "10-12", they query the copies of the v$ tables for snaps 10 and 12. ONLY the sql that was in there at snap 10 will be in the report. (the begin snap drives the report)
when you enter "1-20", they do the same but drive from snap 1
Hence, the sql you are "missing" in the 1-20 report is missing because it had not yet started when snap 1 was taken.
this will only report on sql that was actually in the v$ tables as of snap 1 (or as of snap 10)
sorry i forgot to change the db name for all inputs
A reader, October 12, 2006 - 12:06 pm UTC
sorry i forgot to change the db name for all inputs in my previous post
THANKS
A reader, October 13, 2006 - 10:06 am UTC
Hi Tom,
thank you very much, really i was having a serious problem with this.
you always recommend to take 20 - 30 minutes snaps when serious work load , i usually say " oh , why , i guess taking 20 minutes snaps and then i will generate a report with elapsed time 24hours will be perfect to show me the total day load " , now i understand how much i was wrong.
thanks
puzzle
A reader, October 13, 2006 - 1:41 pm UTC
Hi Tom,
please bear with me , i understand your point but it didn't work with me :(
SQL ordered by Gets for DB: ORCL Instance: orcl Snaps: 223 -432
-> End Buffer Gets Threshold: 10000
-> Note that resources reported for PL/SQL includes the resources used by
all SQL statements called within the PL/SQL code. As individual SQL
statements are also reported, it is possible and valid for the summed
total % to exceed 100
CPU Elapsd
Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
84,848,853 209 405,975.4 6.5 2712.94 3427.11 2712214176
DECLARE job BINARY_INTEGER := :job; next_date DATE := :mydate;
broken BOOLEAN := FALSE; BEGIN Jobs.DO(I_LEVEL=>7,I_
still my sql missed :(
215 -235 : found
snap 1 - 430 : missed
snap 223 - 430 : missed !
it should be already runned when snap 223 is taken
could you please help ?
thanks
October 13, 2006 - 2:49 pm UTC
IF the sql statement was not in v$sql when your FIRST snap was taken
THEN
there is zero chance the sql will appear in this report
the statspack report only considers SQL that was in the shared pool at the time of the first snap, this means the sql you were looking at WAS NOT THERE at the time the first snap you chose was run, that is all.
It might alread have executed when snap 223 was taken, but that by no means implies it was still THERE when snap 223 was taken.
it was there for 215, aged out by 223 - not there anymore.
thanks for bearing with me
A reader, October 13, 2006 - 2:58 pm UTC
Hi Tom,
i understand your piont but it keep like a puzzle for me
please bear with me
SQL ordered by Gets for DB: ORCL Instance: orcl Snaps: 215 -300
-> End Buffer Gets Threshold: 10000
-> Note that resources reported for PL/SQL includes the resources used by
all SQL statements called within the PL/SQL code. As individual SQL
statements are also reported, it is possible and valid for the summed
total % to exceed 100
CPU Elapsd
Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
94,107,482 379,470 248.0 10.4 5192.27 7575.53 476747500
Module: emp.exe
SELECT INITIAL FROM EMPL WHERE NUMBER
==================================
here snap 215 - 300 : it is missed also
1)do you think it should be there when both snaps are taken ? or what is your idea about that?
2)how often statement will be aged out from the v$sql , i though it is permenant untill the database restarted ?
thanks
October 13, 2006 - 7:18 pm UTC
1) read the query, it does an outer join (it is there in sprepins)
but if it was in the first snap, but not the last - then - well, you lose the ending count of the gets, hence it'll have "zero" gets as far as the report was concerned.
the first snap would show "low gets", the second snap - missing - would be "zero" gets.
2) think about it, sort of a function of available memory and what is done on your system!!!
if it were permanent, then the shared pool would have to be infinite.
more details
A reader, October 13, 2006 - 3:15 pm UTC
Hi Tom,
thank you for your efforts and great help
i found something
SQL ordered by Gets for DB: ORCL Instance: orcl Snaps: 2 -235
-> End Buffer Gets Threshold: 10000
-> Note that resources reported for PL/SQL includes the resources used by
all SQL statements called within the PL/SQL code. As individual SQL
statements are also reported, it is possible and valid for the summed
total % to exceed 100
CPU Elapsd
Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
234,579,155 1 234,579,155.0 14.1 3604.41 3626.01 1872085176
DECLARE job BINARY_INTEGER := :job; next_date DATE := :mydate; broken BOOLEAN := FALSE; BEGIN dbms_refresh.refresh('"NEW"." VIEW"'); :mydate := next_date; IF broken THEN :b := 1; ELSE :b := 0; END IF; END;
====================================================
snap 2 - 235 : it is there
therefore i guess the second snap is whom driving the report , statement should be in the v$ when taking the second snap. is my guess is correct ?
2) what cause statements to be aged out from the v$sql , aren't they still there all the time till next database restart ? could you please explain this
thanks
PL/SQL single threaded question
A reader, October 13, 2006 - 6:26 pm UTC
If PL/SQL is operating with parallel hints or on tables with parallel degrees, is it still on one just CPU? I will check out the section in your book.
October 13, 2006 - 7:21 pm UTC
plsql would be on a single cpu, the sql it executes might be done in parallel.
could you give more explain please
A reader, October 13, 2006 - 10:49 pm UTC
Hi Tom,
thanks for your time.
sorry i didn't get your point well , if it is in the first snap and not in the last snap it wont show up ?
if it isn't in the first snap but it is in the second snap it will show up?
if it is in both snaps , it will show up with correct statistics for the statement ?
please correct me
sorry about the second question
thanks
October 14, 2006 - 8:10 am UTC
sorry, in looking I got it backwards, it has to be in the ENDING snapshot.
Much can be learned about what you are looking at by looking at the stuff that generates what you are looking at. like I said above, peeking at sprepins.sql - the script that does the report, you'll find the sql ordered by gets query to be:
lpad((to_char(100*(e.buffer_gets - nvl(b.buffer_gets,0))/:slr
,'990.0'))
, 6) ||' '||
lpad( nvl(to_char( (e.cpu_time - nvl(b.cpu_time,0))/1000000
, '9990.00')
, ' '),8) || ' ' ||
lpad( nvl(to_char( (e.elapsed_time - nvl(b.elapsed_time,0))/1000000
, '99990.00')
, ' '),9) || ' ' ||
lpad(e.old_hash_value,10)||''||
decode(e.module,null,st.sql_text
,rpad('Module: '||e.module,80)||st.sql_text)
, st.sql_text) aa
, e.old_hash_value hv
from stats$sql_summary e
, stats$sql_summary b
, stats$sqltext st
where b.snap_id(+) = :bid
and b.dbid(+) = e.dbid
and b.instance_number(+) = e.instance_number
and b.old_hash_value(+) = e.old_hash_value
and b.address(+) = e.address
and b.text_subset(+) = e.text_subset
and e.snap_id = :eid
and e.dbid = :dbid
and e.instance_number = :inst_num
and e.old_hash_value = st.old_hash_value
and e.text_subset = st.text_subset
and st.piece <= &&num_rows_per_hash
and e.executions > nvl(b.executions,0)
and 100*(e.buffer_gets - nvl(b.buffer_gets,0))/:slr > &&top_pct_sql
order by (e.buffer_gets - nvl(b.buffer_gets,0)) desc, e.old_hash_value, st.piece
)
so, E is the "ending snap", it is outer joined to B for the "begin snap".
It is likely that your end snap in the report you ran either
a) didn't have your sql
b) didn't have your sql having excessive e.buffer_gets yet
shared sql area
A reader, October 16, 2006 - 7:15 am UTC
Hi Tom,
my system has regular busy hours from 6AM - 1PM after 1PM the system have very little activity ( just at night we have few PL/SQL packages gathering statistics and uploading data on other database using datbase link)
it seem that at the begining of every day ( 6AM) we face some hard parses , the hard parses keep vary during the first 4 hours(6 - 10) and then it is nearly gone through the last 3 hours (10 - 1)
i have gathered statspack snaps every hour , it seems to me that most of our sql statements which are issued during the the day are totally aged out during the night and we have to hard parse them again.
first hour before (5-6)
Shared Pool Statistics Begin End
------ ------
Memory Usage %: 59.07 59.58
shared free memory 473,785,752 467,965,752 -1.23
shared sql area 117,770,600 122,042,192 3.63
last hour of system loaded (1PM)
Shared Pool Statistics Begin End
------ ------
Memory Usage %: 95.82 95.79
shared free memory 48,347,072 48,687,184 0.70
shared sql area 568,352,616 569,088,808 0.13
first hour after load (2 PM)
Shared Pool Statistics Begin End
------ ------
Memory Usage %: 95.79 77.61
SGA breakdown difference :
shared free memory 324,825,904 420,684,592 29.51
shared sql area 569,088,808 358,095,656 -37.08
second hour (3 PM)
Shared Pool Statistics Begin End
------ ------
Memory Usage %: 77.61 71.94
SGA breakdown difference
shared free memory 259,180,848 324,825,904 25.33
shared sql area 358,095,656 256,671,168 -28.32
third hour (4 PM)
Shared Pool Statistics Begin End
------ ------
Memory Usage %: 71.94 63.66
shared free memory 324,825,904 420,684,592 29.51
shared sql area 256,671,168 158,459,168 -38.26
fourth hour (5PM)
Shared Pool Statistics Begin End
------ ------
Memory Usage %: 63.66 64.83
shared free memory 420,684,592 407,192,400 -3.21
shared sql area 158,459,168 162,222,736 2.38
as you can see the "shared sql area" keep decreasing , i guess it is aging out the mornings sql statements while there is no need to for the shared sql area , it also keep decreasing and age out
1)do you have any idea about how to keep my morning sql statements till the next day so i don't need to hard parse them again?
2)can you tell us more about how oracle age out the shared pool , i know it is lru , but does it keep aging out even if there is no need for that memory at this point?
thanks
October 16, 2006 - 7:56 am UTC
... we have few PL/SQL packages gathering statistics ...
... are totally aged out during the
night and we have to hard parse them again. ....
No, they are not aged out, they are invalidated because you gathered statistics. You forced them out.
A reader, October 16, 2006 - 8:36 am UTC
Hi Tom,
i understand your piont now , but i am confused about 2 pionts now
1) we gather statistics on stale and invalid objects only , should that do the high effect ?
2) the shared sql area is keep decreasing while no load at the system , the gathering statistics done on 11 PM , but you can see that the shared sql area is keep decreasing to reach 60% even earlier than i don't know why , can you explain that please ?
3) oracle recommend gather statistics on , we gather on stale only ? why it affecting the system ?
4) the shared pool usage during the heavy load is keep flikering between 95% and 85% , should we increase the shared pool size ?
thanks
October 16, 2006 - 10:17 am UTC
1) you tell me - how many objects in general does that affect on YOUR system (your database is not my database, not their database - only you know what happens to your database).
And how many of the queries in the shared pool reference those objects? All it takes is one "popular" table to be analyzed.
2) do you use binds? If you fill up the shared pool and we need to free space in it to parse another query, we whip through and flush what we can. So, if you fill it, then still need more, we "unfill" it.
3) it is doing precisely what it is SUPPOSED to do, why do you gather statistics? In general to get plans to change, what can let a plan change? A hard parse. So, by default, we invalidate the cursors when you gather statistics, else the GOAL of gathering statistics (to get new plans) could not be met!
4) do you use binds correctly and in all cases. If yet, you *might* be undersized. if no, that is the cause of many of your issues
A reader, October 16, 2006 - 12:50 pm UTC
Hi Tom,
thanks for your help
will you please bear with me some more
we are using bind variables as much as possible , most of out statements use them
okay i can understand when the usage go from 95.79 > 77.61 as 95 is high and when
oracle need parse it might need to free up some memory
but what i see strange when oracle keep decreasing 77.61 > 71.94 ( i see 77 is low and oracle
can locate any needed memory although we have very little work during this time)
then it went from 71.94 > 63.66 , seeming to me it is aging out more and more with no actual
need
will you please help me in understanding this issue ?( as you can see there are alot of free
memory at 4PM and 5PM , why oracle doesn't reuse this memory instead of aging out ?
2) can i depend on the shared pool advisory in sizing the database ?( it said that if i increased
my shared pool with 20% the Estd Lib Cache Time saved will increase by about 70 sec) " i have
enough of unused memory on my system but as you always mention *just use what you need*"
thanks
October 16, 2006 - 12:55 pm UTC
it takes exactly this many unbound sql statements to kill your shared pool:
ONE
that is it, just one, only one, no more than one are needed. One application parsing One sql statement frequently enough without binds - bamm, you are dead in the water.
77 went up to 100 and down to 71, nothing unusual or confusing about that - remember, you are seeing two END POINTS, heck it could have:
77 -> 100 -> 50 -> 100 -> 60 -> 100 -> 20 -> 100 -> 71
for all we know during your period of observation.
A reader, October 16, 2006 - 1:22 pm UTC
Hi Tom,
i haven't thought that few statements can do that
but there are some cases that developers prefered to use bind variable
bind variable are unkown for the optimizar , so it could lead to a wrong plan
in your book " as far as i understand" you said it is commonly to have mixed between bind an litral variable , didn't i got it right ?
what about the shared pool advisory ? can i consider it ?
thanks
October 16, 2006 - 5:23 pm UTC
if you have a query that is going to be repeatedly parsed - hundreds or thousands of times per day - "binds are us" will describe that query.
If there is one particular "skewed" column and it is KNOWN to be skewed AND it has few (say 40 or less values), go ahead, don't bind THAT column, but bind the REST OF THE STUFF in that query.
If there is a literal value that never changes in the query - sure, use a literal
binds binds binds...
sure you can consider it, you can consider all of the stuff.
yes i understand it now
A reader, October 17, 2006 - 4:02 am UTC
Hi Tom,
yes i understand your piont, thank you for clearing that.
will you please forgive me for the long post but i am still confused about the shared pool usage,this is 10 min report , you can see my database is nearly ideal, but i still have shared pool usage gone from 70 > 63%
Elapsed: 10.02 (mins)
Cache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 2,448M Std Block Size: 8K
Shared Pool Size: 992M Log Buffer: 512K
Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 5,281.70 24,607.01
Logical reads: 896.27 4,175.65
Block changes: 11.66 54.30
Physical reads: 841.58 3,920.84
Physical writes: 30.39 141.60
User calls: 5,783.94 26,946.90
Parses: 2.32 10.82
Hard parses: 0.04 0.21
Sorts: 1.23 5.73
Logons: 0.12 0.54
Executes: 746.58 3,478.23
Transactions: 0.21
% Blocks changed per Read: 1.30 Recursive Call %: 28.12
Rollback per transaction %: 3.10 Rows per Sort: ########
Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 100.00 Redo NoWait %: 100.00
Buffer Hit %: 86.89 In-memory Sort %: 99.86
Library Hit %: 99.99 Soft Parse %: 98.07
Execute to Parse %: 99.69 Latch Hit %: 100.00
Parse CPU to Parse Elapsd %: 100.00 % Non-Parse CPU: 99.79
Shared Pool Statistics Begin End
------ ------
Memory Usage %: 70.20 63.63
% SQL with executions>1: 77.20 81.67
% Memory for SQL w/exec>1: 61.43 68.03
SGA breakdown difference for DB: PRD Instance: prd Snaps: 336 -337
Pool Name Begin value End value % Diff
------ ------------------------------ ---------------- ---------------- -------
java free memory 162,922,496 162,922,496 0.00
java memory in use 4,849,664 4,849,664 0.00
large free memory 83,789,744 83,789,744 0.00
large session heap 96,336 96,336 0.00
shared DG Broker heap 50,152 50,152 0.00
shared FileOpenBlock 14,960,384 14,960,384 0.00
shared KGK heap 7,000 7,000 0.00
shared KGLS heap 36,863,488 36,863,488 0.00
shared KQR L PO 21,612,096 21,612,096 0.00
shared KQR M PO 19,262,304 19,262,304 0.00
shared KQR S SO 188,408 188,408 0.00
shared KQR X PO 10,304 10,304 0.00
shared MTTR advisory 669,448 669,448 0.00
shared PL/SQL DIANA 10,755,824 10,755,824 0.00
shared PL/SQL MPCODE 28,035,216 28,035,216 0.00
shared PL/SQL PPCODE 545,904 545,904 0.00
shared PL/SQL SOURCE 1,720 1,720 0.00
shared PLS non-lib hp 2,088 2,088 0.00
shared PX subheap 278,912 278,912 0.00
shared db_block_hash_buckets 10,931,088 10,931,088 0.00
shared dictionary cache 3,229,952 3,229,952 0.00
shared errors 360,912 360,912 0.00
shared event statistics per sess 28,140,000 28,140,000 0.00
shared fixed allocation callback 552 552 0.00
shared free memory 344,983,456 421,069,016 22.05
shared joxlod: in ehe 273,304 273,304 0.00
shared joxs heap init 4,240 4,240 0.00
shared library cache 248,370,744 244,892,336 -1.40
shared miscellaneous 144,216,672 144,243,960 0.02
shared parameters 3,292,920 3,292,920 0.00
shared partitioning d 21,472 21,472 0.00
shared sessions 6,760,000 6,760,000 0.00
shared sim memory hea 2,507,208 2,507,208 0.00
shared sql area 230,850,400 158,220,200 -31.46
shared table definiti 33,264 33,264 0.00
shared temporary tabl 9,440 5,200 -44.92
shared trigger defini 189,256 189,256 0.00
shared trigger inform 27,952 27,952 0.00
shared trigger source 2,072 2,072 0.00
shared type object de 196,136 196,136 0.00
buffer_cache 2,566,914,048 2,566,914,048 0.00
fixed_sga 739,792 739,792 0.00
log_buffer 787,456 787,456 0.00
1) is there a target oracle usualy try to reach for the shared pool usage ? you can see that it is aging out alot of space for nothing ? could you please help
2)you recommended to have the shared pool size to just as you need , also some notes refer that a large size of shared pool may lead to worse performace , in my case the shared pool advisory shows a better result if i increase with 20% , can i give it a try ? i am just worry it will reduce my performance more ! ( i was thinking of increasing it and if i still see those reduction of memory usage i should ask developers to review thier code , they believe that they use litral when only needed)
thanks
October 17, 2006 - 4:29 am UTC
it never ages things out for nothing.
I would hardly call 746 executes per second "almost idle" - would you?
A reader, October 17, 2006 - 5:10 am UTC
Hi Tom,
thanks for your reply
but could you please clear to us the relation between number of execution and the shared pool aging ?
almost all used soft parses , they didn't add any new stuff in the shared pool
thanks for your time and kindly support
October 17, 2006 - 5:21 am UTC
there ISN"T one, you said "system idle", I'm saying "not"
for all I know, you gathered statistics during this "you were not really idle" idle time.
Have you identified a performance related issue here or are you simply looking at numbers and saying "hmm"
A reader, October 17, 2006 - 5:39 am UTC
i didn't mean ideal by nothing is done
i meant it as no activity on the shared pool.
also here i mean idle of cpu usage , it is very little compared to high load period
yes almost my cpu is 95% during the high load period
i identified the top wait is latch free , i believe that latches are eating my cpu
i notice that within the first 3 hours ( where CPU is almost 95% )there are about 10 hard parse per second.
it look that daily the database is warming up for few hours then it work fine , i was wondering what is wipping my statements out
October 17, 2006 - 7:07 am UTC
sure there was activity on the shared pool???
I told you what is happening. You gather statistics. The goal of gathering statistics is to change plans. In order to change plans you must invalidate the sql. When you invalidate the sql, you must hard parse it the first time.
So, you are causing this to happen.
A reader, October 17, 2006 - 8:18 am UTC
Hi Tom,
yes i thought in this as well , but when i checked the number of tables that is analyzed daily , i found it is nearly zero ( we gather satatistics on stale only and our tables have little write activity )
as you can see i started the day with "shared sql area 117,770,600"
at the end of heavy activity [1PM]( lot of hard parsing )
i end up with "shared sql area 569,088,808 "
at 5 PM i have "shared sql area 158,459,168 " only
i gather statistics at 9PM , therefore you can see that alot of my memory has been already wipped out (569,088,808 - 158,459,168) before the statistics gathering.
i am not sure why all of those memory wipped, could you give me any clue , or how i can find that out )
thank you very much
A reader, October 17, 2006 - 8:58 am UTC
Hi Tom,
some more info for the same snap i just posted above which the usgae went from 70 > 63 and you mentioned i have alot of execution
Library Cache Activity for DB: STS Instance: sts Snaps: 115 -116
->"Pct Misses" should be very low
Get Pct Pin Pct Invali-
Namespace Requests Miss Requests Miss Reloads dations
--------------- ------------ ------ -------------- ------ ---------- --------
BODY 353 0.0 353 0.0 0 0
SQL AREA 694 3.3 451,894 0.0 4 1,687
TABLE/PROCEDURE 287 0.0 1,082 0.8 0 0
TRIGGER 135 0.0 135 0.0 0 0
huge number of invalidation :( , do you have a clue for that?
thank you very much
October 17, 2006 - 9:51 am UTC
your, well, statistics..... definitely would do that.
It only takes one table....
that is referenced in various sqls.
A reader, October 17, 2006 - 11:02 am UTC
Hi Tom,
is there any other reason that can do that ?
that happen on 5 - 5:10 PM
while i gather statistics at 9 PM
will you please tell me what other reasons for mark sql as invalid
many thanks
October 17, 2006 - 11:04 am UTC
DDL operations, revokes and the like.
A reader, October 17, 2006 - 11:25 am UTC
Hi Tom,
thank you very much
one more thing please
could i identify the real cause ( the DDL that have been done ) from the statspack ?
or how i could do that ?
thank you very much
October 17, 2006 - 11:56 am UTC
you could write a trigger to audit DDL activity, you could enable auditing to audit DDL activity, and if you have 10g, you might be able to spy on the DDL in the shared pool by querying v$sql
Statspack.snap is taking too long
RJ, October 23, 2006 - 7:52 am UTC
Hi Tom:
First time running statspack. I'm running it on 9.2.0.5.0 - 64bit Production running on AIX 5.2 . Is there anything I missed? I'm monitoring CPU and Disk usage on the OS side and both are low.
1 SELECT --p.spid,
2 s.sid, s.serial#, s.username,
3 --s.osuser, s.terminal, s.status, s.program,
4 TO_CHAR(SYSDATE, 'hh:mi:ss') CURRENT_TIME,
5 TO_CHAR(LOGON_TIME, 'hh:mi:ss') START_TIME,
6 T.SQL_TEXT
7 FROM v$session s,
8 V$SQL T,
9 v$process p
10 WHERE
11 S.SQL_ADDRESS = T.ADDRESS
12 AND s.username = 'PERFSTAT'
13* AND p.addr = s.paddr
sbc_admin@SBCBO> /
SID SERIAL# USERNAME CURRENT_ START_TI SQL_TEXT
---------- ---------- ---------- -------- -------- --------------------
593 862 PERFSTAT 10:30:02 09:29:17 BEGIN statspack.snap
; END;
Pga Advisor
Yoav, November 05, 2006 - 8:06 am UTC
Hi Tom,
Our PGA is currently 1500 MB.
From the statspack report its look like that even if i set
it to 12,000MB the improvement will be low.
I missing somthing hir ? can you please suggest what to do about it ?
DB Name DB Id Instance Inst Num Release Cluster Host
------------ ----------- ------------ -------- ----------- ------- ------------
xxxxx xxxxx xxxxx 1 9.2.0.7.0 NO xxxxx
Snap Id Snap Time Sessions Curs/Sess Comment
--------- ------------------ -------- --------- -------------------
Begin Snap: 5164 03-Nov-06 12:00:21 53 100.7
End Snap: 5165 03-Nov-06 13:00:12 49 110.2
Elapsed: 59.85 (mins)
Cache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 1,304M Std Block Size: 8K
Shared Pool Size: 152M Log Buffer: 1,600K
Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: -626,533.87 -70,308,848.13
Logical reads: 15,129.40 1,697,802.31
Block changes: 2,129.22 238,938.31
Physical reads: 1,257.50 141,114.53
Physical writes: 389.80 43,743.03
User calls: 85.22 9,563.72
Parses: 33.20 3,725.25
Hard parses: 5.18 581.44
Sorts: 0.65 73.03
Logons: 0.01 1.41
Executes: 34.21 3,839.13
Transactions: 0.01
% Blocks changed per Read: 14.07 Recursive Call %: 28.43
Rollback per transaction %: 3.13 Rows per Sort: 8111.32
Estd Extra Estd PGA Estd PGA
PGA Target Size W/A MB W/A MB Read/ Cache Overalloc
Est (MB) Factr Processed Written to Disk Hit % Count
---------- ------- ---------------- ---------------- -------- ----------
188 0.1 489,372.6 562,709.6 47.0 0
375 0.3 489,372.6 433,642.7 53.0 0
750 0.5 489,372.6 269,687.8 64.0 0
1,125 0.8 489,372.6 263,770.4 65.0 0
1,500 1.0 489,372.6 255,903.4 66.0 0
1,800 1.2 489,372.6 137,898.4 78.0 0
2,100 1.4 489,372.6 137,018.1 78.0 0
2,400 1.6 489,372.6 137,018.1 78.0 0
2,700 1.8 489,372.6 137,018.1 78.0 0
3,000 2.0 489,372.6 137,018.1 78.0 0
4,500 3.0 489,372.6 137,018.1 78.0 0
6,000 4.0 489,372.6 137,018.1 78.0 0
9,000 6.0 489,372.6 137,018.1 78.0 0
12,000 8.0 489,372.6 137,018.1 78.0 0
November 05, 2006 - 9:17 am UTC
I don't see anything that you would be looking to improve - know what I mean? What are you TRYING to fix.
All I see are a bunch of numbers that without any sort of context are rather "not meaningful"
SQL execute elapse time
Ramprasad, November 10, 2006 - 7:19 am UTC
Hi Tom,
- Should we subtract "DB CPU" or "SQL execute elapse time" time in order to determine the wait time? If the answer is "DB CPU" time, then the next question...
- Why do we see larger SQL execute elapse time value than DB CPU?
- What contribute to the difference between DB CPU time and SQL elapssed time?
STATSPACK REPORT 1
===================
Statistic Name Time (s) % of DB Time
------------------------------------------ ------------------ ------------
sql execute elapsed time 123.3 84.2
DB CPU 113.5 77.5
PL/SQL execution elapsed time 35.8 24.5
connection management call elapsed time 23.8 16.3
parse time elapsed 13.1 8.9
hard parse elapsed time 7.5 5.1
sequence load elapsed time 0.7 .5
PL/SQL compilation elapsed time 0.2 .2
hard parse (sharing criteria) elapsed time 0.2 .1
repeated bind elapsed time 0.0 .0
failed parse elapsed time 0.0 .0
DB time 146.4 N/A
background elapsed time 11.0 N/A
background cpu time 4.8 N/A
STATSPACK REPORT 2
==================
Statistic Name Time (s) % of DB Time
------------------------------------------ ------------------ ------------
sql execute elapsed time 46.5 93.1
DB CPU 37.6 75.3
parse time elapsed 10.9 21.8
hard parse elapsed time 9.3 18.7
PL/SQL execution elapsed time 7.5 15.1
connection management call elapsed time 4.6 9.2
PL/SQL compilation elapsed time 0.2 .4
hard parse (sharing criteria) elapsed time 0.1 .3
sequence load elapsed time 0.1 .2
repeated bind elapsed time 0.1 .2
hard parse (bind mismatch) elapsed time 0.0 .0
DB time 49.9 N/A
background elapsed time 22.0 N/A
background cpu time 11.0 N/A
STATSPACK REPORT 3
==================
Statistic Name Time (s) % of DB Time
------------------------------------------ ------------------ ------------
sql execute elapsed time 218.0 98.2
DB CPU 133.6 60.2
parse time elapsed 18.9 8.5
PL/SQL execution elapsed time 18.6 8.4
hard parse elapsed time 16.5 7.4
connection management call elapsed time 4.9 2.2
RMAN cpu time (backup/restore) 2.5 1.1
repeated bind elapsed time 0.3 .1
PL/SQL compilation elapsed time 0.1 .1
sequence load elapsed time 0.1 .1
hard parse (sharing criteria) elapsed time 0.0 .0
inbound PL/SQL rpc elapsed time 0.0 .0
DB time 221.9 N/A
background elapsed time 77.6 N/A
background cpu time 18.8 N/A
Thanks,
Ram
November 10, 2006 - 9:04 am UTC
why subtract anything - wait times are reported for you?
when you see elapsed > cpu, that means you waited for something, sql statements can wait for thousands of things - from IO to enqueues. You need to trace the sql.
A reader, November 18, 2006 - 6:52 pm UTC
Hi Tom,
thank you for this very useful thread , one more question please
part of my statspack (60 min elapsed time) "Library Cache Activity " section
Get Pct Pin Pct Invali-
Namespace Requests Miss Requests Miss Reloads dations
--------------- ------------ ------ -------------- ------ ---------- --------
BODY 3,537 5.2 66,216 0.3 8 0
CLUSTER 1,338 0.7 3,228 0.4 4 0
INDEX 1,056 72.0 5,416 23.1 20 0
SQL AREA 4,610,706 0.5 19,780,520 0.4 22,680 5,111
TABLE/PROCEDURE 50,015 5.8 2,714,619 0.7 6,540 0
TRIGGER 284 6.3 213,468 0.0 11 0
-------------------------------------------------------------
as you can see huge number of invalidation for SQL area , i haven't gathered any statistics , or no DDL has been mentioned , pluse no grant , revoke and such commands :-(
1)all was just some select , update , instert commands , is there a reason for those numbers of invalidation ?
2)could it be small shared pool size ?
thanks
November 20, 2006 - 2:32 am UTC
huge, where did you see huge - compared to the get/ping requests, it is quite teeny tiny.
are you running with a "full" shared pool (eg: is it large enough for your working set)
DB cpu
Manoranjan Dash, November 20, 2006 - 6:22 am UTC
sql execute elapsed time 123.3 84.2
The DB cpu is a percentage. How you can subtract an actual value (say 123.3) with a percentage(84.2)
Thanks
Mano
November 20, 2006 - 1:37 pm UTC
having no idea what exactly I'm looking at here, I will say you can do whatever you want to with those numbers?
A reader, November 20, 2006 - 3:51 pm UTC
Hi Tom,
Thank you for your time, and help in the previous thread of invalidation.
from statspack the shared pool usage was about 90% , while i have shared_area_reserved_size to 10%
i have soft to hard parse ratio of 98% , in your book you advice that there should be no hard parsing , you said the reason is literal variable , i check with the application code plus i run the script in your book that help in know if i am using bind variable or not , it seem to me that everything is fine , so now i want to confirm if i need more shared pool size.
i guess that the invalidation could be the reason of hard parsing but i don't know a reason for those invalidation , could it be the size of shared pool ?
is there any other reasons to mark sql as invalid ?
i mean by huge that about 5000 shared statments have been invalid , isn't that too much ?
Thanks again
November 20, 2006 - 4:11 pm UTC
sounds like you are running your shared pool at full utilization, might be time to think about backing off the shared pool reserved size or making it larger over all.
in comparison to the number of "hits" 5,000 is rather small.
Redo statistic in statspack
A reader, November 20, 2006 - 9:00 pm UTC
Tom, assuming a statistic of:
Per Second Per Transaction
--------------- ---------------
Redo size: 2,770.53 723.63
for a statpack report that spanned 8 hours (normal business hours), assuming also no mass-update/insert or batch job ran, just *normal* OLTP work during the snaps, this is on a NOARCHIVELOG database, can we assume the following:
1) Our daily redo generated would be about (or close to): 2770*60*60*24 bytes.
2) Redo groups of 10m would switch every hour (2770*60*60).
I took the redo statistics from someone else report here, not my own. I just want to make sure I understand the statistics on the statspack report correctly. On the future, I am trying to implement this to change NOARCHIVE db to ARCHIVE mode.
Thanks!
Thanks!
November 22, 2006 - 2:57 pm UTC
#1, sure, maybe, umm, no. This is 8 hours. Unless your next 16 hours are identical extrapolation like that is dangerous...
#2, absolutely no way you can come to that conclusion. An 8 hour statspack is sooooo long. You might generate 50% of that redo in 1 hour, and the other 50% in the remaining 7.
if this data is of any importance, you will IMMEDIATELY, RIGHT NOW, put it in archive log mode.
Tom must be offthis week
Jim, November 22, 2006 - 2:02 pm UTC
I noticed that the time stamps on the latest updates are all from Monday, and this is Wednesday afternoon. So Tom must have taken the rest of the week off. Have a good one Tom and hurry back!
anothere question of statspack wait
Ajeet, December 12, 2006 - 3:58 am UTC
Hi Tom,
this thread was so useful - I have another quesion about enqueue waits.one of my statspack report shows waits like below -
just doing a cut & paste of header section and wait section -
here it shows that enqueu wait time was 24 mins in the snap duration of 15 mins - in fact this application(set of some c programs) runs for 15 minutes only.
please advice.
STATSPACK report for
DB Name DB Id Instance Inst Num Release Cluster Host
------------ ----------- ------------ -------- ----------- ------- ------------
xxxxx 1755775419 xxxx 1 9.2.0.8.0 NO spxxxx
Snap Id Snap Time Sessions Curs/Sess Comment
--------- ------------------ -------- --------- -------------------
Begin Snap: 93 12-Dec-06 12:42:46 15 9.6
End Snap: 94 12-Dec-06 12:57:46 15 8.8
Elapsed: 15.00 (mins)
Cache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 2,048M Std Block Size: 8K
Shared Pool Size: 608M Log Buffer: 512K
Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 589,911.68 77,710.85
Logical reads: 9,254.84 1,219.17
Block changes: 2,097.61 276.32
Physical reads: 17.33 2.28
Physical writes: 48.35 6.37
User calls: 993.48 130.87
Parses: 443.32 58.40
Hard parses: 22.82 3.01
Sorts: 123.68 16.29
Logons: 0.00 0.00
Executes: 490.34 64.59
Transactions: 7.59
% Blocks changed per Read: 22.67 Recursive Call %: 3.79
Rollback per transaction %: 0.00 Rows per Sort: 0.96
Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 99.88 Redo NoWait %: 100.00
Buffer Hit %: 99.82 In-memory Sort %: 100.00
Library Hit %: 98.39 Soft Parse %: 94.85
Execute to Parse %: 9.59 Latch Hit %: 99.96
Parse CPU to Parse Elapsd %: 91.42 % Non-Parse CPU: 95.21
Shared Pool Statistics Begin End
------ ------
Memory Usage %: 8.55 8.84
% SQL with executions>1: 79.88 88.71
% Memory for SQL w/exec>1: 76.40 90.39
Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
enqueue 12,948 1,468 75.40
CPU time 360 18.50
log file sync 6,842 37 1.93
log file parallel write 13,818 34 1.76
SQL*Net break/reset to client 40,990 15 .76
-------------------------------------------------------------
^LWait Events for DB: FTFXEMC Instance: FTFXEMC Snaps: 93 -94
-> s - second
-> cs - centisecond - 100th of a second
-> ms - millisecond - 1000th of a second
-> us - microsecond - 1000000th of a second
-> ordered by wait time desc, waits desc (idle events last)
Avg
Total Wait wait Waits
Event Waits Timeouts Time (s) (ms) /txn
---------------------------- ------------ ---------- ---------- ------ --------
enqueue 12,948 0 1,468 113 1.9
log file sync 6,842 0 37 5 1.0
log file parallel write 13,818 0 34 2 2.0
SQL*Net break/reset to clien 40,990 0 15 0 6.0
db file sequential read 11,465 0 14 1 1.7
db file parallel write 4,815 0 11 2 0.7
latch free 142 142 2 15 0.0
December 12, 2006 - 6:59 am UTC
if you had 24 people, each waiting one minute for an enqueue, you would have 24 minutes of wait.
It could take as little as one MINUTE for 24 people to accumulate 24 minutes of wait time.
Remember - waits are aggregated over all sessions - it is very easy for this aggregate wait time to exceed the wall clock time.
Thanks !!
Ajeet, December 13, 2006 - 12:40 am UTC
Hi Tom,
Where to read - that how all these waits work . in the reference manual , there are only description of the waits .
please advice.
Thanks
December 13, 2006 - 7:45 am UTC
that is what you need - the description
waits "work" by - well - waiting. Not sure what you mean by "how all these waits work"
avoiding wait
Kalpesh, January 11, 2007 - 6:20 pm UTC
Hi Tom,
what is your suggestion over here for improvement?
Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt Time
-------------------------------------------- ------------ ------------ -------
db file sequential read 500,208 319,282 74.75
direct path read 436,083 24,899 5.83
latch free 108,822 23,930 5.60
log file sync 273,061 12,691 2.97
direct path write 225,965 7,837 1.83
Total time = 18,000 seconds
HS message to agent
Raks, April 04, 2007 - 9:29 am UTC
In my STATSPACK report Im seeing "HS message to agent" in Top 5 Timed Events.
What this indicate ?
I searched on net and I found it means hetrogenous services.
Can you tell what it means?
April 04, 2007 - 11:00 am UTC
you are using the heterogeneous service - to access another foreign database via ODBC for example. It is the server communicating with the HS.
Some more questions on STATSPACK
Gautam, April 20, 2007 - 9:20 am UTC
Hi Tom,
I am a regular reader of asktom and have been trying to learn from your immensely valuable advices.
MAy I request you to take a look at the statspack report (part uploaded) and my queries below.
Release : 9.2.0.7.0
Snap Id Snap Time Sessions Curs/Sess Comment
--------- ------------------ -------- --------- -------------------
Begin Snap: 35 20-Apr-07 15:11:33 986 45.9
End Snap: 36 20-Apr-07 15:53:51 994 46.5
Elapsed: 42.30 (mins)
Cache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 6,224M Std Block Size: 8K
Shared Pool Size: 1,024M Log Buffer: 384K
Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 963,405.82 1,730.36
Logical reads: 15,981.59 28.70
Block changes: 5,401.55 9.70
Physical reads: 3,044.36 5.47
Physical writes: 266.49 0.48
User calls: 169.10 0.30
Parses: 98.75 0.18
Hard parses: 1.80 0.00
Sorts: 37.46 0.07
Logons: 1.71 0.00
Executes: 677.93 1.22
Transactions: 556.77
% Blocks changed per Read: 33.80 Recursive Call %: 93.61
Rollback per transaction %: 0.04 Rows per Sort: 19.49
Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 99.34 Redo NoWait %: 100.00
Buffer Hit %: 85.68 In-memory Sort %: 100.00
Library Hit %: 99.68 Soft Parse %: 98.18
Execute to Parse %: 85.43 Latch Hit %: 99.53
Parse CPU to Parse Elapsd %: 7.17 % Non-Parse CPU: 98.42
Shared Pool Statistics Begin End
------ ------
Memory Usage %: 86.23 91.18
% SQL with executions>1: 35.19 33.60
% Memory for SQL w/exec>1: 39.29 36.69
Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
free buffer waits 185,516 179,261 62.28
db file sequential read 1,245,100 36,847 12.80
enqueue 8,265 19,098 6.64
buffer busy waits 268,106 13,033 4.53
db file scattered read 689,609 9,675 3.36
-------------------------------------------------------------
Background Wait Events for DB: UNIPROD Instance: uniprod Snaps: 35 -36
-> ordered by wait time desc, waits desc (idle events last)
Avg
Total Wait wait Waits
Event Waits Timeouts Time (s) (ms) /txn
---------------------------- ------------ ---------- ---------- ------ --------
db file parallel write 1,356 0 3,511 2589 0.0
wait for unread message on b 2,482 2,482 2,443 984 0.0
buffer busy waits 3,965 1,291 1,458 368 0.0
log file parallel write 113,560 0 771 7 0.1
async disk IO 4,082 0 117 29 0.0
control file parallel write 937 0 54 57 0.0
log buffer space 1,645 1 48 29 0.0
direct path write 572 0 48 83 0.0
log file sequential read 2,060 0 43 21 0.0
free buffer waits 54 37 40 745 0.0
process startup 132 1 10 79 0.0
db file sequential read 148 0 9 63 0.0
db file scattered read 168 0 9 55 0.0
latch free 455 61 3 7 0.0
direct path read 2,280 0 3 1 0.0
rdbms ipc reply 330 0 1 5 0.0
enqueue 1 0 1 1286 0.0
control file sequential read 1,197 0 1 1 0.0
LGWR wait for redo copy 4,672 4 1 0 0.0
log file single write 8 0 0 7 0.0
rdbms ipc message 12,300 1,420 9,500 772 0.0
pmon timer 856 851 2,421 2828 0.0
smon timer 4,952 0 893 180 0.0
-------------------------------------------------------------
Buffer wait Statistics for DB: UNIPROD Instance: uniprod Snaps: 35 -36
-> ordered by wait time desc, waits desc
Tot Wait Avg
Class Waits Time (s) Time (ms)
------------------ ----------- ---------- ---------
data block 263,032 11,818 45
undo header 4,464 1,497 335
undo block 133 20 153
1st level bmb 336 0 1
segment header 6 0 27
-------------------------------------------------------------
Query #1
Parse CPU to Parse Elapsd %: 7.17 -- Is it something to be concerned about? Can you please explain the implication of this?
Query #2
The "Free buffer waits" and "DB file sequential read" are the top two timed events. Are they indicating to DBWR performance? Or is it physical I/O performance issue of the server machine... how do I derive?
Query #3
I have the following two setting in the parameter file -
db_writer_processes 4
disk_asynch_io TRUE
How do I get to know if asynch io is supported on my OS platform? If it is supported do I need >1 DBWRs?
Thanking you in anticipation. Please let me know if other parts of the statspack should be required for you to answer (if at all you consider the queries worth answering)
Have a nice weekend Tom.
April 20, 2007 - 10:33 am UTC
42.30 (mins) - that is about 3x too long....
1) it depends. how much time did you spend parsing.
2) so, each session on average had 3 minutes of wait in 42 minutes for free buffers. free buffer waits are typically associated with dbwr not being able to keep the cache "clean" - the db file sequential read are not related to dbwr at all. That is physical IO performed by the user session.
3) Utilize support to investigate whether you have async io or not.
System-wide Wait Analysis
Mark, April 20, 2007 - 6:22 pm UTC
Hello Tom,
I ran this query to get the System-wide Wait Analysis for current wait events.
select
event c1,
total_waits c2,
time_waited / 100 c3,
total_timeouts c4,
average_wait /100 c5
from
sys.v_$system_event
where
event not in (
'dispatcher timer',
'lock element cleanup',
'Null event',
'parallel query dequeue wait',
'parallel query idle wait - Slaves',
'pipe get',
'PL/SQL lock timer',
'pmon timer',
'rdbms ipc message',
'slave wait',
'smon timer',
'SQL*Net break/reset to client',
'SQL*Net message from client',
'SQL*Net message to client',
'SQL*Net more data to client',
'virtual circuit status',
'WMON goes to sleep'
)
AND event not like 'DFS%'
and event not like '%done%'
and event not like '%Idle%'
AND event not like 'KXFX%'
order by c2 desc;
Average
Event Total Seconds Total Wait
Name Waits Waiting Timeouts (in secs)
------------------------------ ------------ -------- ------------ ---------
db file sequential read 84,536,189 7,922 0 .000
db file scattered read 2,833,336 6,228 0 .000
SQL*Net message to dblink 238,284 0 0 .000
SQL*Net message from dblink 238,276 6,078 0 .030
SQL*Net more data from dblink 144,072 71 0 .000
log file parallel write 88,546 62 0 .000
log file sync 36,151 45 0 .000
jobq slave wait 27,582 79,869 26,415 2.900
control file parallel write 27,122 38 0 .000
control file sequential read 20,429 6 0 .000
db file parallel write 17,511 139 0 .010
buffer busy waits 10,458 137 0 .010
SQL*Net more data from client 6,061 145 0 .020
direct path read (lob) 3,256 7 0 .000
direct path write 2,420 53 0 .020
direct path read 2,139 0 0 .000
process startup 1,146 61 0 .050
SQL*Net more data to dblink 412 0 0 .000
LGWR wait for redo copy 328 0 0 .000
async disk IO 293 0 0 .000
rdbms ipc reply 182 3 1 .020
direct path write (lob) 165 0 0 .000
log file sequential read 160 8 0 .050
single-task message 118 60 0 .510
latch free 113 0 100 .000
undo segment extension 65 0 65 .000
SQL*Net break/reset to dblink 44 0 0 .000
log file single write 9 0 0 .000
enqueue 5 0 0 .000
db file parallel read 4 0 0 .010
free buffer waits 3 1 0 .240
library cache pin 3 0 0 .000
slave TJ process wait 3 0 3 .020
library cache load lock 3 0 0 .000
log file switch completion 2 0 0 .110
instance state change 2 0 0 .000
reliable message 1 0 0 .000
sort segment request 1 1 1 .980
refresh controlfile command 1 0 0 .030
control file heartbeat 1 4 1 3.910
40 rows selected.
How is "SQL*Net more data from dblink" different from "SQL*Net message from dblink"?
Event "jobq slave wait" is showing timeouts does it mean some job is waiting or is failing?
What does timeout on undo segment extension mean?
Can you tell me what is "slave TJ process wait"?
April 20, 2007 - 8:08 pm UTC
100% miss in SQL Area of Library Cache Activity
Sam, March 05, 2008 - 5:09 pm UTC
Hi Tom,
In a 15 min AWR in the Library Cache Activity Section (10gR2) I see the pct miss in the SQL Area is 100%.
What is the cause / significance of this ?
The application has SQL that is properly bound.
If this is insufficient information, please let me know and I will post more sections of the AWR.
Thanks
Sam
Library Cache Activity
* "Pct Misses" should be very low
Namespace Get Requests Pct Miss Pin Requests Pct Miss Reloads Invalidations
BODY 374 0.00 87,444 0.00 0 0
INDEX 2 0.00 20 0.00 0 0
SQL AREA 251 100.00 43,515,998 -0.01 0 0
TABLE/PROCEDURE 105 0.00 766,595 -0.00 0 0
TRIGGER 7 0.00 202 0.00 0 0
March 06, 2008 - 8:07 am UTC
251 is a small number, it was probably incurred entirely by AWR itself with some sql that wasn't in the shared pool yet - until you ran the report.
251 is very very very *small*
Consumed resources
Marat, May 22, 2008 - 3:08 pm UTC
Dear Tom,
I'm going to use statspack for making snapshots with further comparison them. But our DBA says that runing statspack.snap consumes lots of resources itself. So he does not suggest using it.
Does he right?
Thank you.
May 23, 2008 - 8:11 am UTC
statspack will affect performance in the database overall by NEGATIVE 25%
chew on that for a minute, negative 25%
meaning, the information you get from statspack will enable to you monitor, tune, diagnose issues you would not otherwise be able to do. Enabling you to fix things, make them better, improve stuff
the only time statspack hits the database is during a snap call - that is it, it is not "on all of the time", only when you snap. Snap every 30 minutes and I'll bet you no one notices in most all cases.
NEGATIVE 25% ?
Marat, May 26, 2008 - 6:49 am UTC
do you mean that overall server resources consumption is increased by 25% during making a snap?
May 27, 2008 - 8:01 am UTC
nope, I said NEGATIVE 25% - think about it.
Overall the effect of statspack would be negative 25% - meaning, that without using it - you are running slower than you should, because you have no idea what is going on.
If you have the information provided by statspack, you can make informed decisions regarding your instance and overall performance will improve.
More high and strange values in Statspack?
Lasse Jenssen, October 29, 2008 - 9:12 am UTC
Earlier in this discussion you explained why you can see higher values for cpu seconds than actually available in report time span (regarding "Top 5 Timed Events"). In one of my latest reviews of a Statspack reports I found some strange numbers (See below). I also see some strange numbers in v$sql (see below). I wonder if this is related to the same explaination as you gave earlier (queries running over multiple snapshots)?
Statspack: 15 minute snapshot 4 x dual cpu:
=============================================
Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
db file sequential read 4,172,389 54,908 91.53
db file scattered read 43,882 3,692 6.16
CPU time 1,304 2.17
latch free 19,916 52 .09
db file parallel write 1,925 9 .01
SQL ordered by Gets for DB: <ORCL> Instance: <ORCL>
CPU Elapsd
Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
8,820,464 2 4,410,232.0 28.8 383.89 397.70 950552572
Module: ORACLE.EXE
SELECT "T1"."PRODUKT_NR","T1"."PRODUKTNAVN",COUNT(*) FROM "ADMIN"."TABLE1" "A2","ADMIN"."TABLE2" "A1" WHERE "A2"."FK_SOMETHING"="A1"."PK_SOMETHING" AND "A2"."CREATED"='15.01.200
8' GROUP BY "A1"."PRODUKT_NR","A1"."PRODUKTNAVN" ORDER BY "A1"."PRODUKT_NR"
SQL ordered by Reads for DB: <ORCL> Instance: <ORCL>
CPU Elapsd
Physical Reads Executions Reads per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
2,505,722 2 1,252,861.0 52.3 383.89 397.70 950552572
Module: ORACLE.EXE
SELECT "T1"."PRODUKT_NR","T1"."PRODUKTNAVN",COUNT(*) FROM "ADMIN"."TABLE1" "A2","ADMIN"."TABLE2" "A1" WHERE "A2"."FK_SOMETHING"="A1"."PK_SOMETHING" AND "A2"."CREATED"='15.01.200
8' GROUP BY "A1"."PRODUKT_NR","A1"."PRODUKTNAVN" ORDER BY "A1"."PRODUKT_NR"
Question: I find it strange that the "Time(s)" for a query accounting for 28 % of all buffer gets and 52 % of all physical reads has a "Elapsed Time(s)" at 397.70 second. When the "Time (S)" for "db file sequential read" is 54.908 seconds.
Querying V$SQL
=========================================================
SQL> r
1 select elapsed_time/1000 sec_tot,(ELAPSED_TIME/(0.01+executions))/1000 sec_pr_exec,executions,sql_text
2 from v$sql
3 where (ELAPSED_TIME/(0.01+executions))/1000> 3600
4* order by 2
SEC_TOT SEC_PR_EXEC EXECUTIONS SQL_TEXT
-------------- -------------------- ---------- --------------------------------------------------
9596 9501 1 select MI.FK_LJ_KONTONR KONTONR ...
9990 9891 1 select MI.FK_LJ_KONTONR KONTONR ...
25878 12875 2 SELECT COUNT(*) FROM "ADMIN"."LJ_KUND" "A1"
53156 13256 4 Select distinct T1.KONTONR,T2.INNENFOR_RAMME ...
13648 13513 1 Select T1.FRA_TID FROM_TIME,T2.LJ_PLAN PLAN, Decode(...
13686 13550 1 SELECT OBJ.OBJEKT_NR, OBJ.KOPI_NR, OBJ.UNDER_TYPE...
29084 28796 1 select MI.FK_LJ_KONTONR KONTONR ...
90620 89723 1 select count(*) from lj_kund e, ...utredni)
1196565159062 3793513 315424 select obj#,type#,ctime,mtime,stime,status,dataobj
#,flags,oid$, spare1, spare2 from obj$ where owner
#=:1 and name=:2 and namespace=:3 and remoteowner
is null and linkname is null and subname is null
1196565078699 26977608 44354 select pos#,intcol#,col#,spare1,bo#,spare2 from ic
ol$ where obj#=:1
1196565105734 32471229 36850 SELECT DISTINCT t1.gyldig_fra_dato, t1.gyldig_til_...
2393130114843 42699384 56046 Select t1.GRUPPE_IND From LJ_FAST A, LJ_BAR B, ...
1196565047803 43145877 27733 SELECT COUNT(*) FROM LJ_KONT WHERE KONTONR = :1
1196552851422 47713230 25078 select i.obj#,i.ts#,i.file#,i.block#,i.intcols,i.t ...
2393121687718 117910943 20296 select decode(max(lj_auto.tilgang),null,0,max(...
2393121917953 170014224 14076 select ts#,file#,block#,nvl(bobj#,0),nvl(tab#,0), ...
1196565130105 1549934755 772 SELECT T1.ANDRE_NR, T2.INST, T2.TYPE
1196565032934 5488578657 218 Select Count(*) from LJ_PAKK where FK_LJ_KUNDKUNDEID = :1
1196565079164 70344801832 17 Select /*+ INDEX (LJ_KUNDE LJKUNDE) */ T1.FORNAVN ...
1196565191329 595306065338 2 SELECT LJ_HIST_SEQ.NEXTVAL FROM DUAL
Question: I find these numbers VERY hard to believe. Can this be for real??
October 29, 2008 - 1:25 pm UTC
...
Question: I find it strange that the "Time(s)" for a query accounting for 28 % of all buffer gets and 52 % of all physical reads has a "Elapsed Time(s)" at 397.70 second. When the "Time (S)" for "db file sequential read" is 54.908 seconds.
.....
why? you are assuming all of it's time is spent in IO, in looking at the CPU and the ELAPSED, I would assuming it spends the vast majority of its time munching the data after the IO's (and remember the IO's from the buffer cache consume a HUGE amount of cpu)
cpu time: 383.89
ela time: 397.70
------
diff ~ 14 seconds of wait, probably on IO
the other 383.89 seconds - spent on the CPU doing stuff.
This is for real, they do not look at all 'strange' to me.
Still high and strange values in statspack?
Lasse Jenssen, October 30, 2008 - 9:16 am UTC
Thanks for your replay.
Yes. I did see the CPU_TIME,and I've read (and seen) pleanty of time that logical reads are CPU expensive.
But as i see it, your analysis actually strengthen my question. I'm looking at the value for "db_file_sequential_reads" in "Top 5 Events". It says 54,908 seconds. How can that be, when my thoughest query regarding "physical reads" has an "Elapsed Time" of 400 seconds. Even knowing that only 14 seconds are related to the IO (read: db_file_sequential_reads and db_file_scattered_read)?
October 30, 2008 - 9:54 am UTC
Well, I haven't tested it yet, but there could definitely be CPU dumping for the SQL statements (they could have started before the snapshot). Other copies of it could have started running and are still running after the end snapshot.
I'll have to play (when I get a chance, getting on a plane shortly, then getting on another plane and will be out of country for a week) with some long running queries that cross snaps to see when exactly the elapsed, cpu and IO counts are updated and how that impacts a statspack.
but - in short, your average IO wait is 0.013 - which sounds about right. You've identified the big query - the biggest bang for the buck. I'd be running with that information and seeing if those queries could not be "tuned" somewhat.
It seems hard to imagine that a query this simple:
SELECT "T1"."PRODUKT_NR","T1"."PRODUKTNAVN",COUNT(*) FROM "ADMIN"."TABLE1" "A2","ADMIN"."TABLE2"
"A1" WHERE "A2"."FK_SOMETHING"="A1"."PK_SOMETHING" AND "A2"."CREATED"='15.01.200
8' GROUP BY "A1"."PRODUKT_NR","A1"."PRODUKTNAVN" ORDER BY "A1"."PRODUKT_NR"
would need that much IO in the first place and you know, everytime I see a query with a count(*) like that - I get immediately suspicious, my inclination is to not to say "how to make faster" but rather "why the HECK are we running it, what is the point"
and 999 times out of 1000 - it is pointless - it is someone asking if there is work to be done and then doing it - i would just say "do it"
CPU dumping ...
Lasse Jenssen, October 31, 2008 - 6:43 am UTC
Thanks again ... Have a nice and safe journey :-)
I'll definitly discuss the use of some of the queries (I find in the reports) with the developers.
I'm not sure what you mean by CPU dumping here (but if you're going to test and follow up this, my head might clear up).
Another thing: How in the world did you figure out my IO wait from these numbers?
If you have time, I'd really like to hear how you tested the behavior of the Statspack compared to when elapsed, cpu and IO are updated.
Another finding in "same" report ..
Lasse Jenssen, October 31, 2008 - 8:18 am UTC
I found this in a another report with somewhat same bad values in "Top 5 Timed Events"
Look at the "Elap per Exec(s)" for the highlighted query. I would imagine this is a recursive query. Then this looks like a bug, or what?
Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
db file sequential read 913,979 43,567 53.61
db file scattered read 175,289 36,446 44.85
CPU time 1,096 1.35
latch free 22,126 86 .11
db file parallel write 1,420 21 .03
CPU Elapsd
Physical Reads Executions Reads per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
5,839 1 5,839.0 0.2 0.59 0.63 1064920033
Module: SQL*Plus
SELECT LOGID,LOGTIME,LOGTYPE, APPLNAME,USERID, NODEID, HOSTADR,
SEVERITY, MSGID,MSGTEXT FROM AS_LOGG WHERE LOGTIME > (select (SY
SDATE - (SUM(0.00074 * 10))) from dual) ORDER BY LOGTIME ASC
...
2,288 2,040 1.1 0.1 1.76 ######### 1480155015
select i.obj#,i.ts#,i.file#,i.block#,i.intcols,i.type#,i.flags,
i.property,i.pctfree$,i.initrans,i.maxtrans,i.blevel,i.leafcnt,i
.distkey, i.lblkkey,i.dblkkey,i.clufac,i.cols,i.analyzetime,i.sa
mplesize,i.dataobj#, nvl(i.degree,1),nvl(i.instances,1),i.rowcnt
,mod(i.pctthres$,256),i.indmethod#,i.trunccnt,nvl(c.unicols,0),n
...
1,661 59 28.2 0.0 0.47 ######### 2826862878
Module: JDBC Thin Client
Select A.STATUS <Removed> ...
<b> 870 24,806 0.0 0.0 5.24 ######### 2703824309
select obj#,type#,ctime,mtime,stime,status,dataobj#,flags,oid$,
spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespa
ce=:3 and remoteowner is null and linkname is null and subname i
s null</b> -------------------------------------------------------------
^LSQL ordered by Executions for DB: FD Instance: fd Snaps: 19217 -19227
-> End Executions Threshold: 100
CPU per Elap per
Executions Rows Processed Rows per Exec Exec (s) Exec (s) Hash Value
------------ --------------- ---------------- ----------- ---------- ----------
<b> 24,806 856 0.0 0.00 48239.95 2703824309
select obj#,type#,ctime,mtime,stime,status,dataobj#,flags,oid$,
spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespa
ce=:3 and remoteowner is null and linkname is null and subname i
s null</b>
November 02, 2008 - 4:47 pm UTC
yes, something doesn't look right - make sure the report was done for valid snaps (no database restarts between)
most likely culprit would be a 32bit integer rolling negative or back over positive.