Thanks for the question, Ardiel.
Asked: November 06, 2019 - 3:49 pm UTC
Last updated: November 25, 2019 - 12:26 pm UTC
Version: 12.2.01
Viewed 1000+ times
You Asked
Hello Tom
I am a member of an site reliability team (SRE) and we are trying to develop SRE "golden signals" for an Oracle 11g/12c database.
These signal are:
1) Throughput
2) Latency
3) Response Time
4) Error rate (not sure about this one)
How do I derive these measurements using SQL queries in an Oracle database?
My research has highlighted the following SQL queries:
1) Response time
-------------
select to_char(begin_time,'hh24:mi') time, round( value * 10, 2) "Response Time (ms)"
from v$sysmetric
where metric_name='SQL Service Response Time'
2) Throughput
----------
select a.begin_time, a.end_time,round(((a.value + b.value)/131072),2) "GB per sec"
where a.metric_name = 'Logical Reads Per Sec'
and b.metric_name = 'Physical Reads Direct Per Sec'
and a.begin_time = b.begin_time;
Not sure about latency?
and Connor said...
I'll be honest - I'm not a fan of these kind of "signals", at least at the macro level.
For example, if I run 10 SQL statements with the following response time:
1ms
1ms
1ms
10secs
1ms
1ms
1ms
1ms
1ms
1ms
then my average response time is: 1second
1 second might be deemed as "OK", but we've lost the fact that somewhere in there was someone stuck for 10 seconds waiting on their browser.
Rather than signals, look out for outliers that are exceeding your thresholds, and even they should be as defined by business requirements. For example, the 10seconds above is a disaster on a web browser, but 1000seconds might be totally fine for a nightly batch job.
And a good way of doing that is with the AWR reports, or, if you are not licensed for that, things like Statspack because they'll capture information from things like V$EVENT_HISTOGRAM to give you a better distribution of data.
Rating
(2 ratings)
Is this answer out of date? If it is, please let us know via a Comment