Thanks for the question, Ardiel.
Asked: November 06, 2019 - 3:49 pm UTC
Answered by: Connor McDonald - Last updated: November 25, 2019 - 12:26 pm UTC
Category: Database Administration - Version: 12.2.01
Viewed 100+ times
I am a member of an site reliability team (SRE) and we are trying to develop SRE "golden signals" for an Oracle 11g/12c database.
These signal are:
3) Response Time
4) Error rate (not sure about this one)
How do I derive these measurements using SQL queries in an Oracle database?
My research has highlighted the following SQL queries:
1) Response time
select to_char(begin_time,'hh24:mi') time, round( value * 10, 2) "Response Time (ms)"
where metric_name='SQL Service Response Time'
select a.begin_time, a.end_time,round(((a.value + b.value)/131072),2) "GB per sec"
where a.metric_name = 'Logical Reads Per Sec'
and b.metric_name = 'Physical Reads Direct Per Sec'
and a.begin_time = b.begin_time;
Not sure about latency?
and we said...
I'll be honest - I'm not a fan of these kind of "signals", at least at the macro level.
For example, if I run 10 SQL statements with the following response time:
then my average response time is: 1second
1 second might be deemed as "OK", but we've lost the fact that somewhere in there was someone stuck for 10 seconds waiting on their browser.
Rather than signals, look out for outliers that are exceeding your thresholds, and even they should be as defined by business requirements. For example, the 10seconds above is a disaster on a web browser, but 1000seconds might be totally fine for a nightly batch job.
And a good way of doing that is with the AWR reports, or, if you are not licensed for that, things like Statspack because they'll capture information from things like V$EVENT_HISTOGRAM to give you a better distribution of data.
and you rated our response