capacity planning
Parag J Patankar, October 17, 2005 - 10:35 am UTC
Hi Tom,
Will you pl explain us what are the principles we have to keep in mind while creating new database ? that is how to do capacity planning effectively ? do you use any tools for this ?
regards & thanks
pjp
October 17, 2005 - 11:11 am UTC
you need to get the requirements first and foremost, before you can "plan capacity", you need to have a solid understanding of what you are planning for exactly.
Isolating Application Utilization from a database
Anirban, May 31, 2010 - 10:57 am UTC
Hi,
We have a three node 10.1.0.5 RAC say ORARAC. We would like to migrate one application(say X) from ORARAC into a separate database. We would like to know 'how much % of ORARAC is being used by application X in terms of CPU?' We have thought of gathering stat from gv$sesstat for schemas belonging to application X and then minus it from the rest but any other idea would greatly help us.
Thank you.
Regards, anirban
June 08, 2010 - 7:50 am UTC
well, it is a trick question. The less you run on a machine, the less CPU it will use (eg: if you run a query in isolation - by yourself - nothing else going on in the database and you find it uses X units of cpu, when you go with two users doing work at the same time, you'll find it uses X+Y where Y is greater than zero units of CPU. The more concurrent workload you have, the more Y will be.
But anyway.
If you can identify someone the SQL and PLSQL used by that application (eg: if that application always set the action, module columns in v$session which appear then in v$sql..) you could add that up.
Your approach of v$sesstat will only work if you collected that as they logged out of the database
Isolating Application Utilization
Anirban, June 03, 2010 - 4:05 am UTC
Hi Tom,
To add further on my last post, I am using the following query to get % of CPU used by X Sessions(the application) = (Total CPU used by X Sessions / Total CPU used by all Sessions) * 100.
Does it look correct to you please?
Just to clarify, the query returns 45% at 11 AM in the morning. I believe, if Oracle is using P% of the total CPU, then 55% CPU of P% is used by all sessions and 45% CPU of P% is used by X sessions.
select sysdate, sum(PERCENTAGE_BAMUSER_CPU) from
(
SELECT SID, username, total_bamuser_cpu,
ROUND (100 * total_bamuser_cpu / GREATEST (total_cpu, 1), 2) PERCENTAGE_BAMUSER_CPU
FROM (SELECT b.SID SID, NVL (b.username, p.NAME) username,
SUM (VALUE) total_bamuser_cpu
FROM SYS.gv_$statname c,
sys.gv_$sesstat a,
SYS.gv_$session b,
SYS.gv_$bgprocess p
WHERE a.statistic# = c.statistic#
AND p.paddr(+) = b.paddr
AND b.SID = a.SID
AND c.NAME = 'CPU used by this session'
AND USERNAME like '%TEST%'
GROUP BY b.SID, NVL (b.username, p.NAME)
ORDER BY 3 DESC),
(SELECT SUM (VALUE) total_cpu
FROM gv$statname c, gv$sesstat a
WHERE a.statistic# = c.statistic#
AND c.NAME = 'CPU used by this session')
);
many thanks, Anirban
db time metric for capacity planning
york, December 16, 2014 - 2:26 pm UTC
Hi Tom,
We are working with the Oracle db capacity planning model. One metric we want to use to measure the overall db load is "Database Time" , wich can be used as a indicator for Avg DB load.
My quesiton is : for Avg DB Load (database time / wall clock elapsed time) metric, what value is max capacity for DB as high level estimation? (or how do we find out this value in our spercifc env?).
Based on my experience, for one single instance Oracle enterprise db, my estimation is: avg db load > 10 --> db load is somewhat high, avg_db_load > 30 --> db load is very high. Do you think they are reasonalbe threshold?
AVERAGE DATABASE LOAD: At any given time we can count how many users (also called 'Active Sessions') are waiting for an answer from the instance. This is the ADDM's measurement for instance load. The 'Average Database Load' is the average of the the load measurement taken over the entire analysis period. We get this number by dividing the 'Database Time' by the analysis period. For example, if the analysis period is 30 minutes and the 'Database
Time' is 90 minutes, we have an average of 3 users waiting for a response.
December 17, 2014 - 6:58 pm UTC
depends on the number of cores.
Say you have 32 cores. Every second you have 32 CPU seconds available to you.
Say database time = 100, elapsed time = 4. You would say database load it 25 - to you load is high.
However, in 4 seconds, you have 128 cpu seconds. You used at most 100, and probably less (db time = cpu+wait). So you probably used 70-80 cpu seconds of 128, about 60% utilization, that isn't too bad - about as high as you want to get.
Now, what if you were on a 64 core machine, same math but you are only 30% of cpu capacity.
but what if you only had on IO path on that machine and were actually IO bound, a large part of the dbtime would be spent waiting. You'd say "we are low on load", when in fact you are really really high - because your IO stinks and if you fixed that - you'd go faster (probably :) no assurances...)
But also, since database time = cpu+wait, what if 100 sessions waited 1 second each for a lock. In one second you'd have 100 seconds of db time in that one second. but was the database "under load"?
To me the very idea of "database load" is very very very dependent on what you are doing. I just had this conversation at a round table at the UKOUG last week. "how do you find out how busy a database is" - the answer is "it depends"
for a warehouse, it might be a function of actual cpu used plus load on the IO system plus memory in play.
for an oltp system - to me it is a measure of how many BUSINESS TRANSACTIONS (not user_commits) are done and how fast they are done.
simple things like this:
... if the analysis period is 30 minutes and the 'Database
Time' is 90 minutes, we have an average of 3 users waiting for a response. ...
don't work in general (don't take PQ into play). And averages are not always useful - did you use that 90 minutes of dbtime in a 30 second window running a big parallel query? or was it evenly spread over time.
I'd much rather know, as an end user, how much useful work my database performs over a period of time. How many queries did my warehouse do for me. how many business transactions (NOT USER COMMITS) did I do? and how long did it take.
I've never found a "database load" that made sense to me. It depends on what you are looking for.
db time metric for capacity planning
York, December 29, 2014 - 7:42 pm UTC
Thank you, Tom. Very appreciated for your inputs.
Firstly, let me make some assumptions to narrow down the scope: single Instance Oracle DB, OLTP, no Parallel Query in db.
I am working an Oracle DB capacity planning model for our OLTP DB env. I will keep track and do trend analysis on load metrics at both db server layer and DB instance layer. It is possible that DB load is be high, even when db server's CPU and I/O is not busy. For server layer, common metrics are CPU usage and I/O subsystem load. For DB instance layer, I think the most important metric will be ‘database time’, which can be an overall indicator for avg database load. Other metrics like transaction per second, logon per sec etc (other metrics in AWR report ‘Load Profile’ Sections) can also be tracked and analyzed over time.
Here are some of my comments for your feedbacks:
but what if you only had on IO path on that machine and were actually IO bound, a large part of the dbtime would be spent waiting. You'd say "we are low on load", when in fact you are really really high - because your IO stinks and if you fixed that - you'd go faster (probably :) no assurances...)
In above example, the cpu usage is low, but “database time” will be high, which indicates overall db load is high. Database time = working on CPU time + waiting for resource time (I/O, Lock, Latch etc). Hence, ”DB time high” means db is either busy with on CPU, or having heavy contention on some resources.
But also, since database time = cpu+wait, what if 100 sessions waited 1 second each for a lock. In one second you'd have 100 seconds of db time in that one second. but was the database "under load"?”
This is a good example that DB instance load can be very high, even when db server CPU usage is low, and I/O is not busy. If that kind of load lasts one hour, ‘database time’ will be 100 hours in one wall-clock hour, and avg active user sessions will be 100 in that hour. In this case, I would say DB is under vey heavy load, and DB is facing heavy contention on lock resource. If more traffic is coming and more sessions are active / waiting for that lock, the db might hang with ‘max process error’.
To my understanding, the “databases time” is the something like a ‘timecard system’ for the Oracle db. Assume that in one company, HR requires that all workers submits detailed timecard to manager to report their working hours honestly. If the manager sees that each team member in group 1 was spending around 10 – 20 hours on tasks per week, while each team member in group 2 was spending around 40 – 60 hours on tasks per week, he can figure out group 2 is busy and under heavy load. If the manager expects more tasks coming to group 2 in next year, then the manager will need to either make group 2 work more efficiently (find bottleneck and do tuning, for example, he may find that 5 workers share one hammer, and they were spending many hours on waiting for hammer) , or need to hire more people to expand capacity. I think the ‘database time’ / 'time model statistics' in Oracle is very similar with 'timecard managemetn in HR'. One thing I am not sure is if / how we can get a high level estimation of ‘max capacity limit’ in a OLTP DB.
You mentioned that you had one conversation about "how do you find out how busy a database is". Is there any link that I can find the presentaiton material?
Also, do you happen to know any good resrouces for oracle capaicty planing?
Thank you for all you inputs.
Wish you a great holiday.
York
A reader, April 01, 2015 - 7:30 pm UTC
change to Linux. Windows has lot of security issues