Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Beth.

Asked: April 04, 2002 - 9:05 pm UTC

Last updated: November 30, 2006 - 9:11 am UTC

Version: 8.1.6

Viewed 1000+ times

You Asked

Is there a downside to having a very large shared pool?

We have a box with 4GB of memory almost completely devoted to 3 Oracle databases. We have quite a small shared_pool_size (100MB for one db and smaller for the rest).

Is there a downside to allocating 1GB or so to the shared_pool in each database for instance, since the box not being used for other applications?



and Tom said...

YES. definitely.

Especially if you don't make use of bind variables.
The shared pool works best when right sized to hold what you do. A common "myth" (conventional wisdom) is that if you are getting ora-4031's due to not using bind variables you must immediately increase the size of the shared pool

What that does is this:

o delays the problem a little while.
o makes the problem even WORSE when it does happen.

The larger the shared pool, the harder to manage and page the stuff in and out. When not using bind variables, you litterally flood the shared pool with statements that will never be reused --sending us into a frenzy of "load and age and flush". The bigger the shared pool -- the longer it takes to do this -- the more you wait (the system seems to sloooowwww way down) and then eventually, it falls apart.

Take that same system, decrease the shared pool way down -- and there you go, it'll managed itself in a timely fashion.


I've seen sites that didn't (refused) use bind variables in OLTP. They actually had a guy who would sit there and watch a quest screen all day waiting for their 2gig (yes 2gig) shared pool to get 50% utilized. He would then alter system flush shared pool. That was his job.

I begged them to test a 75m shared pool -- when they finally did, guess what happened.... (rhetorical question, you should be able to figure it out by now)

The real fix for them -- use BINDS, re-assign the guy who was paid to flush the shared pool (heck, they didn't even script that...)

If you ask me, 100mb is NOT small, it's about right.

Rating

  (31 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

MaxU, April 05, 2002 - 11:14 am UTC

Tom, thank you for clear answer!

Tom, should one use bind variables even though he/she has constants, for example:

'select * from bla_table where level=1 and name=:nm'

if application uses such code and it's hardcoded, should we use bind variable for column "level" too?

Thank you!


Tom Kyte
April 05, 2002 - 11:47 am UTC

Nope, you use bind variables on things that CHANGE from run to run of the query.

If you have my book -- I actually cover this topic in "why I don't like cursor_sharing = force". It over binds -- sometimes you DON'T want to bind everything.

Only bind things that change from execution to execution.

Thanks !!!

MaxU, April 05, 2002 - 12:08 pm UTC


Does this apply to MTS?

Fred, May 07, 2003 - 8:22 pm UTC

Does your recommendation about reducing size apply to MTS?

Jean Zhou, September 17, 2003 - 2:29 pm UTC

I forward this page to developement managers so that they will understand the importance of using bind variables.

Avoiding Shared Pool

Wor, February 02, 2004 - 5:49 pm UTC

Hi Tom,
In my application we have alot of queries form dynamically which take space in shared pool. Yes we have SQL's with bind variables and stored procedures too. But considering the quantity of change and urgent need to handle dynamic SQL's, Is there any command that when executed does not put the SQL into shared memory.
e.g. Set Command 1=>execute SQL=>Undo command 1.
which will not put the SQL in the shared pool.
And even if we have some command how will this affect the performance of the system ?
Thanks in advance,

Tom Kyte
February 03, 2004 - 7:38 am UTC

Nope, the sql is shared in the shared pool.

Just because you build queries dynamically doesn't mean the cannot use binds and hence would probably be reused.

What do you say about this?

Arun Gupta, February 04, 2004 - 11:52 am UTC

Tom,
Please see the recommendation from Oracle support about a 7445 error that we have been getting:

<quote>
The next thing I looked at was your non-default values listed during startup of the database:

shared_pool_size = 100663296
job_queue_processes = 5

Now since you are running a 64-bit version of Oracle, the minimum shared pool is 64MB. Then you need to add about 10MB for each of the job queues, which leaves you short of memory.

The error you see is most likely caused by a misconfigured database.

The minimum shared_pool_size for a single user database with the job queues you have is at least 114MB. Then you have to add about 1 to 1.5MB for each concurrent connection. Even more so depending on the types of operations the database is performing. I would imagine a shared pool in the area of 256MB would be more accurate for your system.
</quote>

My questions are:
a) We take 8 statspack snapshots a day and generate 4 reports, 15 minutes apart during peak database activity periods. The shared pool advisory has never indicated that we need to increase shared pool. In light of this, is the above observation correct?
b) Why do we need to allocate 1MB to 1.5MB memory in shared pool per concurrent connection? We are running in dedicated server mode.
c) Why the job queues take up 10MB per queue? The job queues are active only during off hours. Do the job queues use memory from shared pool even when not active?

We have 100MB shared pool and the system is running fine. Won't these recommendations cause an over allocated shared pool?

Thanks

Tom Kyte
February 04, 2004 - 5:38 pm UTC

I fail to see how "job queue processes" has anything to do with "sizing the shared pool" -- not directly anyhow -- definitely not as a "formula".

I think they need to rethink this -- a 7445 is due to a "program failing" (eg: segmentation fault). While too small of a shared pool could cause this (and 100m is pretty huge to me), it is still something that needs to be *fixed*.

Only if you are using shared server would you add 1 to 1.5 meg per concurrent connection -- but, the funny thing is you better NOT add it to the shared pool -- it would go into to the large pool and you'd watch your average UGA usage to get the real number (versus the 1 to 1.5 meg rule of thumb)

soooooo -- i would not agree with this advice (which quite honestly is taken out of context -- so, don't get me wrong -- I'd have to see the entire thing in context before being 100%).




Arun Gupta, February 04, 2004 - 8:26 pm UTC

Tom,
Thanks for your advise. We were suspecting the 7445 due to bug 2475995. We were just trying to get a confirmation from support whether the trace files indicated as much or not. We had done our due diligence efforts to reproduce the problem and checked that the workaround suggested by Oracle worked fine, use the no_merge hint for inline view. However, since this is a production system, we did not want to apply the one off patch without getting confirmation from support. After analyzing the RDA and trace files, the advise we got was:

<quote>
I have been reviewing this issue on behalf of the owning analyst.

I started with a look at your alert.log and found these errors:

Tue Feb 3 12:24:38 2004
Errors in file /export/home/oracle/CCMP/trace/user/ccmp_ora_5081.trc:
ORA-07445: exception encountered: core dump [0000000100F8E3FC] [SIGSEGV] [Address not mapped to object] [0x000000000] [] []

The next thing I looked at was your non-default values listed during startup of the database:


shared_pool_size = 100663296
job_queue_processes = 5

Now since you are running a 64-bit version of Oracle, the minimum shared pool is 64MB. Then you need to add about 10MB for each of the job queues, which leaves you short of memory.

The error you see is most likely caused by a misconfigured database.

The minimum shared_pool_size for a single user database with the job queues you have is at least 114MB. Then you have to add about 1 to 1.5MB for each concurrent connection. Even more so depending on the types of operations the database is performing. I would imagine a shared pool in the area of 256MB would be more accurate for your system. You should start using STATSPACK to gather db statistics that will enable you to monitor the memory usage and to size the database appropriatly.

pga_aggregate_target = 188743680
workarea_size_policy = auto

This is also a source of concern. your pga_aggregate_target is very low. It should normally be set to about 50-60% of your total amount of physical memory on the server. Since you have about 20GB of RAM on the box, I would at least recommend setting it to 2-3GB.

The last issue I see is that your redolog files are way too small. They are sometimes switched every minute. Oracle recommends sizing them so they only switch every 15-30 minute during PEAK load.

If you still are seeing the ora-7445 errors after reconfiguring the database, please update the tar with the errors and uplaod a new alert.log and the tracefile, else the tar will be closed automatically after 2 weeks.
</quote>

Our statspack reports do not indicate any need to increase the PGA aggregate target or the shared pool. Improperly sized redo log files can hardly cause 7445. The shared pool size calculations and recommendations looked factually incorrect so I decided to check with you.

Thanks...

Tom Kyte
February 05, 2004 - 7:28 am UTC

I disagree with their analysis, yes.

How do we explain the recommended SHARED_POOL for Oracle Apps

Hemant K Chitale, February 05, 2004 - 4:58 am UTC

How do we explain the very large recommended
SHARED_POOL_SIZEs for Oracle Apps --eg in Note 216205.1
on MetaLink -- sizes of 600M to 800M for upto 500 users,
with 600M being the minimum.

Tom Kyte
February 05, 2004 - 7:41 am UTC

they have sized it, they have tested it, they have designed for it. it is a requirement of that packaged application.

apps uses a TON of plsql - lots of it - huge amounts of it.

APP

A reader, February 05, 2004 - 7:39 am UTC

Becase Oracle Applications is full of PL/SQL codes!

can db cache be too big?

A reader, February 06, 2004 - 4:13 pm UTC

Hi

Recently I visited a shop, they have Oracle 8.1.7 running on Sun Solaris 8 and a database of 10GB. My surprise is their db block buffer was set to 4GB (The server has 8GB memory), the guy who sized it said because he was taught in Oracle performance tuning courses that more memory == good == improved hit ratio...

My question is can a data cache be too BIG?

Tom Kyte
February 07, 2004 - 2:09 pm UTC

answer = yes.

show them this:

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:6749454952894 <code>

and ask them if the 98.9% cache hit ratio would be something to be proud of.

I massively prefered my 50% cache hit ratio.


A cache hit ratio by itself is meaningless. The best way to improve performance - identify your top SQL's (by IO/CPU amounts) and look at them as ask yourself "are they doing as little work as possible"

side effect of very very big database cache?

A reader, February 08, 2004 - 1:50 pm UTC

Are there any side effects of huge data cache? Shared pool you mentioned that too big will make things hardered to manage how about data cache?

Will my application run slower if my data cache is very big?

Tom Kyte
February 08, 2004 - 9:07 pm UTC

yes, it can. it can actually take longer during some operations (like a full scan) to figure out what needs to be read and what is already read -- then to just "read it all". the bigger the cache -- the longer it takes.

bigger = harder to manage, more stuff to manage. lots of stuff you might not be using in the cache to scan over time and time and time and time again.

depends on how you use the data.

db cache size

A reader, February 08, 2004 - 10:36 pm UTC

Hi Tom,

If I have 4G RAM on one machine which is dedicated to oracle database. What is your recommended db_cache_size?

thanks,

Tom Kyte
February 09, 2004 - 7:19 am UTC

somewhere between 50meg and 3.75gig probably.


depends on what you are doing -- 100%.



"I massively prefered my 50% cache hit ratio."

MEHMOOD, May 25, 2004 - 5:54 am UTC

Dear Tom:

I am amazed by your this statement i.e.

==============================
"I massively prefered my 50% cache hit ratio.
==============================

How come Tom, because every one say the DB cache hit ratio should be more than 90%. Then if you are happy with 50%. Then I mean can you explain with your classic examples??


Tom Kyte
May 25, 2004 - 7:38 am UTC

did you read the referenced link? there is an example there.

One query had a high 90% cache hit but took hours.
The same query, different plan, had a 50% cache hit but took seconds.

I prefered the second query plan myself, even though it would be damaging to a cache hit ratio.

how can this be good

sonali, August 25, 2004 - 12:47 pm UTC

"If I have 4G RAM on one machine which is dedicated to oracle database. What is
your recommended db_cache_size?"

and you said
"somewhere between 50meg and 3.75gig probably.
depends on what you are doing -- 100%. "

How could 3.75GB of shared pool be good for this 4GB server. You said that having big shared pool is bad but then you say again you can have it this big.. can you explain when you will have it this for for this size server ?
Thanks




Tom Kyte
August 25, 2004 - 1:33 pm UTC

"db_cache_size"

not "shared pool"

Doubts

Aru, September 13, 2004 - 10:31 pm UTC

Hi Tom,

I just read somewhere on google about ORA-4031 ERROR

Users usually get this error while attempting to load a big
package or while executing a very large procedure and there is not sufficient contiguous free memory in the SHARED_POOL. This may be due to:
1. Fragmentation
or
2. Insufficient memory in the shared pool.

But then if my understanding is correct isen't the library cache managed by the LRU algorithm? If so then why dosen't it just remove the least recently used statements and blocks etc and then use the new found space to load up new statements, blocks etc.??
Also please can you advice if there is a way to check the free space in the shared_pool and is it of any consiquence
if I have cursor_space_for_time = true in the production DB.

please clarify in simple words,


Tom Kyte
September 14, 2004 - 7:43 am UTC

lru means you can get rid of things not in use.
lru doesn't mean you can get rid of everything.

uuuFuuuFuuuFuuuFuuuFuuuFuuuF

you might have lots of free (F) space in a heap (add them up, lots of F's).

However, you might have very small F's with used (u) space in between.

7445

abc, February 23, 2005 - 2:18 pm UTC

ORA-07445: exception encountered: core dump [11] [2013252736] [65610752] [2048] [] []
Wed Feb 2 01:26:47 2005
Thread 1 advanced to log sequence 33718
Wed Feb 2 01:26:47 2005
ARC0: Beginning to archive log# 4 seq# 33717

Tom can you please suggest

Tom Kyte
February 24, 2005 - 5:08 am UTC

metalink.oracle.com whenever you see

ora-600
ora-7445
ora-3113


cursor_sharing=similar

kapil, May 24, 2006 - 11:35 am UTC

Hi Tom,

Need your expert comments.
In 9.2.0.6 database if I have 2 gig of shared pool on 32 CPU Box with cursor_sharing=similar and higher version count(upto 800 for 40-50 sqls). As per application guys, they are using bind variable for 90-95% of code - database hits 4031 error frequently.

What is your suggestion about using bind variables for rest of the application code which is approx 5-10% of total code) and to use cursor_sharing=EXACT? Will this definitely help to reduce shared pool fragmentation as using bind variable will bring down the version count and less shared pool will be used ?

Thanks
kapil




Tom Kyte
May 25, 2006 - 7:11 am UTC

It only takes one bad SQL statement not using binds to totally squash your database.

cursor sharing FORCE would reduce the versions - you are using similar which may or may not have any effect.


they should fix all of the affected statements - tell them to google

sql injection

and ask them if they understand what that might mean.

kapil, May 25, 2006 - 10:27 am UTC

Thank you Sir.

what is stored in miscellaneous shared pool

jianhui, May 30, 2006 - 8:26 pm UTC

Hi Tom,
We had shared pool ORA-4031 error recently. The SQL area is only 10% of total shared pool size, majority of shared pool is consumed by v$sgastat.name=miscellaneous, 50% or 500MB. Shared pool size was 900Mb, after the error it's been increased to 1500Mb.

By tracking statspack history in past week, this area is slowing increaing(~600MB now) which in turn pushes free shared pool memory lower and lower. I am wondering what is stored in this part of shared pool, what can be the cause of it's increase. 

Bind variable seems not the cause, since sql area in shared pool is relatively small, or say less than 10% of total shared pool size.

Could you point a direction?

SQL> select * from v$sgastat where pool='shared pool' and bytes>100*1024*1024;

POOL        NAME                                BYTES
----------- ------------------------------ ----------
shared pool sql area                        109901936
shared pool free memory                     433180560
shared pool FileOpenBlock                   131580288
shared pool miscellaneous                   619960648
shared pool db_block_buffers               2455218784
shared pool session param values            109480000
shared pool db_block_hash_buckets           713031984

7 rows selected.

SQL> show parameter shared_pool_size

NAME                                 TYPE                             VALUE
------------------------------------ -------------------------------- ------------------------------
shared_pool_size                     big integer                      1509949440

Best Regards, 

Tom Kyte
May 31, 2006 - 8:39 am UTC

Well, in general you WANT to have the share pool utilized (that it is growing is not necessarily "bad").

Various "miscellaneous" things are stored there (obvious, I know, but miscellaneous is rather 'broad'). segment statistics, advisory information, lots of stuff.

I'll ask you to work with support on this, they can do things like "dump the heap" (careful with that, do during period of low activity for example, on such a large shared pool).

It is not necessarily the root cause of anything however - without versions and lots of other details (that support will extract from you, not asking for them here) it is not really useful to hypothesize.

miscellaneous shared pool growing

jianhui, May 31, 2006 - 11:59 am UTC

Hi Tom,
Indeed your hypothetical direction was the right one, I opened a TAR yesterday parallelly when I posted this question in your site, just wanted to get a sceondary opionion from one of the most reputable experts.

Today, oracle supports related this two some bugs.

Best regards,

Cursor_space_for_time

Arindam Mukherjee, July 09, 2006 - 9:16 am UTC

Respected Mr. Tom,

In the Oracle 9i performance tuning guide (Page:14-40), it is stated "Do not set the value of CURSOR_SPACE_FOR_TIME to true, if the amount of memory available to each user for private SQL areas is scarce" - My question how would I measure the amount of memory to each user for private SQL areas?
One more request is
If you have more time,could you kindly write in details on this parameter and how much it is recomended to use during configuration when I have CURSOR_SHARING parameter.
I am sorry, I am getting confused between these two parameters.

Regards,
Arindam Mukherjee


Tom Kyte
July 09, 2006 - 1:44 pm UTC

that is 'cga' memory - stored in the users session memory (meaning in the PGA using dedicated server, in the SGA using shared server).

I would just let it default - why are you considering changing it, what are you trying to fix?

Very helpful

A reader, July 10, 2006 - 7:01 am UTC


Huge SGA components

Sanji, November 20, 2006 - 5:09 pm UTC

Tom

I'm starting to configure/optimize a production database which is on HP-UX 11i, Oracle 9i rel 2 and is 570 Gb in size.
Following is the configuration which i find slightly odd, considering the possibilities.

db_block_size 4096
db_cache_size 3.5 Gb
log_buffer 4194304
log_checkpoint_interval 50000
log_checkpoint_timeout 1800
shared_pool_size 600 Mb

There are 4 groups of "1Gb" redos with 2 members each.
I have been given to understand that traditional backup/ recovery methodology is adopted to replicate this environment at a disaster recovery site and the system (tablespaces) is subjected to hot backup mode every half and hour.
From the alert log, there is a recurrence of following messages.

Beginning log switch checkpoint up to RBA [0xc6de.2.10], SCN: 0x0001.9a2245e3
Thread 1 advanced to log sequence 50910
Current log# 2 seq# 50910 mem# 0: /pfin9/oradata/PFIN/redo/redo_2.log
Current log# 2 seq# 50910 mem# 1: /pfin6/oradata/PFIN/redo/redo_2b.log
Mon Nov 20 09:17:45 2006
ARC1: Evaluating archive log 1 thread 1 sequence 50909
ARC1: Beginning to archive log 1 thread 1 sequence 50909
Creating archive destination LOG_ARCHIVE_DEST_1: '/pfinarc/oradata/archlogs/arch50909.arc'
Mon Nov 20 09:19:50 2006
ARC1: Completed archiving log 1 thread 1 sequence 50909
Mon Nov 20 09:20:56 2006
Completed checkpoint up to RBA [0xc6de.2.10], SCN: 0x0001.9a2245e3

So it's typically taking 3 minutes to do a log switch checkpoint, quite understandably, considering the size of redo logs.

Following is a list of wait events from v$system_event. (I'd typically install statspack and get the reports, but as a startup...)

EVENT TOTAL_WAITS TOTAL_TIMEOUTS TIME_WAITED AVERAGE_WAIT TIME_WAITED_MICRO
---------------------------------------- ----------- -------------- ----------- ------------ -----------------
db file sequential read 252148453 0 16292975 0 162929745826
log file sync 69633731 84 11428893 0 114288928336
log file parallel write 69314880 0 9835757 0 98357565643
enqueue 18459 13920 4364021 236 43640206031
db file scattered read 142731306 0 4100917 0 41009170282
log file sequential read 292401 0 1168780 4 11687802958
undo segment extension 139371587 139371582 889585 0 8895845306
log file switch (checkpoint incomplete) 1782 1734 176109 99 1761088287
buffer busy waits 266263 135 72181 0 721813782
control file parallel write 321048 0 54069 0 540687146
db file parallel read 8368 0 14722 2 147222391
log buffer space 1208 16 14006 12 140061608
library cache pin 59 31 9392 159 93915575
db file single write 99456 0 8679 0 86787847
log file switch completion 921 8 6821 7 68208658
latch free 142099 0 3664 0 36640856

I'd request for your opinion on my analysis.

1> db_cache_size is an overkill at 3.5 Gb.
* Might induce all the more latch waits to protect memory structures. "Latches inhibit concurrency" as you say.

2> log buffer at 4 MB. Correct me if i'm wrong Tom.
The log buffer gets cleared when it's 1 Mb full OR 1/3rd Full OR every 3 seconds. Typically anything above 1Mb would be a waste.

3> Replace log_checkpoint_interval and log_checkpoint_timeout which have been set to tremendously high values, with fast_start_mttr_target.

4> Reduce Shared_pool_size and monitor for any shared pool related waits.

5> Reduce the size of redo log files to around 150 Mb and monitor for any waits.
I believe 1Gb of redo is risking 1Gb of pertinent data.
This along with the massive db_cache_size might be causing frequent "Checkpoint not complete" errors.

Thanks in advance
Sanji

Tom Kyte
November 22, 2006 - 2:49 pm UTC

you are looking at waits since the database started, quite simply - that is "way too long of a period of time to make sense of them"

1) you cannot say that, it could be that if you reduced it, your physical IO would shoot way up. could be that that wouldn't happen. There is insufficient data here to say.

2) wrong. You are generating redo, when it hits one of the triggers - lgwr starts writing it out - HOWEVER - you are STILL generating redo. So, you can definitely use more than 1mb, just because that is a trigger, doesn't mean that is the "upper bound"

3) that one might make sense, however - if it ain't broke?

4) why?

5) why?

1gb of redo is risking 1gb of data how??
do you see frequent checkpoint not completes?

Missed a point

Sanji, November 20, 2006 - 6:14 pm UTC

Point 1 is in conjunction with a large shared pool.
Thanks
Sanji

Opinion regarding Large SGA settings

Sanji, November 22, 2006 - 4:44 pm UTC

Tom

1> The buffer cache was set to 3.5G because the previous admin wanted a near 100% cache hit ratio. Otherwise there have been no pertinent waits to suggest that the cache size should be this high.

2> Agreed. We can use more than 1Mb, but how is 4 Mb log buffer justified ?

3> The activity on the database doesn't suggest that SQL Statements and other shared pool information would run out of space (except ofcourse if bind variables aren't used, there too a bigger shared pool might suffer comparitively more). I want to start from a lower SGA size, gradually increasing it as and when required.

4> Reiterating, 1Gb of redo would be saving probably 1 and half times of buffer cache blocks. This again was set to counter "Checkpoint Not Complete", but i feel this might (along with checkpoint related parameters) result in
a> Slower instance recovery process
b> In event of disaster, chances of losing 1Gb redo data are more than losing say 150 Mb of data
c> Recovering and managing 150Mb archives against 1Gb archives should be more convenient.
d> Slower incremental checkpoints and checkpoint not complete errors (Due to a slower DBWR, too).
e> Longer Complete checkpoints and database stalling during checkpointing because of redo switching and simultaneously 1Gb of archiving.
f> Redo related waits.

There are still Checkpoint not complete errors in the alert log.

Thanks
Sanji

Tom Kyte
November 24, 2006 - 2:47 pm UTC

1) ah, but chicken and egg. If you drop it.... (did you think to use the buffer cache advisor?)

2) how is it not? depends on the size of your transactions, rate of transactions. Don't ask me to justify it (insufficient data from you to do so), rather you "unjustify" it - you provide facts that show it is "too large"

beside, 4mb is just 4mb, not really worth worrying about.

3) did you think to use the advisors that tell you what will happen if you increase/decrease things?


4)

a) sure, fast_start_mttr_target would be the solution to that IF your instance recovery times are too large. So..... ARE THEY

b) you are not making sense. in case of disaster you lose redo and data, now what?

c) why? it could be true, very personal however.

d) are you SEEING THEM though, that is the point.

e) are you SEEING THAT though, that is the point. is the database "stalling", you don't actually say.

f) free buffer waits (you just seem to have typed out a random wait event, so I did too :) Not sure what F pertains to or why you picked "read related waits"

Analysis

Sanji, November 27, 2006 - 3:41 pm UTC

Sorry about that Tom. I should have looked at the STATSPACK report before speculating the changes/ reasons.
Following is the snapshot of the STATSPACK report with a 15 min snap period.

STATSPACK report for
DB Name DB Id Instance Inst Num Release Cluster Host
------------ ----------- ------------ -------- -----------
PFIN 1191160476 PFIN 1 9.2.0.4.0 NO lawson1

Snap Id Snap Time Sessions Curs/Sess Comment
------- ------------------ -------- --------- -------------------
Begin Snap: 1 27-Nov-06 12:00:06 137 33.9
End Snap: 2 27-Nov-06 12:15:08 137 31.7
Elapsed: 15.03 (mins)

Cache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 3,328M Std Block Size: 4K
Shared Pool Size: 576M Log Buffer: 4,096K

Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 828,876.54 220,870.50
Logical reads: 97,985.69 26,110.22
Parses: 532.01 141.76
Hard parses: 1.18 0.32
Sorts: 2.76 0.74
Executes: 2,362.38 629.50
Transactions: 3.75

Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
CPU time 1,835 43.12
enqueue 294 881 20.71
db file sequential read 585,017 649 15.26
log file switch (checkpoint incomplete) 543 544 12.79
buffer busy waits 81 150 3.52

Wait Events for DB: PFIN Instance: PFIN Snaps: 1 -2
Avg
Total Wait wait Waits
Event Waits Timeouts Time (s) (ms) /txn
---------------------------- ------------ ---------- ---------- ------ --------
enqueue 294 293 881 2998 0.1
db file sequential read 585,017 0 649 1 172.8
log file switch (checkpoint 543 538 544 1002 0.2
buffer busy waits 81 50 150 1848 0.0
log file parallel write 3,942 0 74 19 1.2
log file sequential read 1,030 0 55 53 0.3
log file sync 3,347 0 42 13 1.0
db file scattered read 229,567 0 7 0 67.8
log buffer space 13 0 2 116 0.0
control file parallel write 359 0 1 3 0.1
log file switch completion 1 1 1 1001 0.0
control file sequential read 3,170 0 0 0 0.9
db file parallel read 2 0 0 58 0.0
latch free 538 0 0 0 0.2
process startup 1 0 0 41 0.0
log file single write 4 0 0 5 0.0
direct path read 7,835 0 0 0 2.3
direct path write 843 0 0 0 0.2
LGWR wait for redo copy 34 0 0 0 0.0
async disk IO 256 0 0 0 0.1

Background Wait Events for DB: PFIN Instance: PFIN Snaps: 1 -2
Avg
Total Wait wait Waits
Event Waits Timeouts Time (s) (ms) /txn
---------------------------- ------------ ---------- ---------- ------ --------
log file parallel write 3,942 0 74 19 1.2
log file sequential read 1,030 0 55 53 0.3
control file parallel write 359 0 1 3 0.1
control file sequential read 3,063 0 0 0 0.9
db file sequential read 15 0 0 3 0.0
log file single write 4 0 0 5 0.0
direct path read 456 0 0 0 0.1
LGWR wait for redo copy 34 0 0 0 0.0
async disk IO 256 0 0 0 0.1
direct path write 224 0 0 0 0.1
latch free 3 0 0 0 0.0

Buffer Pool Statistics for DB: PFIN Instance: PFIN Snaps: 1 -2
Free Write Buffer
Number of Cache Buffer Physical Physical Buffer Complete Busy
P Buffers Hit % Gets Reads Writes Waits Waits Waits
--- ---------- ----- ----------- ----------- ---------- ------- -------- ------
D 801,840 98.6 88,367,312 1,208,735 120,015 0 0 81

Instance Recovery Stats for DB: PFIN Instance: PFIN Snaps: 1 -2
Targt Estd Log File Log Ckpt Log Ckpt
MTTR MTTR Recovery Actual Target Size Timeout Interval
(s) (s) Estd IOs Redo Blks Redo Blks Redo Blks Redo Blks Redo Blks
- ----- ----- ---------- ---------- ---------- ---------- ---------- ----------
B 0 724 285084 3066279 50000 943713 2016916 50000
E 0 719 304628 2761652 50000 943713 1264652 50000


Buffer wait Statistics for DB: PFIN Instance: PFIN Snaps: 1 -2
Tot Wait Avg
Class Waits Time (s) Time (ms)
------------------ ----------- ---------- ---------
data block 80 153 1,916
undo block 1 0 0

Latch Sleep breakdown for DB: PFIN Instance: PFIN Snaps: 1 -2
Get Spin &
Latch Name Requests Misses Sleeps Sleeps 1->4
-------------------------- -------------- ----------- ----------- ------------
cache buffers chains 131,077,952 11,203 14 11189/14/0/0/0
shared pool 2,742,087 3,809 305 3504/305/0/0/0
library cache 5,913,161 3,041 193 2849/191/1/0/0
cas latch 5,066,536 1,615 5 1610/5/0/0/0
multiblock read objects 1,598,936 1,569 11 1558/11/0/0/0
cache buffers lru chain 205,270 223 3 220/3/0/0/0
library cache pin 4,330,280 164 1 163/1/0/0/0
redo allocation 600,288 91 2 89/2/0/0/0
simulator lru latch 142,882 32 5 27/5/0/0/0

The db cache and shared pool advisories

Size for Size Buffers for Est Physical Estimated
P Estimate (M) Factr Estimate Read Factor Physical Reads
--- ------------ ----- ---------------- ------------- ------------------
D 336 .1 80,955 7.40 10,243,118,844
D 672 .2 161,910 2.77 3,837,542,580
D 1,008 .3 242,865 1.45 2,009,014,952
D 1,344 .4 323,820 1.24 1,716,391,291
D 1,680 .5 404,775 1.12 1,544,017,791
D 2,016 .6 485,730 1.06 1,469,964,332
D 2,352 .7 566,685 1.04 1,434,520,697
D 2,688 .8 647,640 1.02 1,413,635,915
D 3,024 .9 728,595 1.01 1,398,379,557
D 3,328 1.0 801,840 1.00 1,384,631,257
D 3,360 1.0 809,550 1.00 1,383,108,635
D 3,696 1.1 890,505 0.99 1,367,832,098
D 4,032 1.2 971,460 0.98 1,355,681,127
D 4,368 1.3 1,052,415 0.97 1,345,645,439
D 4,704 1.4 1,133,370 0.96 1,334,951,727
D 5,040 1.5 1,214,325 0.96 1,324,726,352
D 5,376 1.6 1,295,280 0.95 1,316,520,099
D 5,712 1.7 1,376,235 0.95 1,311,108,604
D 6,048 1.8 1,457,190 0.94 1,304,899,627
D 6,384 1.9 1,538,145 0.94 1,300,211,211
D 6,720 2.0 1,619,100 0.94 1,296,098,654

Estd
Shared Pool SP Estd Estd Estd Lib LC Time
Size for Size Lib Cache Lib Cache Cache Time Saved Estd Lib Cache
Estim (M) Factr Size (M) Mem Obj Saved (s) Factr Mem Obj Hits
----------- ----- ---------- ------------ ------------ ------- ---------------
320 .6 323 15,813 315,393 1.0 70,659,810
384 .7 386 18,140 315,505 1.0 70,667,661
448 .8 452 20,603 315,582 1.0 70,674,776
512 .9 515 23,572 315,648 1.0 70,683,601
576 1.0 578 26,395 315,707 1.0 70,693,063
640 1.1 641 28,761 315,752 1.0 70,700,684
704 1.2 704 32,737 315,795 1.0 70,707,768
768 1.3 767 35,465 315,836 1.0 70,715,473
832 1.4 830 37,834 315,870 1.0 70,721,343
896 1.6 893 42,186 315,905 1.0 70,727,830
960 1.7 956 44,547 315,930 1.0 70,732,245
1,024 1.8 1,019 46,737 315,961 1.0 70,737,159
1,088 1.9 1,082 49,020 315,984 1.0 70,741,352
1,152 2.0 1,161 51,967 316,010 1.0 70,745,531

After going through the STATSPACK Report, i feel that the database requires the following changes.
1> db cache = 1.2 Gb
2> shared pool = 225 Mb
3> db block size = 8K (need to check on this)
4> redo files should be much smaller than 1 Gb.
5> set multiple database writers

Appreciate your opinion.
Regards
Sanji

Tom Kyte
November 27, 2006 - 8:06 pm UTC

why?

why do you believe that.
explain your reasoning.

What does mean information in v$shared_pool_advice view

Sergey, November 28, 2006 - 4:59 am UTC

Hi Tom,
I can't find full information about v$shared_pool_advice.
Only Note:255409.1 from metalink, but it is short.
May I ask you to give example how to read and understand this advice view.

Tom Kyte
November 28, 2006 - 7:27 am UTC

use statspack, it'll generate a nice report for you from it.

take a snap
let your database do it's normal work
take a snap

report on it.

Opinion on the analysis

Sanji, November 28, 2006 - 10:57 am UTC

Analysis November 27, 2006 Reviewer: Sanji from Shelton, CT

1> As per the advisory, the estimated physical read factor wouldn't get affected "much" if we reduce the size of the cache. Allocating more space unnecessarily can thus be avoided if that helps latching lesser blocks in the cahce.
2> For the same reason. Why allocate unnecessary space when it's not required.
3> I need to investigate why the data block size was set to 4Kb and need to check segments allocation to deduce if a bigger block size would be beneficial.
4> The Checkpoint not complete error is quite prominent in the alert log.
Tue Nov 28 01:44:22 2006
Thread 1 cannot allocate new log, sequence 51155
Checkpoint not complete
Current log# 2 seq# 51154 mem# 0: /pfin9/oradata/PFIN/redo/redo_2.log
Current log# 2 seq# 51154 mem# 1: /pfin6/oradata/PFIN/redo/redo_2b.log
Tue Nov 28 01:44:50 2006

Multiple smaller log files along with multiple database writers(Point 5) should solve the problem.

The disaster recovery environment has been configured such that the database is backed up using backup tablespace mode, every half an hour. This generates huge amount of redo, subsequently archives and then the datafiles along with archives are propogated to the recovery site. During the complete process, the database stalls. I am presuming that the database configuration has been done so because of the half hourly backup strategy and I strongly believe that dataguard would be a better option in this scenario.

Thanks
Sanji

Tom Kyte
November 28, 2006 - 12:05 pm UTC

1) are we reading the same numbers? somewhere between the first two lines below is 1.2 gb:

D 1,008 .3 242,865 1.45 2,009,014,952
D 1,344 .4 323,820 1.24 1,716,391,291
...
D 3,328 1.0 801,840 1.00 1,384,631,257


so, you really want 25-50% more physical IO, if that is not "by much", I'm at a loss to describe "by much"

2) no big deal there - up to you.

3) correct, you have lots of "proving to do" before you "do anything", so I'm back to "why"

4) multiple dbwrs may or may not do a thing (if async io is already on, probably not).




I would change the backup tool then, rman would not cause additional redo to be generated

If you are backing up every half hour - this must be a tiny database, reinforces my comment on #4.

Analysis

Sanji, November 28, 2006 - 11:32 am UTC

It's a 570Gb database.
The reason why i was recommending DataGuard was because of the disaster recovery strategy. In any case RMAN would be the backup tool.

Per statspack, there are queries that use only literals and executed quite a few times.

Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
1,084,205 1 1,084,205.0 2.5 5.15 5.75 1503174161
Module: oracle@orasrv1 (TNS V1-V3)
SELECT /*+ */ "A1"."VENDOR_GROUP","A1"."VENDOR","A1"."LOCATION_CODE","A1"."VENDOR_SNAME","A1"."ADDR1","A1"."CITY_ADDR5","A1"."STATE_PROV","A1"."POSTAL_CODE","A1"."PHONE_NUM","A1"."FAX_NUM","A1"."TAX_ID","A1"."VAT_REG_NBR","A1"."CREATE_DATE","A1"."DT_CHG" FROM "VIEW_VENDOR_ALL" "A1" WHERE "A1"."VENDOR_GROUP"='UR' AND "A

1,081,599 1 1,081,599.0 2.5 5.39 5.66 281584770
Module: oracle@orasrv1 (TNS V1-V3)
SELECT /*+ */ "A1"."VENDOR_GROUP","A1"."VENDOR","A1"."LOCATION_CODE","A1"."VENDOR_SNAME","A1"."ADDR1","A1"."CITY_ADDR5","A1"."STATE_PROV","A1"."POSTAL_CODE","A1"."PHONE_NUM","A1"."FAX_NUM","A1"."TAX_ID","A1"."VAT_REG_NBR","A1"."CREATE_DATE","A1"."DT_CHG" FROM "VIEW_VENDOR_ALL" "A1" WHERE "A1"."VENDOR_GROUP"='UR' AND "A

1,080,491 1 1,080,491.0 2.5 5.07 5.38 2919570497
Module: oracle@orasrv1 (TNS V1-V3)
SELECT /*+ */ "A1"."VENDOR_GROUP","A1"."VENDOR","A1"."LOCATION_CODE","A1"."VENDOR_SNAME","A1"."ADDR1","A1"."CITY_ADDR5","A1"."STATE_PROV","A1"."POSTAL_CODE","A1"."PHONE_NUM","A1"."FAX_NUM","A1"."TAX_ID","A1"."VAT_REG_NBR","A1"."CREATE_DATE","A1"."DT_CHG" FROM "VIEW_VENDOR_ALL" "A1" WHERE "A1"."VENDOR_GROUP"='UR' AND "A

SQL ordered by Reads for DB: PFIN Instance: PFIN Snaps: 7 -8
CPU Elapsd
Physical Reads Executions Reads per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
27,963 2 13,981.5 11.7 2.64 2.76 3978611445
Module: oracle@orasrv1 (TNS V1-V3)
SELECT /*+ */ DISTINCT "A1"."VENDOR_GROUP","A1"."VENDOR","A1"."LOCATION_CODE","A1"."VENDOR_SNAME","A1"."ADDR1","A1"."CITY_ADDR5","A1"."STATE_PROV","A1"."POSTAL_CODE","A1"."PHONE_NUM","A1"."FAX_NUM","A1"."TAX_ID","A1"."VAT_REG_NBR","A1"."CREATE_DATE","A1"."DT_CHG" FROM "VIEW_VENDOR_USA" "A1" WHERE "A1"."VENDOR_GROUP"='U

21,239 1 21,239.0 8.9 2.08 28.75 4248051120
Module: MSACCESS.EXE
SELECT "COMPANY" ,"FISCAL_YEAR" ,"ACCT_UNIT" ,"ACCOUNT" ,"SUB_ACCOUNT" ,"DB_BEG_BAL" ,"CR_BEG_BAL" FROM "LAWSON"."GLAMOUNTS" WHERE (("COMPANY" = 1 ) AND ("FISCAL_YEAR" = 2003 ) )

There are repeated occurences of such queries. So there is another issue of literals that need to be replaced by bind variables.

Any suggestions Tom.

Thanks
Sanji

Tom Kyte
November 28, 2006 - 12:14 pm UTC

looks more like a query in need of some SERIOUS tuning - over a million logical IO's, unless that pulls hundreds of thousands of records or more, I'd be concerned about that too.

Automatic Shared Memory Management

prabu, November 28, 2006 - 4:10 pm UTC

Hi Tom,

You put things in a new light all the time. Thanks much for that..

how would you see the "Automatic Shared Memory Management" of 10g on this regard [ the default is 128 MB of SGA].

How efficient/trust-worthy is this automatic memory management??

PS: the spell check is a very good option

Tom Kyte
November 28, 2006 - 8:05 pm UTC

on what regard?

Auto Shared Memory Management

Prabu, November 29, 2006 - 10:31 am UTC

Hi Tom,
i am looking for the logic by which Oracle adjusts the shared memory elements[Pools]. Each element of the memory pool is having "specific" purpose to serve. How do Oracle balance this.

select pool, sum(bytes)/1024/1024 Mbytes from v$sgastat group by pool;


1)What are the considerations by which it shrinks/expands the Other pools. any fixed ratios/formulas??

2)How efficient is the Automatic Share Memory Management compared to manual mode

3)What kind of statistics do Oracle collect to age the data out of SGA. Meaning, is it round robin [or is it intelligent enough to collect statistics of usage : frequency of data usage, number of users accessed the data etc]. i understand that in a OLTP environment this will be an unwanted overhead but what about warehousing/DSS environments.

Looking forward to your explanation

regards,
Prabu

Tom Kyte
November 30, 2006 - 9:11 am UTC

undocumented
ever changing
AUTOMATIC