A reader, April 30, 2007 - 12:19 pm UTC
April 30, 2007 - 3:46 pm UTC
Bug in DBMS_STATS.AUTO_SAMPLE_SIZE
Imtiaz Rahim, May 01, 2007 - 1:25 am UTC
Dear Tom,
I have used DBMS_STATS.GATHER_TABLE_STATS(.....,estimate_percent=>DBMS_STATS.AUTO_SAMPLE_SIZE,method_opt => 'FOR ALL COLUMNS SIZE AUTO').
But after that the performance go tremendously degraded. Checked with metalink and they mensioned that there is a bug in AUTO_SAMPLE_SIZE.
My question to you is that, Could you tell me what might be a good sample size for analyzing? Is there any constraints on the size of the table? I ran it on our Data Warehouse. There are table sizing 100 to >100000000 rows.
What might be the best practise?
Thanks a lot in advance.
Imtiaz
May 01, 2007 - 9:55 am UTC
they mentioned there is a bug - how about "more details" - a note #, a bug# whatever.
I would question gathering histograms on all columns - in general, they should be the exception, not the rule.
for a really large table like that, a very small % works nicely - even 1%. You'll find histograms to be very expensive to gather and will likely want to fine tune your approach there.
BUG# 3150750
A reader, May 01, 2007 - 11:21 am UTC
Dear Tom,
The bug detail is BUG# 3150750. We are using 92050.
Could you advise what would be the best practice for dbms_stats? If you advise not to gather histograms then what method should we use for a typical Data Warehouse?
Another Qestion, we have 51% PGA hit ratio. The queries are using more full table scans and what might the way to improve PGA hit? My SGA is 12GB and PGA is 1 GB.
At one time, at best 15 - 20 sessions are running.
Thanks in advance.
Imtiaz
May 01, 2007 - 11:36 am UTC
give that 9208 is the current release and that was tagged as fixed in 9206 - perhaps..... well, you know.
Define "typical" :) that is a rhetorical question, there is no such thing as "typical". You know your data, you know what is likely to be used in a predicate and what is likely to be "skewed". It would be those attributes you would consider.
Full table scans and PGA memory usage are not really related? I don't know why you pull them together??
On a data warehouse, where large sorts and hash joins are probable, a larger pga aggregate target might well be useful.
PGA
Imtiaz Rahim, May 01, 2007 - 11:46 am UTC
Oh Tom,
Sorry for the explanation. What I meant was, the tables are using full table scans are they are sorted using PGA. As most of the queries are aggregate functions, I would like to increase the Hit ratio of my PGA. As I can remember, the allocation should be a per session basis. So I would like to know how can I determine what might be an optimal PGA size to attain a PGA hit ~ 100%.
Is there any Data Dictionary tables I could use to do so?
Thanks
Imtiaz
May 01, 2007 - 12:01 pm UTC
statspack - there is a PGA advisor in there.
Great info.
Imtiaz Rahim, May 01, 2007 - 12:02 pm UTC
Thanks Tom. Your views are as helpful as always.
Thanks a lot.
Imtiaz
Benefit of ANALYZE vs DBMS_STATS
Jaroslav Tachovsky, March 26, 2013 - 3:35 pm UTC
Hi,
let me provide scenario where ANALYZE is the only option to compute statistics (tested on 10g). We had testing database and used "alter system set fixed_date=.." to simulate time flow. When we set fixed_date below user_tables.last_analyzed then dbms_stats did not compute statistics (thought it has already newer). ANALYZE worked fine.
March 27, 2013 - 3:41 pm UTC
but now you have a totally invalidate test, since analyze and dbms_stats gather different stuff. there is that.
if you had an issue with fixed date, you should have simply set fixed date higher to gather the stats and then back to whatever you wanted.
seems to be a much simpler approach as compared to rewriting your statistics gathering script into a script that doesn't do what you'll be doing in real life doesn't it?
do not use analyze. it won't result in the plans that dbms_stats would result in 100% of the time.