what are hstograms and how should we gather them
Thx, September 18, 2001 - 1:03 am UTC
September 18, 2001 - 7:49 am UTC
Histograms are more in depth statistics concerning the distribution of data in a tables column.
It is useful when you have an index on a column that contains skewed data. For example, suppose you have a column where 50% of the values are 0 and the remaining 50% are "unique" (eg: the value in the column is either 0 or some unique value).
Well, if you submit the query "select * from t where column = 0" -- you do NOT want to use the index (not selective enough). On the other hand, a query such as "select * from t where column > 1000" might use an index (depends on the values). If you compute histograms for the column -- the optimizer can figure out when and when NOT to use the index.
A reader, September 18, 2001 - 4:12 am UTC
Thanks, Tom.
To the first part of my question: I always use bind variable
in my application. Is there any sense to use histogramms in my case? How much overhead for parsing and memory when i use histogramms?
To the third part. I want to make import (FULL DATABASE) in the database with different block size and I have an error that Oracle cannot import statistics. I should delete all statistics in original database and after this import was successful. It was first time I should delete statistics before export. I can successful export statistics before.
September 18, 2001 - 8:48 am UTC
Yes, histograms can still make sense.
If you have a compound index on (a,b) and the optimizer understands the distribution of the values in A and B it might optimize:
select * from t where b = ?
differently (9i INDEX SKIP SCAN for example). There is not a significant overhead to using histograms.
As for the last part "I have an error" is very vague. One would need to know the error in order to diagnose the problem (at the very least). I doubt it had to do with blocksizes but rather would be a "product issue" (aka -bug)
dbms_utility and monitoring
David, September 18, 2001 - 5:32 am UTC
Am I right that dbms_utility.analyze_schema cannot use info in user_tab_modifications and dbms_stats can
September 18, 2001 - 8:49 am UTC
dbms_stats can with the gathering of stats on STALE tables. dbms_utility is not aware of these new tables (added after it was created). In all cases, dbms_stats would be prefered over dbms_utility to compute statistics.
first_rows
Andrej, September 18, 2001 - 5:51 am UTC
When is the setting of first_rows in init.ora relevant and useful for me?
September 18, 2001 - 8:49 am UTC
Well -- it is always "relevant", if it is set -- it will definitely impact you.
as for useful, I don't know. I personally would not use it, I would use choose as I stated and if I wanted -- do an alter session to get first_rows optimization.
import/export and statistics for sys schema
Olaf, September 18, 2001 - 9:17 am UTC
Is it right that when I have the statistics for schema sys/system I cannot make export/import?
September 18, 2001 - 9:22 am UTC
SYS/SYSTEM should not have stats.
SYS is never exported...
Olaf, September 18, 2001 - 9:37 am UTC
I have ever thought that statistics from all users are placed in the sys schema and availiable through for example use_tables views. How can I transfer the users statistics with schema export?
September 18, 2001 - 11:33 am UTC
EXP can sometimes extract some statistics. See the server utilities guide under exp for details.
DBMS_STATS can be used to copy stats and put them into another database as well.
How to run this DBMS procedures ?
A reader, September 18, 2001 - 9:52 am UTC
I got error when I try to run dbms_space utilities, Do I miss any role previlige? Scott has DBA role, I also tried with system user, same error. Please advise.
SQL> CONN SCOTT/TIGER@DBDEMO
Connected.
SQL> EXEC SHOW_SPACE ('EMP');
BEGIN SHOW_SPACE ('EMP'); END;
*
ERROR at line 1:
ORA-06550: line 1, column 7:
PLS-00201: identifier 'SHOW_SPACE' must be declared
ORA-06550: line 1, column 7:
PL/SQL: Statement ignored
September 18, 2001 - 11:34 am UTC
A problem with statistics
Ravi, February 11, 2003 - 9:11 am UTC
Hi Tom,
The following query,
select count(*)
from t1, t2, t3
where t1.f2 = t2.f2 (+)
and t2.f3 = t3.f3 (+);
gives a result of
COUNT(*)
----------
50
After analyzing my schema using DBMS_STATS.GATHER_SCHEMA_STATS, in a wierd way, the same query returns the value 1.
COUNT(*)
----------
1
In the table T1, the column f2 is NULL except for a row. So, obviously, the outer join is not performed but the normal join is performed in the above query.
select /*+rule*/
count(*)
from t1, t2, t3
where t1.f2 = t2.f2 (+)
and t2.f3 = t3.f3 (+);
gives the original value of 50.
Is this a bug in Oracle? Or can you please explain why Oracle is behaving this way? Also, can you please help me to solve this? I'm using Oracle 9.2.0.
Thanks in advance
Ravi
February 11, 2003 - 4:46 pm UTC
unless you give an entire test case -- not much I can really say -- don't know indexes, volumes of data or whatnot.
But yes, I would put this in the category of "erroneous behavior".
Time to file a TAR with support once you get the test case together.
to delete statistics
reader, August 14, 2003 - 6:38 pm UTC
To gather statistics, I use
dbms_stats.gather_table_stats('DEMO','T');
Is there a proc to delete statistics for a particular table? Thanks.
August 14, 2003 - 8:03 pm UTC
DBMS_STATS
Oleksandr Alesinskyy, August 15, 2003 - 4:48 am UTC
I dare to say that I have very bad exprience with DBMS_STATS and Oracle9i. The worst was when very
straightforward query after applying DBMS_STATS on underlying objects starts to produce ORA-3113. Gathering the same statistics with DBMS_UTILITY or ANALYZE command(in all cases 'FOR TABLE FOR ALL INDEXES FOR ALL INDEXED COLUMNS') has not cause this problem. There were other issues as well. I have opened TAR but without big success. So I would not recommend to use DBMS_STATS despite all Oracle recommendations.
August 15, 2003 - 8:36 am UTC
nothing to do with dbms_stats.
everything to do with the plan generated based on the statstics gathered.
you are killing the messenger here. next time, analyze might do it to you and dbms_stats won't. 3113 means "i'm sorry, something really wrong happened processing this query plan". it is NOT dbms_stats that did it. It is a bug, but not with dbms_stats.
don't kill the messenger.
Real time stats gathering
Rajeshwaran, Jeyabal, December 07, 2020 - 2:09 pm UTC
Team:
was reading about "real time stats" gathering here -
https://docs.oracle.com/en/database/oracle/oracle-database/19/tgsql/optimizer-statistics-concepts.html#GUID-769E609D-0312-43A7-9581-3F3EACF10BA9 However the below demo show it is not working as mentioned, kindly advice.
demo@PDB1> select banner_full from v$version;
BANNER_FULL
-------------------------------------------------------------------------
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
demo@PDB1> create table t as select * From all_objects;
Table created.
demo@PDB1> select distinct notes
2 from user_tab_statistics
3 where table_name ='T';
NOTES
-------------------------
demo@PDB1> select distinct notes
2 from user_tab_col_statistics
3 where table_name ='T';
NOTES
-------------------------------------------------------------------------
STATS_ON_LOAD
demo@PDB1> insert into t select * from all_objects where rownum <=1000;
1000 rows created.
demo@PDB1> commit;
Commit complete.
demo@PDB1> select distinct notes
2 from user_tab_statistics
3 where table_name ='T';
NOTES
-------------------------
demo@PDB1> select distinct notes
2 from user_tab_col_statistics
3 where table_name ='T';
NOTES
-------------------------------------------------------------------------
STATS_ON_LOAD
demo@PDB1>
December 08, 2020 - 5:11 am UTC
Engineered system?
Real time stats gathering
Rajeshwaran Jeyabal, September 21, 2021 - 12:52 pm UTC
September 23, 2021 - 5:43 am UTC
Not sure what you mean - the parameter means its either on or off
Real time stats gathering
Rajeshwaran Jeyabal, September 23, 2021 - 5:55 am UTC
September 23, 2021 - 10:15 am UTC
As the post says:
It is now disabled by default in Oracle Database 19c RU10 on-premises Exadata systems, but it can be enabled using the new parameter.
Real time stats gathering
Rajeshwaran Jeyabal, September 23, 2021 - 12:50 pm UTC
September 24, 2021 - 7:36 am UTC
Correct
Real time statistics on "delete"
Rajeshwaran Jeyabal, September 27, 2021 - 2:36 pm UTC
Team,
Can you please help us to understand why no stats on conventional dml for deletes ?
demo@PDB19> select banner_full from v$version;
BANNER_FULL
--------------------------------------------------------------------------------
Oracle Database 19c EE Extreme Perf Release 19.0.0.0.0 - Production
Version 19.12.0.0.0
demo@PDB19> select * from v$cell;
CELL_PATH CELL_HASHVAL CON_ID CELL_TYPE
------------------------------ ------------ ---------- ----------
192.168.174.11;192.168.174.12 3115627269 0 EXADATA
192.168.174.13;192.168.174.14 1184623385 0 EXADATA
192.168.174.15;192.168.174.16 3875978273 0 EXADATA
192.168.174.17;192.168.174.18 3810989880 0 EXADATA
192.168.174.19;192.168.174.20 922794634 0 EXADATA
192.168.174.9;192.168.174.10 1518238859 0 EXADATA
6 rows selected.
demo@PDB19> show parameter optimizer_real_time_statistics
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
optimizer_real_time_statistics boolean FALSE
demo@PDB19> alter session set optimizer_real_time_statistics=true;
Session altered.
demo@PDB19> drop table t purge;
Table dropped.
demo@PDB19> create table t(
2 n1 number primary key,
3 n2 number ,
4 n3 varchar2(80) );
Table created.
demo@PDB19>
demo@PDB19> insert /*+ append */ into t(n1,n2,n3)
2 with rws as ( select rownum n from dual connect by level <= 1000 )
3 select n, ceil(sqrt(n)),
4 to_char(to_date(n,'j'),'jsp')
5 from rws;
1000 rows created.
demo@PDB19> commit;
Commit complete.
demo@PDB19>
demo@PDB19> delete from t where n2 > 800;
0 rows deleted.
demo@PDB19> commit;
Commit complete.
demo@PDB19> delete from t where n1 > 800;
200 rows deleted.
demo@PDB19> commit;
Commit complete.
demo@PDB19> exec dbms_stats.flush_database_monitoring_info;
PL/SQL procedure successfully completed.
demo@PDB19> select num_rows,blocks,monitoring
2 from user_tables
3 where table_name ='T';
NUM_ROWS BLOCKS MON
---------- ---------- ---
1000 8 YES
demo@PDB19>
demo@PDB19> select num_rows,blocks,stale_stats,notes
2 from user_tab_statistics
3 where table_name ='T';
NUM_ROWS BLOCKS STALE_S NOTES
---------- ---------- ------- -------------------------
1000 8 YES
demo@PDB19>
demo@PDB19> select column_name,num_nulls,num_distinct,low_value,high_value
2 from user_tab_columns
3 where table_name ='T';
COLUMN_NAM NUM_NULLS NUM_DISTINCT LOW_VALUE HIGH_VALUE
---------- ---------- ------------ ---------- ----------
N1 0 1000 C102 C20B
N2 0 32 C102 C121
N3 0 1000 6569676874 74776F2068
756E647265
642074776F
demo@PDB19>
demo@PDB19> select column_name,num_nulls,num_distinct,low_value,high_value,notes
2 from user_tab_col_statistics
3 where table_name ='T';
COLUMN_NAM NUM_NULLS NUM_DISTINCT LOW_VALUE HIGH_VALUE NOTES
---------- ---------- ------------ ---------- ---------- -------------------------
N1 0 1000 C102 C20B STATS_ON_LOAD
N2 0 32 C102 C121 STATS_ON_LOAD
N3 0 1000 6569676874 74776F2068 STATS_ON_LOAD
756E647265
642074776F
demo@PDB19>
September 29, 2021 - 6:32 am UTC
My understanding is that this is by design and not a bug.
Real time stats in general is trying obtain good values for:
- total number of rows,
- column min/max values
*without* needing to read any existing table data.
For example, if the existing min/max for a column is 1-100 and you update those values to 20-120, then you'll probably see the RTS as 1-120. Why? Because its easy to track that 120 is higher than the existing max of 100, but its impossible to know that 20 is now the new minimum *without* scanning the table.
Because we're trying to track total number of rows and min/max, we're going to try do our best to keep that in sync. So...
- if you insert, we can adjust total, min and max in sync because its an insert
- if you update, we can adjust total and max in sync (total doesn't change), we potentially lose 'min'
- if you delete, then we could only change 'total' which would leave it out of sync with min and max so we don't do it. Its better for us to keep all three in sync.
- But if you update *and* delete, then we probably will track the delete, because we've now lost the 'min' sync anyway.
- Similarly, if you have histograms, we're going to use them in preference because its likely to be better data.
In all cases, we're doing our best to keep the stats as "intelligent" as possible. Its always going to be the case that its a "best effort" because things like "delete every row" are exceptional cases.
And of course, what we can achieve with RTS will change with each release as we enhance the feature. In 21c, you'll see we can potentially adjust distinct values now as well