Thanks for the question, Andrew.
Asked: December 24, 2005 - 12:02 pm UTC
Last updated: December 26, 2005 - 10:28 am UTC
Version: 9.2.0
Viewed 10K+ times! This question is
You Asked
Hi Tom,
I was reading your "Expert Oracle Database Architecture" where it says, that (p. 409) "The second technique that works with ON COMMIT PRESERVE ROWS global temporary tables is to user GATHER_TABLE_STATS directly on that table. You would populate the global temporary table ... and then execute GATHER_TABLE_STATS on that global temporary table".
This sounds quite simple, but in my opinion there is a caveat in such a technique.
Consider:
Sess #1> select * from v$version where rownum = 1;
BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.6.0 - Production
Sess #1> create global temporary table gtt (x int)
2 on commit PRESERVE rows;
Table created.
Sess #1> insert into gtt
2 select rownum
3 from all_objects
4 where rownum <= 1000;
1000 rows created.
Sess #1> commit;
Commit complete.
Sess #1> select count(*) from gtt;
COUNT(*)
----------
1000
Now, lets start another session and issue the same query:
Sess #2> select count(*) from gtt;
COUNT(*)
----------
0
So far so good -- no rows, which is an obvious result. Now, lets return back to our first session, gather statistics on temporary table and look closer:
Sess #1> exec dbms_stats.GATHER_TABLE_STATS (ownname => user, tabname => 'GTT')
PL/SQL procedure successfully completed.
Sess #1> set autotrace traceonly exp
Sess #1>
Sess #1> select * from gtt;
1000 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1000 Bytes=3000)
1 0 TABLE ACCESS (FULL) OF 'GTT' (Cost=4 Card=1000 Bytes=3000)
And this is good, too. We gathered statistics, so CBO can make better decisions now. Its easy to notice that cardinality estimate was 100% accurate here.
But what about our second session?
Sess #2> set autotrace on exp
Sess #2>
Sess #2> select * from gtt;
no rows selected
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1000 Bytes=3000)
1 0 TABLE ACCESS (FULL) OF 'GTT' (Cost=4 Card=1000 Bytes=3000)
Here, things become worse. Our query returned no rows but AUTOTRACE shows us that estimated cardinality is 1000, just if we were running this in a first session.
So, if you suggest the following technique:
1. Insert data into temporary table
2. Gather statistics on this table.
3. Run queries against temporary table
-- then Im having problems with it, because an example above shows that, in concurrent environment this approach would be negatively affected.
Suppose, we have two concurrent sessions and they both use the same temporary table. What will happen if the following events do occur in the order specified?
12:00 am. Session 1 loaded 1 000 000 rows into temp table.
12:01 am. Session 2 loaded 100 rows into the same temp table.
12.02 am. Session 2 finished gathering statistics on this temp table.
12.03 am. Session 1 finished gathering statistics.
12.04 am. Session 2 issued some query against temp table and CBO faced wrong statistics, because it was just re-gathered by session 1.
12.05 am. Session 1 issued its own query...
So, as long as I see, theres possible caveat in this technique. I would personally prefer using DYNAMIC_SAMPLING, or maybe CARDINALITY hint to make CBO happy with temporary tables.
And what do you think of this?
Thanks in advance and best regards,
Andrew.
and Tom said...
The goal was to populate the global temporary table with REPRESENTATIVE statistics - statistics that are going to be "in general representative of what you expect in this table"
<quote>
This problem is particularly noticeable with temporary tables. There is simply no way to analyze them and gather statistics. You could use DBMS_STATS.SET_TABLE_STATS to put in representative statistics, but that would be problematic if you didnt know what the representative statistics should be, or if they changed from run to run. (In Oracle8i and earlier, this was your only option.) Also, the defaults for temporary tables are different from the defaults for regular tables.
</quote>
If there is no such "representative set of statistics", then you are in a "rock and hard place" sort of situation. You could use dynamic sampling:
<quote>
Interestingly, the temporary table appears to be much larger and the real table smaller. Enter dynamic sampling. Using this feature, the optimizer will, at query optimization time, dynamically gather just enough statistics on the referenced object to come up with a more meaningful, correct planin the most likely event that your temporary tables do not typically have 8,168 rows (if they do, you wont need this).
For the OPTIMIZER_DYNAMIC_SAMPLING parameter, the valid values are 0 through 10, with 0 meaning dont do any sample and 10 being the most aggressive. Level 1 is the default in Oracle9i Release 2 if the OPTIMIZER_FEATURES_ENABLED parameter is set to 9.2 or above. Otherwise, level 0 is the default. Table 6-5 describes each setting for this parameter.
</quote>
but even so - that only happens during a hard parse - so if YOU hard parse with a global temporary filled with a 1,000 rows - and when I run the same query I only have 1 - it'll use the plan generated for 1,000 rows.
Query Plans with Temporary Tables
</code>
https://www.oracle.com/technetwork/issue-archive/2014/14-sep/o54asktom-2279184.html <code>
Discusses this in more detail.
Rating
(5 ratings)
Is this answer out of date? If it is, please let us know via a Comment