Skip to Main Content

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, Guglielmo.

Asked: February 14, 2018 - 7:41 pm UTC

Last updated: February 20, 2018 - 1:15 am UTC

Version: 12.1.0.2

Viewed 1000+ times

You Asked


I have an upcoming production migration I need help with. Table A has hundreds of million of rows. The migration is supposed to update a column in table A, but instead of doing an UPDATE it creates table B (one-to-one with table A) and then does a CTAS to create table C, then drops table B.

The migration developer did not bother to take statistics on table B. The CTAS has a parallel join which is using broadcast instead of hash. It's adding an extra 12 hours to the migration.

Management doesn't want to touch the migration code to add statistics collection or even add SQL hints. I want to tell Oracle to use a parallel hash join (not broadcast) but the sql id for the CTAS does not exist until the migration kicks off. Table B does not exist until the migration kicks off either.

Is it possible to create a sql plan baseline for a query that doesn't exist yet, or to create stats for a table that doesn't exist yet? I know there is an undocument procedure that rewrites incoming SQL code, that could be an option as well.

I know that this is solving the wrong problem, but that's what I need help with.

and Connor said...

Maybe a SQL translation

SQL> exec dbms_sql_translator.create_profile('CTAS');

PL/SQL procedure successfully completed.

SQL> exec dbms_sql_translator.register_sql_translation('CTAS','create table t as select * from dba_objects','create table t as select * from dba_objects where rownum <= 10');

PL/SQL procedure successfully completed.

SQL> alter session set sql_translation_profile = CTAS;

Session altered.

SQL> alter session set events = '10601 trace name context forever, level 32';

Session altered.

SQL> create table t as select * from dba_objects;

Table created.

SQL> select count(*) from t;

  COUNT(*)
----------
        10

1 row selected.


Rating

  (2 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

A reader, February 19, 2018 - 1:09 pm UTC

Thank you very much.

It's a good thing 12c has this function as a documented pl/sql package. I remember seeing an undocumented procedure in 11g but I couldn't remember the name.

One thing I was wondering while pondering this type of solution was how much overhead there would be. But in any case I would only need this for the duration of the data migration.

Another option I was thinking about was turning on dynamic sampling instance-wide for the duration of the migration, but I would like the stats to run with auto_sample_size and I am not sure if there is a DYNAMIC(..) parameter that will reproduce that stats collection algorithm (where it builds a hash as it goes without sorting rows.)

Have a great day.


Connor McDonald
February 20, 2018 - 1:15 am UTC

One thing I was wondering while pondering this type of solution was how much overhead there would be.

Compared to the time it will take to run a big CTAS....negligible. It is a parse-time operation.

but I would like the stats to run with auto_sample_size

That is the default anyway, so unless you've tweaked preferences we'll be using the optimized NDV stats algorithm

online stats gathering

Rajeshwaran Jeyabal, February 20, 2018 - 8:39 am UTC

....
but instead of doing an UPDATE it creates table B (one-to-one with table A) and then does a CTAS to create table C, then drops table B.
The migration developer did not bother to take statistics on table B.
....


Since you were on Oracle 12c, creating a table by default do stats gathering (dont need an addition table scan for stats gathering) because of the new feature "online stats gathering".

Exception is only for the partition tables where partition level stats wont be available by default.

....
The CTAS has a parallel join which is using broadcast instead of hash. It's adding an extra 12 hours to the migration.
....


Again in 12c, new feature called "Adaptive plans" that has the ability to changing the parallel distribution methods at run time (HYBRID-HASH distribution) either to go with Broadcast or HASH distribution based on the resultset size and PX servers available.

Does these features don't help here ?

More to Explore

Administration

Need more information on Administration? Check out the Administrators guide for the Oracle Database