Skip to Main Content
  • Questions
  • Dropping a Column - Why can't we do this operatio in Parallel.

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, koshal.

Asked: March 14, 2011 - 9:51 am UTC

Last updated: August 31, 2011 - 8:54 am UTC

Version: 10.2.0.4

Viewed 10K+ times! This question is

You Asked

We had a huge Table around 360GB. Inserting that much data into the Table with Parallel degree of 96 took 5 hours. When we tried to Drop the Column it took over 36 hours and did not completed. We cancelled, in order for the session to get back to normal state it took 60 hours. We noticed it was not performing Drop Statement in Parallel. Why can't we have this feature in Oracle? Similarly when we add a column with default value it takes for ever and does not perform in Parallel. These are basic stuff why can't these operation be parallelized. Was this taken care in 11G?


Thanks a lot for your response.

and Tom said...

360gb is a rather smallish table.

It is highly unlikely you have 24 cpus (I say that, because if you spent that much for hardware, but your IO throughput is such that you need 5 hours to move a mere 360gb....) - so your parallel 96 was probably very high as well.


You don't want to have to rewrite the entire table, you should have just set it unused. We'd reclaim your space if you ever reorganized the table.

There is also the fact that you loaded this column and then you turn around and want to drop it???? Why would you do that? It seems the mistake here is not that "drop column isn't parallel" but rather that you loaded this column at all.

Set unused would have been instantaneous.

DBMS_REDEFINITION could have been used (in parallel) to "reorganize" the column away.

But not loading it in the first place is absolutely the right answer don't you think? Your IO isn't very fast if it took that long to write that much data - do you really want to have to read it all from your disk array, pull it over to the server, modify it, and write it back?


In 11g, you can add a column to a table with a default value instantly. We don't even touch the table anymore. We just change the structure in the dictionary and put the default value in the dictionary once. When we access the row later - we can tell (because the column won't appear on the block at all) - if the default value should be returned or not - so we don't have to update the block at all when we add it.




ops$tkyte%ORA11GR2> alter table big_table add x varchar(4000) default rpad('*',4000,'*') not null;

Table altered.

Elapsed: 00:00:00.12
ops$tkyte%ORA11GR2> select count(*) from big_table;

  COUNT(*)
----------
   1000000


Rating

  (5 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Default values in 11g - Fantastic

A reader, March 14, 2011 - 2:40 pm UTC

That is really smart thinking by Oracle on Default values. I love the way Default values are handled in 11G. One good reason why you need to migrate to 11G.

Thanks a lot for that info

11g new-column defaults are buggy, though

Mike, March 15, 2011 - 1:49 pm UTC

I saw an intermittent problem producing incorrect results (!) (!!) that was traced back to the new 11g add-column-with-default-without-updating-the-physical-rows feature. Oracle support suggested that, even though there was a patch available for our specific issue, we might want to disable the feature rather than risk hitting one of the many other bugs associated with it.

Tom Kyte
March 15, 2011 - 2:09 pm UTC

sigh. did you know there is a bug with select - period? More than one probably. should you not use select? As we are now in the 4th generation of 11g...

Setting to unused

A reader, March 16, 2011 - 2:27 pm UTC

1. The problem with setting it to unused is
a) You dont get the space back
b) It would count against 1000 column limit
2. When we issue a Drop statement why can't Oracle create another Table/Index and once it is done change the pointers to the new table in Data Dictionary. Something like how Index is Rebuilt is done and how exchange of partition works. Understood that entities would be locked for some time.

Reason for Dropping a column.
a) Change of business rules.
b) Change in Data Structure of source system. It is beyond one's control
c) Say out of 360GB data. A text column has 200 GB and if the Application does not need that text column any more, it makes perfect sense to Drop that Column than setting it unused and scanning 360GB

Tom Kyte
March 16, 2011 - 4:11 pm UTC

1) you just loaded a 360gb table. Now you want to rewrite it all over again. The right answer for this was "don't load that column in the first place"

See (c) below for the "you don't get space back" bit

If you are bumping up against the 1000 column limit, you got bigger fish to fry in my opinion.

2) that would be called dbms_redefinition, and you can certainly do that and it was mentioned in the original answer.


a) fine, unused satisfies that
b) see (a)
c) but dropping the column won't change the high water mark now will it. You'll still be scanning 360gb of data. You still need a reorganization if you want to "reclaim" space.

Drop should have an option

koshal patibandla, March 29, 2011 - 5:11 pm UTC

I still feel Alter table drop column column_name should have an optional whether it want to
1)
a. Reorganize the Table
b. Whether the drop option needs to be done in Parallel or not
2) Optimizer should be smart enough to guess which route it has to travel by the size of the Table. Taking lot of time and leaving empty space here and there is dumbest thing to do on a large table for a column that existed right from the begining.Drop column definitely need to have more options.




Tom Kyte
March 30, 2011 - 1:34 am UTC

1) we do have the ability to choose on the reorganization thingy - we call the one that does not 'rewrite' the table "SET UNUSED"

2) this makes no sense to me, I don't know what you are trying to communicate to us here. What does the OPTIMIZER have to do with anything here???

Taking lot of time and leaving empty space here and
there is dumbest thing to do on a large table for a column that existed right
from the begining.


and I'll say it again because you don't seem to be hearing it - SET UNUSED, it gets rid of the column, it happens instantly. If you want the column to totally disappear after that, in a non-blocking, online fashion - there is this thing called dbms_redefinition.

DROP COLUMN does NOT make the table any smaller, you would still have to reorganize it if your goal was to "get rid of empty space". So, if your goal is:

a) get rid of column
b) compact all space
c) in a fashion that end users are not affected, no downtime

then - the right answer would be set unused+dbms_redefinition.

the options all exist.

We do adding-column in 10gR2 this way and it is fast

A reader, August 30, 2011 - 6:11 pm UTC

In 10gR2, we can add a new column to a big table with default value using the alter statement below

alter big_table add (new_col_name integer default 0 not null );

But it will take very very long time to execute. If we split the task into multiple statements showed blow, it will very fast:
Step 1. Adding the column
alter big_table add (new_col_name integer);
Step 2. Update it to default value in parallel
alter session enable parallel dml;
update /*+ parallel( big_table, 24) */ big_table t set t.new_col_name = 0;
commit;

Step 3. set the table default option
alter big_table MODIFY new_col_name DEFAULT 0;


The Step 1 and Step 3 are instantly. The most times are spent on the Step 2. At least this approach works for us for the tables with 150M records (avg_row_length = 596).

I don't the performance of the statement below in 10gR2. Instantly? I don't think so.

Tom Kyte
August 31, 2011 - 8:54 am UTC

Instantly? I don't
think so.


what did you mean by that?


I wrote:

In 11g, you can add a column to a table with a default value instantly. We don't even touch the table anymore.


More to Explore

DBMS_REDEFINITION

More on PL/SQL routine DBMS_REDEFINITION here