Skip to Main Content
  • Questions
  • add a column and update the column with a big table

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Lilian.

Asked: May 12, 2002 - 10:33 pm UTC

Last updated: September 16, 2003 - 11:18 am UTC

Version: 8.1.6

Viewed 1000+ times

You Asked

Hi Tom:

Thank you very much for your help as always.

I have a big table called client that currently has over 8 millions records, now I need to add one more column called phone, and have to update phone number to each row. the table has one pk as personid, I created a temp table which includs three colums personid, phone and ind(for check), wrote procedure following as:

declare
v_nCount number(10,0);
v_nTotal number(10,0);
v_nNumber number(10,0);

begin
v_nCount := 0;

select count(*) into v_nTotal from Test_table;

while v_nTotal < v_nCount

Loop
begin

update main_table a
set a.phone = (select b.phone from test_table b where b.personid = a.personid and ind is not null and rownum < 1000);


v_nCount := sql%rowcount;

update test_table
set ind = 1
where rownum < 1000
and ind is null;

v_nCount := v_nCount + v_nNumber;

commit;

end;
end loop;

I really like to have your suggestion before i run this procedure, whether the procedure will less impact our database performance, redo, archive log.
what is the best way to update the big table.

Thank you in advance,
an


and Tom said...

To update 8 million rows and set an UNINDEXED column to some value -- I would most certainly do it in a single update.


update ( select a.phone new_phone, b.phone set_to_phone
from main_table a, test_table b
where a.personid = b.person_id )
set new_phone = set_to_phone;


And that is it. Period, no complex logic.

Your logic appears flawed to me -- UNLESS there is exactly one row for each and every row in main_table in test_table. If there is one row more or less in either table -- if the set of personid's is NOT EXACTLY the same -- you will miss some updates.

Your counting is not making sense to me either -- you ALWAYS set v_nCount to sql%rowcount (number of rows last updated) and add v_nNumber to it -- but that is never set.

Your update of main_table ALWAYS updates ALL ROWS in main_table!!!! It is most definitely wrong.


Just update the join, skip the "fancy" logic, do it the right way. Just make sure to have sized your rollback large enough to do this operation.

Rating

  (7 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

update 8 million column

an, May 13, 2002 - 9:25 am UTC

Thank you very much for your quick response. Maybe my logic mislead.
Here is my problem, there is a table client that has about 8 millions records stored already, now we need to add one column called phone to the table, and load the phone to each row and match each client. What is the best way to do it?

Thank you very much,
an


Tom Kyte
May 13, 2002 - 10:07 am UTC

You were not misleading in the question. My answer is, was and will always be:

To update 8 million rows and set an UNINDEXED column to some value -- I would
most certainly do it in a single update.


update ( select a.phone new_phone, b.phone set_to_phone
from main_table a, test_table b
where a.personid = b.person_id )
set new_phone = set_to_phone;



for this one.

Your logic in your approach is fundementally flawed and doesn't make sense. Just do the update of 8 million rows in a single statement.

Only one thing to add...

Connor, May 13, 2002 - 11:53 am UTC

For the original poster...

A similar experience (an update to a massive table) led our team to a slightly different variant on the same solution. We used rowid ranges (as derived by probing the primary keys) so that we could parallelise the update even though the table was not partitioned.

But the underlying philosophy Tom suggests is the same - use simple SQL to do it.

how about indexed?

Doug, May 13, 2002 - 5:02 pm UTC

You've emphasized UNINDEXED column. How about an indexed one?

Tom Kyte
May 13, 2002 - 7:00 pm UTC

drop the index, do the bulk update, build the index in parallel with nologging.

thanks

an, May 13, 2002 - 9:34 pm UTC

Hi Tom:

Thank you millions! I got your point later today, I never thought it was as simple as you pointed. I dived into very hard way at beginning.
Thanks again.


Good SQL ---but will it work if Table A has no PK ?

Ajeet, September 15, 2003 - 10:13 pm UTC

This is a good solution but seems it will not work if table
Main_Table (in the sql below) has no Primary key..

Is there a work around of this...

update ( select a.phone new_phone, b.phone set_to_phone
from main_table a, test_table b
where a.personid = b.person_id )
set new_phone = set_to_phone;

Sometime back I tried and I get this error..i am missing something basic...can anyone help me here.

Thanks


Tom Kyte
September 16, 2003 - 7:50 am UTC

if there isn't a primary/unique key, you are back to a correlated subquery AND there had better be uniqueness else the query will fail in the end no matter what (as the results of the update would be ambigous)

my suggstion - add the unique constraint and update the join.

short of that

update main_table a
set phone = (select phone from test_table b where b.person_id = a.person_id)
where exists (select phone from test_table b where b.person_id = a.person_id);

but remember, if person_id is not unique in B -- that'll not work, we'd have no idea WHICH row to use (so the unique constraint needs to be there anyhow)

Error was this

A reader, September 15, 2003 - 10:14 pm UTC

ORA-01779: cannot modify a column which maps to a non key-preserved table...
sorry about this..forgot to put in my earlier post.

Thanks
Ajeet

Tom's update is most useful ... the rest? ... hmm, scary.

Gabriel, September 16, 2003 - 11:18 am UTC

To Ajeet,

Come on man ... don't just stop and ask on first error message ... just a bit of investigation would have shown:

/*
** For the “special” update method …
*/
update ( select a.phone new_phone, b.phone set_to_phone
from main_table a, test_table b
where a.person_id = b.person_id )
set new_phone = set_to_phone;

1. You get ORA-01779 when test_table doesn't have the PK on person_id (hey, this is what the message for ORA-01779 is actually saying!)
2. If test_table has the PK on person_id, the update will work whether the PK on main_table.person_id exists or not ... or data is unique by person_id!
3. Any method of enforcing uniqueness on test_table (person_id) will do … in other words, an unique index would have been enough. Personally, I like to declare and name things … hence I would have the unique or primary key constraint declared (whichever is appropriate) … the constraint would get me the unique index as well.

/*
** for the “classic” update method …
*/
update main_table a
set phone = (select phone from test_table b where b.person_id = a.person_id)
where exists (select phone from test_table b where b.person_id = a.person_id);

This will work in all cases (constraints defined in both tables, constraints defined in one only, constraints not defined in both) … except when data in test_table is not unique by person_id. In this case one would get this message:

“ORA-01427: single-row subquery returns more than one row”

Note: I make distinction between having a unique constraint on a table (which also guarantees uniqueness of data) and data being unique. Of course, when the constraint is missing, data may be unique now (hence the update will work) … but not tomorrow (when the update will fail). Anyone heard about the brilliant application, which passed all QA tests (‘cause they had “nice”, “it-will-always-be-like-this-take-my-GUI-APP-architect-word-for-it” data) … but failed on first production day? Love those … they keep me employed.

Gabriel


More to Explore

PL/SQL demos

Check out more PL/SQL tutorials on our LiveSQL tool.

PL/SQL docs

PL/SQL reference manual from the Oracle documentation library