update 8 million column
an, May 13, 2002 - 9:25 am UTC
Thank you very much for your quick response. Maybe my logic mislead.
Here is my problem, there is a table client that has about 8 millions records stored already, now we need to add one column called phone to the table, and load the phone to each row and match each client. What is the best way to do it?
Thank you very much,
an
May 13, 2002 - 10:07 am UTC
You were not misleading in the question. My answer is, was and will always be:
To update 8 million rows and set an UNINDEXED column to some value -- I would
most certainly do it in a single update.
update ( select a.phone new_phone, b.phone set_to_phone
from main_table a, test_table b
where a.personid = b.person_id )
set new_phone = set_to_phone;
for this one.
Your logic in your approach is fundementally flawed and doesn't make sense. Just do the update of 8 million rows in a single statement.
Only one thing to add...
Connor, May 13, 2002 - 11:53 am UTC
For the original poster...
A similar experience (an update to a massive table) led our team to a slightly different variant on the same solution. We used rowid ranges (as derived by probing the primary keys) so that we could parallelise the update even though the table was not partitioned.
But the underlying philosophy Tom suggests is the same - use simple SQL to do it.
how about indexed?
Doug, May 13, 2002 - 5:02 pm UTC
You've emphasized UNINDEXED column. How about an indexed one?
May 13, 2002 - 7:00 pm UTC
drop the index, do the bulk update, build the index in parallel with nologging.
thanks
an, May 13, 2002 - 9:34 pm UTC
Hi Tom:
Thank you millions! I got your point later today, I never thought it was as simple as you pointed. I dived into very hard way at beginning.
Thanks again.
Good SQL ---but will it work if Table A has no PK ?
Ajeet, September 15, 2003 - 10:13 pm UTC
This is a good solution but seems it will not work if table
Main_Table (in the sql below) has no Primary key..
Is there a work around of this...
update ( select a.phone new_phone, b.phone set_to_phone
from main_table a, test_table b
where a.personid = b.person_id )
set new_phone = set_to_phone;
Sometime back I tried and I get this error..i am missing something basic...can anyone help me here.
Thanks
September 16, 2003 - 7:50 am UTC
if there isn't a primary/unique key, you are back to a correlated subquery AND there had better be uniqueness else the query will fail in the end no matter what (as the results of the update would be ambigous)
my suggstion - add the unique constraint and update the join.
short of that
update main_table a
set phone = (select phone from test_table b where b.person_id = a.person_id)
where exists (select phone from test_table b where b.person_id = a.person_id);
but remember, if person_id is not unique in B -- that'll not work, we'd have no idea WHICH row to use (so the unique constraint needs to be there anyhow)
Error was this
A reader, September 15, 2003 - 10:14 pm UTC
ORA-01779: cannot modify a column which maps to a non key-preserved table...
sorry about this..forgot to put in my earlier post.
Thanks
Ajeet
Tom's update is most useful ... the rest? ... hmm, scary.
Gabriel, September 16, 2003 - 11:18 am UTC
To Ajeet,
Come on man ... don't just stop and ask on first error message ... just a bit of investigation would have shown:
/*
** For the special update method
*/
update ( select a.phone new_phone, b.phone set_to_phone
from main_table a, test_table b
where a.person_id = b.person_id )
set new_phone = set_to_phone;
1. You get ORA-01779 when test_table doesn't have the PK on person_id (hey, this is what the message for ORA-01779 is actually saying!)
2. If test_table has the PK on person_id, the update will work whether the PK on main_table.person_id exists or not ... or data is unique by person_id!
3. Any method of enforcing uniqueness on test_table (person_id) will do
in other words, an unique index would have been enough. Personally, I like to declare and name things
hence I would have the unique or primary key constraint declared (whichever is appropriate)
the constraint would get me the unique index as well.
/*
** for the classic update method
*/
update main_table a
set phone = (select phone from test_table b where b.person_id = a.person_id)
where exists (select phone from test_table b where b.person_id = a.person_id);
This will work in all cases (constraints defined in both tables, constraints defined in one only, constraints not defined in both)
except when data in test_table is not unique by person_id. In this case one would get this message:
ORA-01427: single-row subquery returns more than one row
Note: I make distinction between having a unique constraint on a table (which also guarantees uniqueness of data) and data being unique. Of course, when the constraint is missing, data may be unique now (hence the update will work)
but not tomorrow (when the update will fail). Anyone heard about the brilliant application, which passed all QA tests (cause they had nice, it-will-always-be-like-this-take-my-GUI-APP-architect-word-for-it data)
but failed on first production day? Love those
they keep me employed.
Gabriel