Hello, Ask TOM Team.
Scenario:
1. We have to deploy a feature in one app. The change consists of avoiding duplicates (mix of three database columns).
2. In the database site I have to add one column to a table (200+ millions) and then update it so the app can validate the rows and avoid duplicated rows. At the end of the database maintenance I have to create an index to avoid the duplicates in th database site.
So, here is the database steps (process 1):
1. Move duplicate rows to another tables (insert into) in another schema (like 6 or 7 tables with the same structure). --2hrs
2. Delete those rows in the original schema. --2 hrs
3. Update the column in one table (almost all rows in the table). --40 mins
4. Create a unique index. -- 12 mins
All of these tasks can be done online because rows can still being inserted. The problem is when deleting the rows (step 2), users can still insert duplicate data in the relational tables and the delete script may fail in some point because of FK constraint.
Process 2
I have the option to make the update first so the app can have the updated data to avoid duplicates despite the database index will not be created at this point. At this point, duplicates rows will not be inserted in the database and I can run the rest of steps without no issue. The problem is that this update takes 40 minutes to complete. I could rename the old table, create a new table with the original name, insert the data avoiding the update using and rename the new table with the original table name. This is much faster but is not an online process.
Questions:
1. What can I do to avoid the FKs constraint error? Process 1
2. How can I do the Process 2 online?
Thanks.
Reagards,
It's unclear to me what the new column is for. Is this necessary to prevent duplicates? Or a flag the app uses to know there are no duplicates?
So here are some general thoughts.
To stop new duplicates appearing between copying and deleting the current duplicates, you can create an unvalidated unique constraint. Note this requires a non-unique index.
For example:
create table t ( c1 int );
insert into t values ( 1 );
insert into t values ( 1 );
commit;
create index i on t ( c1 );
alter table t add constraint u unique ( c1 )
using index i
novalidate;
insert into t values ( 1 );
-- ORA-00001: unique constraint (CHRIS.U) violated on table CHRIS.T columns (C1)
2 hours for the inserts to complete feels like a long time - even for 200M+ rows. If this is taking 2 hours, I would expect there to be ways you can speed this process up.
What can I do to avoid the FKs constraint error? Process 1Add a WHERE clause to check there NOT EXISTS child rows!
If this is complicated you can use the LOG ERRORS clause to capture the failed rows. This allows the delete to complete and remove rows without children.
However, this does bring the bigger question of how you plan on handling the child rows after the process completes.
How can I do the Process 2 online?DBMS_redefinition has an execute_update procedure which enables you to do online updates.