Some details
Maxim, April 26, 2021 - 5:58 am UTC
Hello! Thank you very much for accepting our question! I try to add some detailes to make the problem easiler to understand. There is a table T1 in DB1
Create table t1
(
Id number primary key,
some_column1 varchar2 (50),
some_column2 varchar2 (50)
);
Rows are inserted using a sequence sq_t1.nexval for id value
Insert into table t1
(
Id ,
Some_column1,
Some_column2
)
Values
(
sq_t1.nexval ,
'r',
's'
);
There is an after insert trigger on that table that inserts some of the rows (rows with a specific value of Some_column2 ) into a stage table in DB1. Once in a while a process awakes (every 5 min), gets the rows from the stage table in DB1 and sends them over a DBlink to a stage table in DB2. In DB2 a job every 5 min checks the stage table, and invokes a procedure that inserts data from stage table in a table t1 - a copy of table t1 in DB1, and into a bunch of child tables.
The problem is that rows are being inserted in the parent table t1 (copy of table t1) using its own sequence for id column, and, at the same time, - into child tables with id of rows in DB1.
There are no FK constraints on child tables, so that process was going unnoticed for years.
INSERT INTO t1 -- a table in DB2
(id,
Some_column1,
Some_column2)
VALUES
(sq_t1.nextval,
data1_from_row_from_stage_table,
data2_from_row_from_stage_table);
INSERT INTO t1_child -- a table in DB2
(id,
id_t1,
date_column)
VALUES
(sq_t1_child.nextval,
id_from_row_from_stage_table,
sysdate);
Now, when we have discovered this bug and we are looking for the best way to make it right. We can see two options
1) update PK in t1 in DB2 based on values in t1 DB1 (but the table is big - about 200 million rows) and stop using sequence for id values in it
2) keep using sequence and update id_t1 columns in child tables ( easier way - but the first way is more preferable for us)
We would appreciate your advise about this matter. Does the option 1 seem insane?
May 06, 2021 - 5:09 am UTC
200 million rows is big but not *that* big, so you should be able to correct this.
However, an update is probably a very slow way to do it - you'll be quick to recreate the tables and correct the ID as you go.
The process for this would be:
In DB1
- unload a copy of T1 being just the columns you need to do sufficient matching in DB2
- load that table up into DB2 as (say) T1_TEMP
In DB2
- create T1_NEW by joining T1 and T1_TEMP using a CTAS
- do the same with T1_CHILD_NEW
- Then apply indexes, foreign keys etc once those new tables are loaded.
- drop the old tables and rename the _NEW tables to the correct name
Our thanks :)
Maxim, May 26, 2021 - 5:11 am UTC
Thank you very much for your advice. We've tried it in our test environment, and it worked :)
May 26, 2021 - 1:08 pm UTC
Great to hear, thanks for letting us know :)