Skip to Main Content
  • Questions
  • Altering table columns to a smaller size

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question.

Asked: February 01, 2003 - 6:51 pm UTC

Last updated: February 01, 2003 - 6:51 pm UTC

Version: 8.1.7

Viewed 1000+ times

You Asked

Hi Tom,

We have a big customer table which is referenced by many other tables. The table's primary key is NUMBER(12). However the foreign key columns of the some of the tables which is referencing this customer table were defined as NUMBER. (Seen as NUMBER(38) in data dictionary). This datatype difference causes a data conversion during the checking of the foreign keys and thus a performance loss. In order to prevent it, we tried to alter the NUMBER columns to the same type of our customer table's primary key - NUMBER(12). Since Oracle doesn't allow us to do this unless the column values are NULL, we had to insert them with their rowids to a temporary table, setting them to NULL, altering the column to NUMBER(12), and populating the original values from the temporary table again. This is a very slow process. So that we can't even see its end after 8 hours of working and we had to stop it. My question is, could you please offer us a faster way of doing this operation?

(The total size of the tables we have to alter is about 220 million rows, we are running all DML in parallel and nologging mode, we are creating the temporary table by create as select)

Thanks for your help...

and Tom said...

really -- do you have hard numbers to quantify this "performance difference"?

Funny thing is that all numbers are -- well -- numbers in Oracle.

A number(38) is not stored any differently then a number(12).

A number(38) is not "converted" to a number(12) for comparision. A number(12) is not "converted" to a number(38) for comparision.

It is not like comparing a C int to a C long or a float to a double...

Consider this test -- p1(number(12)) -> c1(number)
p2(number(12)) -> c2(number(12))

we'll compare:

o bulk insert
o singleton inserts
o bulk delete from parent (with on delete cascade)
o singleton delete from parent (with on delete cascade)

The script is:

create table p1 ( x number(12) primary key);
create table c1 ( x number references p1 on delete cascade);
create index c1_idx on c1(x);

create table p2 ( x number(12) primary key);
create table c2 ( x number(12) references p2 on delete cascade );
create index c2_idx on c2(x);


insert into p1 select rownum from all_objects;
insert into p2 select rownum from all_objects;
commit;

alter session set sql_trace=true;
insert into c1 select * from p1;
insert into c2 select * from p2;

begin
for x in ( select * from p1 )
loop
for i in 1 .. 5
loop
insert into c1 values ( x.x );
end loop;
commit;
end loop;
end;
/
begin
for x in ( select * from p2 )
loop
for i in 1 .. 5
loop
insert into c2 values ( x.x );
end loop;
commit;
end loop;
end;
/
delete from p1 where mod(x,2) = 0;
commit;

begin
for x in ( select * from p1 )
loop
delete from p1 where x = x.x;
end loop;
end;
/
commit;



delete from c2 where x < 12000;
commit;

delete from p2 where mod(x,2) = 0;
commit;

begin
for x in ( select * from p2 )
loop
delete from p2 where x = x.x;
end loop;
end;
/
commit;


And tkprof says.....



insert into c1 select * from p1

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.01 0 0 0 0
Execute 1 1.64 1.74 0 242 49429 23939
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 1.65 1.75 0 242 49429 23939

insert into c2 select * from p2


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 1.72 1.78 0 239 49427 23939
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 1.72 1.78 0 239 49427 23939


well, for the bulk insert, C1 and C2 are for all intents and purposes - the same! a difference of 0.07 cpu seconds is not meaningful here (run a couple of times anc c2 would best c1 sometimes, c1 best c2 the others). Not only that -- but C1 is the NUMBER type and C2 is the NUMBER(12)...

Now, onto the single row inserts:



INSERT INTO C1 VALUES ( :b1 )

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 119695 36.61 38.81 0 1205 657138 119695
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 119696 36.61 38.81 0 1205 657138 119695

********************************************************************************
INSERT INTO C2 VALUES ( :b1 )


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 119695 37.66 38.90 0 1210 657146 119695
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 119696 37.66 38.90 0 1210 657146 119695


Once again, no measurable difference (although C1 again bests C2 -- maybe there is in fact a slight edge to being unconstrained??)

Looking at the deletes, bulk delete first:



delete from p1 where mod(x,2) = 0

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 3.29 4.41 61 40 66705 11969
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 3.29 4.41 61 40 66705 11969


delete from p2 where mod(x,2) = 0

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 3.55 5.04 276 40 66699 11969
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 3.55 5.04 276 40 66699 11969

well, no smoking gun there... looking at single row deletes:

DELETE FROM P1 WHERE X = :b1


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 11970 8.24 8.08 0 23956 818445 11970
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 11971 8.24 8.08 0 23956 818445 11970

********************************************************************************
DELETE FROM P2 WHERE X = :b1

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 11970 8.09 8.65 0 23956 818471 11970
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 11971 8.10 8.65 0 23956 818471 11970

Finally -- C2 bested C1 -- but not by much and it is totally arbitrary


So, my suggestions:

a) benchmark these things, get hard and fast numbers before you go about doing long painful changes

b) stop doing your conversion -- it is quite simply "not worth it"







Is this answer out of date? If it is, please let us know via a Comment