You Asked

I'm performing a character set conversion from US7ASCII to AL32UTF8 using expdp/impdp.

I've found that that the segment sizes of tables without CLOBS remain about the same, but segments underlying CLOB columns
roughly double in size.

Is this expected behaviour?

and Connor said...

Yes this is expected.

Under a single byte characterset (eg US7ASCII) then everything (as the name suggests) is stored as a single byte.

When you move to AL32UTF8 (which btw is an excellent thing!) then

a) characters in a VARCHAR2 are stored with as many bytes "as needed". Thus all of your existing single bytes from US7ASCII are going to still be single bytes. As you store more "exotic" characters, they'll use 1,2,3 or even 4 bytes.

b) CLOBs are different. They move from single byte to a format known as UCS2, your existing single byte characters will now all be stored as 2 bytes.

More details on UCS here https://en.wikipedia.org/wiki/Universal_Coded_Character_Set

Is this answer out of date? If it is, please let us know via a Comment

More to Explore

Administration

Need more information on Administration? Check out the Administrators guide for the Oracle Database

Breadcrumb

Question and Answer

You Asked

and Connor said...

More to Explore

Administration

Increase in size of CLOB objects during character set conversion.

Breadcrumb

Question and Answer

You Asked

and Connor said...

More to Explore

Administration