Skip to Main Content
  • Questions
  • Why is CLOB Segment Size Twice That of the Actual Fize Sizes

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, Beth.

Asked: January 14, 2020 - 6:10 am UTC

Last updated: January 20, 2020 - 3:02 am UTC

Version: Oracle 19c

Viewed 1000+ times

You Asked

I created a table with a SecureFiles CLOB in Oracle 19c. I have a program that populates the CLOB from text files on the file system. After executing the populate routine, I looked at the size of the CLOB segment and it is 2560 Mb. The sum of the file sizes themselves is approximately half that size.

So it appears that the segment size is double the file sizes. I altered the RETENTION parameter on the CREATE TABLE command to use RETENTION MIN 60, to see if this decreases the amount of space used by the CLOB. This had no effect. The segment size was still 2560 Mb.

My question is: I know that the Oracle documentation states that the UNDO for CLOBs is stored in the CLOB segment itself. But why is it double the size of the files? And is there anything I can do to decrease the amount of storage? Is there a way to determine how much of the CLOB segment space is the actual CLOB value versus the UNDO space?

and Connor said...

It is not related to your retention, but your characterset.

If you have a variable width characterset (UTF-8 etc) for you database, then we will store clobs in a UCS-2 (UTF-16) format in all cases.

You can see this outside the database as well:

Here I have a standard text file:

C:\>dir c:\temp\file1.txt
 Volume in drive C is OS
 Volume Serial Number is 66D0-4AB5

 Directory of c:\temp

15/01/2020  03:00 PM         1,000,000 file1.txt


I open that file and choose to save it as UTF-16 format, and the resulting file is now:

C:\>dir c:\temp\file1*
 Volume in drive C is OS
 Volume Serial Number is 66D0-4AB5

 Directory of c:\temp

15/01/2020  03:00 PM         1,000,000 file1.txt
15/01/2020  03:09 PM         2,000,000 file1u.txt



Rating

  (1 rating)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Options ??

J. Laurindo Chiappa, January 16, 2020 - 1:24 pm UTC

yes, probably it can be the root cause... The options/alternatives, afaik, could be :

1. use BLOB instead of CLOB : BLOB is a stream of bytes, WITHOUT character set information

or

2. change NATIONAL CHARACTER SET and use NCLOB instead of CLOB

Right ?
Connor McDonald
January 20, 2020 - 3:02 am UTC

If your database is UTF based, then large objects are stored in UCS-2.

More to Explore

Administration

Need more information on Administration? Check out the Administrators guide for the Oracle Database