Outdated
Wat Hughes, May 16, 2018 - 8:11 pm UTC
dbms_lob.getlength works for me on CLOB values.
May 18, 2018 - 2:16 am UTC
Can you elucidate on what you mean by "works"?
getlength always "works" but the question was about getting a byte length rather than character length.
A reader, July 04, 2018 - 9:50 pm UTC
Exactly right
P Jordan, November 04, 2019 - 4:55 pm UTC
DBMS_LOB.GETLENGTH returns character length, not bytes.
cloblengthb is nice function, but wrong
Pavol Babel, December 15, 2021 - 8:23 am UTC
Oracle store CLOB data type in UCS-2, rather than UTF-8 even in case when charatcer set is AL32UTF8. That means fixed 2-byte representation of every character.
So cloblenthb is obviously wrong, at least it does not tell anything of your CLOB size in bytes.
December 16, 2021 - 6:54 pm UTC
I reached out to Sergiusz Wolicki, our NLS/character set expert about this:
They are stored in UTF-16 but only if the database character set is multibyte. In single byte databases, CLOBs are stored in the database character set, so each character is one byte. As the national character set is always multibyte, NCLOBs are always in UTF-16.
UTF-16 means that the supplementary characters in an AL32UTF8 database (code points > U+0FFFF) are stored as surrogate pairs. Each AL32UTF8 character is then mapped to two UTF-16 codes. So, each original character may occupy either one or two 2-byte codes, kind of 2 or 4 bytes, depending on whether its Unicode code point is below or above U+0FFFF.
To byte or not to byte
Name, December 17, 2021 - 2:54 pm UTC
So cloblenthb is obviously wrong, at least it does not tell anything of your CLOB size in bytes.
This is discussion statement.
If you need the length in bytes for external exchange (for example http content-length), you need to calculate the bytes in charset of that exchange protocol.
If you want to roughly estimate the space occupied by CLOB, for fixed-width database use the dbms_lob.getlength(). For variable-width database use the dbms_lob.getlength()*2. Do not use length(CLOB) in database AL32UTF8, since it counts supplementary characters as 1 char.
The support note about CLOBs states that UCS2 for CLOB was used for 9x database. Since 10g is used the endian independent AL16UTF16 charset. Globalization support guide up to 12c mentioned UCS2.
January 11, 2022 - 3:54 am UTC
Agreed.
Unless your clobs are littered with emojis etc :-), then getlength*2 is going to be close enough to get a good estimate
AdamC, February 28, 2024 - 8:56 pm UTC
create or replace function cloblengthb(in_clob in clob ) return number
as
clob_lengthc NUMBER :=0;
clob_search_range NUMBER :=100;
clob_start NUMBER :=0;
clob_lengthb NUMBER := 0;
BEGIN
clob_lengthc := dbms_lob.getlength(in_clob);
FOR x in 1 .. (ceil(clob_lengthc/clob_search_range))
LOOP
clob_start:=(((x-1)*clob_search_range)+1);
clob_lengthb := clob_blength + lengthb(dbms_lob.substr(in_clob, clob_search_range, clob_start));
END LOOP;
return clob_lengthb;
end cloblengthb;
/
HTH - I have used this to get an accurate answer for the size of a clob in bytes in the past.
February 29, 2024 - 3:04 am UTC
thanks for the code!
Small typo in provided code
Sasa Petkovic, February 29, 2024 - 7:39 am UTC
Hi,
Just noticed typo in last provided code.
BEGIN
clob_lengthc := dbms_lob.getlength(in_clob);
FOR x in 1 .. (ceil(clob_lengthc/clob_search_range))
LOOP
clob_start:=(((x-1)*clob_search_range)+1);
clob_lengthb := clob_blength + lengthb(dbms_lob.substr(in_clob, clob_search_range, clob_start));
END LOOP;
return clob_lengthb;
i guess it should be ?
clob_lengthb := clob_blengthb + ...
Best
Sasa
Then I did typo :)
Sasa Petkovic, February 29, 2024 - 7:41 am UTC
should be
clob_lengthb := clob_lengthb + ...
March 01, 2024 - 1:42 am UTC
:-)