Skip to Main Content
  • Questions
  • Problems with dbms_lob.writeappend and multibyte characters

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, Dieter.

Asked: December 12, 2022 - 8:45 am UTC

Last updated: December 13, 2022 - 3:01 am UTC

Version: 19.3

Viewed 10K+ times! This question is

You Asked

I'm having problems with the dbms_lob.writeappend procedure when using multibyte characters like the smiley in the attached LiveSQL Link.
It seems as if the amount parameter of the API would now work as expected when using the LENGTH or LENGTHB of a VARCHAR2 argument but work as expected when using dbms_lob.getlength of the VARCHAR2 parameter.
What does the function actually expect as amount and what is the proper way to use it?

FUNCTION temp_test_dbms_lob_error(text IN VARCHAR2) RETURN VARCHAR2 IS
 c CLOB;
 t VARCHAR2(32767);
BEGIN
 dbms_lob.createtemporary(lob_loc=>c, cache=>TRUE);
 dbms_lob.writeappend(lob_loc=>c, amount=>LENGTH(text), buffer=>text);
 t := dbms_lob.substr(lob_loc=>c);
 dbms_lob.freetemporary(lob_loc=>c);
 RETURN t;
END;



with LiveSQL Test Case:

and Connor said...

As per the docs: https://docs.oracle.com/en/database/oracle/oracle-database/21/arpls/DBMS_LOB.html

"Number of bytes (for BLOBs) or characters (for CLOBs) to write"

but of course, the key thing there is what do we mean by 'character'? Well that depends on the characterset of the database. Assuming the default (AL32UTF8) or any multibyte charcterset, then CLOBs are stored in UCS2 format, ie, 2 bytes per characters, thus for such a database, you'd be wanting to use length2

This means the 4 bytes for the smiley equates to 2 characters.

SQL> with t as (
  2    select UTL_RAW.CAST_TO_VARCHAR2('F09F988A') smiley from dual)
  3  select 
  4    dbms_lob.getlength(smiley) ,
  5    length(smiley),
  6    lengthc(smiley),
  7    lengthb(smiley),
  8    length2(smiley),
  9    length4(smiley)
 10  from t

DBMS_LOB.GETLENGTH(SMILEY) LENGTH(SMILEY) LENGTHC(SMILEY) LENGTHB(SMILEY) LENGTH2(SMILEY) LENGTH4(SMILEY)
-------------------------- -------------- --------------- --------------- --------------- ---------------
                         2              1               1               4               2               1


Its why I generally go with dbms_lob.getlength

Great test case btw - thanks!

Rating

  (3 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Globalization Support Guide/Unicode

A reader, December 13, 2022 - 7:35 am UTC

In the docs 9i-11g charset of CLOB is UCS2, since 12c AL32UTF8. The note 257772.1 says AL16UTF16 is used since 10g. The main difference between UCS2 and AL16UTF16 is depending on the platform endian LE/BE.
Docs 21c:
Data in CLOB columns is stored in the AL16UTF16 character set when the database character set is multibyte, such as UTF8 or AL32UTF8.

dbms_lob.getlength() has the clob argument type, that is, it entails an additional implicit conversion. It's better to use append() instead of writeappend(..getlength()). And it's even easier to use concatenation ||.

The description of DBMS_LOB should include a warning about the specific of complementary characters and what are characters for POSITION/AMOUNT arguments. SQL Reference should indicate that substr, length, etc. are unsafe when used with CLOB. Moreover, the result may depend on plsql_optimization_level.

declare
  procedure p(lbl varchar2, cl clob) is
    v varchar2(32767 byte);
  begin
    v := substr(cl,1,4);
    dbms_output.put_line(lbl||'len='||length(v)||' lenb='||lengthb(v)||' hex='||utl_raw.cast_to_raw(v)||' str='||v);
  exception when others then
    dbms_output.put_line(lbl||sqlerrm);
  end;
begin
  pragma inline(p, 'yes'); -- default YES with plsql_optimize_level=3
  p('error  : ', to_clob('xyz'||chr(4036991128)));
  pragma inline(p, 'no'); -- default  NO  with plsql_optimize_level<3
  p('3 of 4 : ', to_clob('xyz'||chr(4036991128)));
end;
/

error  : ORA-22831: Offset or offset+amount does not land on character boundary
3 of 4 : len=3 lenb=3 hex=78797A str=xyz

Documentation

Dieter, December 13, 2022 - 9:56 am UTC

Thank you for the feedback and a few more thoughts on this:

1) Would you not agree that the documentation of dbms_lob.writeappend should explain in detail that this api can only be used when using dbms_lob.getlength to determine the length of the varchar2 argument?

2) I just also noticed that thinks are even more confusing when taking into consideration that character encoding of the session. When for eample settings NLS_LANG to a character encoding like we8mswin1252 the LENGHT function works as expected. I do understand why this is the case, but makes using this api correctly even harder.

3) I'm still puzzled on why dbms_lob.getlength and LENGTHC are not returning the same number of characters and would kindly ask you to to elaborate a bit more on this?

singlebit characterset

A reader, December 13, 2022 - 2:23 pm UTC

> I just also noticed that thinks are even more confusing when taking into consideration that character encoding of the session.

CLOB are hadled in database charset, not session charset. If your database is AL32UTF8 and you have set the client encoding to single byte, you cannot pass supplemetary characters from the client, but you can use already existing data and expressions such as chr(4036991128).

> I'm still puzzled on why dbms_lob.getlength and LENGTHC are not returning the same number of characters and would kindly ask you to to elaborate a bit more on this?

lengthC uses a complete unicode characters. Why you think it should be same as AL16UTF16/UCS2? Length2 uses UCS2 which same as AL16UTF16 in terms of length. Also, if ncharset is AL16UTF16, length(nvarchar2) returns same as dbms_lob.getlength(clob).

with t(c, n) as (select chr(4036994985), cast(chr(4036994985) as nvarchar2(4)) from dual)
select c, rawtohex(c) cr, length(c) cl, length2(c) cl2, lengthc(c) clc,
       n, rawtohex(n) nr, length(n) nl, length2(n) nl2, lengthc(n) nlc
  4  from t;

C    CR        CL CL2 CLC N    NR        NL NL2 NLC
---- -------- --- --- --- ---- -------- --- --- ---
🧩  F09FA7A9   1   2   1  🧩  D83EDDE9   2   2   1


More to Explore

PL/SQL demos

Check out more PL/SQL tutorials on our LiveSQL tool.

PL/SQL docs

PL/SQL reference manual from the Oracle documentation library