Skip to Main Content
  • Questions
  • Nvarchar to Varchar2 conversion (UTF8 to AL32UTF8)

Breadcrumb

Question and Answer

Sergiusz Wolicki

Thanks for the question, Enrico Raphael.

Asked: November 12, 2018 - 6:41 am UTC

Last updated: November 15, 2018 - 10:18 pm UTC

Version: 11g

Viewed 10K+ times! This question is

You Asked

We are planning to convert all NVarchar fields to Varchar2 fields as we're going to change our character set and since Oracle recommends AL32UTF8 character set encoding. My question is it 100% sure that all characters from Nvarchar (UTF8) can be converted to Varchar (AL32UTF8) using below.

Update Table1
set Varchar2_field = to_char(Nvarchar2_field);

OR

Update Table
set Varchar2_field = convert(NVARCHAR2_field,'AL32UTF8','UTF8');


We are talking about million records needs to be updated. And before we proceed with our plan we just want to make sure that everything will be converted right. If it will be converted, we would like to ask also if how can we validate it correctly that all data has been converted correctly. Thank you!

and we said...

Generally, you should be able to convert from NVARCHAR2 in UTF8 to VARCHAR2 in AL32UTF8 just by writing:

Update Table1 set Varchar2_field = Nvarchar2_field


TO_CHAR will be added implicitly. You can add it explicitly, of course. However, do not use CONVERT.

All valid UTF8 character values should be converted without problems. (This will be a bit pattern conversion, so Unicode version mismatch between AL32UTF8 and UTF8 does not matter.) However, if you have any illegal byte sequences in the NVARCHAR2 columns, they will be lost. For example, if an NVARCHAR2 column incorrectly contains binary data (e.g. encryption result from the deprecated DBMS_OBFUSCATION_TOOLKIT), this data will be lost in conversion.

To verify validity of your existing data, you can try the CSSCAN tool ( https://docs.oracle.com/cd/E11882_01/server.112/e10729/ch12scanner.htm#NLSPG485 ). It is obsolete in 12.1 but still available in 11g. Use parameters FROMNCHAR=UTF8 TONCHAR=AL16UTF16. All data should be reported as convertible. No lossy data should be reported.

In a database >= 12.1 and in 11g as well you can test the conversion on a clone copy of your database. Before executing the UPDATEs, issue the following statement:

alter session set nls_nchar_conv_excp=true


This will cause the UPDATE statements to fail with "ORA-12713: Character data loss in NCHAR/CHAR conversion", if invalid data is encountered. Without the parameter change, conversion will go through but depending on a particular invalid byte sequence either replacement characters (U+FFFD) will be used for the sequence or the sequence will be skipped. The UPDATE's error_logging_clause ( https://docs.oracle.com/cd/E11882_01/server.112/e41084/statements_10008.htm#BCEFBFCD ) should come handy in identifying the offending data but you need to create the error log table manually with mandatory description/control columns only. Do not include original table columns to avoid a recursive ORA-12713 from the error log code.


Is this answer out of date? If it is, please let us know via a Comment

More to Explore

Administration

Need more information on Administration? Check out the Administrators guide for the Oracle Database