Skip to Main Content
  • Questions
  • Urgent Help on converting UTF8 to UCS2

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, chet.

Asked: November 11, 2001 - 8:20 pm UTC

Last updated: August 11, 2004 - 9:33 am UTC

Version: 8.1.7

Viewed 1000+ times

You Asked

I want to get data from Oracle to insert into MS SQL Server that is I have a string in UTF8 encoding and need to convert to UCS2. How to do this in Java?
Your help is greatly appreciated.
Thanks




and Tom said...

It just happens automagically. I read this in the "National Language Support
Guide" book, chapter 6, section "JDBC Class Library".

</code> http://docs.oracle.com/cd/A81042_01/DOC/server.816/a76966/ch6.htm#8036 <code>
=======================================

The [JDBC] library always accepts US7ASCII, UTF8 or WE8ISO8859P1 encoded
string data from the input stream of the JDBC drivers. It also accepts
UCS2 for the JDBC server-side driver. The JDBC Class Library converts
the input stream to UCS2 before passing it to the client applications.
If the input stream is in UTF8, the JDBC Class Library converts the UTF8
encoded string to UCS2 by using the bit-wise operation defined in the
UTF8-to-UCS2 conversion algorithm. If the input stream is in US7ASCII or
WE8ISO8859P1, it converts the input string to UCS2 by casting the bytes
to Java JDBC characters.

At database connection time, the JDBC Class Library sets the server
NLS_LANGUAGE and NLS_TERRITORY parameters to correspond to the locale of
the Java VM that runs the JDBC driver. This operation is performed on
the JDBC OCI and JDBC Thin drivers only, and ensures that the server and
the Java client communicate in the same language.



Rating

  (4 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Utf8 and JDBC

Heriawan, September 08, 2002 - 10:09 pm UTC

Could you give me more detail of examples how java using oracle jdbc can store multilanguages for double bytes eg:chinese,japanese,etc into Oracle database whose UTF8 character set?

Tom Kyte
September 09, 2002 - 8:02 am UTC

Just do it -- it just does it.

An example on this ascii screen would look something like this:

@#$@!~#@$#@!

Not very interesting. The code is the code is the code, regardless of whether you have ascii text or chinese.

java.sql.SQLException: Fail to convert between UTF8 and UCS2

Vikas Khanna, September 10, 2003 - 10:00 am UTC

Hi Tom,

This is in regards to a very critical issue which is being faced by us. Hope you will provide us with some solutions.

We are using Oracle Thin Driver to read from Oracle 7.3 database. The databsase containg information in Japanese chatacters. (UTF-8 characher set).

For some of the records the application throws following exception while calling getString function of ResultSet...

java.sql.SQLException: Fail to convert between UTF8 and UCS2

Pls. suggest how to read this data that throws this error.
Is it due to some invalid characters in database for which conversion is not able to work with.

Also for that column we checked with the Substr() function by getting character by character and then we found that on the 21st character this Exception does come and prior to this all 20 characters had no problem.

Please suggest what can be done.Hoping to get a solution ASAP.

With Kind regards,
Vikas Khanna

Tom Kyte
September 10, 2003 - 7:40 pm UTC

sorry -- i don't really program data, definitely do not have a 7.3 database with japanese characters.

no ideas.

Conversion of data from UTF8 Character set to WE8ISO8859P1

Thara Harish, August 10, 2004 - 2:34 pm UTC

Hi Tom,

The data in the database is stored in the UTF8 Character set , actually want to convert the UTF8 data into WE8ISO8859P1 character set.
Please do need some kind of routine or the materials which can help me in developing the routine to convert the UTF8 character set data to WE8ISO8859P1 character set.

For eg: Míguel Bógan Tést Colonia Peñuelas, should be converted to the WE8ISO8859P1 character set or ASCII equivalent.

Please do help me ASAP.

Thanks,
Thara

Tom Kyte
August 10, 2004 - 3:49 pm UTC

set your NLS_LANG to WE8ISO8859P1 and simply "select it" and it'll be converted.

clients specify their characterset, we convert to it.

if you want this to be a permanent conversation -- export (with the nls_lang set) and import it into a weiso database.

beware -- it is a "lossy" conversion, meaning that you will definitely have "characters changed" on you as utf8 has more characters than weiso... can represent.

Re: Conversion of data from UTF8 Character set to WE8ISO8859P1

A reader, August 11, 2004 - 9:33 am UTC

Hi Thara Harish,

You can also use CONVERT functions i.e

SELECT CONVERT(<colname>,<destination charset>,<source charset) FROM <tablename>