Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Venkatesh.

Asked: February 23, 2003 - 8:15 pm UTC

Last updated: May 13, 2005 - 9:12 am UTC

Version: 8.1.6

Viewed 1000+ times

You Asked

Hi Tom ,
Need to export a JA16SJIS database and import the same into a UTF8 Db . Problems encountered are as follows .

On the JA16SJIS source db set NLS_LANG=AMERICAN_AMERICA.UTF8
Then perform the export which results in ...

Export: Release 8.1.6.0.0 - Production on Sun Feb 23 12:42:17 2003

(c) Copyright 1999 Oracle Corporation. All rights reserved.

Connected to: Oracle8i Enterprise Edition Release 8.1.6.0.0 - Production
With the Partitioning option
JServer Release 8.1.6.0.0 - Production
Export done in UTF8 character set and JA16SJIS NCHAR character set
server uses JA16SJIS character set (possible charset conversion)

About to export specified tables via Conventional Path ...
Current user changed to EXHELP2YMS
. . exporting table HPSYS_COMENT
EXP-00015: error on row 0 of table HPSYS_COMENT, column TOKUBETU_COMENT, datatyp
e 1
EXP-00001: data field truncation - column length=600, buffer size=600 actual siz
e=0
Export terminated successfully with warnings.

Does not work despite setting high buffer value and there is sufficient disk space to perform the export (Metalink suggests this could be a reason)
*********************************************************************

Solution : I set the NLS to JA16SJIS and exported successfully .

IMPORTING TO TARGET UTF8 DB : SET NLS_LANG to JA16SJIS it throws an error like

IMP-00019: row rejected due to ORACLE error 1401
IMP-00003: ORACLE error 1401 encountered
ORA-01401: inserted value too large for column
Column 1 JR0038

WHEN SET TO UTF-8 Reports CONVERSION NOT POSSIBLE ,this I understand as Import utility does not perform conversion from one multibyte char set to another . But why does the former method fail . I set the buffer to a very large value ,but fails .

Please suggest a solution and an explanation for the same .
Thanx in advance .
Venki

and Tom said...

Well export and import actually do do the character set conversions -- whenever possible the CLIENT, not the database does them.

but, that aside -- is the column JR0038 actually large enough -- you know, something that takes 30 bytes in one database might take 60 bytes in the other. Have you looked at the actual size of that column to see if it might really need to be increased???


In fact, in looking at note <Note:99120.1>, i would definitely say this is the cause of BOTH errors -- that in ja16sjis this string takes X bytes and in UTF8 it is taking X+N bytes and X+N exceeds the MAXIMUM string length.

<quote>
Explanation 1
-------------
In the UTF8 characterset a lot of characters (like the ones from the extended
ASCII set) need more than 1 byte to be stored. For example in WE8ISO8859P1 and
similar single byte charactersets the character "ê" takes 1 byte of storage but
it takes 2 bytes to store the same character in UTF8.

A definition of a VARCHAR2 column gives the size in bytes, not in characters.
That means a varchar2(2) in a WE8ISO8859P1 database can store the string "êe"
without problems. A similar varchar2(2) column in a UTF8 database would not be
able to store it since it would take 3 bytes (2 for ê and 1 for the simple e)
to store that string.

When export starts it simply takes a look at the definition of a table and
expects data of that size. However, if the characterset in the NLS_LANG variable
is set to UTF8, the strings selected from the database will be converted into
UTF8. If we follow the above example again the following would happen: The
WE8ISO8859P1 database has got a varchar2(2) column that happily stores the
string "êe" in 2 bytes. Export starts and expects the strings to be of a
maximum size of 2 bytes. When the rows are returned they are converted into
UTF8, meaning that for the row with "êe" 3 bytes are presented to exp instead
of the expected 2 bytes. When this happens the EXP-15 and EXP-1 errors are
returned.


Solution 1
----------
Set the characterset in NLS_LANG to a single byte characterset that is correct
for your platform. There is no reason why this shouldn't be possible.
</quote>



so, you followed the work around but upon IMPORT you see the other side effect - -the string in the dmp file is still TOO LARGE to fit in the column (it was too large to have exp convert it, it is too large for imp to import it)

You must widen this column to hold this data.

Rating

  (4 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Character Set Scanner Utility - CSSCAN

Logan Palanisamy, February 24, 2003 - 5:13 pm UTC

How about using the Character Set Scanner Utility (CSSCAN) available in 9i.

</code> http://download-west.oracle.com/docs/cd/A91202_01/901_doc/server.901/a90236.pdf <code>

You specify the FROMCHAR and TOCHAR sets. It will tell you whether the data is convertible. If not convertible, it will tell you which tables and which columns have the problem data. You fix the data or increase the column width etc.

Hope that helps

TH8TISASCII to AL32UTF8

Alvin, May 11, 2005 - 11:00 pm UTC

I was given an import dumpfile who has a Thai charset (TH8TISASCII) and our current db charset is AL32UTF8.

We are currently using oracle 9i. What i did was changed the environment variable in the OS in the .bash_profile to

export NLS_LANG=THAI_THAILAND.TH8TISASCII

and tried to import it in a schema. And according to oracle documentation the db charset is as follows:

AL32UTF8 Unicode 3.1 UTF-8 Universal character set
MB, ASCII, EURO


My import errors are:

import done in TH8TISASCII character set and AL16UTF16 NCHAR character set
import server uses AL32UTF8 character set (possible charset conversion)
export server uses TH8TISASCII NCHAR character set (possible ncharset conversion)
. importing DBA_HO's objects into B2BTHAI
. . importing table "SUPPLIER"
IMP-00019: row rejected due to ORACLE error 1401
IMP-00003: ORACLE error 1401 encountered
ORA-01401: inserted value too large for column
Column : AAAGQ9AAJAAAbPDAAA
Column : 14196
Column : -1
Column : 9

My NLS_DATABASE_PARAMETERS

PARAMETER VALUE
------------------------------ ------------------------------
NLS_LANGUAGE AMERICAN
NLS_TERRITORY AMERICA
NLS_CURRENCY $
NLS_ISO_CURRENCY AMERICA
NLS_NUMERIC_CHARACTERS .,
NLS_CHARACTERSET AL32UTF8
NLS_CALENDAR GREGORIAN
NLS_DATE_FORMAT DD-MON-RR
NLS_DATE_LANGUAGE AMERICAN
NLS_SORT BINARY
NLS_TIME_FORMAT HH.MI.SSXFF AM
NLS_TIMESTAMP_FORMAT DD-MON-RR HH.MI.SSXFF AM
NLS_TIME_TZ_FORMAT HH.MI.SSXFF AM TZR
NLS_TIMESTAMP_TZ_FORMAT DD-MON-RR HH.MI.SSXFF AM TZR
NLS_DUAL_CURRENCY $
NLS_COMP BINARY
NLS_LENGTH_SEMANTICS BYTE
NLS_NCHAR_CONV_EXCP FALSE
NLS_NCHAR_CHARACTERSET AL16UTF16
NLS_RDBMS_VERSION 9.2.0.4.0




Tom Kyte
May 12, 2005 - 7:43 am UTC

You are going from a single byte characterset to a multi-byte characterset.

every character in the import file might take 1, 2, 3 or more bytes in the database.

So, you could have a varchar2(10) that held 10 characters in the single byte characterset - it might only be able to hold 2 or 3 characters in the multi-byte representation.

It is a fact of multi-byte charactersets. You need to make the columns be able to hold more bytes. You can use

column_name varchar2(30 CHAR)

to let it hold 30 characters in the characterset, instead of 30 bytes.

I read Note:99120.1

Alvin, May 12, 2005 - 10:30 pm UTC

So basically i've 2 alternative.

1. Fastest way - Since i'm importing the data for Developmental testing (dev. db)it would be much much faster and painless to just create another db with a similar charset.

2. Extract the DDL and painfully-painfully change all the varchar columns.

(Desperately lazy and hoping) Can't i just change the NLS_LENGTH_SEMANTICS then import the dump file ??

Tom Kyte
May 13, 2005 - 9:12 am UTC

yes -- as long as the tables do not exist and import creates them, this will consider varchar2(20) to be varchar2(20 char)


now, the maximum length of a varchar2 is 4000 bytes -- regardless (and char is 2000 bytes -- regardless). be aware of that, this means a long text string could still be problematic.

Nagy Gabor, July 05, 2005 - 7:29 am UTC


More to Explore

VLDB

If you are new to partitioning, check out Connor McDonald's introduction series here.

VLDB

Documentation set on VLDB and Partitioning.