Skip to Main Content
  • Questions
  • Import and export between dbs with different character sets

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, alvaro.

Asked: August 04, 2000 - 10:30 am UTC

Last updated: January 14, 2013 - 11:12 am UTC

Version: Oracle Server 7.3.4

Viewed 100K+ times! This question is

You Asked

I have a db with us7ascii character set, in this db I can´t store characters like ñ. For example the word "niño" is stored like ni?o.
I have another db with WE8ISO8859P1 character set (oracle 8.1.5 enterprise edition on NT). I want to Know if is there any problems when I going to do an export/import from db (us7ascii) to db WE8ISO8859P1. How Oracle db with WE8ISO8859P1 character set
recognize ni?no ???. Export/Import make the conversion without problems ???


Thanks,

Acueto

and Tom said...

the data will remain the same -- it'll still be ni?o since its stored with 7bits in the US7ASCII database. The 8bit database won't add the extra bit.

It will be exactly as if a client using US7ASCII connected to the WE8ISO8859P1 database and inserted the data directly -- the client character set is obeyed on the way in and out. Data will be converted from US7ASCII to WE8ISO8859P1 on the way in (no change) and from WE8ISO8859P1 to US7ASCII on the way out (strip that high bit).



Rating

  (56 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

What will happend with the letter ascii (164 )

Mario Goren mlgoren@bellsouth.net, March 27, 2001 - 12:09 pm UTC

It is not clear if I will see the letter after import in the other version of character set.
Is the same with US7ASCII to WE8DEC.

It got me imp-0016 error

lisa, April 10, 2002 - 10:29 am UTC

The export file from Oracle 734 is US7ASCII format,our software upgrade to newer version with Oracle 817 with Oracle9 iAS, the new database created in the new DB is required to use nls_lang setting "UTF8", When I took the export file (w/US7ASCII) import into new db (w/UTF8)it gives me imp-0016 error (required character set conversion (type 1 to 871) not supported.

How do I make the exported dump load into new Oracle 817?
I am aware that they are not the same format, but I can't change them. the old db dump is US7ASCII, the new db requires the UTF8, so we can use the web brower feature.

Do you know how do we get by with this ?



Tom Kyte
April 10, 2002 - 11:16 am UTC

Set your NLS_LANG environment variable to AMERICAN_AMERICA.US7ASCII before doing the import and that'll work. You must have it set to .UTF8 right now and that is the cause of the error.



This is a documented (albeit confusingly documented) restriction. Page 2.53 of the utilities guide has:

<quote>
Character Set Conversion

The following sections describe character conversion for CHAR and NCHAR data.

CHAR Data

Up to three character set conversions may be required for character data during an export/import operation:

1. Export writes export files using the character set specified in the NLS_LANG
environment variable for the user session. A character set conversion is
performed if the value of NLS_LANG differs from the database character set.

2. If the character set in the export file is different than the Import user session character set, Import performs a character set conversion to its user session character set. Import can perform this conversion only if the ratio of the width of the widest character in its user session character set to the width of the smallest character in the export file character set is 1.


3. A final character set conversion may be performed if the target database?s
character set is different from Import?s user session character set.

</quote>

You are hitting the issue in #2, the sessions character set (UTF8) is larger than 1. The export data was exported with US7ASCII -- that width is 1. The ratio of these numbers is not 1 -- hence IMP cannot do the conversion (and that's what it is trying to do) -- we have to let the database do the conversion to UTF8.

Setting the NLS_LANG should clear that right up. I've tested to ensure this is so.



ora-12712 error

lisa, April 10, 2002 - 11:48 am UTC

Thank you for your quick reply, but when I try to change the character in Oracle817 db (which was UTF8 character set), and got the following error:


H:\Oracle>SVRMGRL

Oracle Server Manager Release 3.1.7.0.0 - Production

Copyright (c) 1997, 1999, Oracle Corporation. All Rights Reserved.

Oracle8i Release 8.1.7.0.0 - Production
JServer Release 8.1.7.0.0 - Production

SVRMGR> CONNECT INTERNAL
Connected.
SVRMGR> SHUTDOWN IMMEDIATE;
Database closed.
Database dismounted.
ORACLE instance shut down.
SVRMGR> STARTUP MOUNT;
ORACLE instance started.
Total System Global Area 175650844 bytes
Fixed Size 75804 bytes
Variable Size 121839616 bytes
Database Buffers 53657600 bytes
Redo Buffers 77824 bytes
Database mounted.
SVRMGR> ALTER SYSTEM ENABLE RESTRICTED SESSION;
Statement processed.
SVRMGR> ALTER DATABASE OPEN;
Statement processed.
SVRMGR> ALTER DATABASE CHARACTER SET US7ASCII;
ALTER DATABASE CHARACTER SET US7ASCII
*
ORA-12712: new character set must be a superset of old character set
SVRMGR> ALTER DATABASE NATIONAL CHARACTER SET US7ASCII;
ALTER DATABASE NATIONAL CHARACTER SET US7ASCII
*
ORA-12712: new character set must be a superset of old character set
SVRMGR>



Tom Kyte
April 10, 2002 - 12:23 pm UTC

No, that's not what I was saying to do.

Set the ENVIRONMENT variable NLS_LANG. You have it set already, to UTF8

$ setenv NLS_LANG AMERICAN_AMERICA.UTF8

$ imp userid=scott/tiger tables=emp ignore=y

Import: Release 9.0.1.2.0 - Production on Wed Apr 10 12:21:59 2002

(c) Copyright 2001 Oracle Corporation. All rights reserved.


Connected to: Oracle9i Enterprise Edition Release 9.0.1.2.0 - Production
With the Partitioning option
JServer Release 9.0.1.2.0 - Production

IMP-00016: required character set conversion (type 1 to 871) not supported
IMP-00000: Import terminated unsuccessfully



set it to this:
$ setenv NLS_LANG AMERICAN_AMERICA.US7ASCII

$ !imp
imp userid=scott/tiger tables=emp ignore=y

Import: Release 9.0.1.2.0 - Production on Wed Apr 10 12:22:12 2002

(c) Copyright 2001 Oracle Corporation. All rights reserved.


Connected to: Oracle9i Enterprise Edition Release 9.0.1.2.0 - Production
With the Partitioning option
JServer Release 9.0.1.2.0 - Production

Export file created by EXPORT:V07.03.04 via conventional path
. importing SCOTT's objects into SCOTT
. . importing table "EMP"
......


and do the import, thats all you need to do. don't touch the database!


Encrypted data problem

Rahul, August 26, 2002 - 7:44 am UTC

Hi Tom

We have Oracle816 installed and in db we are storing some encrypted password based on DES statndard which is done at Java end and java program converts a simple string to UNICODE string and stored in db.

The problem is when we take an export of that schema and import on the same server which is having UTF8 char set then
after importing all the passwords do corrupted.

I'm doing both operation from server only so I don't think that is there any char set problem.

Please suggest something.

Regards
Rahul.

Tom Kyte
August 26, 2002 - 8:57 am UTC

sounds like you are using a varchar2 or char to store this BINARY DATA.

That would be an inappropriate type to store encrypted data, use RAW instead and your worries go away (it is RAW, binary data after all -- you wouldn't store a date in string or a number right? same thing here. Use the right type for the data)

Alternative

Rahul, August 27, 2002 - 4:06 am UTC

Hi Tom

You are right , we are storing the encrypted passwords in varchar2 type field.
But as I know during export/import Oracle doesn't touch the data whatever is in the column, so why it is not storing the encrypted data as it was at the time of export.
And also I would appreciate if you tell me any other workaround because I don't want to change the datatype of column at this stage.

Regards
Rahul.


Tom Kyte
August 27, 2002 - 8:34 am UTC

You are wrong -- during import/export the data can and often does go through character set translation.


Unless

a) source database has CHARACTER SET X
b) client running export has their NLS_LANG set to X
c) client running import has their NLS_LANG set to X
d) target database has CHARACTER SET X

you won't get very far. You do not mention what the character sets where so I have to assume that somewhere in A, B, C, D -- X changed to Y and the data was changed.

You have a bug in your program -- that bug is you are storing binary data in a totally improper datatype. You should really fix your bug.

Agree but...

Rahul, August 29, 2002 - 12:43 am UTC

Hi Tom

I am fully agree with you that we are storing encrypted data in wrong datatype.
But I want to tell you one more thing, in one more project we are doing the same thing but not from Java end rather we are using DBMS_OBFUSCATION_TOOLKIT to make the password encrypted/decrypted and storing in the varchar2 datatype but this database never shown this kind of error in imp/exp.
So please clear my doubt that what could be the reason??

Regards
Rahul.


Tom Kyte
August 29, 2002 - 7:38 pm UTC

You had the fortunate luck of have the NLS_LANG set on the client equal to that of the database and the two databases had the same character set.  Period.

Consider this test done on UNIX where the default NLS_LANG is us7ascii:

ops$tkyte@ORA817DEV.US.ORACLE.COM> create or replace function crypt( p_str in varchar2 ) return varchar2
  2  as
  3      l_data  varchar2(255);
  4          l_return varchar2(255);
  5  begin
  6      l_data := rpad( p_str, (trunc(length(p_str)/8)+1)*8, chr(0) );
  7
  8      dbms_obfuscation_toolkit.DESEncrypt
  9          ( input_string => l_data,
 10            key_string   => 'MagicKey',
 11            encrypted_string=> l_return );
 12          return l_return;
 13  end;
 14  /

Function created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> create table t ( id int,  x varchar2(80) );
Table created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> insert into t values ( 2, crypt('How Now Brown Cow') );
1 row created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> commit;
Commit complete.

ops$tkyte@ORA817DEV.US.ORACLE.COM> host unsetenv NLS_LANG; exp userid=/ tables=t

Export: Release 8.1.7.4.0 - Production on Thu Aug 29 19:35:05 2002

(c) Copyright 2000 Oracle Corporation.  All rights reserved.


Connected to: Oracle8i Enterprise Edition Release 8.1.7.4.0 - Production
With the Partitioning option
JServer Release 8.1.7.4.0 - Production<b>
Export done in US7ASCII character set and WE8ISO8859P1 NCHAR character set
server uses WE8ISO8859P1 character set (possible charset conversion)
</b>
About to export specified tables via Conventional Path ...
. . exporting table                              T          1 rows exported
Export terminated successfully without warnings.

ops$tkyte@ORA817DEV.US.ORACLE.COM> update t set id = 1;
1 row updated.

ops$tkyte@ORA817DEV.US.ORACLE.COM> commit;
Commit complete.

ops$tkyte@ORA817DEV.US.ORACLE.COM> host imp userid=/ full=y ignore=y

Import: Release 8.1.7.4.0 - Production on Thu Aug 29 19:35:05 2002

(c) Copyright 2000 Oracle Corporation.  All rights reserved.


Connected to: Oracle8i Enterprise Edition Release 8.1.7.4.0 - Production
With the Partitioning option
JServer Release 8.1.7.4.0 - Production

Export file created by EXPORT:V08.01.07 via conventional path<b>
import done in WE8ISO8859P1 character set and WE8ISO8859P1 NCHAR character set
export client uses US7ASCII character set (possible charset conversion)</b>
. importing OPS$TKYTE's objects into OPS$TKYTE
. . importing table                            "T"          1 rows imported
Import terminated successfully without warnings.

ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> column dump_x format a40
ops$tkyte@ORA817DEV.US.ORACLE.COM> select id, dump(x,16) dump_x from t;

        ID DUMP_X
---------- ----------------------------------------
         1 Typ=1 Len=24: 1a,<b>d2</b>,7,72,8f,3d,f0,dd,77,
           5d,f0,91,53,57,f1,3a,dd,d0,f4,ca,3f,54,6
           ,ac

         2 Typ=1 Len=24: 1a,<b>4f</b>,7,72,3f,3d,3f,59,77,
           5d,3f,3f,53,57,3f,3a,59,3f,6f,45,3f,54,6
           ,3f


ops$tkyte@ORA817DEV.US.ORACLE.COM>


<b>see how the data is DEFINITELY different.  wrong datatype = bug in your code</b>


 

Conversion of some spec. char to UTF8 not correct

Amra Sivcevic, April 09, 2003 - 10:26 am UTC

Hi Tom,

Our problem is connected to the original question.
We have problem converting data from WE8ISO8859P1 to UTF8
on Sun Solaris 8(64 bit).

Database 1 - Oracle 815 AMERICAN_AMERICA.UTF8 character set.
Database 2 - Oracle 815 AMERICAN_AMERICA.WE8ISO8859P1 character set.

We are supposed to import some tables with European special characters into Database 1.
We exported Table1 from Database2 using env variable
NLS_LANG=AMERICAN_AMERICA.WE8ISO8859P1
and imported Table1 into Database1 using env variable
NLS_LANG=AMERICAN_AMERICA.WE8ISO8859P1.

We exported Table1 from Database2 using env variable
NLS_LANG=AMERICAN_AMERICA.UTF8
and imported Table1 into Database1 using env variable
NLS_LANG=AMERICAN_AMERICA.UTF8 and got
the same results.

Most special characters appears to be stored properly except special
character (ASCII 132 - European underline qoutes);
Dump of that special character from Dababase 1 shows 239,191,189 what is not
correct. If we use convert function we cannot get 132.

Please can you explain how we'll get properly stored
ASCII 132 from Database 2 into Database 1,
what NLS_LANG we should use on the server and client.

Thanks is advance
Amra

Tom Kyte
April 09, 2003 - 10:59 am UTC

Well -- they are "correct" it is just that you are not "happy" with the results.

Why isn't it "correct", we are using the prescribed, dictated mapping of a single byte character to a multi-byte character.

How to convert from RDB to Oracle

Abdul Wahab, June 15, 2003 - 8:46 am UTC

Dear Tom,
I have been given a task to migrate the data from RDB running on VAX into Oracle 8.1.7 on UNIX TRU64 with character set AR8ISO8859P6. FTP was done to transfer the files(*.rdb) containing the data onto the windows machine.Now how can i use these files to convert it and store it into to oracle database.To make the matter worst for me and challenging for you ..the data is bilingual containing both english and Arabic.
thanks in advance


Tom Kyte
June 15, 2003 - 10:04 am UTC

(looking forward a followup.... gateways come with ias if you have ias, you don't buy them separately btw. many people have them and don't even realize it. One windows you DO have the generic connectivity -- via ODBC -- gateway with the database regardless, so you CAN in fact create a dblink from windows to rdb on vax given an odbc driver -- anyway)

you have to dump the data out. Unless you have rdb running on windows, having those binary datafiles from VAX is less then useful. It would be like taking the unix Oracle datafiles and giving them to sqlserver on windows. They would be big old blobs of data, nothing more.



How to convert from RDB to Oracle

Abdul Wahab, June 15, 2003 - 8:47 am UTC

Dear Tom,
I have been given a task to migrate the data from RDB running on VAX into Oracle 8.1.7 on UNIX TRU64 with character set AR8ISO8859P6. FTP was done to transfer the files(*.rdb) containing the data onto the windows machine.Now how can i use these files to convert it and store it into to oracle database.To make the matter worst for me and challenging for you ..the data is bilingual containing both english and Arabic.
thanks in advance
P.S please don't ask me to use expensive tools like oracle transparent gateway and all.

An unsupported feature

Tamas Szecsy, August 21, 2003 - 5:55 am UTC

For those who have character conversion/migration problems there is a metalink note that follows. You could
1. export the data in the old character set
2. go to the new instance, alter the dtabase character set to the old one
3. import the data
4. alter the character set back to the new one.

I did this on Windows 2000 for WEISO859P1 >> EEISO859P2 conversion. It functioned, though as you might note it in the text below, it is not supported by Oracle. This is only viable on 8.1.x destination databases and upwords.

Metalink note
===============
NOTE:66320.1
Title: V8: Changing the database or national character set

PURPOSE

To explain how to change the database or national character set of
an existing Oracle8 database without having to recreate the database.

SCOPE & APPLICATION

The method described here is documented in the Oracle 8.1.x documentation.
It is not documented but it can be used in version 8.0.x.

Before using this method it is essential to do a full backup of the
database.

This method does not work in Oracle7.

Note that changing the database or the national character set as described
in this document does not change the actual character codes, it only changes
the character set declaration. If you want to convert the contents of the
database from one character set to another you must use the Oracle Export
and Import utilities. This is needed, for example, if the source character
set is not a subset of the target character set.

RELATED DOCUMENTS

<Note:13856.1> V7: CHANGING DATABASE CHARACTER SET
<Note:62107.1> The National Character Set in Oracle8

Oracle8i National Language Support Guide



CHANGING THE DATABASE OR NATIONAL CHARACTER SET
================================================

In Oracle7 it is possible to update the "SYS.PROPS$" table to change the
database character set. The method is unsupported but it is externally
published (see <Note:13856.1>). It is used by Oracle Installer to install
seed databases.

This method is not legal in Oracle8 because character set information is
stored in many other places in the Data Dictionary besides SYS.PROPS$, for
example with each table column, PL/SQL argument, etc.

In Oracle8 there is another way of changing the database or national character
set. The method uses two commands, which are documented in the Oracle8i
National Language Support Guide:

ALTER DATABASE [<db_name>] CHARACTER SET <new_character_set>
ALTER DATABASE [<db_name>] NATIONAL CHARACTER SET <new_NCHAR_character_set>

The database name is optional. The character set name should be specified
without quotes, for example:

ALTER DATABASE CHARACTER SET WE8ISO8859P1

To change the database character set perform the following steps. Not all
of these steps are absolutely necessary, but they are highly recommended:

SVRMGR> SHUTDOWN IMMEDIATE; -- or NORMAL

<do a full database backup>

SVRMGR> STARTUP MOUNT;
SVRMGR> ALTER SYSTEM ENABLE RESTRICTED SESSION;
SVRMGR> ALTER SYSTEM SET JOB_QUEUE_PROCESSES=0;
SVRMGR> ALTER SYSTEM SET AQ_TM_PROCESSES=0;
SVRMGR> ALTER DATABASE OPEN;
SVRMGR> ALTER DATABASE CHARACTER SET <new_character_set>;
SVRMGR> SHUTDOWN IMMEDIATE; -- OR NORMAL
SVRMGR> STARTUP;

To change the national character set replace the ALTER DATABASE CHARACTER
SET command with ALTER DATABASE NATIONAL CHARACTER SET. You can issue both
commands together if you wish.

The ALTER DATABASE [NATIONAL] CHARACTER SET command will return:

ORA-01679: database must be mounted EXCLUSIVE and not open to activate

- if you do not enable restricted session
- if you startup the instance in PARALLEL/SHARED mode
- if you do not set the number of queue processes to 0
- if you do not set the number of AQ time manager processes to 0
- if anybody is logged in apart from you.

This error message is misleading. The command requires the database to be
open but only one session, the one executing the command, is allowed.

The above method will only work if the old character set is US7ASCII (or if
you change the character set to itself). If the old character set is neither
US7ASCII nor equal to the new character set, the ALTER DATABASE [NATIONAL]
CHARACTER SET command will return:

- in Oracle 8.1.5 and above:

ORA-12712: new character set must be a superset of old character set

- in Oracle 8.0.5 and 8.0.6:

ORA-12710: new character set must be a superset of old character set

- in Oracle 8.0.3 and 8.0.4:

ORA-24329: invalid character set identifier

**************************** INTERNAL_USE ********************************

In Oracle 8.1 an undocumented option INTERNAL_USE has been added to
the ALTER DATABASE [NATIONAL] CHARACTER SET command:

ALTER DATABASE [<db_name>]
[NATIONAL] CHARACTER SET INTERNAL_USE <new_character_set>

If included in the statement this option will switch the character set
verification off allowing any database character set to be changed.

*** Improper use of the option can lead to corruption of the database. ***



I can store ñ in US7ASCII

Adrian, June 03, 2004 - 11:22 am UTC

From the original question:-

>> I have a db with us7ascii character set, in this db I 
>> can´t store characters like ñ. For example the 
>> word "niño" is stored like ni?o.

I can store this character and am guessing this perceived problem is due to a client NLS_LANG issue...

SQL*Plus: Release 9.2.0.5.0 - Production on Thu Jun 3 15:57:06 2004

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.

Enter password:

Connected to:
Oracle9i Enterprise Edition Release 9.2.0.5.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.5.0 - Production

SQL> create table t ( x char(1) );

Table created.

SQL> insert into t values ( 'ñ' );

1 row created.

SQL> select ascii('ñ') from dual;

ASCII('ñ')
----------
       241

SQL> insert into t values ( chr(241) );

1 row created.

SQL>
SQL> select * from t;

X
-
ñ
ñ

SQL> select value from v$nls_parameters where parameter = 'NLS_CHARACTERSET';

VALUE
----------------------------------------------------------------
US7ASCII

I'm having a similar problem exporting German characters to WE8ISO8599P15 from my US7ASCII database, but at least I know the characters are in my source data ( I'm about to add a post on the problem ).

Regards
Adrian

 

Tom Kyte
June 03, 2004 - 1:59 pm UTC

you cannot *safely* store that character in that us7ascii database.

all it takes is a client (say windows) that connects to that db with a different client character set. then conversion kicks in -- and they will see "garbage".

very important to use the proper characterset for your data.

US7ASCII to WE8ISO8599P15 German char conversion

Adrian, June 03, 2004 - 11:59 am UTC

Tom

I have read all your NLS threads and many Metalink ones too and cannot get a solution to my problem of exporting German chars from a US7ASCII SOLARIS database and importing them to a WE8ISO8859P15 W*?!&WS XP database. I have tried many permutations as follows:-

Attempt 1
---------
Export with UNIX NLS_LANG set to AMERICAN_AMERICA.US7ASCII
Import with DOS NLS_LANG set to AMERICAN_AMERICA.US7ASCII

I get the message "import done in US7ASCII...<snip> import server uses WE8ISO8859P15 character set ( possible charset conversion )."

The German characters are converted and I confirm this as a data issue by running a SELECT ASCII(lower_a) FROM german_chars; and get 100 instead of 228. 

Note: LOWER_A is a column in my GERMAN_CHARS dummy table that contains "ä" in the source data ( and in the .dmp file as confirmed with an od -x ). Unfortunately, it stores "d" in the target table.

Attempt 2
---------
Export with UNIX NLS_LANG set to AMERICAN_AMERICA.WE8ISO8859P15.
Import with DOS NLS_LANG set to AMERICAN_AMERICA.WE8ISO8859P15.

I get the message "import done in WE8ISO8859P15 characterset...<snip>." No mention of any conversion ( as this was done during export ).

The German characters are still converted ( ä converts to a small d ( ascii 100 instead of 228 ). An od -x on the .dmp file confirms that characterset of export file was WE8... and the ä is bound to the INSERT statement.

Attempts 3 & 4
--------------
Same as attempt 2 but one without setting NLS_LANG on DOS and one go setting it to the appropriate value according to codemap ( AMERICAN_AMERICA.US8PC437 ). They both have the same result as with attempt 2 but the US8PC437 version gives me a possible charset conversion message.

Attempts 5 & 6
--------------
Same as attempts 3 and 4 but using US7ASCII .dmp file. Same result with extra warning about export being US7ASCII and possible conversion.

So despite all the combinations, I cannot get German characters that exist in a US7ASCII database into a supposed superset database with WE8ISO8859P15. 

http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:1224836384599,

A quote of yours from the above thread states "Going the other way, from 7bit to 8bit won't encounter any issues since the 7bit US7ASCII is a subset of the 8bit WE8ISO8859P1."

I'm not having this experience.

Any suggestions because this is driving me mad !

Regards
Adrian

PS My source data...
di_import@dhdi SQL> select * from german_chars;

L U L U L U D
- - - - - - -
ä Ä ö Ö ü Ü ß

di_import@dhdi SQL> select value from v$nls_parameters where parameter = 'NLS_CHARACTERSET';

VALUE
----------------------------------------------------------------
US7ASCII


And my target data...

SQL> select * from german_chars;

L U L U L U D        
- - - - - - -
d D v V | \ _      

SQL> select value from v$nls_parameters where parameter = 'NLS_CHARACTERSET';

VALUE                    
----------------------------------------------------------------
WE8ISO8859P15

And to prove my target client can read German chars...
SQL> create table t as select chr(228) as x from dual;

Table created.

SQL> select x from t;

X
-
ä

 

Tom Kyte
June 03, 2004 - 2:02 pm UTC

exporting German chars from a US7ASCII SOLARIS


You cannot have german characters in a 7bit database. Well, you can, but you cannot expect them to be "OK".

As soon as you cross character set boundaries - bamm, there they go.


You won't be able to use export/import.

You MIGHT be able to use SQLLDR and a flat file dump.

but as soon as you cross character set boundaries with any of the tools (like exp/imp) -- character set conversion kicks in and that 8bit data just isn't supposed to be there in that 7bit database.

see
</code> http://asktom.oracle.com/~tkyte/flat/index.html <code>
and try dumping the data to a flat file.



Don't I know it ?

Adrian, June 04, 2004 - 4:19 am UTC

>> very important to use the proper characterset for your data

Tell me about it :o(

Then tell the DBA who created the database about it ;o)

I never create databases with US7ASCII - it's just too restrictive these days. I was a little shocked to see a US7ASCII database with German data in it, but by the time I arrived on the job it was way too late to revert...

Thanks for your help. I'm going to test a characterset conversion at the database level on a test DB, but from what you say about the theoretically impossible but technically possible storage of German characters in a US7ASCII database and the effect this has on conversions, I'm not too hopeful.

Regards
Adrian

Tom Kyte
June 04, 2004 - 8:18 am UTC

ahh, that would be another approach -- if a german characterset is a superset of us7ascii, you might be able to convert it (the database).

see chapter 3 of your NLS guide.

I might have got lucky...

Adrian, June 04, 2004 - 7:29 am UTC

It looks like ALTER DATABASE CHARACTERSET WE8ISO8859P15 might work for me here.

From my test US7ASCII DB:-

First the test setup
--------------------
SQL>
SQL> CREATE TABLE t ( x VARCHAR2(4000) );

Table created.

SQL> 
SQL> DECLARE
  2      s LONG;
  3  BEGIN
  4      FOR i IN 1 .. 255 LOOP
  5         s := s || CHR(i);
  6      END LOOP;
  7      INSERT INTO t VALUES ( s );
  8      COMMIT;
  9  END;
 10  /

PL/SQL procedure successfully completed.

SQL>
SQL> SELECT * FROM t;

X                                                                               
--------------------------------------------------------------------------------
                                                                       

 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ
[\]^_`abcdefghijklmnopqrstuvwxyz{|}~&#8364;&#8218;&#402;&#8222;&#8230;&#8224;&#8225;&#710;&#8240;&#352;&#8249;&#338;&#381;&#8216;&#8217;&#8220;&#8221;&#8226;&#8211;&#8212;&#732;&#8482;&#353;&#8250;&#339;&#382;&#376; ¡¢£¤¥¦§¨©ª
«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùú
ûüýþÿ

Then the migration
------------------
SQL> conn sys@test as sysdba
Connected.
SQL> 
SQL> alter system enable restricted session;

System altered.

SQL> alter system set job_queue_processes = 0;

System altered.

SQL> alter system set aq_tm_processes = 0;

System altered.

SQL> alter database character set WE8ISO8859P15;

Database altered.

SQL> shutdown
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.

Total System Global Area   93395628 bytes
Fixed Size                   453292 bytes
Variable Size              75497472 bytes 
Database Buffers           16777216 bytes 
Redo Buffers                 667648 bytes
Database mounted.
Database opened.

Finally the test 
----------------

( testing that ASCII values of chars haven't changed )

SQL> SELECT value FROM v$nls_parameters WHERE parameter = 'NLS_CHARACTERSET';

VALUE
----------------------------------------------------------------
WE8ISO8859P15

SQL> 
SQL> DECLARE
  2      n PLS_INTEGER := 0;
  3      a PLS_INTEGER;
  4  BEGIN
  5      FOR r IN ( SELECT x FROM t ) LOOP
  6         FOR i IN 1 .. LENGTH(r.x) LOOP
  7            a := ASCII( SUBSTR( r.x, i, 1 ) );
  8            IF i != a THEN
  9           n := n + 1;
 10           DBMS_OUTPUT.PUT_LINE( 'Character in position ' || i ||
 11                         ' has changed to ascii ' || a );
 12            END IF;
 13         END LOOP;
 14      END LOOP;
 15      DBMS_OUTPUT.PUT_LINE( n || ' incorrect conversions.' );
 16  END;
 17  /
0 incorrect conversions.

PL/SQL procedure successfully completed.

This looks pretty promising for me. The csscan utility showed up the 8 bit chars as "lossy conversions", but they seem to have substituted OK.

Can you see any flaws in the test ?

Thanks
Adrian 

Tom Kyte
June 04, 2004 - 8:54 am UTC

changing the database character set won't actually touch the data so it won't change anything.

all you need to do is ensure that by EXPORTING from this instance and IMPORTING into the other -- nothing "bad" happens.

Everything seems good...

Adrian, June 04, 2004 - 10:01 am UTC

Tom

Good suggestion. I tried the exp from migrated DB to existing WE8 DB and got the same successful results. INSERTs are fine too. So it seems that all my lossy conversions are fortunately the same !

With the ALTER DATABASE not touching the data, makes this a very quick solution too.

Regards
Adrian

A reader, July 14, 2004 - 1:37 pm UTC


A reader, August 02, 2004 - 2:13 pm UTC


export import between WE8ISO8859P1 and US7ASCII databases

Amal, September 13, 2004 - 11:19 am UTC

Hi Tom,

I have two databases, one in WE8ISO8859P1, and the other in US7ASCII. I am trying to import a table's data from the WE8ISO8859P1 databse into a the US7ASCII database. But data like Primus® HF-1050 is getting converted to Primus? HF-1050 in the US7ASCII database.

I have tried by setting the NLS_LANG parameter to both US7ASCII and WE8ISO8859P1 and do the export/import but no luck.

Is it that US7ASCII does not support special characters like ® or there is something that i have missed?

Your response would be immensely helpful.

Thanks

Tom Kyte
September 13, 2004 - 1:57 pm UTC

yes, because the (r) isn't ascii, doens't fit into 7bits.


You get the lower 128 characters, period, with us7ascii.

Amal, September 14, 2004 - 10:22 am UTC

Thanks a lot Tom! You have confirmed my hunch.

import of data

Reader, January 03, 2005 - 4:42 am UTC

Tom,

We like to export schema level export from AL32UTF8 database and import into US7ASCII database. The database versions and o.s versions are same (oracle 9.2.0.1 and Sun Solaris 2.8). The source database does not contain any Asian language data. All the data are only in English.

How this needs to be done? Any character set conversion issue? Give details.

Rgds

Tom Kyte
January 03, 2005 - 8:48 am UTC

sure there will be issues. there are hundreds of characters possible in AL32UTF8 that have no corresponding representation in US7ASCII -- so be prepared to have lots of data changed on the way in.

You might use the character set scanner utility to see what might be impacted:

</code> http://docs.oracle.com/docs/cd/B10501_01/server.920/a96529/ch11.htm#1656 <code>

import of data

Reader, January 03, 2005 - 4:44 am UTC

Tom,

We like to take schema level export from AL32UTF8 database and import it into US7ASCII database. The database versions and o.s versions are same (oracle 9.2.0.1 and Sun Solaris 2.8). The source database does not contain any Asian language data. All the data are only in English.

How this needs to be done? Any character set conversion issue? Give details.

Rgds

export/import

binoj, March 31, 2005 - 11:36 pm UTC

can i do an export import using 8.0.6 version of exp/imp


Tom Kyte
April 01, 2005 - 8:18 am UTC

against 8.0.6, sure.

export/import

binoj, March 31, 2005 - 11:38 pm UTC

can i do an export import of CLOB/BLOB using 8.0.6 version of exp/imp


Tom Kyte
April 01, 2005 - 8:22 am UTC

with 8.0.6, sure.

import/export and various charactersets

A reader, June 28, 2005 - 3:12 pm UTC

answers to various issues concerning import/export and charactersets are very clear and helpful. -sreedhar

Lossy data exp/imp from US7ASCII 8i to AL32UTF8 10g

Paru, July 30, 2007 - 2:26 pm UTC

Hello Tom,

What happens to the existing data i.e. lossy data if I were to use export/import from US7ASCII 8i to AL32UTF8 10g instead of using CSSCAN and CSALTER?

Will it alter my data or will that be untouched? Can you please explain?
Tom Kyte
July 30, 2007 - 5:49 pm UTC

from us7ascii to utf will not lose anything.

CSALTER ...

Chaman, November 27, 2007 - 9:31 am UTC

Hello Tom,
On my Oracle 10gR2 DB, I want to convert from WE8MSWIN1252 to AL32UTF8. I ran the CSSCAN and it generated a scan report. I found 2 tables having CLOB columns might be truncated. Since it's a test DB I don't mind.

I ran the CSALTER command, it gave the following messages:

0 rows created.
Function created.
Function created.
Procedure created.
This script will update the content of the Oracle Data Dictionary.
Please ensure you have a full backup before initiating this procedure.
Would you like to proceed (Y/N)?Y
old 6: if (UPPER('&conf') <> 'Y') then
new 6: if (UPPER('Y') <> 'Y') then
Checking data validility...
Unrecognized convertible date found in scanner result
PL/SQL procedure successfully completed.
Checking or Converting phrase did not finish successfully
No database (national) character set will be altered
CSALTER finished unsuccessfully.
PL/SQL procedure successfully completed.
0 rows deleted.
Function dropped.
Function dropped.
Procedure dropped.

I checked, the vaue of VALUE column for row name='CSALTER_FINISHED' is FALSE.

I read in this site CSALTER looks for clean CSSAN.

http://ora.seiler.us/2007/06/csalter-requires-clean-csscan.html

What should I do now ?

Thanks

ARABIC

Basma, December 03, 2007 - 8:15 am UTC

Dear tom,
i exported a dump file from pc1 & when i import it to pc2 the arabic letters appear like ??? i tried to make sure that the nls_lang is the same in two pcs & tried the exp/imp again but no use..
if i made them all AR8MSWIN1256 :-
while exp it says exp is made in AR8MSWIN1256 but server is using WE8MSWIN1252
if i made them all WE8MSWIN1252 :-
while imp it says imp is made in WE8MSWIN1252 but server is using AR8MSWIN1256

Schema Refresh

Rama Subramanian G, January 03, 2008 - 5:23 am UTC

Hi Tom,

Greetings on New Year 2008.

We had a schema refresh by exporting a schema from a production database with UTF8 character set to a development database with UTF8 character set. While the multi-byte data in tables have come in alright, some multi-byte characters in a function (part of the function code) did not come alright. Further one package also did not get imported.

While you would naturally expect the log files of the import and export, unfortunately, these are now not available and we are at a loss to understand what might have happened.

Will it be possible for you from your experience to just hint what could be the potential causes. Especially since the unicode charaters in the tables are intact while the ones in the function are not.

Thanks & regards
Rama Subramanian G
Tom Kyte
January 03, 2008 - 11:22 am UTC

you have the original code. just export that schema with rows=n and you should be able to reproduce immediately.

characterset migration

Marcelo, March 20, 2009 - 12:25 pm UTC

I am migrating the database from 8i to 10g and the characterset from WE8ISO8859P1 to AL32UTF8.
I create the 10g database, generate the export file for one schema in server8i, ftp this file to server10g and now I want to run the import.  Below, you can see my configuration, but I have some problems to understand the possible problems (lost or truncation data, etc.) with the characterset migration.  Can I have some problems according to my configuration ?  

server8i > uname -a
HP-UX server8i B.11.11 U 9000/800 3178118772 unlimited-user license

server8i > locale
LANG=
LC_CTYPE="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_MESSAGES="C"
LC_ALL=

server8i > sqlplus $BMC_USER/$BMC_PASSWORD

SQL*Plus: Release 8.1.7.0.0 - Production on Ve Mar 20 16:57:27 2009

(c) Copyright 2000 Oracle Corporation.  All rights reserved.


Connecté à :
Oracle8i Enterprise Edition Release 8.1.7.4.0 - 64bit Production
With the Partitioning option
JServer Release 8.1.7.4.0 - 64bit Production

SQL> select value from nls_database_parameters where parameter = 'NLS_CHARACTERSET';

VALUE
----------------------------------------
WE8ISO8859P1

SQL> exit
Déconnecté de Oracle8i Enterprise Edition Release 8.1.7.4.0 - 64bit Production
With the Partitioning option
JServer Release 8.1.7.4.0 - 64bit Production

server8i> > echo $NLS_LANG
FRENCH_FRANCE.WE8ISO8859P1



server10g > uname -a
SunOS server10g 5.10 Generic_127112-06 i86pc i386 i86pc

server10g > locale
LANG=
LC_CTYPE="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_ALL=

server10g > sqlplus xx/xxx

SQL*Plus: Release 10.2.0.3.0 - Production on Fri Mar 20 16:53:51 2009

Copyright (c) 1982, 2006, Oracle.  All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> select value from nls_database_parameters where parameter = 'NLS_CHARACTERSET';

VALUE
--------------------------------------------------------------------------------
AL32UTF8

SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, OLAP and Data Mining options

server10g> echo $NLS_LANG
AMERICAN_AMERICA.WE8MSWIN1252





Tom Kyte
March 24, 2009 - 10:38 am UTC

anytime you export in character set 1 and import in character set 2, it is going to warn you of possible characterset conversions.


your nls lang on the export should be the source database character set
your nls lang on the import should be the target database character set

to minimize the conversions

Lossy data

Prem S, December 08, 2009 - 9:20 am UTC

We have a DB that is US7ASCII, however, it has 8 bit data.
we are looking to export and import into a DB with character set of AL32UTF8.

CSCAN results indicate lossy conversions for data that is mostly western character set

we also ran cscan to attempt to change the character set of the source to WE8ISO8859P1 but it still indicates lossy conversion not different from the CSCAN result of the UTF8 above.

in a previous post you indicated changing character set from US7ASCII to UTF8 - nothing is lost. Is that with the assumption that there is indeed only 7 bit data ?


What would be the best course of action since data in the source is really not US7ASCII

Please advice
Tom Kyte
December 10, 2009 - 12:49 pm UTC

you have garbage right now, you are not really sure what you have.

You told us "we shall store us7ascii data" and your clients apparently said "we shall send you us7ascii data" and then they did not.

any character set conversion is going to be "lossy" because there is no meaningful way to interpret what you have.


converting to utf8 isn't going to fix anything - you have garbage in there right now.

Your best probable bet in this case, if you don't have too much data, would be a dump (which you code) by a client claiming to be us7ascii) and a reload (which you code - sqlldr could be used) by a client claiming to be using the right character set for the data you are loading.

In response to 'characterset migration'

Manchester United, September 30, 2010 - 2:20 am UTC

hi Tom,

You mentioned above regarding a migration from 7bit to 8bit characterset to do:
"anytime you export in character set 1 and import in character set 2, it is going to warn you of possible characterset conversions.

your nls lang on the export should be the source database character set
your nls lang on the import should be the target database character set

to minimize the conversions"

But in metalink doco, "NLS considerations in Import/Export - Frequently Asked Questions [ID 227332.1]" it suggests (preferred way) is to set the NLS_LANG to the characterset of the source database on the import. (Whilst it does not suggest setting NLS_LANG to the target database being incorrect)
From your experience, is one way "better" than the other?

I'm guessing the key setting is the NLS_LANG on the export.
Tom Kyte
September 30, 2010 - 7:37 am UTC

It is sort of a six one way, half a dozen the other. Either the client can do the conversion or the server can do the conversion - although as the note points out - the server can do more conversions than the client, so if you are encountering the IMP-16 issue they talk of - you would set the NLS lang during exp/imp to that of the source database.

it works either way. The point is: use the nls_langs of the source and target databases - do NOT introduce a THIRD characterset into the equation.


Here is what I've seen: source database - nls_lang = X, target database - nls_lange = Y. Client - nls_lang = Z

So, x converts to Z, Z converts to Y

instead of:

x converts to y


You want x converts to y at most - you do not want to introduce Z into the equation.

character set conversion

A reader, October 26, 2010 - 3:06 pm UTC

Hi Tom,

1) In the globalization support guide I could find "The ALTER DATABASE CHARACTER SET statement does not perform any data conversion, so it can be used if and only if the new character set is a strict superset of the current
character set."

Can you please elaborate this? I thought even if "alter database character set" command could do data conversion, the new character set will have to be superset of older one always.
2) Can a "ALTER DATABASE CHARACTER SET AL32UTF8;" command change DB character set anytime? If yes then we should never need to recreate a DB with new character set.
Tom Kyte
October 26, 2010 - 8:27 pm UTC

1) if it could do conversion- it could CONVERT characterset A into character B. It doesn't do that however, it doesn't convert anything.

So it only works if every character in character set A has exactly, precisely the same representation in character set B - that is, A is completely contained in B (the euro symbol for example in A would be stored using the same bits and bytes in B as it does in A - as would EVERY character)


2) is AL32UTF8 a superset of EVERY other character set?

No, it isn't (simple proof - the euro symbol is represented using a different byte value in different single byte character sets - and the bytes used in the various single byte character sets are used by the others for different characters. Meaning the byte "X" is used for a euro symbol in character set A and the byte "X" is used for 'something' in character set B. In AL32UTF8 - that byte "X" is used for maybe a euro symbol and maybe 'something' and maybe something else....)

character set conversion

A reader, October 26, 2010 - 9:59 pm UTC

Hi Tom,

Can a "ALTER DATABASE CHARACTER SET <new character set name>;" command change DB character set anytime? If yes then we should never need to recreate a DB with new character set.

Also same query is for "ALTER DATABASE NATIONAL CHARACTER SET <new character set name>;"
Tom Kyte
October 27, 2010 - 5:52 am UTC

see the globalization guide for information regarding the use of this command.

http://docs.oracle.com/docs/cd/E11882_01/server.112/e10729/ch2charset.htm#i1007203

character set conversion

A reader, October 27, 2010 - 11:18 pm UTC

Hi Tom,

I have a feeling after reading 9i globalization support guide is:
since an "alter database character set" command cannot convert data between character sets at all, therefore the pre-requisite is the new character set must be strict super set of the older one so that no conversion is required, otherwise we could have data corruption.

Is my understanding correct?
If yes then I think this process is better than exporting data,recreating database with new character set and then import that data through conversion. Isn't it?
Tom Kyte
October 28, 2010 - 12:01 pm UTC

for the most part, yes


If yes then I think this process is better than exporting data,recreating database with new character set and then import that data through conversion. Isn't it?


Not if the new character set is a strict superset it wouldn't be. If the new characterset is a strict superset, using the alter database is trivial and doesn't require unloading and reloading.

And even if it isn't a strict superset, you can run csscan (character set scanner) to see if your data conforms to the new character set - and if not, what rows would be a problem. It might be easier just to "patch" the few occurrences rather than dump and reload.

csscan

A reader, November 01, 2010 - 6:10 am UTC

Hi Tom,

You are saying that by running csscan after doing "alter database character set" command, I can find which rows are incompatible to my new character set. Is the individual rows be then corrupted? How can I recover individual rows then?
Tom Kyte
November 01, 2010 - 3:31 pm UTC

no, i didn't say that, you run it BEFORE.

it'll identify rows that have a problem, so that after you convert (or before if you like) you can MODIFY those rows to fix them as you see fit.

unicode

A reader, November 15, 2010 - 9:39 pm UTC

Hi Tom,

1) If I use UTF8 as database character set then should it be necessary to use a national character set as I think UTF8 is the superset of all 8 bit ASCII character set. Isn't it?
2) What happens if I use WE8ISO8859P1 as DB character set and UTF8 as national character set?
Tom Kyte
November 16, 2010 - 3:47 am UTC

1) you'll always have a national character set.

UTF8 is not a superset of ALL 8 bit 'ASCII' - the euro symbol for example is represented using different numbers in different 8 bit charactersets.

2) not sure what to say here.... Other than varchar2 would use we8iso... and nvarchar2 would use utf8


http://docs.oracle.com/docs/cd/E11882_01/server.112/e10729/ch2charset.htm#i1007017

unicode

A reader, November 16, 2010 - 6:53 am UTC

Hi Tom,

Thanks, I feel little confusing in understanding the Character set things.

1) If I use US7ASCII for DB character set and UTF8 for national character set, do I need a "alter database character set" still? or will it be done as per requirement?
2) For situation above if my client uses 16 bit character set on his system then what will happen? Will I have to convert the DB and national character sets to 16 bit to make it compatible with client?
Tom Kyte
November 17, 2010 - 5:01 am UTC

1) how can I answer that?


If I drive my car 55mph and turn left, do I still need to fill up my tank?

when you can answer that, I can answer you.


If you have a database with US7ASCII/UTF8 - and

a) you want to change the character set, you would need to do an alter database
b) you DO NOT want to change the character, you would NOT need to do an alter database


2) what is a '16 bit' characterset?

Characterset

Abhisek, November 16, 2010 - 8:21 am UTC

Hi Tom,

Suppose if we have to set up a database that will support customers from different parts of world. China, Japan, Germany, US, UK..

Now my question is: Is there any charatcterset which handles all multi-lingual requests? Is it also dependent over the database server(are they multilingual as well)

How do you normally solve this request when records are from differnt charactersets? Do we have different database with different characterset..

Hope I am clear. Please suggest
Tom Kyte
November 17, 2010 - 5:10 am UTC

see right below, Michel beat me to it.

to Abhisek

Michel Cadot, November 16, 2010 - 8:26 am UTC


Now my question is: Is there any charatcterset which handles all multi-lingual requests?
AL32UTF8

Database Globalization Support Guide
Chapter 2 Choosing a Character Set (but I recommend to first read chapter 1)
http://download.oracle.com/docs/cd/B19306_01/server.102/b14225/ch2charset.htm#i1007681

Regards
Michel

character set

A reader, November 16, 2010 - 9:48 pm UTC

Tom:

I was not sure i understood you above.

If you are migrating from WEISO8859P1 database to UTF-8, you can do that with "ALTER character set" command?

there is no need to create new db with UTF-8 and then export/import?
Tom Kyte
November 17, 2010 - 7:46 am UTC

Sam - (you should know what I'm going to say I would think....)


as long as you have done your research and verified for yourself that WE8ISO is a 100% subset of UTF-8 - so is it?

OR

if you have run the characterset scanner to see if your database - after altering the character set - would still be "ok" in UTF8 -

OR

if you run the characterset scanner and are prepared to deal with the exceptions it reports out

then sure. No need to recreate the database.

To the reader

Michel Cadot, November 17, 2010 - 5:06 am UTC


If you are migrating from WEISO8859P1 database to UTF-8, you can do that with "ALTER character set"
command?

No.

You can also read the link I provided in my previous post.

Regards
Michel

Thanks

Abhisek, November 17, 2010 - 5:27 am UTC

Thanks Micheal for the wonderful explanation with reference to documentation..

Thanks to you Tom as well ..

Character Set

A reader, November 17, 2010 - 6:43 am UTC

Hi Tom,

I meant to say, suppose if my DB character set is US7ASCII and national character set as UTF8. Now if my client uses 16 bit encoded character set like Japanese charcter set on his system then what will happen? Will I have to convert the DB and national character sets to 16 bit encoded charcter set (like AL16UTF16) to make it compatible with client?

Tom Kyte
November 17, 2010 - 8:35 am UTC

UFF8 can deal with Japanese data.

US7ASCII sure cannot, but the other can.


http://docs.oracle.com/docs/cd/B19306_01/server.102/b14225/ch2charset.htm#sthref178

You might not be happy (since the mapping from JA16SJIS into UTF8 is not perfect, for example:

<QUOTE>

Conversion between UTF-8 and any multibyte character set has some overhead. There is no data loss from conversion, with the following exceptions:

*

Some multibyte character sets do not support user-defined characters during character set conversion to and from UTF-8.
*

Some Unicode characters are mapped to more than one character in another character set. For example, one Unicode character is mapped to three characters in the JA16SJIS character set. This means that a round-trip conversion may not result in the original JA16SJIS character.


</quote>

Ora-12716

A reader, November 20, 2010 - 8:19 am UTC

Hi Tom,

I was trying to convert to UTF8 from US7ASCII, My database has CLOB columns table in SYS and SYSTEM schemas.It is not allowing me to change the character set, I got
ORA-12716: Cannot ALTER DATABASE CHARACTER SET when CLOB data exists
error message.

1) I heard that "alter database character set" statement is possible only when migration is done between single byte encoded character sets or multibyte encoded character sets. Probably I have violated the rule. Isn't it?
2) Now how can I migrate CLOB data?
Should I use export-drop-import steps to convert it from single byte to multibyte? But would SYS table export/import takes place in this way?
Tom Kyte
November 20, 2010 - 9:32 am UTC

what sys tables are involved, alert log should tell you

Ora-12716

A reader, November 20, 2010 - 10:34 am UTC

Hi Tom,

Actually for the above case I got the below log message in the alert log:

SYS.EXTERNAL_TAB$ (PARAM_CLOB) - CLOB populated
ORA-12716 signalled during: alter database character set UTF8...


Now how do I migrate US7ASCII to UTF8 including CLOB values?
Tom Kyte
November 21, 2010 - 9:06 am UTC

I'm feeling my way as we go along on this one (one the road right now, don't have the resources to test up a controlled test).

what happens if you drop your external tables (use dbms_metadata to get their DDL, drop them, and see if by having the table empty - it is "ok" with it)?

If you update after tomorrow, I'll be back home and can look in more depth

Character set conversion while export

Arindam, December 22, 2010 - 3:20 am UTC

Hi Tom,

I am getting the below message while taking export of a database user:

With the Partitioning, OLAP, Data Mining and Real Application Testing options
Export done in US7ASCII character set and AL16UTF16 NCHAR character set
server uses AL32UTF8 character set (possible charset conversion)


Should I be worry about this server character set? In what character set my data will be imported from? Is it AL32UTF8 or US7ASCII?
Tom Kyte
December 22, 2010 - 2:28 pm UTC

you should be worried, yes.


You have a multibyte database - some characters take 2 or more bytes to store. Your database is capable of storing thousands and thousands of unique characters.

You are exporting with a 7 bit character set - capable of storing... 128 characters uniquely.


You have probably created a useless set of data in your dmp file.


It would be imported from us7ascii.

Character set conversion while export

A reader, December 22, 2010 - 3:32 pm UTC

Hi Tom,

Should I still worry if my new database to which the import will be done has the same US7ASCII as DB character set and AL16UTF16 as National character set? Or will there be any character conversion between database and server in the way?
Tom Kyte
December 22, 2010 - 5:10 pm UTC

If your target database is us7ascii+al16utf16 and the source database was AL32UTF8 - then you must be prepared to have a different set of data in target than in source.

Should you worry? You answer that - you are going to convert a ton of AL32UTF8 into us7ascii - is that a good thing or bad thing?

Character set conversion

A reader, December 23, 2010 - 6:23 am UTC

Hi Tom,

I could not understand what is meant by "Export done in US7ASCII character set" and "server uses AL32UTF8 character set". I can see my source Database is of AL32UTF8 set:
SQL> select * from NLS_DATABASE_PARAMETERS where PARAMETER like '%CHARACTERSET%';

PARAMETER
------------------------------
VALUE
--------------------------------------------------------------------------------
NLS_CHARACTERSET
AL32UTF8

NLS_NCHAR_CHARACTERSET
AL16UTF16

Then why it is said that "Export done in US7ASCII character set"? Is it the default character set to which export is done?
Tom Kyte
December 23, 2010 - 11:12 am UTC

because in the environment your client (exp) was run, the client said "I want to use US7ASCII as my character set"


I'll make some assumptions:

a) you ran the export on some unix variant
b) the default client character set on unix is US7ASCII
c) you did NOT set the NLS_LANG to the character set you wanted to use
d) therefore it used the default

$ export NLS_LANG=AMERICAN_AMERICA.AL32UTF8


do that before you export.

Character set conversion while export

A reader, December 23, 2010 - 12:28 pm UTC

Hi Tom,

So are all Unix operating system uses US7ASCII character set by default?
Tom Kyte
December 24, 2010 - 12:05 pm UTC

pretty much, but it doesn't really matter in your case - because you definitely know "your default is us7ascii" and that is pretty much all that matters - for you.

Character set conversion while export

A reader, December 24, 2010 - 1:07 pm UTC

Hi Tom,

So you are saying that my database characters have been converted from US7ASCII to AL32UTF8. Now suppose my target DB character set is also AL32UTF8 then would those characters again converts into AL32UTF8 from US7ASCII this time? Would that be a problem as I think the characters were modified since source values are different?
Tom Kyte
December 25, 2010 - 11:46 am UTC

No, when you did the export you converted FROM the database character set to US7ASCII.

when you import, they will be converted from US7ASCII into the database character set.


If your two databases both had character set X and you do the export using character set Y and Y <> X - then you have almost certainly changed the data and the data in the target database will be different from the source database.

find charset from expdp file

Ravi B, March 28, 2011 - 4:41 pm UTC

Hi Tom,

One of our clients sent us a export dmp created using expdp. We ran into issues with

ORA-02374: conversion error loading table...
ORA-12899: value too large for column ...

Is there a way we can find out character set of source database from export dmp file which is created using expdp utility?

Thanks!

full exp from 9i , full imp to 10g

A reader, September 13, 2011 - 2:02 pm UTC

Hello Tom,
Can I do a full exp from 9i and full import to 10g with ignore=Y using the system user in 9i and 10g.

Doubts I have:-
9i and 10g are different architecture, so system schema will be different, even with ignore =Y, we are trying to force 9i system schem on 10g system schema.

Is this recommended doing a Full exp /imp from 9i to 10g ?
Tom Kyte
September 14, 2011 - 3:57 pm UTC

I much prefer just upgrading - doing a manual exp/imp is just about the slowest way to do it. The only slower way I can think would be to manually type the data into sqlplus (and depending on how fast you type - that might actually be faster).

Can you do it? Yes.

The question is "should you" :)

we do not export sys things. It can work, but it is really slow and not the easiest thing on the planet to do.

import hangs 10g

A reader, September 14, 2011 - 10:18 pm UTC

Thanx for your response:

Main issue and hence the question :
Can I do a full exp from 9i and full import to 10g with ignore=Y using the system user in
9i and 10g.

It hangs on a particular table when it tries to enable the constraints stage in the end, It tries to enable PK for a 4 million row table and I see it is enabled but hangs at that statement.

Even trying to import just the table using the tables clause gives same issue.exp was done using 9i and imp using 10g.

Getting db file sequential read event
Tom Kyte
September 15, 2011 - 7:40 am UTC

it is not hanging, it is taking its time. It is creating that index - using logging, using no parallel.


I'm telling you, I've told you, you could probably type the data in faster than a full import.

the db file sequential read is you doing physical IO on the index to create the index itself.

import only table also same result

A reader, September 15, 2011 - 10:38 am UTC

Even if try to just import the single table with 4 million rows I have the same issue. Now that is not full imp, only of the specific table which has 30 FKs including 6 selrefencing FKs to it PK column (single column named - ID ).

Also do triggers fire during import ?

Thanx
Tom Kyte
September 16, 2011 - 1:22 pm UTC

triggers only fire if you precreate the table and put triggers on it and then import.


so no, triggers do not usually fire.


I'll say it again:

it is not hanging, it is taking its time. It is creating that index - using logging, using no parallel.



You should consider importing just the data - no indexes, no constraints - and then adding the indexes (including the primary key) yourself using parallel if you have the resources and nologging (and a big sort_area_size - you could consider using manual memory management in the session creating the indexes)

Or, you could just upgrade the database and be done with it.

NLS_LANG

sam, January 06, 2013 - 9:59 pm UTC

Tom:

Since it is very important to set NLS_LANG value properly based on the machine OS, how do you normally determine the proper value for it?

Is it usually set bsed on the OS machine is running on (i.e Windows, Linux, Unix).

Which file also has the value for this parameter?

I do not understand why oracle does not set this automatically during installation.
Tom Kyte
January 14, 2013 - 11:12 am UTC

I do not understand why oracle does not set this automatically during
installation.


we do, on platforms that support such a concept.

On unix - you install the database and tell us what characterset to use for the database. You would then have to set the environment correctly since you do not "install" the client software in each account - they have to set up their environment appropriate (and should very much likely be using the characterset of the database - the one you chose)

this is the way of unix.

On windows - when you install the client - you are actually installing the client for the end users - we can pop something into the registry (something unix does not have - in unix, you set the environment).




on windows the registry is the file used for this setting although it can be overridden in the environment (and you might do that because you connect to more than one database and they have different character sets)

on unix, you just set your environment - typically in a login script like .bashrc or such.



the characterset is something the client decides on what to use - it could be the same as the database - but it doesn't have to be, it is the decision of the application and end users.

character set

A reader, January 21, 2013 - 7:09 pm UTC

Tom:

1) If i understand you correctly, you are saying that NLS_LANG value setting is already being handled by oracle when i install the oracle client software on a windows machine in the registry and i do not have to do anything really and just install my application on the client and have it use tnsnames.ora for connection.

I assume oracle checks what windows OS character set is and sets the NLS_LANG the same value because there can be also multiple values for windows CS (unicode, WE8WIN1252, etc) the machine is set at.

Is this correct?

2) If I have a new 11g DB running on RHEL with CS of AL32UTF8.

and i have oracle HTTP server/mod_plsql (oracle 11g web tier) running on another RHEL machine, would you set the DAD plsqllanguage (character set) as AL32UTF8 or UTF-8?

I can also set NLS_LANG instead of plsqllanguage parameter in each DAD?

3) Would web pages (stored procedures plsql web toolkit) charset declaration change from "WE8ISO8859P1" to "AL32UTF8" or "UTF-8" when you migrate from we8iso8859P1 to AL32UTF8 DB.

I do not think browsers recognize "AL32UT8" because it is oracle CS and not IANA cs like UTF-8.