Skip to Main Content
  • Questions
  • Migrating database from US7ASCII to UTF8

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Pratap.

Asked: December 12, 2001 - 9:31 am UTC

Last updated: October 06, 2011 - 9:05 am UTC

Version: 8.1.6

Viewed 10K+ times! This question is

You Asked

Hi Tom

We have two databases -

USER1 - DB1 - US7ASCII
USER2 - DB2 - UTF8

Some of the tables in USER1 and USER2 are exactly the same.
So for these tables, to migrate data from USER1 to USER2 we
are using insert scripts and copy commands (wherever there are long columns).

Do you see any problems in the migrating the data in such a manner when the database character sets are different?

Secondly how do I confirm that the database from where I am migrating the data is in a consistent state and no transactions are going on there.

Thanks

Pratap

and Tom said...

from USER1 to USER2 should be no problem as long as the data in db1 is all 7bit. If some 8bit data is stored there, it will be converted to 7bit data first.

Oracle provides for a consistent query, the results will be consistent with respect to the point in time your query began. You need to do nothing to enable this facility. read:

</code> http://docs.oracle.com/cd/A81042_01/DOC/server.816/a76965/c23cnsis.htm#2611 <code>

...

Rating

  (14 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

About the second question

Pratap, December 12, 2001 - 10:48 pm UTC

Thanks Tom, but I when I start the migration I need to be sure -

1. That all the currently running transactions are complete
2. And nobody else start any new transaction when the migration is going on.
(We would shut down the web site from where the transactions are done, but still there are background processes and unfinished transactions going on)

Tom Kyte
December 13, 2001 - 8:37 am UTC

1) do a shutdown normal
2) do a startup enable restricted session, now only DBA's can log in.

US7ASCII to UTF8

A reader, April 05, 2005 - 12:14 pm UTC

Oracle 9.2.0.6

What are the considerations that should be taken into account when migrating a 9.2.0.6 US7ASCII database to UTF8? How much testing is needed? Where is all this documented? I know about the Character Set Scanner utility (csscan). Can you please point me to some docs/notes that explain this in detail?

Thanks

Tom Kyte
April 05, 2005 - 12:38 pm UTC

well, you might need more storage...

a varchar2(20) that held

'01234567890123456789'

might not be able to hold that (each character takes 1,2,3 or more bytes in multibyte character sets)

so that description field might not hold N characters anymore.

you might have to use varchar(80 CHAR) instead of just (80) (defaults to bytes..)


</code> https://docs.oracle.com#index-GLO <code>


US7ASCII

A reader, April 05, 2005 - 2:10 pm UTC

Hm, but UTF8 is a strict superset of US7ASCII, right? In other words, if I run the CSSCAN utility and it tells me that all my data is 7-bit ascii, I can do the ALTER DATABASE CHARACTER SET command.

"each character takes 1,2,3 or more bytes in multibyte character sets)"

But each 7-bit ASCII character in the US7ASCII character set still takes just 1 byte in the AL32UTF8 character set, right?

Thanks

Tom Kyte
April 05, 2005 - 6:46 pm UTC

It is that right now if someone stuffs 8bit or bigger data into the table, it gets turned into 7bit data.

In the future, it won't.


Soooo right now, everything is 7bit data.

Tomorrow, same application, same inputs - not same number of bytes, in fact not same bytes anymore as the data isn't converted from whatever to 7bit.




CLOB data

VA, July 20, 2005 - 3:32 pm UTC

I did the ALTER DATABASE CHARACTER SET AL32UTF8; command on a US7ASCII database and got

12716, 00000, "Cannot ALTER DATABASE CHARACTER SET when CLOB data exists"
// *Cause: CLOB data changes representation to Unicode when converting to a
// multibyte character set and must be migrated
// *Action: Remove CLOB data as listed in the alert file. CLOB data can be migrated
// by methods such as import/export


Not sure how to handle this. Some of my application schemas have CLOB columns, even the data dictionary (sys.snap$, sys.deflob and HTML DB's flows_010600) schemas have a bunch of CLOB columns.

Does this mean I cant use the ALTER DATABASE command to migrate my characterset to AL32UTF8?

Thanks

Tom Kyte
July 21, 2005 - 7:44 am UTC

have you run the character set scanner to see what information it might relate to you

</code> http://docs.oracle.com/docs/cd/B10501_01/server.920/a96529/ch11.htm#1005049 <code>

Metalink

VA, July 21, 2005 - 8:57 am UTC

Metalink has a note on this Note 66320.1

Basically, it acknowledges that the 9i dictionary itself has CLOB columns so the ALTER DATABASE CHARACTER SET command will not work even on a brand new database with no user data.

It describes how to do this.

Basically, do a
select owner,table_name,count(*)
from dba_tab_columns
where data_type='CLOB'
and owner!='SYS'
group by owner,table_name;

Put all those owner.table_name into a exp parfile with tables=(...), export them, truncate them (if truncate complains about foreign keys, use DELETE).

Now the ALTER DATABASE CHARACTER SET works fine.

Then re-import all the tables with ignore=y constraints=n grants=n indexes=n

Worked like a charm for me

Data Guard

VA, July 26, 2005 - 1:40 pm UTC

If I issue the ALTER DATABASE CHARACTER SET <new character set>; command on the primary database, would it properly change the characterset of my physical standby database?

Thanks

Tom Kyte
July 26, 2005 - 3:19 pm UTC

that alter just does sql to the dictionary, it'll be replicated over by the redo log transfer.


one question about CSSCAN when I do conversion from WE8ISO8859P1 to AL32UTF8 unicode character set.

Jacky, April 06, 2009 - 11:54 pm UTC

I have one question about CSSCAN when I do conversion from WE8ISO8859P1 to AL32UTF8 unicode character set.

Below is the test case.

create table aaa (c1 varchar2(10));
insert into aaa values ('sys.¿Med');

After csscan the data, get the application exception data as below

------------------------------------------------
[Application data individual exceptions]
User : JACKY
Table : AAA
Column: C1
Type : VARCHAR2(10)
Number of Exceptions : 1
Max Post Conversion Data Size: 9
ROWID Exception Type Size Cell Data(first 30 bytes)
------------------ ------------------ ----- ------------------------------
AAAYtgAAEAAAEKPAAA lossy conversion sys.¿Med
------------------ ------------------ ----- ------------------------------

But after use convert method and check it, the data can be converted successfully after use below sql

select case when convert(convert(C1, 'AL32UTF8', 'WE8ISO8859P1') ,'WE8ISO8859P1', 'AL32UTF8') = C1 then 0 else
-999999+111111 END from aaa;

Below is dump of data

1. dump in AL32UTF8 character set
select dump(convert(c1,'AL32UTF8')) from aaa
DUMP(CONVERT(C1,'AL32UTF8'))
----------------------------------------------------
Typ=1 Len=9: 115,121,115,46,194,133,77,101,100
2. dump in WE8ISO8859P1 character set
select dump(c1) from aaa
DUMP(C1)
----------------------------------------------------
Typ=1 Len=8: 115,121,115,46,133,77,101,100
Is report of cssacn not correct?

Thanks


Tom Kyte
April 13, 2009 - 11:09 am UTC

chr(133) is not a character in the WE8ISO8859P1 character set - it is a "..." in windows character sets - but not we8iso. You have bad data in there.


http://arjudba.blogspot.com/2009/03/difference-between-we8iso8859p1-and.html

csalter/csscan do not get the change to the physical standby

Sreekanth Bandi, November 12, 2009 - 7:17 am UTC

I have changed the character set from WE8ISO8859P1 to WE8MSWIN1252 using cscan/csalter scripts, but the physical standby database character set is still showing as WE8MSWIN1252. The Oralce Support says that recreating the standby database is the only supported way to change the character set on the standby database.

Converting NLS_CHARACTERSET

Jim, September 27, 2011 - 11:00 am UTC

Hi Tom

Having difficulty understanding process in going from
NLS_CHARACTERSET WE8MSWIN1252 to
NLS_CHARACTERSET AL32UTF8

Environment:
Windows 2003 64-bit running oracle 10g 64-bit
windows 2008 r2 64-bit running oracle 11g r2 64-bit

Assumptions:
CScan has been run and all issues have been resolved for conversion

Two scenarios:

1. I need to convert a 10g database in place from WE8MSWIN1252 to AL32UTF8

Can I just run CSALTER on the database and the conversion will happen or do I need to do more ?

2. I need to move/convert the 10g database (WE8MSWIN1252) into an 11g database (AL32UTF8)

Can I just create the 11g database as AL32UTF8 and then export the 10g
schema into the 11g database ?

Thanks for Your Time
Jim
Tom Kyte
September 27, 2011 - 1:06 pm UTC

1) you can

http://www.oracle-base.com/articles/10g/CharacterSetMigration.php

shows a good example of using the tools


2) you could do that using datapump, yes. Or, you can alter the existing 10g database and then upgrade it.

Converting NLS_CHARACTERSET - Thanks

Jim, September 28, 2011 - 9:55 am UTC

Thanks for your time and input Tom

I will review the document

Jim

Converting NLS_CHARACTERSET-Follow-up

Jim Cox, September 29, 2011 - 12:57 pm UTC

Hi Tom

after running Csscan on my database i have "Convertible" data. Any idea on what that means and what would happen if i now run CSALTER ?

Example Csscan output:


[Application Data Conversion Summary]

Datatype Changeless Convertible Truncation Lossy
--------------------- ----------- ---------------- ---------------- ----------------
VARCHAR2 9,737,307 1,957 0 0
CHAR 0 0 0 0
LONG 0 0 0 0
CLOB 131,273 127,978 0 0
VARRAY 1,575 0 0 0
--------------------- ----------- ---------------- ---------------- ----------------
Total 9,870,155 129,935 0 0
Total in percentage 98.701% 1.299% 0.000% 0.000%

[Distribution of Convertible, Truncated and Lossy Data by Table]

USER.TABLE Convertible Truncation Lossy
--------------------------------- ---------------- ---------------- ----------------
CDRL.DAUDITMORE 21 0 0
CDRL.DAUDITNEW 118 0 0
CDRL.DAVPROP 11,784 0 0

(rest if report truncated)




Thanks
Jim
Tom Kyte
September 30, 2011 - 5:54 pm UTC

Thanks

Jim Cox, October 05, 2011 - 10:14 am UTC

Hi Tom


Thanks for the info and your time on this. Will read it. I found this nice document about Migrating Database Character Sets to Unicode by Yan Li. It was also very helpful

http://www.nyoug.org/Presentations/2009/Li_Unicode.pdf

Jim

NLS_LANG Registry Setting After Conversion

Jim Cox, October 05, 2011 - 1:27 pm UTC

Hi Tom

I just coompleted a sucessful change to NLS_CHARACTERSET

I went from WE8MSWIN1252 to AL32UTF8
using CSSCAN,CSALTER and DataPump Export/Import

My question is about the NLS_LANG Registry setting
on the server and client side

It currently is set to AMERICAN_AMERICA.WE8MSWIN1252

Do I need to change it on the server and client side
to now reflect AL32UTF8 or can it stay the same ?

Thanks Jim
Tom Kyte
October 06, 2011 - 9:05 am UTC

You would want to set it on the clients to the character set they are expecting to display and input on their screens. What are those clients going to input into the database - that is the character set you would use for them.


sort of a non-answer, but only because the answer is "it depends"
http://docs.oracle.com/docs/cd/E11882_01/server.112/e10729/ch2charset.htm#i1006853

probably you want al32utf8 on the client

NLS_LANG Registry Setting After Conversion-Thanks

Jim Cox, October 06, 2011 - 10:25 am UTC

Thanks again Tom

Appreciate the assist

Jim