Thanks for the question, Sunil.
Asked: March 06, 2018 - 6:55 pm UTC
Last updated: March 15, 2018 - 4:18 am UTC
Version: 12c
Viewed 1000+ times
You Asked
Hi Tom,
I'm using Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production.
Need some suggestion regarding how to handle special characters.
My doubt with converting any special characters to normal character except below valid characters.
All mentioned below are valid characters
a to z , A to Z , 0 to 9
`~!@#%$^&*()_-+={}[]|\:;"'<>,.?/
While executing the below statement
SELECT convert( 'RÃÁB̂GÃIÈ IÑTËRMNÇPL (DE)', 'US7ASCII' ) FROM dual;
Output : R?AB?G?IE I?TERMNCPL (DE)
As per my requirement the output should be : RAABGAIE INTERMNCPL (DE) --(How can i achieve this one).
1. What could be the reason to convert it ? symbol.
2. why it converts à to ? but Á to A.
3. Does it converts unrecognized special characters to ? always or anything else.
Please help me to clear the doubts.
Thanks in Advance!!
Sunil.
and we said...
Sunil,
You are trying to use CONVERT for a purpose that it is not designed for. Any Oracle character set (except Unicode AL32UTF8/UTF8/AL16UTF16) can have a mapping table of replacement characters. When the character set is a target of a character set conversion, this table tells the conversion routines (e.g. CONVERT) what to do with incoming source characters that do not exist in the character set and thus cannot be converted directly. If a source character cannot be converted directly but it exists in the mapping table, its corresponding (mapped) replacement character from the table is put into the target string.
Most character sets do not really use this feature maybe except for a few alternative forms of some characters. You can use the Oracle Locale Builder utility to check replacement character tables of the supported character sets. For historical reasons, the US7ASCII character set does define plain Latin letters as replacements for a number of accented Latin letters, mainly those existing in the EE8ISO8859P2 character set. The mapping table does not cover all possible accented characters, not even all characters defined in WE8ISO8859P1. With Unicode gaining popularity, the tendency is to store original strings and not try to squeeze them into legacy character sets.
If you really need a string with accents stripped down, what you need is basically an NLS_BASE_LETTER function, which we do not unfortunately have. If the number of accented characters that you have to replace is manageable, your best choice is to use the TRANSLATE function to do the character replacement. Alternatively, you can keep the original text but use one of the _AI collations (linguistic sorts) with comparison and matching expressions in your SQL queries. In Oracle Database 12.2, you can say:
SELECT * FROM mytab1 WHERE mycol1 LIKE 'A%' COLLATE BINARY_AI
and both 'ÃB' and 'ÁB' will match. (In 12.1, you need to set the NLS_SORT and NLS_COMP session parameters, which may be cumbersome.)
Rating
(3 ratings)
Is this answer out of date? If it is, please let us know via a Comment