Follow-up
Alex, October 29, 2020 - 4:47 pm UTC
Hi Sergiusz,
Your answer was exactly what i encountered:
(1) NLSSORT: Yes, ițve seen the ugly NUL in the output which is the reason of this topic actually; Adding the NUL character (from reasons i don't understand) is a show-stopper for my needs, although all the other conversions it's doing are PERFECT.
(2) option 2 (translate); there are 2 drawbacks here:
(a) TRANSLATE requires a 1-1 mapping so if i want to translate German 'ö' to 'oe', i'm out of luck.
(b) TRANSLATE requires me to enter ALL European (national) characters, both upper-case and lower-case, so it ends up being a long (and probably incomplete) list.
This approach is what i've actually implemented, TRANSLATE + big chain of single characters translated into their English correspondent (it's not optimal, but it's honest work).
Cheers,
To Sergiusz
J. Laurindo Chiappa, October 30, 2020 - 4:43 pm UTC
Yes, NLSSORT adds an extra CHR(0), but I´m able to remove it easily with REPLACE, and to my needs (I´m using Portuguese, and the Portuguese alphabet can be directly converted) NLS_SORT works perfectly, this way :
SQL> select utl_raw.cast_to_varchar2(nlssort('À NOITE,VOVÔ MÜLLER VÊ O ÍMÃ CAIR NO PÉ E VOVÓ SÓ PÕE AÇÚCAR NO CHÁ SEM ÂNIMO', 'nls_sort=binary_ai')) as result from dual;
the result is :
RESULT -------------------------------------------------------------------------------------------
a noite,vovo muller ve o ima cair no pe e vovo so poe acucar no cha sem animo
==> This phrase is just a Portuguese pangram, ie, a sentence using all letters of the particular language's alphabet.. You will not see here, but sqlplus shows a small square at the end, representing the extra CHR(0) inserted)...
Adding a replace :
SQL> select replace(utl_raw.cast_to_varchar2(nlssort('À NOITE,VOVÔ MÜLLER VÊ O ÍMÃ CAIR NO PÉ E VOVÓ SÓ PÕE AÇÚCAR NO CHÁ SEM ÂNIMO', 'nls_sort=binary_ai')), chr(0)) as result from dual;
the result is perfect - afaik this will not work for non-Latin alphabets such as German, but for LAs it works perfectly...
Regards,
Chiappa
To Chiappa
Dieter, October 30, 2020 - 7:23 pm UTC
There are two problems with your approach:
- Upper / lower case is not preserved, as required by the original post.
- The cast behaves inconsistently, see 2nd line of my example ...
WITH a AS
(SELECT 'À NOITE,VOVÔ MÜLLER VÊ O ÍMÃ CAIR NO PÉ E VOVÓ SÓ PÕE AÇÚCAR NO CHÁ SEM ÂNIMO' text
FROM dual
UNION ALL
SELECT '¡¿ !? Åû -> Æ æ œ <- õč ÄÖÜ äöü -> ß <-'
FROM dual)
SELECT text,
REPLACE(utl_raw.cast_to_varchar2(nlssort(text,
'nls_sort=binary_ai')),
chr(0)) AS RESULT
FROM a;
To Dieter
J. Laurindo Chiappa, November 01, 2020 - 3:53 pm UTC
Yes, as I said the NLSSORT solution works only for the base symbols in Portuguese alphabet - for sure, all special symbols (such as Euro, macron, superscrit ordinals, etc) will not work/will not be transalated... Same for decomposable symbols, such as the latin ligature, what must be decompost in A and E - so, Sergiusz will not be able to use it but for Other readers, using only base Latin symbols, this solution can be feasible/usable...
Regards,
Chiappa