Skip to Main Content

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, shivashankar.

Asked: October 21, 2015 - 7:34 pm UTC

Last updated: November 02, 2020 - 3:00 am UTC

Version: 11.2.0.3

Viewed 10K+ times! This question is

You Asked

Hello,

There are numerous characters Teradata doesn't like. On Oracle, I want to do global replacement (at least at the table level) by finding all the characters that are 4 byte and few other 1,2 and 3 byte character (based on my choice) by a space. Is there a way to do this?

If an option is available using Unix command like sed, tr that helps too.

Thanks,



and Chris said...

Lengthb returns the size of a string in bytes. To identify all the strings containing multibyte characters, check whether lengthb > length:

select * from (
  select 'àéíóú' s from dual union all
  select 'aeiou' s from dual 
)
where length(s) < lengthb(s);

S        
----------
àéíóú   


To find all the individual multibyte characters, split the strings so each character is a row. You can do this using the connect by level (or similar) row generation trick. Then a single character substr for each row. Then check the size:

select c, length(c), lengthb(c) from (
  select substr(s, level, 1) c
  from (
    select 'àéíóúbcdef' s from dual
  )
  connect by level <= length(s)
)
where lengthb(c) > 1;

C     LENGTH(C) LENGTHB(C)
---- ---------- ----------
à             1          2
é             1          2
í             1          2
ó             1          2
ú             1          2

Rating

  (2 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

It's only me?

Hugo, October 31, 2020 - 3:46 pm UTC

SQL*Plus: Release 19.0.0.0.0 - Production on Sáb Oct 31 12:42:30 2020
Version 19.3.0.0.0
Copyright (c) 1982, 2019, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Release 11.2.0.4.0 - 64bit Production
SQL> select c, length(c), lengthb(c) from (
  select substr(s, level, 1) c
  from (
    select 'àéíóúbcdef' s from dual
  )
  connect by level <= length(s)
)
where lengthb(c) > 1;
No rows selected

Connor McDonald
November 02, 2020 - 3:00 am UTC

What characterset is your database using?

It might be a single byte characterset that includes the characters you're using.

use select N'àéaíóúbcdef' s from dual;

bias, September 21, 2022 - 1:01 pm UTC

Use select N'àéaíóúbcdef' s from dual ;
ie, use data type as NVarchar2 that can store Unicode Chars