Skip to Main Content
  • Questions
  • removing control characters from text

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, shrenit .

Asked: May 02, 2000 - 1:20 pm UTC

Last updated: November 18, 2018 - 10:36 pm UTC

Version:

Viewed 50K+ times! This question is

You Asked

Is there a routine in Oracle that can test for and remove
unwanted characters in a text string, ie control characters?




and we said...

Yes, we can use REPLACE and TRANSLATE to do this. Lets say the characters you wanted to remove where 'SAT' (to remove control characters like TABS, CR, LF you would use chr(9) || chr(13) || chr(10). CHR is a function that takes the ASCII code and returns that character -- 9 = tab, 13 = CR and so on). We could then code:

ops$tkyte@8i> select username, replace( translate( username, 'SAT', 'S' ), 'S', '' )
2 from all_users
3 where rownum < 5
4 /

USERNAME REPLACE(TRANSLATE(USERNAME,'SA
------------------------------ ------------------------------
SYS Y
SYSTEM YEM
OUTLN OULN
DBSNMP DBNMP


So, that gets rid of SAT in all of the strings for us (and would get rid of control characters just as easily.

Since this can be cumbersome to code and we would not really want to put the string of control characters into all of our queries, we can make this more generic as follows:

ops$tkyte@8i> variable undesirable varchar2(255)
ops$tkyte@8i> exec :undesirable := 'SAT'

PL/SQL procedure successfully completed.

ops$tkyte@8i> select username,
2 replace(
3 translate( username, :undesirable, substr(:undesirable,1,1) ),
4 substr(:undesirable,1,1), '' )
5 from all_users
6 where rownum < 5
7 /

USERNAME REPLACE(TRANSLATE(USERNAME,:UN
------------------------------ ------------------------------
SYS Y
SYSTEM YEM
OUTLN OULN
DBSNMP DBNMP


So, here we get the same effect but 'SAT' and 'S' are not hard coded into the query. This is OK but still hard to code and we would still have to code these variables into our applications. We would prefer to control this from the database (in case the set of control characters changes over time). To achive this, we create a view:


ops$tkyte@8i> create or replace view bad_chars
2 as
3 select 'SAT' allbad, 'S' onebad
4 from dual
5 /

View created.

And now we can code:

ops$tkyte@8i> select username, replace( translate( username, allbad, onebad ), onebad, '' )
2 from all_users, bad_chars
3 where rownum < 5
4 /

USERNAME REPLACE(TRANSLATE(USERNAME,ALL
------------------------------ ------------------------------
SYS Y
SYSTEM YEM
OUTLN OULN
DBSNMP DBNMP

That is easier but we can go one step further:


ops$tkyte@8i> create or replace function stripbad( p_str in varchar2 ) return varchar2
2 is
3 undesirable varchar2(25) default 'SAT';
4 begin
5 return
6 replace(
7 translate( p_str, undesirable, substr(undesirable,1,1) ),
8 substr(undesirable,1,1), '' ) ;
9 end;
10 /

Function created.

ops$tkyte@8i> select username, stripbad( username ) stripped
2 from all_users
3 where rownum < 5;

USERNAME STRIPPED
------------------------------ --------------------
SYS Y
SYSTEM YEM
OUTLN OULN
DBSNMP DBNMP


So, now it is like we have a builtin function stripbad that gets rid of our control characters. If you select a couple of rows at a time -- the PL/SQL function will perform quite adequately. If you select thousands of records at a time, you will want to use the view + translate/replace directly for performance.



Rating

  (56 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Suppose characters to strip are not known?

Dave, February 12, 2003 - 3:07 pm UTC

Thanks, Tom, for the great example.

You were looking for strings that had "SAT" in them. Suppose, that I do not know the "bad" characters? All I know is that the bad characters have ascci values less than 32 and greater than 126. Does your method still work? And for my problem, I just need to identify the rows with these bad characters. I don't need to replace them.

Tom Kyte
February 12, 2003 - 4:17 pm UTC

ops$tkyte@ORA920> create table bad ( str varchar2(255) primary key ) organization index;

Table created.

ops$tkyte@ORA920>
ops$tkyte@ORA920> declare
  2          l_str varchar2(255);
  3  begin
  4          for i in 0 .. 31
  5          loop
  6                  l_str := l_str || chr(i);
  7          end loop;
  8          for i in 127..255
  9          loop
 10                  l_str := l_str || chr(i);
 11          end loop;
 12          insert into bad values ( l_str );
 13  end;
 14  /

PL/SQL procedure successfully completed.

ops$tkyte@ORA920>
ops$tkyte@ORA920> drop table t;

Table dropped.

ops$tkyte@ORA920>
ops$tkyte@ORA920> create table t ( data varchar2(20) );

Table created.

ops$tkyte@ORA920>
ops$tkyte@ORA920> insert into t values ( 'Hello World' );

1 row created.

ops$tkyte@ORA920> insert into t values ( 'bad ' || chr(31) );

1 row created.

ops$tkyte@ORA920>
ops$tkyte@ORA920> select * from t where translate( data, (select * from bad), chr(0) ) != data;

DATA
--------------------
bad


will do that -- just need to put into "bad" something that represents what you are looking for. 

can I do this for all columns of a number of tables in a schema

Ajeet Ojha, July 18, 2003 - 5:01 pm UTC

Hi Tom,
Can I find a particular set of special charactters in all the tables of a schema,How can I change this script to do this.

Thanks
Ajeet

is this a good way --it can be make more generic

Ajeet, July 20, 2003 - 9:08 pm UTC

declare
l_cnt number := 0 ;
begin
for x in (select column_name,table_name from user_tab_columns where data_type in ('CHAR','VARCHAR2' )) loop
execute immediate 'select count(*) from ' || x.table_name || ' where instr (' || x.column_name|| ',chr(0) ) > 0 '
into l_cnt ;
if l_cnt > 0 then
dbms_output.put_line(l_cnt||'~'||x.table_name||'~'||x.column_name) ;
end if ;
end loop ;
end ;
/

You are unbelievable

Ajeet, July 20, 2003 - 9:50 pm UTC

This is what I needed.How can you write such generic scripts..You are unbelievable.

Thanks for your help again.
Ajeet

Thank you very much.

A reader, July 21, 2003 - 6:52 am UTC

Just exactly what I needed. The flat file generated additional lines when it exceeded 255 chars or the next comma position in a comma-delimited file whichever is latest.

Using this snippet, the control characters were removed.
Thanks.

Thank you very much.

Oliver Dimalanta, July 21, 2003 - 6:53 am UTC

Just exactly what I needed. The flat file generated additional lines when it exceeded 255 chars or the next comma position in a comma-delimited file whichever is latest.

Using this snippet, the control characters were removed.
Thanks.

replace two strings

Pingu_SAN, August 21, 2003 - 6:13 am UTC

Hi

How can we replace two strings without using twice replace(), for example

select
replace(replace(value, '%s', sequence#), '%t', thread#)
from v$log_history a, v$parameter b
where b.name='log_archive_format';

I have to use twice replace, is there a way (using decode or case) to replace two or even more strings in a go?


Cheers

Tom Kyte
August 21, 2003 - 6:10 pm UTC

nope

mixed case

Sandeep, September 15, 2003 - 12:17 pm UTC

8.1.7.4
need to look for a particular word in a string and change it to a particular case...
like say the column we are working with is job_descrip for an emloyee....and job_descrip has data in mixed case...
so we need to replace all occurences of word oracle (in any case ex: oRacle or ORacLE etc. to OrACLE)...
How can I write as sql to do this....I was trying REPLACE...but didn't work....cause have more than one case here and all the cases are not known....too many combinations...

Thanks..



Tom Kyte
September 15, 2003 - 2:40 pm UTC

is there a reasonable max limit to the number of terms in the string to be replaced you would expect ever??

Can it be done with using upper.

Shailandra, September 15, 2003 - 3:00 pm UTC

Hi Tom,

Can the above question be done by using upper function rather than looking for every possible combination of string.

SQL> select username, replace( translate( upper(username), 'SAT', 'S' ), 'S', 
'' )
  2    from     all_users
  3   where rownum < 5
  4  /
 

Tom Kyte
September 15, 2003 - 4:14 pm UTC

yes, but that would destroy the case of everything else -- which I thought they wanted to preserve.

Snippet Error

reader, May 07, 2004 - 5:31 am UTC

Hi Tom,

I tried to execute pl/sql code given by you. It is giving following error

15:04:05 atlas@INFOD> @find_value
Enter value for giv_str: sachin
old 3: giv_str varchar(200):=upper('&giv_str');
new 3: giv_str varchar(200):=upper('sachin');
Table Name ColumnName
----------------------------- -----------------------------
declare
*
ERROR at line 1:
ORA-00932: inconsistent datatypes
ORA-06512: at line 12

Kindly suggest

Tom Kyte
May 07, 2004 - 8:33 am UTC

sooo, care to share the *entire* example

there is more than one plsql block on this page.
I called none of them find_value
the only script i have on this page with line 12 wouldn't/couldn't do that.
none of them print out "columnName"

and so on and so on...

Wanted to keep only valid characters in string

parag, May 07, 2004 - 6:30 am UTC

Hi Tom,

Thanks for your great answer. I have practically little different question in same context

I want to keep only a-z, A-Z, '-', '_', '$' characters in text string. All other characters in the string I want to remove it. I do ascii values of the characters to be removed. How can I do it in easy way using SQL ?

regards & thanks
parag jayant patankar


Tom Kyte
May 07, 2004 - 8:35 am UTC

with 10g regular expressions, this will be easy.

in 9i, you would have to build a string of the characters you want to remove.

parag, May 07, 2004 - 7:17 am UTC

Hi Tom,

Thanks for your great answer. I have practically little different question in same context. Samll correction in earlier thread.

I want to keep only a-z, A-Z, '-', '_', '$' characters in text string. All other characters in the string I want to remove it. I do not know ascii values of the characters to be removed. How can I do it in easy way using SQL ?

regards & thanks
parag jayant patankar




Ignore example

parag, May 07, 2004 - 9:27 am UTC

Hi Tom,

Sorry. I put wrong example in your name. This is an example given in same thread and link to metalink "Tip of the day".

I had tried your pl/sql coding. Working very fine.

thanks & regards


Pl answer my question !!!!

parag, May 07, 2004 - 9:45 am UTC

Hi Tom,

Will you pl help me regarding keeping certain characters a-z, A-Z, 0-9, "-", "_" in a string using sql ????? ( already posted a question in the same thread )

regards & thanks

Tom Kyte
May 07, 2004 - 11:43 am UTC

i told you?!?

you'd have to build a string of everything but the ones you wanted to keep in 9i to do this efficiently.


ops$tkyte@ORA9IR2> create or replace package demo_pkg
  2  as
  3          g_bad_chars varchar2(255);
  4  end;
  5  /
 
Package created.
 
ops$tkyte@ORA9IR2> create or replace package body demo_pkg
  2  as
  3  begin
  4          for i in 1 .. 255
  5          loop
  6                  if ( chr(i) not between 'a' and 'z' )
  7                     and
  8                     ( chr(i) not between 'A' and 'Z' )
  9                     and
 10                     ( chr(i) not between '0' and '9' )
 11                     and
 12                     ( chr(i) not in ( '-', '_' ) )
 13                  then
 14                          g_bad_chars := g_bad_chars || chr(i);
 15                  end if;
 16          end loop;
 17  end;
 18  /
 
Package body created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> drop table t;
 
Table dropped.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create table t ( str varchar2(30) );
 
Table created.
 
ops$tkyte@ORA9IR2> insert into t values ( '5%@%52345-fafdA_5@$#' );
 
1 row created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> begin
  2          for x in ( select translate( str, 'a'||demo_pkg.g_bad_chars, 'a' ) str from t )
  3          loop
  4                  dbms_output.put_line( x.str );
  5          end loop;
  6  end;
  7  /
552345-fafdA_5
 
PL/SQL procedure successfully completed.
 

How to find value for ' ( single quote )

pjp, July 28, 2004 - 8:23 am UTC

Hi Tom,

How to find value of single quote "'" ? for e.g. B'BAY that I want to replace with spaces as mentioned by you in this thread. I had tried following

select to_number('''') from dual
*
ERROR at line 1:
ORA-01722: invalid number

thanks & regards

Tom Kyte
July 28, 2004 - 12:49 pm UTC

replace( string, '''', ' ' )

will replace it, like '%''%' will find it (as would instr( string, '''' ) > 0)

** Query **

pjp, July 28, 2004 - 9:47 am UTC

Tom,

1. I got an answer what I want to do for question I had asked you in this thread. I should do ascii('''') to get an value am I correct ? ( I got ascii value 39 for single quote "'")

2. secondly I am trying translate the characters by pl/sql code as mentioned in this thread but I am not able to remove single quote character from character string. For other characters pl/sql code working very fine. If I am running from SQL to remove/translate character it is getting removed. Do you guess what is the reason ? ( I am running pl/sql code on AIX )

3. I am trying to do test removing character from sql by following

19:03:21 atlas@ATP1P1> select * from parag;

A
-------------------------
DD B'AY

I am storing all unwanted characters in v9spd900 table in v0010s varchar2(999) fields for ascii range 1 to 255. If i do

19:08:29 atlas@ATP1P1> select translate(a, v0010s, ' ') from parag, v9spd900;

TRANSLATE(A,V0010S,'')
-------------------------
DD BAY

My question is why "'" character not replaced with ' ' (blank space ) why it removed character ?

thanks & regards



Tom Kyte
July 28, 2004 - 1:08 pm UTC

2) cannot guess, you did not give an example.


select replace( a, '''', ' ')

** Pl expalin ***

pjp, July 29, 2004 - 1:27 am UTC

Tom,

Thanks for your reply. I am trying to replace unwanted characters by blank space character ' ' by techniques suggeted by you in this thread. In table v9spd900 I am storing values in a table by following pl/sql code : -

declare
l_chr varchar2(999);
begin
for i in 1..255
loop
   if (chr(i) not between 'A' and 'Z') and
      (chr(i) not between 'a' and 'z') and
      (chr(i) not between '0' and '9') and
      (chr(i) not in ('_', '-', ' ','/','\','$')) then
      l_chr := l_chr||chr(i);
   end if;
end loop;
     insert into v9spd900 values (l_chr);
     commit;
end;
/

Then I am using following SQL to replace unwanted characters by '  ' ( one blank space ).

  1* select translate(a, v0010s, ' ') from parag, v9spd900
SQL> /

TRANSLATE(A,V0010S,'')
-------------------------
DD BAY

I want output as "DD B AY". My questions is

Why this SQL not able to replace unwanted characters by ' ' blank space, it just removed unwanted character ?

thanks & regards 

Tom Kyte
July 29, 2004 - 8:07 am UTC

translate( a, v0010s, rpad( ' ', length(v0010s) )

you need to fill out the translation table....

Will you pl reply my query above ?

pjp, July 29, 2004 - 7:43 am UTC

Hi Tom,

Will you pl reply to my query of translate asked above ?

best regards
pjp

A reader, July 29, 2004 - 10:09 am UTC

for i in 1..255
loop
if (chr(i) not between 'A' and 'Z') and
(chr(i) not between 'a' and 'z') and
(chr(i) not between '0' and '9') and
(chr(i) not in ('_', '-', ' ','/','\','$')) then
l_chr := l_chr|| ' ' ;
else
l_chr := l_chr||chr(i) ;
end if;
end loop;


Stripping special characters, redux!

Duke Ganote, July 29, 2004 - 1:50 pm UTC

A parallel question was "How would you go about stripping special characters from a partnumber...I want to strip everything except A-Z, a-z, 0-9."
at </code> http://asktom.oracle.com/pls/ask/f?p=4950:61:17787707607021855365::::P61_ID:595323463035 <code>

How to find records having unwanted characters ?

Parag Jayant Patankar, November 09, 2004 - 1:16 am UTC

Hi Tom,

I am having string of 1000 characters that is nothing but complete transaction for our Internet Banking. I want to detect strings having unwanted characters and write to log file.

Earlier I was replacing all unwanted characters in SQL
by
select translate(string1, v01010s, rpad( ' ', length(v0010s) )
from transaction, v9spd900
/

where v9spd900 is having all unwanted characters.

Kindly suggest me a simple way in SQL ( not in PL/SQL )to detect records having unwanted characters ?

Is it also possible to detect the exact position/poistions of unwanted characters in a string ( e.g 80th Character, 976th Character ...etc ) in SQL ( not in PL/SQL ) ?

best regards & thanks
pjp

Tom Kyte
November 09, 2004 - 8:49 am UTC

you've already done the work for me here, you have posted the "simple" way in sql to do this.

the exact position of the "first" occurence, sure -- but all occurences becomes problematic.


why the fear of plsql? a plsql function there and here to do something truly useful (it is what I would use to report back the "errors", something like:

select string1,
case
when translate(string1, v01010s, rpad( ' ', length(v0010s) ) <> string1
then plsql_function(string1)
else null
end error_positions
from transaction, v9spd900
/




Question

Parag Jayant Patankar, November 09, 2004 - 8:57 am UTC

Hi Tom,

Pl explain also my question asked in thread

Kindly suggest me a simple way in SQL ( not in PL/SQL )to detect records having unwanted characters ?

regards & thanks
pjp

Tom Kyte
November 09, 2004 - 9:24 am UTC

page up -- you ANSWERED it already yourself?


select translate(string1, v01010s, rpad( ' ', length(v0010s) )
from transaction, v9spd900
/


now, just compare translate(...) to string1 and keep the records where string1 <> translate(...)

Thanks

Hubertus Krogmann, December 02, 2004 - 8:00 am UTC

We have a colum globaltext filled with text from 4 other colums by a perl script.
This colum is big and not needed anymore, except by a mainframe program using it for fulltextsearch.
a simple a||b||c||d as globaltext returns some nasty CR LF which the mainframe cannot use ...

the replace/translate(...)) was very helpful even 4.6 year old things DO help ;-)


Show "only" leading and trailing spaces as #

A reader, April 21, 2005 - 8:25 am UTC

Hello Sir,
I want to show the leading and trailing spaces if any as # in a string ( It does not have any # inside the string. Also rule is just for trailing and leading spaces,not spaces in between the string)

Example :
1) "mystring " ---> mystring###
2) " mystring" ---> ##mystring
3) " my string " --> ##my string##

If I do a simple replace that does not work for example 3


Tom Kyte
April 21, 2005 - 11:56 am UTC

ops$tkyte@ORA9IR2.US.ORACLE.COM> select '"'||x||'"', rpad( rtrim(lpad( ltrim(x), length(x), '#')), length(x), '#' ) y
  2    from t;
 
'"'||X||'"'  Y
------------ ----------
"   x   "    ###x###
"x   "       x###
"  x"        ##x
 

Thanks Sir.

A reader, April 21, 2005 - 3:46 pm UTC


Re: leading and trailing spaces as #

Gabe, April 21, 2005 - 9:45 pm UTC

The one possible problem with that solution is if the string is made up only of spaces ... it returns null ... in case they expect the spaces replaced rather than removed.

Tom Kyte
April 22, 2005 - 10:20 am UTC

good point

nvl(ltrim(x),'#')

in the inside should fix that.

How to handle this scenario

TS, April 28, 2005 - 4:58 pm UTC

Tom,

I've a column value like this:-

abcdef^M
ghijklmn^M
opqrstuvwxyz^M
<blank line>
<blank line>

After ^M there's a long blank space on each line
followed by 2 blank lines at the end. I used replace,translate functions to remove the control characters and replace it with '!', But how do I remove the blank spaces after each line and bring the entire column value into a single line.


Tom Kyte
April 28, 2005 - 5:26 pm UTC

are there chr(10)'s in there you want to remove?

sounds like you might want to trim the replace of that string's chr(10) with ''

Thanks

TS, April 29, 2005 - 10:42 am UTC

Tom,

Worked like a champ.Thanks a ton.

TS

questions how to printing blank spaces

feng, May 26, 2005 - 1:30 am UTC

hi,
I got problems to adding some extra blank spaces before I print the entire text.
however, rpad and lpad seem to be negative to control blank space

Tom Kyte
May 26, 2005 - 8:44 am UTC

don't know what you mean.... is this about dbms_output?

set serveroutput on format wrapped


ops$tkyte@ORA9IR2> exec dbms_output.put_line( '            x' );
x
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> set serveroutput on format wrapped
ops$tkyte@ORA9IR2> exec dbms_output.put_line( '            x' );
            x
 
PL/SQL procedure successfully completed.
 
 

replacing null values in a column with char

Reader, May 30, 2005 - 8:32 am UTC

Greeting Tom,

I have followin results from this query:

SQL> SELECT * from emp;

     EMPNO ENAME      JOB              MGR HIREDATE         SAL       COMM     DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
      7369            CLERK           7902 17-DEC-80          2                    20
      7499 ALLEN      SALESMAN        7698 20-FEB-81          3        300         30
      7521 WARD       SALESMAN        7698 22-FEB-81          3        500         30
      7566            MANAGER         7839 02-APR-81          4                    20
      7654 MARTIN     SALESMAN        7698 28-SEP-81          5       1400         30
      7698            MANAGER         7839 01-MAY-81          7                    30
      7782            MANAGER         7839 09-JUN-81                               10
      7788 SCOTT      ANALYST         7566 09-DEC-82       3000                    20
      7839 KING       PRESIDENT            17-NOV-81       5000                    10
      7844            SALESMAN        7698 08-SEP-81       1500          0         30
      7876 ADAMS      CLERK           7788 12-JAN-83       1100                    20
      7900            CLERK           7698 03-DEC-81        950                    30
      7902 FORD       ANALYST         7566 03-DEC-81       3000                    20
      7934            CLERK           7782 23-JAN-82       1300                    10
      7955            
15 rows selected.

I have some NULL values in ename column, is there any way that i can replace  NULL value and print as many X as the length of this column?

Thank you in advance... 

Tom Kyte
May 30, 2005 - 9:22 am UTC

nvl( ename, rpad('x',10,'x') )

HTML Codes

A reader, May 03, 2006 - 11:50 am UTC

Tom,

A front-end application when saved a text field copied from MS-Word onto a long column in the database, it saved with HTML Codes (codes that start with '&#', and end with ';' - not HTML Tags).

This is how it looks like in the application:

Case 1)
    Enter and maintain...

This is how it looks like from the database:

SQL> select descr from t;

Case 1)

&#61485;    Enter and maintain...

Is there a simple way to strip those HTML codes? or do we have to write a function to do so?  I'm using 8i.

I appreciate your help. 

Tom Kyte
May 03, 2006 - 1:25 pm UTC

it just be "text" to us - nothing special here.

Just like if you wanted to remove the word "Enter" - you will have to remove the word "&#524532;" when it occurrs.

HTML Codes

A reader, May 03, 2006 - 1:47 pm UTC

Thanks for the answer but there could be lots of HTML codes stored in that columns and all of them may be different. The only thing they have in common is the start '&#' and the end ';' characters. Everything else can be different in length and value.


Tom Kyte
May 03, 2006 - 2:50 pm UTC

they are just character strings to us, they are just character strings to you. They are just character strings

No different then "hello world"

sorry - no magic, I couldn't think of a way to get TEXT to do this (it sees the clob as a string, no filtering/translation necessary)

HTML Codes

A reader, May 04, 2006 - 9:38 am UTC

Just wonder if there's an equivalent of CHR() and ASCII() in PL/SQL for HTML Codes?

Thanks.

Tom Kyte
May 04, 2006 - 4:43 pm UTC

nope, they are just character strings! these entities in these strings are just strings themselves - nothing "special" about them.

#143213; is not any different in this string than "hello world" is.

How to do this efficiently

A reader, November 15, 2008 - 3:05 pm UTC

Hi Tom

I have a table with 48 million rows and database is production database. one particular filed called "Description" which is varchar2(200) .

almost 9000 rows have an data issue - the issue is some users have entered a trailing space using the front end application and now this is causing issue due to some reasons.

so we need to find all such records and update the description field with ltrim(rtrim(description)) .

that 9000 count are just a guess , as if we try to select such fields from the table .the query is taking very long time as description field has no index on it and we can not create an index easily as this table is an standard oracle appls table , and second thing is database is production so we can not take a long outage to resolve this issue without giving many reasons.

is there a way to do this efficiently with minimum loss of service.

i tried your find_string function in test enviromnet where data volume is same as production (99% of production) but it takes huge time..I tried to use parallel hint it also takes quite some time in finding all such records.

could you pls suggest another way to solve this problem.

Thanks
Tom Kyte
November 18, 2008 - 6:50 pm UTC

if it is just a few thousand out of millions, just do an update

update t set c = rtrim(ltrim(c)) where c like '% ';

that'll only 'lock' a few thousand records.

worked well

A reader, November 19, 2008 - 9:59 pm UTC

Hi Tom,

worked well in test database and it took only 4 minutes to update 8200 rows.

I fall into another issue - for few records the space was leading as well as trailing so i changed the condition like

where c like '% ' or c like ' %' ;

could I have done it in another better way.just curious.

also finally we will be able to hear you live in India and it is great for us that you are even coming to Mumbai .

Thanks
Tom Kyte
November 24, 2008 - 11:23 am UTC

Mumbai and Bangalore....


you could have used regular expressions, but the OR you used is

a) probably faster
b) the way I would have done it myself

trim

Chris Gould, November 24, 2008 - 1:30 pm UTC

Just curious - any particular reason for using
rtrim(ltrim(mystring)) 
instead of just
trim(mystring) 
?

Tom Kyte
November 24, 2008 - 7:36 pm UTC

because I'm old.

trim is still relatively new ;)

chennai plan

Raaghid, November 25, 2008 - 10:22 am UTC

Any plan for chennai. I am a big fan of you, want to attend your session or speech.
Tom Kyte
November 25, 2008 - 12:40 pm UTC

Just Mumbai and Bangalore on this trip.

no chennai

Reene, November 26, 2008 - 10:59 am UTC

Its better as chennai is too hot , Mumbai has become pleasent weather wise , Banglore is anyway best in india as for as weather goes!

Special characters

Anupam, December 10, 2008 - 3:14 am UTC

Hi Tom ,
I am selecting data from our DB .Data in this DB are entered by the Data entry team .Sample is like below

AD,5040259,111407_sc1_b_
Switch to the version of Mail.html,629025,13036905,SC1,12596113

AD,5521965,639127 101408 ca sw1 text,639127,13052857,SW1,12610618
AD,5292780,032108_236957 sports/fantasy_new/nascar

HEAD SBC Users (Dial or DSL)

,236957,12476063,HEAD,12090138

AD,3535157,236957 news/hotzone HEAD SBC Users (Dial or DSL),236957,12781352,HEAD,12382062

Third column in this example is varchar2(500) .While entering data in this column people push "enter" once or twice ,so while retrievig the data we are getting those spaces and blank lines which are creating problems in processing the data.

So our main aim is to delete the the blank lines in between and get the all the column data in one row.
Like if we are selecting 5 columns then those 5 columns should be in one row of flat file without blanks .

It will be very much helpfull if you can provid any suggestion on doing this .

Thanks And Regards,
Anupam


Tom Kyte
December 10, 2008 - 9:50 am UTC

I have no idea where this data resides. If this is in a file, fix the file. Is this in a row in a table - where? what?

details are very lacking.

Control characters

A reader, February 11, 2009 - 10:46 am UTC

Hi Tom,
We have a table in db which stores id and a html string .

create table test
(id number(10),
html varchar2(2000)
);

The HTML column contains the html script .

now I am trying to dump data from this table .
select id||chr(01)||trim(replace(translate(html,chr(10),'$'),'$',''))
from test;

but the problem here in the spool file is that the records are broken into multiple lines , which will impact the way down streamer handles the data.The cause for this multiple lines is we have more number of special characters in html fields .

We need to dump data so that one line is not broken into multiple lines .

Plz help.

Thanks And Regards,
Anupam
Tom Kyte
February 11, 2009 - 1:04 pm UTC

well, chr(1) is not a character to start with...

but you can set linesize and column cname format aNNNNN

set linesize 4000
column txt format a4000
select id||chr(01)||trim(replace(translate(html,chr(10),'$'),'$','')) TXT
from test;


Any suggestions

A, March 03, 2009 - 5:39 am UTC

Hello Tom,
I want to replace the string 'ABC' with a blank, in a comma seperated string. But the string 'ABC' might be at the begining or at end or in the middle of the string. How can I achieve in with a REPLACE() ?

Following is the data:

'ABC, 123, PQR'
'LMN, ABC, XYZ'
'781, UUU, MMM, ABC'

I want to achieve:

'123, PQR'
'LMN, XYZ'
'781, UUU, MMM'


Thanks


Tom Kyte
March 03, 2009 - 4:52 pm UTC

ops$tkyte%ORA10GR2> select x, regexp_replace( x, '(ABC, |, ABC$|ABC)', '' ) x_rep from t;

X                              X_REP
------------------------------ ----------------------------------------
ABC, 123, PQR                  123, PQR
LMN, ABC, XYZ                  LMN, XYZ
781, UUU, MMM, ABC             781, UUU, MMM
ABC

CONTROL CHARACTERS

A reader, March 03, 2009 - 8:03 pm UTC

Excellent stuff

Excellent !!!

A, March 04, 2009 - 4:20 am UTC

Thanks Tom ... Too good

Getting Control Character

Saradhi, June 12, 2009 - 2:07 pm UTC

Here i am loading data from flatfile to temp table,but when i query the table, i am seeing control character for one column.

Tom Kyte
June 12, 2009 - 2:56 pm UTC


control characters are data in a file as far as sqlldr is concerned

but since you didn't give us

a) a create table
b) your control file
c) a sample of your data

to reproduce with, we cannot help you by suggesting ways to avoid the issue.

Getting rid of control characters

Duke Ganote, June 12, 2009 - 3:31 pm UTC

Tom Kyte
June 12, 2009 - 4:24 pm UTC

yeah, but they said

... i am seeing control
character for one column.
...


which I assume means "the entire column is just this control character and I want it to SKIP that column" or something.

In short - we don't really know what they are trying to do

strip

A reader, June 13, 2009 - 8:25 am UTC


why 29 replace calls is faster than 1 translate call?

Renat, February 25, 2010 - 5:18 am UTC

I want to write a fast function that replaces non-printable characters with ascii codes 0-31 except 11,12,15 to printable one '�'
I'm generate xml as string from the data in database, and some strings have unwanted characters in it. I've found that 29 REPLACE calls is faster then one TRANSLATE call. Why?

Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> set timing on

declare
   n number := 10000;
   a varchar2(32000);
   b varchar2(32000):= rpad('#',n,'#')||rpad('-',n,'-');
   l_bad_chars varchar2(32) := '';
begin
   for i in 0..10 loop
      l_bad_chars := l_bad_chars || chr(i);
   end loop;
   
   l_bad_chars := l_bad_chars || chr(13);
   l_bad_chars := l_bad_chars || chr(14);

   for i in 16..31 loop
      l_bad_chars := l_bad_chars || chr(i);
   end loop;
   
  for i in 1..1000 loop
    a := translate(b, l_bad_chars, '�');
  end loop;
end;
/

declare
   n number := 10000;
   a varchar2(32000);
   b varchar2(32000):= rpad('#',n,'#')||rpad('-',n,'-');
begin
  for i in 1..1000 loop
     a := b;
   
     for i in 0..10 loop
        a := replace(a, chr(i), '�');
     end loop;
     
     a := replace(a, chr(13), '�');
     a := replace(a, chr(14), '�');
     
     for i in 16..31 loop
        a := replace(a, chr(i), '�');
     end loop;
  end loop;
end;
/

SQL> SQL>   2    3    4    5    6    7    8    9   10   11   12   13   14   15   16   17   18   19   20   21   22  
PL/SQL procedure successfully completed.

Elapsed: 00:00:08.79
SQL> SQL>   2    3    4    5    6    7    8    9   10   11   12   13   14   15   16   17   18   19   20   21  
PL/SQL procedure successfully completed.

Elapsed: 00:00:03.04
SQL> SQL> 

Tom Kyte
March 01, 2010 - 9:34 am UTC

ops$tkyte%ORA10GR2> set timing on
ops$tkyte%ORA10GR2> declare
  2     n number := 10000;
  3     a varchar2(32000);
  4     b varchar2(32000):= rpad('#',n,'#')||rpad('-',n,'-');
  5     l_bad_chars varchar2(32) := '';
  6     l_cpu number;
  7  begin
  8     for i in 0..10 loop
  9        l_bad_chars := l_bad_chars || chr(i);
 10     end loop;
 11
 12     l_bad_chars := l_bad_chars || chr(13);
 13     l_bad_chars := l_bad_chars || chr(14);
 14
 15     for i in 16..31 loop
 16        l_bad_chars := l_bad_chars || chr(i);
 17     end loop;
 18    l_cpu := dbms_utility.get_cpu_time;
 19    for i in 1..1000 loop
 20      a := translate(b, l_bad_chars, '?');
 21    end loop;
 22    dbms_output.put_line( (dbms_utility.get_cpu_time-l_cpu) || ' cpu hsecs' );
 23  end;
 24  /
11 cpu hsecs

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.11
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> declare
  2     n number := 10000;
  3     a varchar2(32000);
  4     b varchar2(32000):= rpad('#',n,'#')||rpad('-',n,'-');
  5     l_cpu number;
  6  begin
  7    l_cpu := dbms_utility.get_cpu_time;
  8    for i in 1..1000 loop
  9       a := b;
 10
 11       for i in 0..10 loop
 12          a := replace(a, chr(i), '?');
 13       end loop;
 14
 15       a := replace(a, chr(13), '?');
 16       a := replace(a, chr(14), '?');
 17
 18       for i in 16..31 loop
 19          a := replace(a, chr(i), '?');
 20       end loop;
 21    end loop;
 22    dbms_output.put_line( (dbms_utility.get_cpu_time-l_cpu) || ' cpu hsecs' );
 23  end;
 24  /
77 cpu hsecs

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.78



In newer releases:

ops$tkyte%ORA11GR2> @test
7 cpu hsecs

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.17
49 cpu hsecs

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.51



and even in really old releases:

ops$tkyte%ORA9IR2> @test
ops$tkyte%ORA9IR2> set timing on
ops$tkyte%ORA9IR2> declare
  2     n number := 10000;
  3     a varchar2(32000);
  4     b varchar2(32000):= rpad('#',n,'#')||rpad('-',n,'-');
  5     l_bad_chars varchar2(32) := '';
  6     l_cpu number;
  7  begin
  8     for i in 0..10 loop
  9        l_bad_chars := l_bad_chars || chr(i);
 10     end loop;
 11
 12     l_bad_chars := l_bad_chars || chr(13);
 13     l_bad_chars := l_bad_chars || chr(14);
 14
 15     for i in 16..31 loop
 16        l_bad_chars := l_bad_chars || chr(i);
 17     end loop;
 18    --l_cpu := dbms_utility.get_cpu_time;
 19    for i in 1..1000 loop
 20      a := translate(b, l_bad_chars, '?');
 21    end loop;
 22    --dbms_output.put_line( (dbms_utility.get_cpu_time-l_cpu) || ' cpu hsecs' );
 23  end;
 24  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.12
ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> declare
  2     n number := 10000;
  3     a varchar2(32000);
  4     b varchar2(32000):= rpad('#',n,'#')||rpad('-',n,'-');
  5     l_cpu number;
  6  begin
  7    --l_cpu := dbms_utility.get_cpu_time;
  8    for i in 1..1000 loop
  9       a := b;
 10
 11       for i in 0..10 loop
 12          a := replace(a, chr(i), '?');
 13       end loop;
 14
 15       a := replace(a, chr(13), '?');
 16       a := replace(a, chr(14), '?');
 17
 18       for i in 16..31 loop
 19          a := replace(a, chr(i), '?');
 20       end loop;
 21    end loop;
 22    --dbms_output.put_line( (dbms_utility.get_cpu_time-l_cpu) || ' cpu hsecs' );
 23  end;
 24  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.71



I cannot reproduce anything close to what you show - 8 or 3 seconds would be horriblly long in any case.

I'd have to guess that your machine is just overloaded perhaps and wall clock time (instead of cpu time) is very unreliable since you are affected by what else is happening on that machine at that time.

REPLACE vs TRANSLATE

Renat, March 04, 2010 - 3:40 am UTC

Thank you, Tom!

I suppose this really slow and inconvenient results is due to XEN virtualization. Our developer servers are on XEN so may be this is the point. I'll do further investigation and report later.

TRANSLATE is faster on production server as you said.
Tom Kyte
March 04, 2010 - 9:54 am UTC

I'll ask that other readers test this out for us and report back the results...

I cannot find a system to reproduce your findings

replace characters

A reader, March 04, 2010 - 11:16 am UTC


Sanju, August 18, 2010 - 7:00 am UTC

Hi Tom,

Below is the code which is trying to remove text anything between "<" and">" character.
------------------------------------------
declare
vString varchar(2000) := '<messageFormat></messageFormat><msgLength>551</msgLength><pr>';
vLocator number(10) := null;
v1 number(10) := null;
v2 number(10) := null;
vreplace varchar(1000) := null;
begin

select instr(vString, '<', 1, 1) into vLocator from dual;

while vLocator > 0 loop
select instr(vString, '<', 1, 1) into v1 from dual;
select instr(vString, '>', 1, 1) into v2 from dual;
dbms_output.put_line('Current String Value: '||vString);
dbms_output.put_line('Starting index for "<": ' || v1);
dbms_output.put_line('End index for ">": ' || v2);
select substr(vString,
instr(vString, '<', 1, 1),
instr(vString, '>', 1, 1))
into vreplace
from dual;
dbms_output.put_line('String to replace: '||vreplace);
select replace(vString,
substr(vString,
instr(vString, '<', 1, 1),
instr(vString, '>', 1, 1)))
into vString
from dual;
select instr(vString, '<', 1, 1) into vLocator from dual;
dbms_output.put_line ('after replace: '||vString);
end loop;
dbms_output.put_line(vString);
end;

------------------------------------------

And the output is as follows:
------------------------------------------
Current String Value: <messageFormat></messageFormat><msgLength>551</msgLength><pr>
Starting index for "<": 1
End index for ">": 15
String to replace: <messageFormat>
after replace: </messageFormat><msgLength>551</msgLength><pr>
Current String Value: </messageFormat><msgLength>551</msgLength><pr>
Starting index for "<": 1
End index for ">": 16
String to replace: </messageFormat>
after replace: <msgLength>551</msgLength><pr>
Current String Value: <msgLength>551</msgLength><pr>
Starting index for "<": 1
End index for ">": 11
String to replace: <msgLength>
after replace: 551</msgLength><pr>
Current String Value: 551</msgLength><pr>
Starting index for "<": 4
End index for ">": 15
String to replace: </msgLength><pr
after replace: 551>
551>
------------------------------------------
As you can see the output is "551>". I am not able to understand you the last character is not removed.

Special characters are coming in place of spaces

srinivas Rao, September 08, 2011 - 7:57 am UTC

Hi Tom,

In my table for column i have "He llo" data but while getting in excel i am getiing special character instead of spaces. Please let me know if there is any resolution for this. i want value as "He llo". Actually i am pulling the output of the query into BLOB variable and from there i am populating it into excel.

Thanks
Srinivas Rao
Tom Kyte
September 08, 2011 - 5:38 pm UTC

you do not have spaces there then - you have something that your terminal normally doesn't print.

use

select dump(column) from table where....


you'd expect to see the number 32 where there was a space like this:

ops$tkyte%ORA11GR2> select dump('he  llo' ) from dual;

DUMP('HELLO')
---------------------------------------
Typ=96 Len=7: 104,101,32,32,108,108,111


but you won't, it'll be something else.


since this is a blob, you can use


dbms_lob.substr(...)

to pick out the bit you want to see and sqlplus will show you the hex codes - you'll be looking for 0x20 to be there (but it won't, it'll be something else)

How to find any or all the special characters in a column or table??

Geek, September 09, 2011 - 8:21 am UTC

Hi Tom,

I have a table and I want to find if there is a special character in any of the column or if possible the whole table, and rows with any special characters should be returned, anything other than regular numbers and characters should be returned. Is there a way/SQL for this in Oracle 9.2? I tried to find solution myself and in many forums but I couldn't and at last posting it here. Thing is I don't want to input one particular character. Your response is much appreciated.

Thanks in advance,
Geek
Tom Kyte
September 09, 2011 - 9:37 am UTC

define special characters - define special characters PRECISELY - don't just say "not normal characters" or something like that. Be really really special.

because to me, there are no special characters, they are all just characters in a given character set.

How to find any or all the special characters in a column or table??

Geek, September 09, 2011 - 11:40 am UTC

ÿøõôóê뜡¢¥½¯£..etc I meant are special characters..

These kind of characters take additional bytes, depending on database its 2 or 3 bytes per character.

I want to know this because I am facing an issue with 'ORA-01401: inserted value too large for column' error while inserting into a table, and I have checked each and every column in select statement and maximum length of the data, and maximum length of the data is less than the column size of the table I am inserting into, but it still errors out, So I want to check if there are special characters that are taking additional bytes and i want to find the rows containing these characters. Please correct me if I am wrong.
Tom Kyte
September 10, 2011 - 2:56 pm UTC

define them all - etc doesn't cut it. Those all look VALID and not very special to me.

Either

a) define the set of "non-special" characters

or

b) define the set of "special" characters.


You can query "where length(column) <> lengthb(column)" to find candidate rows, use DUMP() to inspect them.

And I'm assuming that you are using a multi-byte database characterset? If not, that lengthb/length thing won't work.

Removing unmatched Double Quotes

Thiru, November 07, 2012 - 5:33 pm UTC

Hi Tom,

I am trying to find out how to remove unmatched double quotes from a column value using a sql. Can you please help me out ?

For example:

if a value in a column is: 111 Laurel drive "Texas

This has a unmatched double quote. Need to find out all the column values that has odd number of doube quotes.

Thaks for your time.
T
Tom Kyte
November 08, 2012 - 7:32 am UTC

ops$tkyte%ORA11GR2> with data
  2  as
  3  (select '111 Laurel drive "Texas' x from dual union all
  4   select '111 Laurel drive "Texas"' x from dual union all
  5   select '111 "Laurel drive "Texas"' x from dual union all
  6   select '111 "Laurel" drive "Texas"' x from dual
  7  )
  8  select x,
  9         substr(x,1,instr(x,'"')-1) || substr(x,instr(x,'"')+1) newx
 10    from data
 11   where mod(length(x)-length(replace(x,'"','')),2) = 1
 12  /

X                          NEWX
-------------------------- ----------------------------------------------------
111 Laurel drive "Texas    111 Laurel drive Texas
111 "Laurel drive "Texas"  111 Laurel drive "Texas"




you didn't say what quote to remove, so I removed the first one I find...

REmoving unwanted lines

capeb, December 06, 2012 - 6:00 am UTC

Hi,
I wanted to remove unwanted lines from a text file which will be sent as attachment through a mail.
Since there are many unwanted characters, is there a way to remove those by using some query

A reader, October 24, 2014 - 1:27 am UTC

create table bad (str varchar2(255) primary key) organization index;

declare
l_str varchar2(255);
begin
for i in 0 .. 31
loop
l_str := l_str || chr(i);
end loop;
for i in 127..255
loop
l_str := l_str || chr(i);
end loop;
insert into bad values ( l_str );
end;
/

select dump(str) from bad;
Typ=1 Len=33: 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,127


Can you help me to figure out why there is stop at 127, where is the 128 to 255?
Sergiusz Wolicki
November 18, 2018 - 10:36 pm UTC

Most probably, your database character set is not a single-byte character set. I am guessing it is AL32UTF8, which is multibyte. Bytes 128-255 are not valid character codes in isolation in AL32UTF8. Therefore, CHR will not work with them.

You can check your database character set with this query:

SELECT value FROM nls_database_parameters WHERE parameter = 'NLS_CHARACTERSET'