Robert, July 19, 2013 - 6:15 pm UTC
Dan Blum, July 19, 2013 - 7:07 pm UTC
Unfortunately if you have a number longer than 16 digits that REGEXP_REPLACE will replace the first 16 digits. To restrict it to 16-digit numbers, period, you need something like
select regexp_replace( x, '[[:digit:]]{16}([^[:digit:]])',
'xxxxxxxxxxxxxxxx\1' ) from t;
Only works on the first occurance of a 16 digit number
Ken Hughes, July 19, 2013 - 8:29 pm UTC
This works great on the first occurrence of a 16 digit number. But it does not affect additional occurrences of the number. How should I modify the statement to include all occurrences of a 16 digit number in a record returned by the select statement?
replace only occurenaces of a 16 digit number in a string
Shimmy, July 19, 2013 - 8:35 pm UTC
Both the SQL don't give the results
SQL> insert into t values ( 'this is NOT A credit card number 1234567890123456789 This is credit card number 1234567890123456 done');
1 row created.
SQL>
SQL> select X, regexp_replace( x, '([[:digit:]]{16})', 'xxxxxxxxxxxxxxxx' ) formated
2 from t
3 WHERE X LIKE '%NOT%';
X
----------------------------------------------------------------------------------------------------
FORMATED
----------------------------------------------------------------------------------------------------
this is NOT A credit card number 1234567890123456789 This is credit card number 1234567890123456 done
this is NOT A credit card number xxxxxxxxxxxxxxxx789 This is credit card number xxxxxxxxxxxxxxxx done
SQL>
SQL>
SQL> select X, regexp_replace( x, '[[:digit:]]{16}([^[:digit:]])', 'xxxxxxxxxxxxxxxx' ) formated
2 from t
3 WHERE X LIKE '%NOT%';
X
----------------------------------------------------------------------------------------------------
FORMATED
----------------------------------------------------------------------------------------------------
this is NOT A credit card number 1234567890123456789 This is credit card number 1234567890123456 done
this is NOT A credit card number 123xxxxxxxxxxxxxxxxThis is credit card number xxxxxxxxxxxxxxxxdone
Just trying to understand regular expressions. For some reason it's always confusing how to define the pattern.
It does not replace the number if it is at the end of the record
Ken Hughes, July 19, 2013 - 8:56 pm UTC
My earlier comment was slightly incorrect. The syntax you provided will replace all occurrences of a 16 digit number except on that is at the end of the record. The syntax looks for a 16 digit number with a character after it. The number at the end of the record does not have any characters beyond it.
Dealing with edge cases
tony, July 20, 2013 - 7:20 am UTC
I modified Ken's solution so that it works with card numbers at the beginning/end of the input string:
with t as (
select
'1234567890123456 <= CCN CCN => 1234567890123456 CCN => 2345678901234567' x
from dual
union all
select
'1234567890 <= NOCCN NOCCN => 1234567 NOCCN => 2345678' x
from dual
union all
select
'12345678901234567 <= NOCCN NOCCN => 12345678901234567 NOCCN => 23456789012345678' x
from dual
)
select regexp_replace( x,
'([^[:digit:]]|^)[[:digit:]]{16}([^[:digit:]]|$)',
'\1xxxxxxxxxxxxxxxx\2' ) from t
Multiple
Jonathan Taylor, July 22, 2013 - 2:09 pm UTC
Care has to be taken if the text is multi-line (contains new line characters). The "m" modifier to REGEXP_SUBSTR should make it work (but it DOESN'T seem to).
I suspect that the "([^[:digit:]]|^)" part is confusing things (or perhaps a bug).
This should match a non-digit OR the start-of-line.
with t as (
select
'1234567890123456 <= CCN CCN => 1234567890123456 CCN => 2345678901234567
1234567890123456 <= CCN CCN => 1234567890123456 CCN => 2345678901234567' x
from dual
union all
select
'1234567890 <= NOCCN NOCCN => 1234567 NOCCN => 2345678' x
from dual
union all
select
'12345678901234567 <= NOCCN NOCCN => 12345678901234567 NOCCN => 23456789012345678' x
from dual
)
select regexp_replace( x,
'([^[:digit:]]|^)[[:digit:]]{16}([^[:digit:]]|$)',
'\1xxxxxxxxxxxxxxxx\2',1,0,'m' ) from t
The only way I could get it working is dealing with the 3 cases separately:-
1. <non digit><16 digits><non digit>
2. <start of line><16 digits><non digit>
3. <non digit><16 digits><end of line>
with t as (
select
'1234567890123456 <= CCN CCN => 1234567890123456 CCN => 2345678901234567
1234567890123456 <= CCN CCN => 1234567890123456 CCN => 2345678901234567' x
from dual
union all
select
'1234567890 <= NOCCN NOCCN => 1234567 NOCCN => 2345678' x
from dual
union all
select
'12345678901234567 <= NOCCN NOCCN => 12345678901234567 NOCCN => 23456789012345678' x
from dual
)
select regexp_replace(
regexp_replace(
regexp_replace( x
, '([^[:digit:]])[[:digit:]]{16}([^[:digit:]])'
, '\1xxxxxxxxxxxxxxxx\2',1,0,'m'
)
,'^[[:digit:]]{16}([^[:digit:]])'
,'xxxxxxxxxxxxxxxx\1',1,0,'m'
)
,'([^[:digit:]])[[:digit:]]{16}$'
,'\1xxxxxxxxxxxxxxxx',1,0,'m'
)
from t
@ Ken
Rajeshwaran, July 23, 2013 - 1:05 pm UTC
@Ken, does this helps you.
rajesh@ORA10G> select x,
2 regexp_replace(x,'(\D+|^)\d{16}(\D+|$)','\1xxxxxxxxxxxxxxxx\2') val
3 from t ;
X VAL
---------------------------------------- -------------------------------------
I am 16digit 1234567890123456 I am 16digit xxxxxxxxxxxxxxxx
I am not 16digit 12345678901234567 I am not 16digit 12345678901234567
12345678901234567 12345678901234567
1234567890123456 xxxxxxxxxxxxxxxx
4 rows selected.
Elapsed: 00:00:00.03
rajesh@ORA10G>
Solution works great in select but fails in triger
Ken Hughes, July 23, 2013 - 8:00 pm UTC
DDL:
drop table kendev1.testdata;
create table kendev1.testdata (testpk number, test1 varchar2(400));
insert into testdata values (1,'I am 16digit 1234567890123456');
insert into testdata values (2,'I am not 16digit 12345678901234567');
insert into testdata values (3,'12345678901234567');
insert into testdata values (4,'1234567890123456');
insert into testdata values (5,'this is a credit card number 1234567890123456 this is a phone number 2538887777 this is a zip code 98030 this is another credit card number 2345678901234567');
insert into testdata values (6,'this is a phone number 2538887777 this is a zip code 98030');
commit;
col test1 for a40
col val for a40
set line 200
set pages 99
select testpk, test1,
regexp_replace(test1,'(\D+|^)\d{16}(\D+|$)','\1xxxxxxxxxxxxxxxx\2') val
from testdata;
TESTPK TEST1 VAL
---------- ---------------------------------------- ----------------------------------------
1 I am 16digit 1234567890123456 I am 16digit xxxxxxxxxxxxxxxx
2 I am not 16digit 12345678901234567 I am not 16digit 12345678901234567
3 12345678901234567 12345678901234567
4 1234567890123456 xxxxxxxxxxxxxxxx
5 this is a credit card number 12345678901 this is a credit card number xxxxxxxxxxx
23456 this is a phone number 2538887777 xxxxx this is a phone number 2538887777
this is a zip code 98030 this is another this is a zip code 98030 this is another
credit card number 2345678901234567 credit card number xxxxxxxxxxxxxxxx
6 this is a phone number 2538887777 this i this is a phone number 2538887777 this i
s a zip code 98030 s a zip code 98030
6 rows selected.
NOTE: All occurrences of a 16 digit number have been masked with xxxxxxxxxxxxxxxx
drop table test_save;
create table test_save as select * from testdata where 1=2;
CREATE OR REPLACE TRIGGER strip_credit_card_numbers
before insert or update on kendev1.test_save for each row
DECLARE
v_credit_card varchar2(4000);
begin
-- This statement will succeed if there is a 16 digit number anywhere in :new.test1.
select 'x' into v_credit_card from dual where regexp_like( :new.test1, '[[:digit:]]{16}([^[:digit:]])');
if v_credit_card = 'x' then
:new.test1 := regexp_replace(:new.test1,'(\D+|^)\d{16}(\D+|$)','\1xxxxxxxxxxxxxxxx\2');
end if;
EXCEPTION WHEN NO_DATA_FOUND THEN
null;
end;
/
show errors trigger strip_credit_card_numbers
insert into test_save (select TESTPK, test1 from testdata);
col test1 for a80
select TESTPK, test1 from test_save;
TESTPK TEST1
---------- -------------------------------------------------------------------------------
1 I am 16digit 1234567890123456
2 I am not 16digit 12345678901234567
3 12345678901234567
4 1234567890123456
5 this is a credit card number xxxxxxxxxxxxxxxx this is a phone number 2538887777
this is a zip code 98030 this is another credit card number xxxxxxxxxxxxxxxx
6 this is a phone number 2538887777 this is a zip code 98030
6 rows selected.
NOTE: Records #1 & #4 did not get their 16 digit numbers masked with xxxxxxxxxxxxxxxx.
July 31, 2013 - 6:49 pm UTC
why would you use select to make an assignment?
what isn't working?
ops$tkyte%ORA11GR2> create table t ( a int, x varchar2(2000));
Table created.
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> create or replace trigger t
2 before insert on t for each row
3 begin
4 if regexp_like( :new.x, '(\D+|^)\d{16}(\D+|$)' )
5 then
6 :new.x := regexp_replace( :new.x, '(\D+|^)\d{16}(\D+|$)','\1xxxxxxxxxxxxxxxx\2');
7 end if;
8 end;
9 /
Trigger created.
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> insert into t values (1,'I am 16digit 1234567890123456');
1 row created.
ops$tkyte%ORA11GR2> insert into t values (2,'I am not 16digit 12345678901234567');
1 row created.
ops$tkyte%ORA11GR2> insert into t values (3,'12345678901234567');
1 row created.
ops$tkyte%ORA11GR2> insert into t values (4,'1234567890123456');
1 row created.
ops$tkyte%ORA11GR2> insert into t values (5,'this is a credit card number 1234567890123456 this is a phone number 2538887777 this is a zip code 98030 this is another credit card number 2345678901234567');
1 row created.
ops$tkyte%ORA11GR2> insert into t values (6,'this is a phone number 2538887777 this is a zip code 98030');
1 row created.
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> select * from t;
A
----------
X
-------------------------------------------------------------------------------
1
I am 16digit xxxxxxxxxxxxxxxx
2
I am not 16digit 12345678901234567
3
12345678901234567
4
xxxxxxxxxxxxxxxx
5
this is a credit card number xxxxxxxxxxxxxxxx this is a phone number 2538887777
this is a zip code 98030 this is another credit card number xxxxxxxxxxxxxxxx
6
this is a phone number 2538887777 this is a zip code 98030
6 rows selected.
Fixed the problem.
Ken Hughes, July 23, 2013 - 8:15 pm UTC
CREATE OR REPLACE TRIGGER strip_credit_card_numbers
before insert or update on kendev1.test_save for each row
DECLARE
v_credit_card varchar2(4000);
begin
-- This statement will succeed if there is a 16 digit number anywhere in :new.test1.
select 'x' into v_credit_card from dual where regexp_like( :new.test1, '(\D+|^)\d{16}(\D+|$)');
if v_credit_card = 'x' then
:new.test1 := regexp_replace(:new.test1,'(\D+|^)\d{16}(\D+|$)','\1xxxxxxxxxxxxxxxx\2');
end if;
EXCEPTION WHEN NO_DATA_FOUND THEN
null;
end;
/
July 31, 2013 - 6:52 pm UTC
in addition to the inappropriate use of a select to perform an assignment - the exception handler is in the wrong place too. there should be a begin/end block around the select (which shouldn't be there at all :) ). and the exception should be just for the select - not the entire block of code.
what if one day, for whatever reason, regexp_replace threw a NO_DATA_FOUND - you wouldn't want to ignore that.
exception blocks should be around ONLY the code you are expecting the exception from!
but best to just remove the select, the ONLY time you need to use a select to make an assignment in plsql is when you are using decode (in which case, time to move to case :) )
This is a great service and I really appreciate it a lot !
Ken Hughes, July 25, 2013 - 6:33 pm UTC
Thank you for the great work you have doe for me on this issue :)
Won't catch all cards
djb, July 27, 2013 - 3:39 am UTC
Hi Ken, Dean here... You do know that not all credit card numbers are 16 digits, right? My Amex cards are 15 digits.
One remark
chris, August 01, 2013 - 2:46 pm UTC
Just to be complete.
The proposed solution wont work in every case. Consider:
select
regexp_replace(
'0123456789123456#0123456789123456'
,'(\D+|^)\d{16}(\D+|$)'
,'\1xxxxxxxxxxxxxxxx\2'
) val
from dual
VAL
"xxxxxxxxxxxxxxxx#0123456789123456"
Improved version
chris, August 01, 2013 - 2:51 pm UTC
select
regexp_replace(
'0123456789123456#0123456789123456'
,'\d{16}|(\D+|^)\d{16}(\D+|$)'
,'\1xxxxxxxxxxxxxxxx\2'
) val
from dual
VAL
"xxxxxxxxxxxxxxxx#xxxxxxxxxxxxxxxx"
Notice: as far as i know this only works based on a speciality of the oracle regexp engine, not to go for the largest match of an or-expression, in contrary to other implmentations
Sory wrong
chris, August 01, 2013 - 2:55 pm UTC
sorry last one was wrong as it wont work for sequences of digits longer than 17 (it would replace what it must mot).
In my opinion, the problem of a single separator cant be solved with a single regexp.
One has to split and rebuild the string.
Hmm
chris, August 01, 2013 - 3:03 pm UTC
Assuming chr(0) will never be inside the values
select
replace (
regexp_replace(
regexp_replace (
'#0123456789123456#0123456789123456##01234567891234567#0123456789123456'
, '(\d)(\D)(\d)'
, '\1\2'||chr(0)||'\3'
)
,'(\D|^)\d{16}(\D|$)'
,'\1xxxxxxxxxxxxxxxx\2'
) , chr(0))val
from dual
VAL
"#xxxxxxxxxxxxxxxx#xxxxxxxxxxxxxxxx##01234567891234567#xxxxxxxxxxxxxxxx"
August 02, 2013 - 7:19 pm UTC
I am loving this thread :)
I'm thinking about turning this entire thing into a presentation - "there is no such thing as a small change - or a small requirement"
Clean String of non numeric and non alpha characters
Tony Fernandez, August 06, 2013 - 7:32 pm UTC
Tom,
I tried this expression:
regexp_replace( my_string, '[^[:alnum:]]', '' )
to remove characters other than 0-9, A-Z and a-z.
Please correct if there is anything wrong with it.
Kindly,