Use of regular expression....
Shimmy, December 01, 2015 - 3:29 pm UTC
Hi Chris,
When Tom Kyte used to comment of Regular expression, he used to say that we should use regular expression as the last resort because it's highly CPU intensive.
Can't we achieve something like the one below?
create table stk_t (
email varchar2(320) check (
email LIKE '%@%.%' AND email NOT LIKE '@%' AND email NOT LIKE '%@%@%'
)
);
insert into stk_t values ('abc@xyz.com');
insert into stk_t values ('abc@xyz.co.uk');
insert into stk_t values ('ab.com');
insert into stk_t values ('abc@xyz');
December 02, 2015 - 9:00 am UTC
While you could avoid regexes, as stated in a later review this allows spaces an other invalid characters.
Parsing Email addresses - Potholes
John Gasch, December 01, 2015 - 6:02 pm UTC
Tricky problem, with lots of potholes to consider...
Chris - your solution isn't robust without anchor delimiters to force the entire input string to conform with the regex, along with the trailing addition to validate the string past the first dot.
This would help, but still is far from comprehensive:
regexp_like(email, '^[[:alnum:]]+@[[:alnum:]]+(\.[[:alnum:]]+)+$')
Shimmy - Your solution falls short because it allows the substrings matched by '%' to be null strings. So, this would match yours: 'abc@.' Also, the general wildcard '%' would allow spaces and invalid chars that wouldn't pass muster.
The following page addresses the subject in more detail, but even then it isn't fully comprehensive.
https://blogs.oracle.com/paulj/entry/email_regex_address_validation_aka
December 02, 2015 - 9:05 am UTC
Yes, the original regexp has limitations. A more comprehensive solution like you've shown is likely to reject valid email addresses though. e.g. what if the domain contains a hyphen?
As the post you've linked to says:
"Don't try to be clever with regexes, you might get it wrong"
If I were doing this for real, I would probably just validate that there's exactly one @ in the string an leave it at that.
Regarding regex performance
Mark, December 01, 2015 - 8:28 pm UTC
I would not expect regex performance to be an issue here, unless email addresses are added/modified at a very rapid (in my opinion, unrealistic) rate. Maybe in a bulk load this could have some performance impact...
December 02, 2015 - 9:08 am UTC
Yes, if it's single row inserts the overhead should be low.
Words in Email
Arvind Mishra, May 19, 2016 - 1:10 am UTC
I have two tables
a. Customer table having email address of a customer
b. Bad_words table with 'word' column containing bad words
How can I get all those customers whose email must not contain anything from Bad_words.word column? Can we do it ONLY by sql and without using pl/sql
May 19, 2016 - 1:35 am UTC
SQL> create table emails ( e varchar2(100));
Table created.
SQL> create table bad_words ( w varchar2(100));
Table created.
SQL>
SQL> insert into bad_words values ('ibm');
1 row created.
SQL> insert into bad_words values ('microsoft');
1 row created.
SQL> insert into bad_words values ('sap');
1 row created.
SQL>
SQL> insert into emails values ( 'joe@oracle.com');
1 row created.
SQL> insert into emails values ( 'jane.doe@ibm.com');
1 row created.
SQL> insert into emails values ( 'joe.smith@sap.com');
1 row created.
SQL> insert into emails values ( 'mike.jones@microsoft.com');
1 row created.
SQL> insert into emails values ( 'mary.beth@yahoo.com');
1 row created.
SQL> insert into emails values ( 'suzy@gmail.com');
1 row created.
SQL>
SQL> select *
2 from emails, bad_words
3 where instr(e,w) > 0 ;
E W
------------------------------ ------------------------------
jane.doe@ibm.com ibm
mike.jones@microsoft.com microsoft
joe.smith@sap.com sap
3 rows selected.
SQL>
SQL>
More Potholes
AndyP, May 19, 2016 - 9:01 am UTC
Nice, but just be careful with that solution though. It would for example exclude "jonny.millsap@oracle.com", whereas presumably we really only want to exclude cases where the bad word is an actual "word". And that, as the discussion above illustrates, is the hard bit
Bad Words in Email
Arvind, May 19, 2016 - 10:25 am UTC
Thanks Everyone