Skip to Main Content

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Mitchell.

Asked: October 31, 2017 - 5:12 pm UTC

Last updated: November 09, 2017 - 11:37 am UTC

Version: 12.1.0.2

Viewed 1000+ times

You Asked

Tom:
I am trying to use this function to validate email addresses.
I am reading that the regexp functions have a limitation of 512 bytes.
Is there a way around this or a better way?
Thanks!
Mitchell

CREATE OR REPLACE FUNCTION is_valid_email_address(p_email IN VARCHAR2)
RETURN CHAR  IS
cemailregexp CONSTANT VARCHAR2(1000) := '^[a-z0-9!#$%&''*+/=?^_`{|}~-]+(\.[a-z0-9!#$%&''*+/=?^_`{|}~-]+)*@([a-z0-9]([a-z0-9-]*[a-z0-9])?\.)+([A-Z]{2}|arpa|biz|com|info|intww|name|net|org|pro|aero|asia|cat|coop|edu|gov|jobs|mil|mobi|museum|pro|tel|travel|post|fund|attorney|mortgage|network|properties|property|realtor|realty|re|rent|loans|finance|link|zone|builders|business|mls|estate|xyz|global|company|capital|icloud|win|kr|agency|homes|house|win|group|law|casa|cbre|work|team|investments|partners|management|consulting|services|legal|vet|bet|space|one|club|hockey|football|baseball)$';
BEGIN
IF regexp_like(p_email,cemailregexp,'i') THEN
RETURN 'Y';
ELSE RETURN 'N';
END IF;
EXCEPTION
WHEN OTHERS THEN
RETURN 'N';
END;
/


and Chris said...

Nope, the maximum length of regular expressions is 512 bytes as you say.

Is there a better way?

Don't bother!

Building a perfect email regex is effectively impossible. It's a lot of effort and you'll still end up annoying ~1% of your customers when it rejects their email address. Just check it contains an @ symbol and leave it at that.

But if you really must write an email regex validator, read this:

https://www.regular-expressions.info/email.html

Rating

  (2 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

DBMS_REDACT Email address pattern

Rajeshwaran Jeyabal, November 01, 2017 - 2:30 am UTC

Team,

How about using Email address pattern from DBMS_REDACT API ?

rajesh@ORA12C> variable x varchar2(50);
rajesh@ORA12C> exec :x :='hello_world@gmail.com';

PL/SQL procedure successfully completed.

rajesh@ORA12C> begin
  2     if ( regexp_replace ( :x ,dbms_redact.re_pattern_email_address,null,1,1,'i') is null ) then
  3             dbms_output.put_line('True');
  4     else
  5             dbms_output.put_line('False');
  6     end if;
  7  end;
  8  /
True

PL/SQL procedure successfully completed.

rajesh@ORA12C> exec :x :='#$hello_world@gmail.com';

PL/SQL procedure successfully completed.

rajesh@ORA12C> begin
  2     if ( regexp_replace ( :x ,dbms_redact.re_pattern_email_address,null,1,1,'i') is null ) then
  3             dbms_output.put_line('True');
  4     else
  5             dbms_output.put_line('False');
  6     end if;
  7  end;
  8  /
False

PL/SQL procedure successfully completed.

rajesh@ORA12C>

Chris Saxon
November 01, 2017 - 4:31 pm UTC

Nope. Top-level domains can be longer than four characters now. But this doesn't take these into account:

declare
  x varchar2(320) := 'fabio@disapproved.solutions';
begin
   if ( regexp_replace ( x ,dbms_redact.re_pattern_email_address,null,1,1,'i') is null ) then
          dbms_output.put_line('True');
   else
           dbms_output.put_line('False');
   end if;
end;
/

False

Redact-Problem?

Racer I., November 02, 2017 - 9:28 am UTC

Hi,

Maybe the limit of the redact-pattern should be brought to Oracles attention? In case someone uses this feature to hide email adresses via a redact-policy the new longer adresses would be visible or maybe get exported during supposedly anonymized exports.

The problem would appear to be deeper than changing the RE-constant :
http://www.morganslibrary.org/reference/pkgs/dbms_redact.html

since that one is only a string passed to the add_policy method. So a change of the constant would also need a refresh of the policies using it. A cleaner technique would require the redact system to store the constant as a placeholder and apply policies using the current value at runtime to track changes dynamically.

Chris Saxon
November 09, 2017 - 11:37 am UTC

The regular expression doesn't have the start and end of line anchors (^ and $). So it will still mask emails within a string. Just not necessarily the whole email:

grant create session, unlimited tablespace to u identified by u;

create table u.t (
  email varchar2(320)
);

insert into u.t values ('hello_world@gmail.com');
insert into u.t values ('#$hello_world@gmail.com');
insert into u.t values ('fabio@disapproved.solutions');
commit;

BEGIN
  DBMS_REDACT.add_policy(
    object_schema => 'U',
    object_name   => 'T',
    column_name   => 'EMAIL',
    policy_name   => 'redact_email',
    function_type => dbms_redact.regexp,
    regexp_pattern => dbms_redact.re_pattern_email_address,
    regexp_replace_string => dbms_redact.RE_REDACT_EMAIL_ENTIRE,
    regexp_position => dbms_redact.re_beginning,
    regexp_occurrence => dbms_redact.re_all,
    expression    => '1=1'
  );
END;
/

conn u/u

select * from t;

EMAIL                 
xxxx@xxxxx.com        
#$xxxx@xxxxx.com      
xxxx@xxxxx.comtions


The problem comes if you're trying to use this to validate that a string only contains an email address. Which it doesn't do.

More to Explore

SQL

The Oracle documentation contains a complete SQL reference.