Skip to Main Content

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Joachim.

Asked: August 15, 2017 - 11:06 am UTC

Last updated: August 16, 2017 - 1:56 pm UTC

Version: 12.1

Viewed 1000+ times

You Asked

I want to globaly predefine regexp anchor characters ^ and $ and
support traditional like metacharacters _ and %

SELECT first_name
  FROM employees 
 WHERE regexp_like(first_name, 
     '^('                                     -- globally predefine begin of string 
     || regexp_replace( regexp_replace(
     'Daniel%|G_ra_d|% %|Patrick|Ste(v|ph)en' -- test pattern using traditional like metacharacters and regexp metacharacters
     ,'_','.')                                -- replace like metachareacter '_' with regexp pattern '.' 
     ,'%','.*')                               -- replace like metacharacter '%' with regexp pattern '.*' 
     || ')$'                                  -- globally predefine end of string 
 )
 ORDER BY first_name
/


1. question:

are the inner two regexp_replace statements executed once ore multiple times (also if substitution or bind variables are used)


2. question:

to support also queries for _ and % as literals the idea is to escape them: \_ and \%
(... as all the other regexp metacharacters can be escaped)

maybe you can help me with regexp_replace

'abc\_def' --> abs_def
'abc_def' --> abc.def

'abc\%def --> abc%def
'abc%def --> abc.*def


with LiveSQL Test Case:

and Chris said...

1. Regexp_replace is non-deterministic. So the database will call it once for each row it processes.

2. So you want to preserve underscore and percent if they've been escaped?

It's a bit clumsy, but you could use replace to map \_ and \% to some other characters first. Then replace _ and %. Finally the originally replaced characters back to _ and %:

with str as (
  select 'abc\_def_ghi\%jkl%mno' s from dual
)
  select replace (replace (replace (replace (replace (replace ( 
           s , '\_', '####') ,
           '\%', '@@@@') , 
           '_', '.'),
           '%', '.*'),
           '####', '_'),
           '@@@@', '%'
         ) repl, 
         s
  from   str;

REPL                  S                      
abc_def.ghi%jkl.*mno  abc\_def_ghi\%jkl%mno 


Obviously, you have to be sure #### and @@@@ aren't in your source!

The advantage of this is replace is deterministic. So the database can call it once for each input value instead of each row. So this can drop down to just six calls.

Rating

  (1 rating)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Joachim Forster, August 15, 2017 - 9:15 pm UTC

finally i found a solution with regexp_replace
(escated \_ und \% are passed as \_ and \% to outer regexp which is ok too)

... unfortunately regexp_replace called for every row that is processed

3. question:

is there a reason why regexp_replace is non deterministic (compared to replace)?!?

with str as (
  select 'abc\_def_ghi\%jkl%mno' s from dual
)
  select regexp_replace ( regexp_replace ( 
            s,'([^\]|^)_','\1.') 
           ,'([^\]|^)%','\1.*')  
           repl
          ,s  
  from   str
/


REPL                    S
abc\_def.ghi\%jkl.*mno  abc\_def_ghi\%jkl%mno


Chris Saxon
August 16, 2017 - 1:56 pm UTC

Because the regular expression you write may not be deterministic!

More to Explore

SQL

The Oracle documentation contains a complete SQL reference.