Hey Tom,
we have a case where
1. If first character is - (hypen) or space, then strip it out and give o/p
2. If their is an space before or after hypen, strip it out and give o/p
3. If their is an space before or after apostrophe, strip it out and give o/p
So, I decided to go with regullar expression, and my query is like this
create table regex_matching (
sno number(10) not null constraint regex_pk primary key,
c_pattern varchar2(50) not null);
insert into regex_matching values (1, '- My name');
insert into regex_matching values (2, ' My name');
insert into regex_matching values (3, 'My - name ');
insert into regex_matching values (4, 'Why ‘my’ name');
insert into regex_matching values (5, 'Why ‘ my ’ name');
insert into regex_matching values (6, 'Why ‘my ’ name');
insert into regex_matching values (7, 'Why my name');
select case
when REGEXP_COUNT(c_pattern, '^[-| ]{1,}',1, 'i')>=1 then REGEXP_REPLACE(c_pattern, '^[-| ]{1,}', '')
when REGEXP_COUNT(c_pattern, ' ?- ?',1, 'i')>=1 then REGEXP_REPLACE(c_pattern, ' ?- ?', '-')
when REGEXP_COUNT(c_pattern, ' ?‘ ?',1, 'i')>=1 then REGEXP_REPLACE(c_pattern, ' ?‘ ?', '‘')
when REGEXP_COUNT(c_pattern, ' ?’ ?',1, 'i')>=1 then REGEXP_REPLACE(c_pattern, ' ?’ ?', '’')
else c_pattern
end as "regexp_value"
from regex_matching where sno=1;
regexp_value
--------------------------------------------------------------------------------
My nameWith this query, I could achive desired results set for sno=1, 2 & 3. But with this query, for sno 4, 5 & 6 the o/p is
select case
when REGEXP_COUNT(c_pattern, '^[-| ]{1,}',1, 'i')>=1 then REGEXP_REPLACE(c_pattern, '^[-| ]{1,}', '')
when REGEXP_COUNT(c_pattern, ' ?- ?',1, 'i')>=1 then REGEXP_REPLACE(c_pattern, ' ?- ?', '-')
when REGEXP_COUNT(c_pattern, ' ?‘ ?',1, 'i')>=1 then REGEXP_REPLACE(c_pattern, ' ?‘ ?', '‘')
when REGEXP_COUNT(c_pattern, ' ?’ ?',1, 'i')>=1 then REGEXP_REPLACE(c_pattern, ' ?’ ?', '’')
else c_pattern
end as "regexp_value"
from regex_matching where sno=6
SQL> /
regexp_value
--------------------------------------------------------------------------------
Why???my???nameeven more suprising is, when line no. 4 is modified (the regexp_replace expression), the result changes totally. I understand that pattern matched is deleted, but what about the closing apostrophy
select case
when REGEXP_COUNT(c_pattern, '^[-| ]{1,}',1, 'i')>=1 then REGEXP_REPLACE(c_pattern, '^[-| ]{1,}', '')
when REGEXP_COUNT(c_pattern, ' ?- ?',1, 'i')>=1 then REGEXP_REPLACE(c_pattern, ' ?- ?', '-')
when REGEXP_COUNT(c_pattern, ' ?‘ ?',1, 'i')>=1 then REGEXP_REPLACE(c_pattern, ' ?‘ ?', '')
when REGEXP_COUNT(c_pattern, ' ?’ ?',1, 'i')>=1 then REGEXP_REPLACE(c_pattern, ' ?’ ?', '’')
else c_pattern
end as "regexp_value"
from regex_matching where sno=4;
regexp_value
--------------------------------------------------------------------------------
Whymynamewhere as their is no effect on o/p if line no 5 (the regexp_replace expression) is modified
select case
when REGEXP_COUNT(c_pattern, '^[-| ]{1,}',1, 'i')>=1 then REGEXP_REPLACE(c_pattern, '^[-| ]{1,}', '')
when REGEXP_COUNT(c_pattern, ' ?- ?',1, 'i')>=1 then REGEXP_REPLACE(c_pattern, ' ?- ?', '-')
when REGEXP_COUNT(c_pattern, ' ?‘ ?',1, 'i')>=1 then REGEXP_REPLACE(c_pattern, ' ?‘ ?', '‘')
when REGEXP_COUNT(c_pattern, ' ?’ ?',1, 'i')>=1 then REGEXP_REPLACE(c_pattern, ' ?’ ?', '')
else c_pattern
end as "regexp_value"
from regex_matching where sno=4;
regexp_value
--------------------------------------------------------------------------------
Why???my???name
So, my questions are
1. Is their any other better options to achieve this result (I mean other then regullar expression)
2. why opening apostrophe is replaced by question marks
3. why closing apostrophe has no effect on the o/p
I am using 12.2 version
There's a general issue in that the case expression will only process one of the when clauses. So if you have opening and closing quotes in the input, only one of these will get replaced.
You could do this by nesting replaces inside each other. e.g.
replace ( replace ( c_pattern, ' -', '-' ), '- ', '-' )
But that gets messy fast!
An alternative is a regex that searches for any of your desired characters with optional spaces. e.g.:
\s{0,1}([-‘’])\s{0,1}
Then use a backreference (\1) to return the matched character:
select c_pattern,
regexp_replace (
c_pattern,
'\s{0,1}([-‘’])\s{0,1}', '\1'
) strip_spaces,
case
when REGEXP_COUNT(c_pattern, '^[-| ]{1,}',1, 'i')>=1 then REGEXP_REPLACE(c_pattern, '^[-| ]{1,}', '')
when REGEXP_COUNT(c_pattern, ' ?- ?',1, 'i')>=1 then REGEXP_REPLACE(c_pattern, ' ?- ?', '-')
when REGEXP_COUNT(c_pattern, ' ?‘ ?',1, 'i')>=1 then REGEXP_REPLACE(c_pattern, ' ?‘ ?', '')
when REGEXP_COUNT(c_pattern, ' ?’ ?',1, 'i')>=1 then REGEXP_REPLACE(c_pattern, ' ?’ ?', '’')
else c_pattern
end as "regexp_value"
from regex_matching;
C_PATTERN STRIP_SPACES regexp_value
- My name -My name My name
My name My name My name
My - name My-name My-name
Why ‘my’ name Why‘my’name Whymy’ name
Why ‘ my ’ name Why‘my’name Whymy ’ name
Why ‘my ’ name Why‘my’name Whymy ’ name
Why my name Why my name Why my name
To strip the leading space or hyphen, you could call the result of this in another regex. Or if you're feeling brave try to do it all in one!
I'm not sure why you're getting the quotes appearing quotation marks or replacing them all. But it's almost certainly a character set conversion issue. Check how these are stored using dump or ascii to get the character code. And ensure your client can display this character!