Excellent
A reader, December 12, 2001 - 2:42 pm UTC
I'm sure I'll need it one day.
Funny that Oracle hasn't made it a built in yet.
Pattern Matching within PL-SQL
Pascal, December 13, 2001 - 4:27 am UTC
Hi Tom ...
Thank you very much for taking all the Trouble to answer my Query....
I went to </code>
http://www.savarese.org/oro/software/OROMatcher1.1.html <code>
and downloaded OROMatcher1.1
I am now trying to load ORO Classes into my Oracle Schema but look what i get instead....
First i jar-ed all the .class Files from Folder \com\oroinc\text\regex into myload.jar File
and then i used loadjava : as Follows :
C:\java\ORO\OROa\com\oroinc\text\regex>loadjava -user pascal/pascal@my_host:1521:sid -v -f -r -t -s -g public myload.jar
initialization complete
loading : Perl5StreamInput
creating : Perl5StreamInput
loading : OpCode
creating : OpCode
loading : Substitution
creating : Substitution
loading : PatternMatcherInput
creating : PatternMatcherInput
loading : Perl5Debug
creating : Perl5Debug
loading : Perl5Compiler
creating : Perl5Compiler
loading : Perl5MatchResult
creating : Perl5MatchResult
loading : Util
creating : Util
loading : Perl5Matcher
creating : Perl5Matcher
loading : Perl5Pattern
creating : Perl5Pattern
loading : Perl5Substitution
creating : Perl5Substitution
loading : PatternMatcher
creating : PatternMatcher
loading : MatchResult
creating : MatchResult
loading : Pattern
creating : Pattern
loading : Perl5Repetition
creating : Perl5Repetition
loading : PatternCompiler
creating : PatternCompiler
loading : StringSubstitution
creating : StringSubstitution
loading : MalformedPatternException
creating : MalformedPatternException
loading : CharStringPointer
creating : CharStringPointer
loading : META-INF/MANIFEST.MF
creating : META-INF/MANIFEST.MF
Error while granting to Perl5StreamInput
ORA-04042: procedure, function, package, or package body does not exist
Error while granting to OpCode
ORA-04042: procedure, function, package, or package body does not exist
Error while granting to Substitution
ORA-04042: procedure, function, package, or package body does not exist
Error while granting to PatternMatcherInput
ORA-04042: procedure, function, package, or package body does not exist
Error while granting to Perl5Debug
ORA-04042: procedure, function, package, or package body does not exist
Error while granting to Perl5Compiler
ORA-04042: procedure, function, package, or package body does not exist
Error while granting to Perl5MatchResult
ORA-04042: procedure, function, package, or package body does not exist
Error while granting to Util
ORA-04042: procedure, function, package, or package body does not exist
Error while granting to Perl5Matcher
ORA-04042: procedure, function, package, or package body does not exist
Error while granting to Perl5Pattern
ORA-04042: procedure, function, package, or package body does not exist
Error while granting to Perl5Substitution
ORA-04042: procedure, function, package, or package body does not exist
Error while granting to PatternMatcher
ORA-04042: procedure, function, package, or package body does not exist
Error while granting to MatchResult
ORA-04042: procedure, function, package, or package body does not exist
Error while granting to Pattern
ORA-04042: procedure, function, package, or package body does not exist
Error while granting to Perl5Repetition
ORA-04042: procedure, function, package, or package body does not exist
Error while granting to PatternCompiler
ORA-04042: procedure, function, package, or package body does not exist
Error while granting to StringSubstitution
ORA-04042: procedure, function, package, or package body does not exist
Error while granting to MalformedPatternException
ORA-04042: procedure, function, package, or package body does not exist
Error while granting to CharStringPointer
ORA-04042: procedure, function, package, or package body does not exist
resolver :
resolving: Perl5StreamInput
Error while resolving class Perl5StreamInput
ORA-04043: object Perl5StreamInput does not exist
resolving: OpCode
Error while resolving class OpCode
ORA-04043: object OpCode does not exist
resolving: Substitution
Error while resolving class Substitution
ORA-04043: object Substitution does not exist
resolving: PatternMatcherInput
Error while resolving class PatternMatcherInput
ORA-04043: object PatternMatcherInput does not exist
resolving: Perl5Debug
Error while resolving class Perl5Debug
ORA-04043: object Perl5Debug does not exist
resolving: Perl5Compiler
Error while resolving class Perl5Compiler
ORA-04043: object Perl5Compiler does not exist
resolving: Perl5MatchResult
Error while resolving class Perl5MatchResult
ORA-04043: object Perl5MatchResult does not exist
resolving: Util
Error while resolving class Util
ORA-04043: object Util does not exist
resolving: Perl5Matcher
Error while resolving class Perl5Matcher
ORA-04043: object Perl5Matcher does not exist
resolving: Perl5Pattern
Error while resolving class Perl5Pattern
ORA-04043: object Perl5Pattern does not exist
resolving: Perl5Substitution
Error while resolving class Perl5Substitution
ORA-04043: object Perl5Substitution does not exist
resolving: PatternMatcher
Error while resolving class PatternMatcher
ORA-04043: object PatternMatcher does not exist
resolving: MatchResult
Error while resolving class MatchResult
ORA-04043: object MatchResult does not exist
resolving: Pattern
Error while resolving class Pattern
ORA-04043: object Pattern does not exist
resolving: Perl5Repetition
Error while resolving class Perl5Repetition
ORA-04043: object Perl5Repetition does not exist
resolving: PatternCompiler
Error while resolving class PatternCompiler
ORA-04043: object PatternCompiler does not exist
resolving: StringSubstitution
Error while resolving class StringSubstitution
ORA-04043: object StringSubstitution does not exist
resolving: MalformedPatternException
Error while resolving class MalformedPatternException
ORA-04043: object MalformedPatternException does not exist
resolving: CharStringPointer
Error while resolving class CharStringPointer
ORA-04043: object CharStringPointer does not exist
synonym : Perl5StreamInput
synonym : OpCode
synonym : Substitution
synonym : PatternMatcherInput
synonym : Perl5Debug
synonym : Perl5Compiler
synonym : Perl5MatchResult
synonym : Util
synonym : Perl5Matcher
synonym : Perl5Pattern
synonym : Perl5Substitution
synonym : PatternMatcher
synonym : MatchResult
synonym : Pattern
synonym : Perl5Repetition
synonym : PatternCompiler
synonym : StringSubstitution
synonym : MalformedPatternException
synonym : CharStringPointer
loadjava: 38 errors
There i got stuck,
i have jdk 1.4 Beta on my Windwos 2000 Client ....
Oracle Back-end is on Solaris Machine.
My CLASSPAPH ENV variable is correctly set and i also added the /com path in the list.
Please, i would like more help on how t load these classes into Oracle so that i can test your pattern Matcher program.
Best Regards,
Pascal
December 13, 2001 - 9:39 am UTC
1. Download </code>
http://www.savarese.org/oro/downloads/download/OROMatcher-1.1.zip <code>
2. Unzip this file. Make sure that you have your un-zipping program create the directories.
You should have a tree like this:
C:\temp\OROMatcher-1.1.0a>tree
Folder PATH listing
Volume serial number is 71FAE346 1C7F:F8DB
C:.
+---com
¦ +---oroinc
¦ +---text
¦ +---regex
+---doc
¦ +---api
¦ ¦ +---images
¦ +---images
+---examples
+---images
3. Now we want to ignore the 'doc', 'examples', and 'images' directories. Put another way, we only want to include everything in the 'com' directory. I used the free utility called unzip.exe but use whatever you desire. Make sure that you have your zipping program includes the sub-directories. This command creates a file named ORO.ZIP.
C:\temp\OROMatcher-1.1.0a>zip -r oro.zip com
updating: com/ (256 bytes security) (stored 0%)
updating: com/oroinc/ (256 bytes security) (stored 0%)
updating: com/oroinc/text/ (256 bytes security) (stored 0%)
updating: com/oroinc/text/regex/ (256 bytes security) (stored 0%)
updating: com/oroinc/text/regex/CharStringPointer.class (188 bytes security) (deflated 50%)
updating: com/oroinc/text/regex/MalformedPatternException.class (188 bytes security) (deflated 30%)
updating: com/oroinc/text/regex/MatchResult.class (188 bytes security) (deflated 32%)
updating: com/oroinc/text/regex/OpCode.class (188 bytes security) (deflated 49%)
updating: com/oroinc/text/regex/Pattern.class (188 bytes security) (deflated 22%)
updating: com/oroinc/text/regex/PatternCompiler.class (188 bytes security) (deflated 51%)
updating: com/oroinc/text/regex/PatternMatcher.class (188 bytes security) (deflated 51%)
updating: com/oroinc/text/regex/PatternMatcherInput.class (188 bytes security) (deflated 58%)
updating: com/oroinc/text/regex/Perl5Compiler.class (188 bytes security) (deflated 51%)
updating: com/oroinc/text/regex/Perl5Debug.class (188 bytes security) (deflated 40%)
updating: com/oroinc/text/regex/Perl5Matcher.class (188 bytes security) (deflated 54%)
updating: com/oroinc/text/regex/Perl5MatchResult.class (188 bytes security) (deflated 49%)
updating: com/oroinc/text/regex/Perl5Pattern.class (188 bytes security) (deflated 40%)
updating: com/oroinc/text/regex/Perl5Repetition.class (188 bytes security) (deflated 36%)
updating: com/oroinc/text/regex/Perl5StreamInput.class (188 bytes security) (deflated 43%)
updating: com/oroinc/text/regex/Perl5Substitution.class (188 bytes security) (deflated 49%)
updating: com/oroinc/text/regex/StringSubstitution.class (188 bytes security) (deflated 50%)
updating: com/oroinc/text/regex/Substitution.class (188 bytes security) (deflated 38%)
updating: com/oroinc/text/regex/Util.class (188 bytes security) (deflated 64%)
4. Let's double-check the contents of that file.
C:\temp\OROMatcher-1.1.0a>unzip -v oro.zip
Archive: oro.zip
Length Method Size Ratio Date Time CRC-32 Name
-------- ------ ------- ----- ---- ---- ------ ----
0 Stored 0 0% 12-13-01 09:11 00000000 com/
0 Stored 0 0% 12-13-01 09:11 00000000 com/oroinc/
0 Stored 0 0% 12-13-01 09:11 00000000 com/oroinc/text/
0 Stored 0 0% 12-13-01 09:11 00000000 com/oroinc/text/regex/
1458 Defl:N 735 50% 10-21-99 19:25 bbe7f548 com/oroinc/text/regex/CharStringPointer.class
331 Defl:N 232 30% 10-21-99 19:26 bf322bfe com/oroinc/text/regex/MalformedPatternException.class
379 Defl:N 258 32% 10-21-99 19:25 33232905 com/oroinc/text/regex/MatchResult.class
2400 Defl:N 1220 49% 10-21-99 19:25 5ebe5019 com/oroinc/text/regex/OpCode.class
241 Defl:N 188 22% 10-21-99 19:25 173e9b29 com/oroinc/text/regex/Pattern.class
507 Defl:N 248 51% 10-21-99 19:25 4782e5ad com/oroinc/text/regex/PatternCompiler.class
574 Defl:N 280 51% 10-21-99 19:25 fa3a5349 com/oroinc/text/regex/PatternMatcher.class
2354 Defl:N 989 58% 10-21-99 19:25 48e548c6 com/oroinc/text/regex/PatternMatcherInput.class
13501 Defl:N 6569 51% 10-21-99 19:26 ee171492 com/oroinc/text/regex/Perl5Compiler.class
2599 Defl:N 1563 40% 10-21-99 19:26 7d5322dc com/oroinc/text/regex/Perl5Debug.class
11287 Defl:N 5148 54% 10-21-99 19:26 f850bc2a com/oroinc/text/regex/Perl5Matcher.class
1198 Defl:N 612 49% 10-21-99 19:26 2e63a3dd com/oroinc/text/regex/Perl5MatchResult.class
891 Defl:N 537 40% 10-21-99 19:26 a87088c4 com/oroinc/text/regex/Perl5Pattern.class
468 Defl:N 298 36% 10-21-99 19:26 9becfff7 com/oroinc/text/regex/Perl5Repetition.class
2218 Defl:N 1259 43% 10-21-99 19:25 51a33c0c com/oroinc/text/regex/Perl5StreamInput.class
2624 Defl:N 1329 49% 10-21-99 19:27 d52ea45f com/oroinc/text/regex/Perl5Substitution.class
1009 Defl:N 503 50% 10-21-99 19:27 47fc9bae com/oroinc/text/regex/StringSubstitution.class
362 Defl:N 224 38% 10-21-99 19:26 743dd420 com/oroinc/text/regex/Substitution.class
2844 Defl:N 1030 64% 10-21-99 19:27 900ecc52 com/oroinc/text/regex/Util.class
-------- ------- --- -------
47245 23222 51% 23 files
5. This ORO.ZIP file is what you need to load into the database.
C:\temp\OROMatcher-1.1.0a>loadjava -u scott/tiger -f -r -v oro.zip
arguments: '-u' 'scott/tiger' '-f' '-r' '-v' 'oro.zip'
creating : class com/oroinc/text/regex/CharStringPointer
loading : class com/oroinc/text/regex/CharStringPointer
creating : class com/oroinc/text/regex/MalformedPatternException
loading : class com/oroinc/text/regex/MalformedPatternException
creating : class com/oroinc/text/regex/MatchResult
loading : class com/oroinc/text/regex/MatchResult
creating : class com/oroinc/text/regex/OpCode
loading : class com/oroinc/text/regex/OpCode
creating : class com/oroinc/text/regex/Pattern
loading : class com/oroinc/text/regex/Pattern
creating : class com/oroinc/text/regex/PatternCompiler
loading : class com/oroinc/text/regex/PatternCompiler
creating : class com/oroinc/text/regex/PatternMatcher
loading : class com/oroinc/text/regex/PatternMatcher
creating : class com/oroinc/text/regex/PatternMatcherInput
loading : class com/oroinc/text/regex/PatternMatcherInput
creating : class com/oroinc/text/regex/Perl5Compiler
loading : class com/oroinc/text/regex/Perl5Compiler
creating : class com/oroinc/text/regex/Perl5Debug
loading : class com/oroinc/text/regex/Perl5Debug
creating : class com/oroinc/text/regex/Perl5Matcher
loading : class com/oroinc/text/regex/Perl5Matcher
creating : class com/oroinc/text/regex/Perl5MatchResult
loading : class com/oroinc/text/regex/Perl5MatchResult
creating : class com/oroinc/text/regex/Perl5Pattern
loading : class com/oroinc/text/regex/Perl5Pattern
creating : class com/oroinc/text/regex/Perl5Repetition
loading : class com/oroinc/text/regex/Perl5Repetition
creating : class com/oroinc/text/regex/Perl5StreamInput
loading : class com/oroinc/text/regex/Perl5StreamInput
creating : class com/oroinc/text/regex/Perl5Substitution
loading : class com/oroinc/text/regex/Perl5Substitution
creating : class com/oroinc/text/regex/StringSubstitution
loading : class com/oroinc/text/regex/StringSubstitution
creating : class com/oroinc/text/regex/Substitution
loading : class com/oroinc/text/regex/Substitution
creating : class com/oroinc/text/regex/Util
loading : class com/oroinc/text/regex/Util
resolving: class com/oroinc/text/regex/CharStringPointer
resolving: class com/oroinc/text/regex/MalformedPatternException
resolving: class com/oroinc/text/regex/MatchResult
resolving: class com/oroinc/text/regex/OpCode
resolving: class com/oroinc/text/regex/Pattern
resolving: class com/oroinc/text/regex/PatternCompiler
resolving: class com/oroinc/text/regex/PatternMatcher
skipping : class com/oroinc/text/regex/PatternMatcherInput
resolving: class com/oroinc/text/regex/Perl5Compiler
resolving: class com/oroinc/text/regex/Perl5Debug
resolving: class com/oroinc/text/regex/Perl5Matcher
skipping : class com/oroinc/text/regex/Perl5MatchResult
skipping : class com/oroinc/text/regex/Perl5Pattern
skipping : class com/oroinc/text/regex/Perl5Repetition
skipping : class com/oroinc/text/regex/Perl5StreamInput
resolving: class com/oroinc/text/regex/Perl5Substitution
skipping : class com/oroinc/text/regex/StringSubstitution
skipping : class com/oroinc/text/regex/Substitution
resolving: class com/oroinc/text/regex/Util
How can I DO PATTERN MATCHING from within PL-SQL using REGULAR Expression , version 8.1.6
Pascal, December 13, 2001 - 8:22 am UTC
Hi Tom,
I managed to use Loadava to load the Class Files below into Oracle DB , but some are INVALID as indicated below:
I still managed to load your sample Java Stored procedure though , because i note that it does not use any of those INVALID classes .
I also modified your Code a little bit by writing :
import java.sql.*
instead of import oracle.sql.*
The Classes are:
Match INVALID
Perl5Debug INVALID
Perl5Compiler
Perl5MatchResult
Util INVALID
Perl5Matcher
Perl5Pattern
Perl5Substitution INVALID
PatternMatcher
MatchResult
Pattern
Perl5Repetition
PatternCompiler
StringSubstitution INVALID
MalformedPatternException
CharStringPointer
META-INF/MANIFEST.MF
RegExp
Substitution INVALID
OpCode
PatternMatcherInput
Perl5StreamInput
Some Tests for your program:
SQL> select * from reg_test;
ID
------------------------------
+49179232332
+4917223232
+49160232323
+4917423232
+25574123232
+25574432432
+25574234343
+4915634343
+4916334343
+3467832323
+346023232
346122342
+346243232
+346545454
+346823232
+34686
+346222423
+34623345353
+346322332
+3464323232
exec reg_exp_compile( '\346[34]+' );
SQL> select *
2 from reg_test
3 where reg_exp_match(id) = 1;
ID
------------------------------
+346322332
+3464323232
I am stll Testing it if it works for more Complex RegExps.
Once again, thank you very very much for this.
Best regards,
Pascal
I got it Right This time around!!..Everything now works fine.....
Pascal, December 14, 2001 - 8:09 am UTC
Hi Tom,
Thanks very much for your followup to my Query...i managed to install zipped /com Folder and i loaded all the class Files successfully....
But i have one more question,
Where does the Pattern that i compile gets stored , before i match it with some string, because in this case the
reg_exp_compile and
reg_exp_match and not in a package and so how does Oracle Remember the Compiled string before it gets executed .
Best Regards,
Pascal
December 14, 2001 - 8:17 am UTC
It's in the static attribute "pattern" in the java class. I broke compile out from match for performance reasons -- rather then compiling the string for each call or comparing the last compiled string to the current match string on each call -- we just call compile once and match as many times as we like. Just makes it faster...
public final class RegExp
{
static PatternMatcher matcher;
static PatternCompiler compiler;
static {
compiler = new Perl5Compiler();
matcher = new Perl5Matcher();
}
static Pattern pattern;
static String last_test_pattern = null;
static PatternMatcherInput input = new PatternMatcherInput("");
public static final void compile( String test_pattern ) throws Exception
{
pattern = compiler.compile( test_pattern );
}
public static final int match(String test_input) throws Exception
{
input.setInput( test_input );
return matcher.contains(input, pattern) ? 1 : 0;
}
}
Combining reg_exp_compile and reg_exp_match in One CALL
pascal, December 27, 2001 - 4:10 am UTC
Hi Tom and Mark,
Happy New Year in advance to u Guyz!
I was thinking of a way to combine
reg_exp_compile and
reg_exp_match so that they could be called in a Single SQL statement , instead of one procedural call and one SQL.
How could this be achieved.
Thanks,
pascal
************************************************
exec reg_exp_compile( 'ALL.*OBJ' );
select object_name
from all_objects
where reg_exp_match(object_name) = 1
***********************************************
December 27, 2001 - 9:03 am UTC
You can -- and in fact the first version we wrote did that -- however, the overhead of doing this was considered too large.
You are going to call reg_exp_match many times, perhaps 10's of thousands or even more. Every bit of extra work that it must do is overhead that should be avoided.
You can add a method:
public final class RegExp
{
static PatternMatcher matcher;
static PatternCompiler compiler;
static {
compiler = new Perl5Compiler();
matcher = new Perl5Matcher();
}
static Pattern pattern;
static String last_test_pattern = null;
static PatternMatcherInput input = new PatternMatcherInput("");
public static final void compile( String test_pattern ) throws Exception
{
pattern = compiler.compile( test_pattern );
}
public static final int match(String test_input) throws Exception
{
input.setInput( test_input );
return matcher.contains(input, pattern) ? 1 : 0;
}
public static final int match2( String test_input, String test_pattern )
throws Exception
{
-- compare test_pattern to last_test_pattern --
-- if different
compile( test_pattern );
last_test_pattern = test_pattern
end if
return match( test_input )
}
}
/
The overhead of doing that string compare on every call adds up over time. We didn't like that so didn't do it that way (but apparently we left the last_test_pattern in there!)
Follow-up
Pascal, January 09, 2002 - 3:25 am UTC
Hi Tom amd Mark,
How are you guys doing..
Thanks for your previous follow-up on my question...
I am planning to use this regExp package as follows:
I have a DB Table which has a list of Operators and their corresponding Patterns , which matches all mobile numbers for a particular operator.
I have a c++ program which feeds the database with some short messages , and i need to compare each message so that i could extract an Operator by checking the originating phone numbers from the short messages ...and so, that means i have to compile and match fo each and every short message that i receive and there could be millios of them....
Do you think i could use this package for the above mentioned purpose without incurring performance problems?
Also, i don't have much experience with java, and so could you help me write that match_2 Compare logic if you have time please ?
Thanks in advance,
Best regards,
Pasko
January 09, 2002 - 8:50 am UTC
Millions -- I would not bank on this being "fast". You can try Native Compilation which will speed up the java (possibly by alot).
I would definitely suggest NOT using the match_2 method if you plan on doing this lots -- it really does add lots of overhead.
public static final int match2( String test_input, String test_pattern )
throws Exception
{
if (last_test_pattern == null || !test_pattern.isEqual(last_test_pattern))
{
compile(test_pattern);
last_test_pattern = test_pattern;
}
return match( test_input )
}
should do it.
I ordered ur Book !
Pascal, January 10, 2002 - 4:05 am UTC
Hi Tom/Mark,
Thanks again for your follow-up on my Qn.
I added match2 to regExp class you sent me earlier but compiled with errors on jdk 1.4 beta .
regExp.java:33: cannot resolve symbol
symbol : method isEqual (java.lang.String)
location: class java.lang.String
if (last_test_pattern == null || !test_pattern.isEqual(last_test_pattern
))
^
1 error
But i changed : your 'isEqual' method TO 'equals' and it compiled successfully :
if (last_test_pattern == null || !test_pattern.isEqual(last_test_pattern))
TO:
if (last_test_pattern == null || !test_pattern.equals(last_test_pattern))
I am still testing this to see if i get any performance improvements.
Thanks very much for your support .
Today i ordered your BOOK and i am looking forward to getting it soon , because it is not available in most of the English Bookshops here in Germany.
Best regards,
Pasko
How about using this .....
Pascal, February 08, 2002 - 11:36 am UTC
Hi Tom/Mark,
I went to this Link and found that they also have another Tool called PerlTools-1.2.zip
at :
</code>
http://www.savarese.org/oro/downloads/index.html#PerlTools <code>
So i downloaded it and wrote my simple Wrapper JSP.
It looks like :
create or replace and resolve java source named "PerlExp" as
import com.oroinc.text.regex.*;
import com.oroinc.text.perl.*;
import oracle.sql.*;
public final class PerlExp
{
static Perl5Util perl;
static {
perl = new Perl5Util();
}
public static final int match( String p_pattern , String p_input )
{
p_pattern = "/" + p_pattern + "/" ;
return perl.match(p_pattern , p_input) ? 1 : 0 ;
}
}
/
create or replace function perl_match(
p_pattern IN varchar2 , p_input IN varchar2 )
return number
as language java name 'PerlExp.match(java.lang.String , java.lang.String ) return int';
/
Soi could just code in a Siingle sQL as : from your Example:
select * from all_objects where
perl_match( 'ALL.*OBJ' , object_name ) = 1
What do you think is the Performance of this one compared to the Earlier procedure.
Best Regards,
Pascal
February 08, 2002 - 4:40 pm UTC
I don't know -- benchmark it for us and let us know ;)
I will guess:
the original implementation will be faster.
o it does not create a temporary string "/" + pattern + "/"
o it does not compile the pattern for each row.
Function-based Index for Java functions
Tony, December 24, 2002 - 1:43 am UTC
Do we have to create function-based index to speed up the performance in pattern serach using this Java function?.
How to make a deterministic Java function for the above case?
December 24, 2002 - 8:30 am UTC
pattern matching isn't something an index is going to help. You have to inspect each and every record -- no chance to short circut it.
what exactly would you INDEX? there isn't anything...
Error during compilation
Sasa Petkovic, February 17, 2003 - 4:19 am UTC
Hi Tom,
I successfully loaded all the classes from or.zip but got an error during compilation java source "RegExp".
There is an output:
CREATE OR REPLACE AND resolve java source named "RegExp" AS
*
ERROR at line 1:
ORA-29536: badly formed source: Encountered "PUBLIC" at line 4, column 1.
Was expecting one of:
<EOF>
"#sql" ...
"abstract" ...
"final" ...
"public" ...
"class" ...
"import" ...
"interface" ...
";" ...
"\\u001a" ...
Could you help please.
Sasa
February 17, 2003 - 11:22 am UTC
what version of sqlplus are you using?
and do it with SET ECHO ON
and show us the ENTIRE cut and paste -- you must either
o have a typo
o be using a client that doesn't know what to do with create or replace java code.
Sasa Petkovic, February 17, 2003 - 11:32 am UTC
Hi Tom,
I did as you recomended and got as follows:
SQL*Plus: Release 8.1.7.0.0 - Production on Mon Feb 17 17:25:27 2003
(c) Copyright 2000 Oracle Corporation. All rights reserved.
Connected to:
Oracle8i Enterprise Edition Release 8.1.7.3.0 - Production
With the Partitioning option
JServer Release 8.1.7.3.0 - Production
SQL>
SQL> CREATE OR REPLACE AND resolve java source named "RegExp" AS
2 import com.oroinc.text.regex.*;
3 import oracle.SQL.*;
4
5 PUBLIC final class RegExp
6 {
7 static PatternMatcher matcher;
8 static PatternCompiler compiler;
9
10 static {
11 compiler = NEW Perl5Compiler();
12 matcher = NEW Perl5Matcher();
13 }
14 static Pattern pattern;
15 static String last_test_pattern = NULL;
16 static PatternMatcherInput input = NEW PatternMatcherInput("");
17
18 PUBLIC static final void COMPILE( String test_pattern ) throws EXCEPTION
19 {
20 pattern = compiler.COMPILE( test_pattern );
21 }
22
23 PUBLIC static final INT match(String test_input) throws EXCEPTION
24 {
25 input.setInput( test_input );
26 RETURN matcher.contains(input, pattern) ? 1 : 0;
27 }
28 }
29 /
CREATE OR REPLACE AND resolve java source named "RegExp" AS
*
ERROR at line 1:
ORA-29536: badly formed source: Encountered "PUBLIC" at line 4, column 1.
Was expecting one of:
<EOF>
"#sql" ...
"abstract" ...
"final" ...
"public" ...
"class" ...
"import" ...
"interface" ...
";" ...
"\\u001a" ...
SQL>
Cut and paste whole input and output.
Regards,
Sasa
PS
Run from TOAD (calling sql plus from TOAD)
February 17, 2003 - 12:09 pm UTC
so, why did you change my example and capitalize the word public? Java is a case sensitive language.
Use the code I provided as I provided it and it'll work.
Change the case -- and you are on your own.
I'll betcha that neat "toad" thingy pretty printed that for you did it? well, java isn't plsql, "toad" isn't going to do good things to it. RETURN EXCEPTION -- all of those "shouting statements" will fail (why do people like keywords upper cased? I've always found that to be 100% ugly)
Built-in owa_pattern
Alexander, February 17, 2003 - 11:58 pm UTC
Maybe built-in package owa_pattern can be used? Comment from package:
The package OWA_PATTERN is a "regular expression" pattern matching
package. There are 3 fundamental subprograms in OWA_PATTERN.
The are: AMATCH, MATCH, and CHANGE.
TOAD capitalization effect
Sasa, February 18, 2003 - 1:18 am UTC
Hi Tom,
You are completely right. I didn't notice that I had capitalization effect option on in TOAD.
Now it works.Thanks again.
Regards,
Sasa
j., March 08, 2003 - 7:33 am UTC
hi tom,
I 've tried your sample - everything worked very well. but I've now two additional questions:
1st: what (in general) is the right way to remove all the database objects created for this? actually the tool I use (SQL navigator) doesn't show dependencies between stored programs "REG_EXP*" and their associated java classes.
2nd: I get the following output from USER_OBJECTS:
/10074bae_Substitution /184abd21_Perl5Repetition /22eba5d6_Perl5Matcher /2ca03ed7_MalformedPatternExce /74c0e441_Perl5Substitution /781a1e56_Perl5StreamInput /83c2e19d_Perl5Compiler /8a706168_StringSubstitution /a572ed52_PatternMatcher /af39100e_MatchResult /b3a68f05_Perl5Pattern /dbdcd9f3_PatternMatcherInput /dc521360_CharStringPointer /e8ec9c82_Perl5MatchResult /ea37ca8e_PatternCompiler /ede86355_Perl5Debug REG_EXP_COMPILE REG_EXP_MATCH RegExp RegExp com/oroinc/text/regex/OpCode com/oroinc/text/regex/Pattern com/oroinc/text/regex/Util
could you please explain, what has happened here, if there 's anything wrong? where did all classes except of Util, OpCode and Pattern get their (unexpected) names from, and is there any way to "rearrange" that?
thanks for your help.
and can be rearranged, that ALL classes include their pathes)?
anything wrong with the way I installed your example? could those entries with "/" be avoided/hidden
March 08, 2003 - 9:04 am UTC
1) use user dependencies -- but don't you know what you installed??? You drop that which you install and it goes away.
2) they are the short names (we need 30 characters) of the external interfaces available to you. You can use dbms_java.longname to get their "longer" name back.
How about this -- create a user regex. Put all this stuff in there -- and nothing more, nothing less. Then, if you want to remove it -- drop user regex cascade; Grant on that which you want exposed and the "ugly" object names don't exist as far as anyone else is concerned.
j., March 08, 2003 - 11:07 am UTC
30 characters only - I should have known.
thanks, DBMS_JAVA.LONGNAME helps very much to identify java classes through their fully qualified names.
but one problem is left (to me): I've already searched user_dependencies and didn't find links between PL/SQL code (REG_EXP_MATCH, REG_EXP_COMPILE) and java source (RegExp) there. so do one have to do a "code search" in order to check those dependencies?
March 08, 2003 - 11:17 am UTC
create or replace procedure reg_exp_compile(
p_test_input in varchar2)
as language java name 'RegExp.compile(java.lang.String) ';
/
See how it is done as a string there?
1 create or replace procedure reg_exp_compile(
2 p_test_input in varchar2)<b>
3* as language java name 'I_dontReally.ExistAnywhere(java.lang.String) ';</b>
ops$tkyte@ORA920> /
Procedure created.
ops$tkyte@ORA920> exec reg_exp_compile('xxx' );
BEGIN reg_exp_compile('xxx' ); END;
*
ERROR at line 1:
ORA-29540: class I_dontReally does not exist
ORA-06512: at "OPS$TKYTE.REG_EXP_COMPILE", line 0
ORA-06512: at line 1
it is sort of like dynamic SQL in that sense.
NullPointerException
A reader, April 25, 2003 - 12:25 pm UTC
I tried the following:
begin
reg_exp_compile('^[AB].*i$');
end;
select * from emp
where reg_exp_match(last_nm)=1
I got ORA-29532 NullPointerException
I changed this to
select * from emp
where last_nm is not null
reg_exp_match(last_nm)=1
and everything worked fine. It gave me all last names starting with A or B and ending with i.
So, apparenntly, the regexp package cannot handle a null input.
How can I fix this so that a null column doesnt match any regexp? So I dont have to keep adding this 'and column is not null' condition to any query using this regexp?
Thanks
April 25, 2003 - 3:45 pm UTC
just either add a check in the java or wrap yet another layer of plsql.
most efficient would be:
where decode( last_nm, null, 0, reg_exp_match(last_nm) ) = 1
as that would not even have the overhead of invoking the function (and your approach is "unsafe", we can and will flip flop the order of evaluation of the predicate AT WILL -- use the decode -- not the predicate - to ensure order of operations)
Flip-flop the order?
A reader, April 25, 2003 - 4:03 pm UTC
Oracle flip-flops the order of evaluation of predicates at will?
Doesnt it follow the basic short-circuit boolean evaluation?
So, if I have
where 1=0
and <bunch of other stuff>
This query should come out right away, since it is never going to match anything, right?
April 25, 2003 - 4:45 pm UTC
it is purposely "not said".
We can evaluate:
where A and B
as where A and then B or where B and then A.
You have NO real true 100% control over it.
Substituting?
A reader, April 25, 2003 - 4:51 pm UTC
I wrote a fancy regexp using this method and it is working like a charm. Saved me hours of work.
Logical extension: Can I use the Substitution methods in the ORO classes to perform my substitutions on the matches that I found? Using the \(...\) groupings in my compiled pattern and referencing them as \1, \2, etc in my UPDATE
i.e. something like
update table
set column=Substitute(my_pattern,'\1..\2..')
where reg_exp_match(...)=1
This would be awesome. Anyway to do this?
Thanks
April 25, 2003 - 7:39 pm UTC
You would have to look at the ORO classes and see if they support the "replace" sort of functionality.
Jakarta ORO
A reader, April 26, 2003 - 11:41 pm UTC
Seems the ORO software has moved to </code>
http://jakarta.apache.org <code>I downloaded ORO 2.0.7 and added the following method to your Java source
public static final String substitute(String p_input,String p_expression) throws Exception
{
Perl5Util util = new Perl5Util();
return util.substitute(p_expression,p_input);
}
Now I can do
UPDATE table
set column=substitute(column,'s/pattern/replacement/')
where ...
Also, changed your compile procedure to a function so that it can be selected from dual instead of having to run PL/SQL and then the SQL.
public static final String compile( String p_pattern ) throws Exception
{
try {
pattern = compiler.compile(p_pattern);
return "Pattern compiled successfully: " + p_pattern;
} catch (MalformedPatternException e) {
return "Error compiling pattern: " + e.getMessage();
}
}
Thanks.
April 27, 2003 - 8:22 am UTC
why
select regexp.compile( ? ) from dual;
is considered "easier" then
begin regexp.compile( ? ); end;
(especially when one takes more keystrokes to type......)
select from dual vs. pl/sql
A reader, April 27, 2003 - 9:15 pm UTC
Well, most front-end query apps here have no PL/SQL blocks in their code. They are 100% query apps, so this just makes it easier to provide a 'select from dual' interface to this feature rather than a 'begin ... end;' interface.
I guess its just a matter of personal preference...
fyi, how to make pattern_match work with backrefs
Dave Lowenstein, May 14, 2003 - 5:39 pm UTC
I searched high and low for how to make owa_pattern.match work with saving what it found in an array of back references (backrefs). I figure that asktom.oracle.com is a good place to put what I found. I pulled this out of thin air, as it wasn't in any docs i found.
Just like in Perl regular expressions, your backrefs need to be between parentheses to stand up and be counted. I don't know why i didn't think of that before. Here's a working example of what i spent a large portion of yesterday trying to make work. It finds any characters between two pipes and saves what it finds into an array:
declare
a owa_text.vc_arr;
b boolean;
c varchar(3000):='|abc|';
begin
b:=owa_pattern.match(line=>c, pat=>'|(\w*)|', backrefs=>a);
if a.exists(1)
then
for i in a.first..a.last
loop
dbms_output.put_line(a(i));
end loop;
end if;
end;
/
SQL>/
abc
PL/SQL procedure successfully completed.
Largest matching string
Arun Kalavath, July 11, 2003 - 4:23 am UTC
Hi Tom
I am trying to do something similar. I have a list of phone numbers and I want to return the largest matching number.
drop table t
2 /
Table dropped.
SQL>
SQL> create table t(pnum varchar2(15) )
2 /
Table created.
SQL> insert into t values ('0123456789')
2 /
1 row created.
SQL> insert into t values ('012345678')
2 /
1 row created.
SQL> insert into t values ('01234567')
2 /
1 row created.
SQL> insert into t values ('0123456')
2 /
1 row created.
SQL> insert into t values ('0123')
2 /
1 row created.
SQL>
SQL> create or replace procedure get_best_match (Arg_pnum in t.pnum%type )
2 as
3 result t.pnum%type;
4 BEGIN
5 SELECT MAX( pnum ) into result
6 FROM t
7 WHERE (pnum = substr( Arg_pnum, 1, 1 )
8 or pnum = substr( Arg_pnum, 1, 2 )
9 or pnum = substr( Arg_pnum, 1, 3 )
10 or pnum = substr( Arg_pnum, 1, 4 )
11 or pnum = substr( Arg_pnum, 1, 5 )
12 or pnum = substr( Arg_pnum, 1, 6 )
13 or pnum = substr( Arg_pnum, 1, 7 )
14 or pnum = substr( Arg_pnum, 1, 8 )
15 or pnum = substr( Arg_pnum, 1, 9 )
16 or pnum = substr( Arg_pnum, 1, 10 )
17 or pnum = substr( Arg_pnum, 1, 11 )
18 or pnum = substr( Arg_pnum, 1, 12)
19 or pnum = substr( Arg_pnum, 1, 13 )
20 or pnum = substr( Arg_pnum, 1, 14 )
21 or pnum = substr( Arg_pnum, 1, 15 ));
22 dbms_output.put_line (result);
23 end get_best_match;
24
25 /
Procedure created.
SQL> select * from t;
PNUM
---------------
0123456789
012345678
01234567
0123456
0123
SQL> set serveroutput on
SQL> exec get_best_match('0123');
0123
PL/SQL procedure successfully completed.
SQL> exec get_best_match('01231');
0123
PL/SQL procedure successfully completed.
SQL> exec get_best_match('0123456');
0123456
PL/SQL procedure successfully completed.
SQL> exec get_best_match('01234561');
0123456
PL/SQL procedure successfully completed.
SQL> exec get_best_match('01234562');
0123456
PL/SQL procedure successfully completed.
SQL> exec get_best_match('01234567');
01234567
PL/SQL procedure successfully completed.
I.e. I need to return the highest matching number. As you can see currently we use substring. Is there a better way to achive this?
Thank you
Arun
July 11, 2003 - 9:46 am UTC
ops$tkyte@ORA817DEV> variable n varchar2(20)
ops$tkyte@ORA817DEV>
ops$tkyte@ORA817DEV> exec :n := '0123'
PL/SQL procedure successfully completed.
ops$tkyte@ORA817DEV>
ops$tkyte@ORA817DEV> select substr( max(to_char(length(pnum),'fm00009') || pnum ), 6 )
2 from t
3 where :n like pnum||'%';
SUBSTR(MAX(TO_CH
----------------
0123
ops$tkyte@ORA817DEV>
ops$tkyte@ORA817DEV> exec :n := '01231'
PL/SQL procedure successfully completed.
ops$tkyte@ORA817DEV> /
SUBSTR(MAX(TO_CH
----------------
0123
ops$tkyte@ORA817DEV> exec :n := '0123456'
PL/SQL procedure successfully completed.
ops$tkyte@ORA817DEV> /
SUBSTR(MAX(TO_CH
----------------
0123456
ops$tkyte@ORA817DEV> exec :n := '01234561';
PL/SQL procedure successfully completed.
ops$tkyte@ORA817DEV> /
SUBSTR(MAX(TO_CH
----------------
0123456
ops$tkyte@ORA817DEV> exec :n := '01234562';
PL/SQL procedure successfully completed.
ops$tkyte@ORA817DEV> /
SUBSTR(MAX(TO_CH
----------------
0123456
ops$tkyte@ORA817DEV> exec :n := '01234567';
PL/SQL procedure successfully completed.
ops$tkyte@ORA817DEV> /
SUBSTR(MAX(TO_CH
----------------
01234567
Sorry, forgot version number
Arun Kalavath, July 11, 2003 - 4:26 am UTC
We use Oracle 8.1.7. Enterprise edition.
Thanks
Arun
Always does a full tablescan
Arun Kalavath, July 15, 2003 - 3:43 am UTC
Hi Tom
Thank you for this reply. Could you tell me why you used "substr( max(to_char(length(pnum),'fm00009') || pnum
), 6 )". As I understand it, this statments adds the length of the maximum value returned as a 'fm00009' to the max value and then we substring this from the 6th position. I got the same result in the same format when I did
select max(pnum) from t where :n like pnum||'%';
So what is the difference?
Question 2 :- I tried adding an index on pnum. But the query alway seems to do a full table scan. How can I get the query to use an index on pnum column? Or is this because I only got few records in the table?
SQL> create index xx on t (pnum);
Index created.
SQL> set autotrace traceonly
SQL> select max(pnum) from t where :n like pnum||'%';
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (FULL) OF 'T'
Statistics
----------------------------------------------------------
0 recursive calls
4 db block gets
1 consistent gets
0 physical reads
0 redo size
198 bytes sent via SQL*Net to client
311 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL> drop index xx;
Index dropped.
SQL> create unique index xx on t (pnum);
Index created.
SQL> select max(pnum) from t where :n like pnum||'%';
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (FULL) OF 'T'
Statistics
----------------------------------------------------------
0 recursive calls
4 db block gets
1 consistent gets
0 physical reads
0 redo size
210 bytes sent via SQL*Net to client
311 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
Thanks
Arun
July 15, 2003 - 9:41 am UTC
to to_char() trick was used to get the PNUM that had the maximum length, the max worked on the first characters -- getting the one with the maximum length.
I'm fairly sure "max" will work as well -- havent thought it all of the way out.
and of course it does a full scan -- it has to look at EACH and EVERY row. that -- and you are using the RBO and the RBO isn't very smart.
Consider if you used the CBO AND pnum was NOT NULL (or you added "and pnum is not null" to the query itself -- since nulls are not indexed in b*trees)
ops$tkyte@ORA920LAP> create table t
2 as
3 select rownum pnum, a.* from all_objects a;
Table created.
ops$tkyte@ORA920LAP>
ops$tkyte@ORA920LAP> create index t_idx on t(pnum);
Index created.
ops$tkyte@ORA920LAP> alter table t modify pnum NOT NULL;
Table altered.
ops$tkyte@ORA920LAP>
ops$tkyte@ORA920LAP> variable x varchar2(5)
ops$tkyte@ORA920LAP> exec :x := '1234'
PL/SQL procedure successfully completed.
ops$tkyte@ORA920LAP>
ops$tkyte@ORA920LAP> set autotrace on explain
ops$tkyte@ORA920LAP> select max(pnum) from t where :x like pnum ||'%';
MAX(PNUM)
----------
1234
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (FULL) OF 'T'
ops$tkyte@ORA920LAP> analyze table t compute statistics for table for all indexes for all indexed columns;
Table analyzed.
ops$tkyte@ORA920LAP> select max(pnum) from t where :x like pnum ||'%';
MAX(PNUM)
----------
1234
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=4)
1 0 SORT (AGGREGATE)
2 1 FIRST ROW (Cost=2 Card=1532 Bytes=6128)
3 2 INDEX (FULL SCAN (MIN/MAX)) OF 'T_IDX' (NON-UNIQUE) (Cost=2 Card=1)
<b>the cbo will use the index as a skinny table and full scan it very efficiently</b>
Thank you Tom
Arun Kalavath, July 16, 2003 - 4:32 am UTC
A reader, September 12, 2003 - 10:13 am UTC
Can we achive(pattern matching) this in oracle 10g feature "Regular Expressions" ?
Thanks
September 12, 2003 - 10:51 am UTC
regular expressions are pattern matching, yes.
OWA_PATTERN
Tatiane, September 12, 2003 - 1:22 pm UTC
Sirs, Oracle already provides a package for handling regular expressions: OWA_PATTERN. I am using 9iR2. I imagine 10G will make it built-in in the database.
Tom, why the heck is it not listed anywhere in the online docs (like dbms_system is not either) ?
SQL> desc owa_pattern
FUNCTION AMATCH RETURNS NUMBER(38)
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
LINE VARCHAR2 IN
FROM_LOC NUMBER(38) IN
PAT VARCHAR2 IN
FLAGS VARCHAR2 IN DEFAULT
FUNCTION AMATCH RETURNS NUMBER(38)
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
LINE VARCHAR2 IN
FROM_LOC NUMBER(38) IN
PAT TABLE OF VARCHAR2(4) IN/OUT
FLAGS VARCHAR2 IN DEFAULT
FUNCTION AMATCH RETURNS NUMBER(38)
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
LINE VARCHAR2 IN
FROM_LOC NUMBER(38) IN
PAT VARCHAR2 IN
BACKREFS TABLE OF VARCHAR2(32767) OUT
FLAGS VARCHAR2 IN DEFAULT
FUNCTION AMATCH RETURNS NUMBER(38)
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
LINE VARCHAR2 IN
FROM_LOC NUMBER(38) IN
PAT TABLE OF VARCHAR2(4) IN/OUT
BACKREFS TABLE OF VARCHAR2(32767) OUT
FLAGS VARCHAR2 IN DEFAULT
FUNCTION CHANGE RETURNS NUMBER(38)
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
LINE VARCHAR2 IN/OUT
FROM_STR VARCHAR2 IN
TO_STR VARCHAR2 IN
FLAGS VARCHAR2 IN DEFAULT
PROCEDURE CHANGE
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
LINE VARCHAR2 IN/OUT
FROM_STR VARCHAR2 IN
TO_STR VARCHAR2 IN
FLAGS VARCHAR2 IN DEFAULT
FUNCTION CHANGE RETURNS NUMBER(38)
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
MLINE RECORD IN/OUT
ROWS TABLE OF VARCHAR2(32767) IN/OUT
NUM_ROWS NUMBER(38) IN/OUT
PARTIAL_ROW BOOLEAN IN/OUT
FROM_STR VARCHAR2 IN
TO_STR VARCHAR2 IN
FLAGS VARCHAR2 IN DEFAULT
PROCEDURE CHANGE
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
MLINE RECORD IN/OUT
ROWS TABLE OF VARCHAR2(32767) IN/OUT
NUM_ROWS NUMBER(38) IN/OUT
PARTIAL_ROW BOOLEAN IN/OUT
FROM_STR VARCHAR2 IN
TO_STR VARCHAR2 IN
FLAGS VARCHAR2 IN DEFAULT
PROCEDURE GETPAT
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
ARG VARCHAR2 IN
PAT TABLE OF VARCHAR2(4) IN/OUT
FUNCTION MATCH RETURNS BOOLEAN
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
LINE VARCHAR2 IN
PAT VARCHAR2 IN
FLAGS VARCHAR2 IN DEFAULT
FUNCTION MATCH RETURNS BOOLEAN
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
LINE VARCHAR2 IN
PAT TABLE OF VARCHAR2(4) IN/OUT
FLAGS VARCHAR2 IN DEFAULT
FUNCTION MATCH RETURNS BOOLEAN
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
LINE VARCHAR2 IN
PAT VARCHAR2 IN
BACKREFS TABLE OF VARCHAR2(32767) OUT
FLAGS VARCHAR2 IN DEFAULT
FUNCTION MATCH RETURNS BOOLEAN
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
LINE VARCHAR2 IN
PAT TABLE OF VARCHAR2(4) IN/OUT
BACKREFS TABLE OF VARCHAR2(32767) OUT
FLAGS VARCHAR2 IN DEFAULT
FUNCTION MATCH RETURNS BOOLEAN
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
MLINE RECORD IN
ROWS TABLE OF VARCHAR2(32767) IN
NUM_ROWS NUMBER(38) IN
PARTIAL_ROW BOOLEAN IN
PAT VARCHAR2 IN
RLIST RECORD OUT
ROWS TABLE OF NUMBER(38) OUT
NUM_ROWS NUMBER(38) OUT
FLAGS VARCHAR2 IN DEFAULT
FUNCTION MATCH RETURNS BOOLEAN
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
MLINE RECORD IN
ROWS TABLE OF VARCHAR2(32767) IN
NUM_ROWS NUMBER(38) IN
PARTIAL_ROW BOOLEAN IN
PAT TABLE OF VARCHAR2(4) IN/OUT
RLIST RECORD OUT
ROWS TABLE OF NUMBER(38) OUT
NUM_ROWS NUMBER(38) OUT
FLAGS VARCHAR2 IN DEFAULT
September 12, 2003 - 2:12 pm UTC
owa_pattern, a very limited -- not too functional implementation -- is part of the web toolkit and is documented in the same place htp, htf, owa, owa_util, et. al. are.
in the ias documentation. it is part of the webserver.
owa_pattern , owa_text Installation
pasko, October 15, 2003 - 9:00 am UTC
Hi Tom,
i'm back again to my original query:
It seems there's a new Version again for OROMatcher stuff:
your initial version for reg_exp_match and reg_exp_compile still works though, with the addition of the new import statements as :
import org.apache.oro.text.regex.*;
import org.apache.oro.text.perl.*;
Download @:
</code>
http://jakarta.apache.org/oro/index.html <code>
They have even provided an executable which can directly be loaded in the Database with the loadjava Utility:
jakarta-oro-2.0.7.jar
------------------------------------------------------
A Question though about the owa_pattern and owa_match Packages :
-----------------------------------------------------
I note that, if i run the following scripts under the
$ORACLE_HOME/rdbms/admin Folder ,
@pubpat.sql
@pubtext.sql
@privpat.sql
@privtext.sql
i could also get most of the Regular Expressions functionality that i wanted.
The Limitations for this owa_pattern as listed in the Package are :
--------------------------
Possible future enhancements:
-----------------------------
* \B - match on non-"word" boundaries (between \w and \w, or \W and \W)
* "or" character matches:
change(text,'(Unix|unix)','UNIX') would change both occurances
* Using control character references:
A backslashed c followed by a single character, such as \cD, matches
the corresponding control character.
-- No support for:
-- \b == Backspace (in a character class)
-- \r == Carriage return
-- \f == Form feed
-- Modified support for:
-- \s == A whitespace charcter -> [ \t\n\r\f]
-- \S == A non-whitespace character
----------------------------------------------------
Are we allowed to install the packages as stand-alone Packages in other Applications which might require their usage ?
Thanks in advance.
October 15, 2003 - 6:20 pm UTC
in 8i and up -- they should just be "already there" -- they are part of the database, installed by default.
in 10g - the regex stuff is going to be fantastic.
How to pass Array of Pattern back and forth from Oracle to Java...
pasko, October 31, 2003 - 8:56 am UTC
Hi Tom,
I know regular Expressions are coming in 10G,
but in the meantime, i was just trying to figure out how to pass an Array of type: Pattern from your Example above:
i mean an array of Compiled Regular Expressions.
The Reason i am doing this is that i need to save the Compiled Patterns so that i could re-visit some of them again when doing the Matching without recompiling them again.
I know the solution in this case is optimzed for compile- once , match-many times using the same compiled Pattern.
Now i want to Optimize it further by caching some frequently Used Patterns and skip the Compilation process.
Something like this pseudo code :
// I get this from Oracle table
String[] tokens = { array_of_regular_Expressions } ;
// I need to keep track of some known number of
// Compiled patterns
Pattern[] patterns ; // Array of patterns
patterns = new Pattern[ tokens.length ];
for( int token=0; token < tokens.length; token++ ){
patterns[token] = compiler.compile( tokens[token] );
}
I know using some Examples on this site, i could pass
the String Array: tokens from Oracle to Java.
But how to pass an Array of Class Type : Pattern .
Thanks in advance.
November 01, 2003 - 11:42 am UTC
you would not pass an array of patterns? You would just make that a static data structure in the class itself. there is no SQL type that corresponds to "pattern", it is 100% java.
Just set up a hash table of "reg expression" to "compiled regex" and in the compile function -- just do a lookup in the hash table to see if you've compiled it and if so -- just set the pattern to that, else compile it, put it in the hash table and set the pattern to that.
Give the link..
Arangaperumal, July 12, 2004 - 7:29 am UTC
Hi Tom,
Kindly give the link to know about
OWA_PATTERN ( oracle 8.1.7).
I can not able find oracle document.
Thank you
July 12, 2004 - 11:30 am UTC
a working example
CHAG, September 21, 2004 - 5:51 pm UTC
declare
v_string_returned boolean;
BEGIN
v_string_returned :=owa_pattern.match('ABCn124x11xs', '\d');
END;
Non-greedy matching?
Mark, May 26, 2005 - 4:49 pm UTC
Hi Tom,
I have been toying with REGEXP_REPLACE to parse strings. The problem I am having is that this function seems to match greedily; I can't figure out how to make it match in a non-greedy way.
In case you are not familiar with the terms, this page gives a brief example explaining the difference between greedy and non-greedy:
http://www.channelping.com/practicalperl/?id=lectures/regex/greedy_vs_nongreedy
Example of what I'm trying to accomplish:
To extract this zip code from an address string, the following query appears to give just what I want:
SQL> select a, length(a) from (SELECT REGEXP_REPLACE('DEER PARK, NY 11729-2002 ', '^[^[:digit:]]*([[:digit:]-]+).*$', '\1') a from dual);
A LENGTH(A)
---------- ----------
11729-2002 10
However, try throwing a digit into the city name...
SQL> select a, length(a) from (SELECT REGEXP_REPLACE('DEER4 PARK, NY 11729-2002 ', '^[^[:digit:]]*([[:digit:]-]+).*$', '\1') a from dual);
A LENGTH(A)
- ----------
4 1
If this were PERL, we could replace the [^[:digit:]] with .*?, thus telling the parser to match all characters non-greedily up until the first digit. Unfortunately, attempting to do so with REGEXP_REPLACE yields this result:
SQL> select a, length(a) from (SELECT REGEXP_REPLACE('DEER PARK, NY 11729-2002 ', '^.*?([[:digit:]-]+).*?$', '\1') a from dual);
A LENGTH(A)
- ----------
2 1
So, is there any way to do a non-greedy match? (No city really has a digit in its name, but I hope I have illustrated the point.) I'm ready to load up jakarta.regexp, but I thought I'd ask the Oracle man first. :)
Thanks a lot,
Mark
May 26, 2005 - 7:40 pm UTC
I believe that right now in 10gr1, it is only the posix regex which is greedy, rumors of 10gr2 having perl mode too where it can be "not so greedy"
I know if no way to make it 'non-greedy'
Khalid Azmi, July 12, 2005 - 11:16 am UTC
How to find Mixed case values
A reader, August 19, 2005 - 6:49 pm UTC
Hello Sir.
How do I find values having mixed case in SQL.
I dont want all upper case or all lower case values.
Only values that have mixed casei.e values that have both capital as well as small letters. Text may also have some numeric values.
'ORACLE IS GOOD1' -- dont pick
'oracle is good2' -- dont pick
'Oracle is Good1' -- Pick this
'Oracle is gOod2' -- Pick this
create table mix (mix_col varchar2(100) ;
insert into mix values ('ORACLE IS GOOD1' ) ;
insert into mix values ('oracle is good2' ) ;
insert into mix values ('Oracle is Good1') ;
insert into mix values ('Oracle is gOod2') ;
August 20, 2005 - 4:57 pm UTC
ops$tkyte@ORA10G> select * from mix where upper(mix_col)<>mix_col and lower(mix_col)<>mix_col;
MIX_COL
-------------------------------------------------------------------------------
Oracle is Good1
Oracle is gOod2
Thanx
A reader, August 20, 2005 - 5:58 pm UTC
use of RegExp_Like
Kamran, September 23, 2005 - 7:03 am UTC
reference: All results used in these question are based on Oracle's default Scott.Emp table
All questions are about second parameter in Regexp_like function
/********** Question # 1**************************************/
select ename from emp
where regexp_like (ename, '[[A*B]$');
OR
select ename from emp
where regexp_like (ename, '[^[A*B]$');
above queries run but not the following one.
Now removing one bracket '[]' from each side:
select ename from emp
where regexp_like (ename, '[A*B$');
ERROR at line 2:
ORA-12726: unmatched bracket in regular expression
why such error is not raised in previous cases.
/***********************************************************/
/************** Question # 2 *******************************/
select ename from emp
where regexp_like (ename, '[A*B]');
using [A*B] or [A.B] or [A-B] results in same. is there any difference.
when to use [] and when not to use?
e.g. '^A' or '^[AL]' means starting with 'A'
where '^AL' or '^(AL)' mean starting with 'AL'
/***********************************************************/
/************** Question # 3 *******************************/
'A|N' means either starting with 'A' or ending with 'N'.
Result is:
ALLEN
MARTIN
ADAMS
How can we get names starting with 'A' and ending with 'N'?
Result should be:
ALLEN
Hopeful for a an elucidating response as ever.
September 23, 2005 - 9:37 am UTC
I'm not a regexp expert, but it would appear that [ starts the [] pair and the second [ is considered in the "set"
the second one is not "the same"
ops$tkyte@ORA10GR1> create table t ( x varchar2(5) );
Table created.
ops$tkyte@ORA10GR1> insert into t values ('*');
1 row created.
ops$tkyte@ORA10GR1> select * from t where regexp_like(x,'[A*B]');
X
-----
*
ops$tkyte@ORA10GR1> select * from t where regexp_like(x,'[A.B]');
no rows selected
Re: use of RegExp_Like
Anders, September 23, 2005 - 12:17 pm UTC
I'd like to take a stab at this if I may:
The character class markers [...] can not be nested, so '[' in a class just mean '['. Therefore, '[[A*B]$' just matches a string ended by either '[', 'A', '*' or 'B', and '[^[A*B]$' matches a string not ended by '[', 'A', '*' nor 'B'. But normally (outside a character class) '[' have the special meaning of starting a character class and since that have to be ended by a ']', the pattern '[A*B$' contains a syntax error. Had you wanted to match a string ending with the '[', 'A', '*' and 'B' in sequence, you could escape the '[' and '*' and write '\[A\*B$'. (This is just what Tom said).
The pattern '[A*B]' matches any string that contain either 'A', '*' or 'B', the pattern '[A.B]' matches any string that contain either 'A', '.' or 'B' and the pattern '[A-B]' matches any string that contain a character in the range between 'A' and 'B' which is just 'A' or 'B'. So the patterns are all different, but if your input string only contain letters (as emp.ename) you won't see the difference. But you could test it with something like "select v from (select chr(level) v from dual connect by level<=255) where regexp_like(v,'[a*b]')"
"when to use [] and when not to use?" You should use [abc] when you want to match any of a set of expressions. It is kind of a shorthand for (a|b|c). But note that the empty group (eg. the pattern '[]') is not allowed. ']' have no special meaning when placed in the very begining, just as '-' have not special meaning when placed at the begining or end of a group, so the pattern '[][-]' would match any string containing either ']', '[' or '-' (be cause the first ']' is a the start, the second '[' is inside and the '-' is at the end of a group).
'^A' matches any string stared by 'A', but '^[AL]' matches any string started by 'A' or 'L'.
Bot '^AL' and '^(AL)' matches any string started by the characters 'A' and 'L' in sequence by the second one also creates a grouping that you could refer to via a back reference.
'A|N' matches any string containing either 'A' or 'N'. As noted above [abc] is simelair to (a|b|c), so this pattern might also be written with the shorthand '[AN]'.
You could get names starting with 'A' and ending with 'N', by using the pattern '^A.*N$', but "ename like 'A%N'" would probably be faster (to write and execute).
You might want to check out:
</code>
http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/ap_posix001.htm http://search.cpan.org/dist/perl/pod/perlre.pod#Version_8_Regular_Expressions <code>
(Oracles regexp documentation is lacking quite a bit, so you might get some inspiration/understanding by looking at the perl docs, but remember that oracles regexp is not perl-compatible so some perl-features is missing)
September 23, 2005 - 8:37 pm UTC
Thanks
Kamran, September 26, 2005 - 1:24 am UTC
All that reference stuff is very useful. Thanks Tom, and special thanks for Andres.
A reader, January 10, 2006 - 10:40 am UTC
Hi Tom,
My requirement is a little abstract and very difficult to explain but I will try my best.
We are working on a project right now where we are trying to analyze all records from a customer.
Consider a table like
create table names
(
fn1 varchar2(20),
fn2 varchar2(20),
ln1 varchar2(20),
ln2 varchar2(20)
);
insert into names ('ANTHONY', 'ANTHONY', 'JOHN', 'JOHN');
insert into names ('JOHNNY', 'KOHNNY', 'SOMELN', 'SOMELN2');
insert into names ('RACHEL', 'RACQUEL', 'KOLI', 'KOLI');
Now, the customer wants a way to retrieve records that follow some rules:
1. Records which are an exact match or where the first few characters match like the first record(comparison between fn1 and fn2 and ln1 and ln2).
2. Records which are similar matches like the second record where atleast the last five characters match in the firstname and the first six characters match in the lastname.
The inputs will be what to compare (firstname/lastname), how many consecutive characters in the name to compare (a numeric value).
Can we use some kind of pattern matching for this...
Please help.
Thanks.
regular expressions
mo, January 26, 2006 - 8:31 pm UTC
Tom:
I see a big reference that oracle 10g supports regular expressions like perl and awk.
1. Does not 8i and 9i support regular expression using owa_pattern.
2. Would you upgrade a database from 8i to 10g for the regular expression feature.
Thanks,
January 27, 2006 - 8:31 am UTC
1) rudimentary support in a plsql package is provided, as opposed to full posix (10gr1) and perl (10gr2) support compiled into the kernel.
2) that would not be my reason for doing that upgrade. It would be one of the thousands of other really big additions (and in particular the supportability - unless you freeze that 8i system, the OS it is on, the hardware, etc - you have a support issue)
regular expressions
mo, January 27, 2006 - 5:00 pm UTC
Tom:
What do you mean by full posix and perl support.
Do you mean owa_pattern provide baic regular expressions while regexp in 10g provides advanced capabilities like perl
January 28, 2006 - 12:53 pm UTC
I mean "posix is a standard, it describes how to do regular expressions, we do what they say. perl is a psuedo standard, it also describes how it believes regular expressions should be done. people like that, so we did that too"
owa_match, the plsql package is very rudimentary in its implementation
Best match
Serge Shmygelsky, February 23, 2006 - 9:28 am UTC
Hello Tom,
I was looking through this thread and found a post about matching dialled numbers with prefixes which is closely related to my needs. I've managed to do it in the following way:
SHMYG@REX> desc calls
Name Null? Type
------------------------------- -------- ----------------------------
1 IMSI VARCHAR2(15)
2 B_NUMBER VARCHAR2(10)
SHMYG@REX> desc zones
Name Null? Type
------------------------------- -------- ----------------------------
1 ZONE_PREFIX VARCHAR2(5)
2 ZONE_DES VARCHAR2(20)
SHMYG@REX> select * from calls;
IMSI B_NUMBER
--------------- ----------
1 124
1 1234
1 15
2 126
2 1
SHMYG@REX> select * from zones
2 /
ZONE_ ZONE_DES
----- --------------------
12 zone 12
123 zone 123
1 zone 1
SHMYG@REX> get a
1 select * from (
2 select calls.imsi, calls.b_number, zones.zone_des, length(regexp_substr(calls.b_number, '^' || zones.zone_prefix)) as zone_length,
3 row_number() over (partition by imsi, b_number order by length(regexp_substr(calls.b_number, '^' || zones.zone_prefix)) desc ) rn
4 from calls, zones
5 where length(regexp_substr(calls.b_number, '^' || zones.zone_prefix)) is not null
6 )
7* where rn = 1
SHMYG@REX> /
IMSI B_NUMBER ZONE_DES ZONE_LENGTH RN
--------------- ---------- -------------------- ----------- ----------
1 1234 zone 123 3 1
1 124 zone 12 2 1
1 15 zone 1 1 1
2 1 zone 1 1 1
2 126 zone 12 2 1
SHMYG@REX>
So it looks quite fine.
From your opinion, is this the right approach? Especially performance-wise assuming that CALLS table will have 30 mln records per day and I need to create a report containing number of calls per zone.
I'm using Oracle 10G.
Thanks in advance
Regex and Indexes
Barry Chase, March 28, 2007 - 3:15 pm UTC
Need flexibility of Regular Expressions but need indexing too..
I have a situation where I need to search for last name based upon user input, but also need to search for hyphenated names as well. I want to be able to use in index for performance, however I haven't found a means to do so without sacrificing either search capabilities or performance.
I need to be able to find :
Yost%
Yost-%
%-Yost%
%-Yost
but not %Yost%
The reason for the latter is to prevent simple searches that return undesirable sized result sets.
If I use just LIKE statements... I can create a func index and maintain performance, but lose the %-Yost% and %-Yost searches. If use regular expressions, I get all of the above but lose the index. Since one of the components is a user provided value, I can create a function index for the regular expression. Thus I get a full table scan.
So what are my options ?
REGEXP_LIKE ( REGEXP_REPLACE ( UPPER ( hds.last_name ), ' ', NULL )
, '^(' ||
replace(REGEXP_REPLACE ( UPPER ( replace(LTRIM ( :p_last_name, '-' ),'-','000000')), '\W', ''''),'000000','-') ||
')|(-' ||
replace(REGEXP_REPLACE ( UPPER ( replace(LTRIM ( :p_last_name, '-' ),'-','000000')), '\W', ''''),'000000','-') ||
')')
===========================
Barry Chase
Application Architect
Humana Inc.
Louisville, Kentucky USA
==========================
Per Abich, April 02, 2007 - 9:59 am UTC
Is there a way, to use one pattern expression to split a complex string in multiple parts? I have i.e. the following string:
IBS_01M_20071013_C
the _ is a sepeartor string.
In other languages I'd use the following pattern:
([[:alpha:]]{3-5})_([[:alpha]]{3})_([[:digit:]]{8})_([[:alpha:]])
and after parsing, I can simply address the data using an index.
some more help on this.
Vijay, August 13, 2007 - 10:18 am UTC
Dear Tom,
good day to you as always, I am trying to get some information using regexp_like but facing one small issue need your help on this.
The table I am working on has a column of varchar2(15) having numbers, the reason to have it varchar2 is the account numbers start with 0 (zero), I want to search all junk account number that have non numeric values so I used the below query
select acc_number from accounts where not regexp_like(acc_number,'[0-9]');
this query list all account number having non numeric values but if there are account number with . (decimal) these are not listed, so I have to use another query
select acc_number from accounts where regexp_like(acc_number,'[.]');
I am not sure why is '.' not being considered in first query, your help in this regard is much appreciated.
Thanks in advance for your time on this.
Regards,
Vijay Sehgal
August 15, 2007 - 11:06 am UTC
give example, not really following your text description all of the way.
if you are trying to find things that are ONLY numeric (0..9), I myself prefer the old fashioned:
ops$tkyte%ORA10GR2> select x, replace( translate(x,'0123456789','0'), '0', '' ) newx
2 from t
3 /
X NEWX
---------- ----------
123
12.3 .
12x3 x
ops$tkyte%ORA10GR2> select x
2 from t
3 where replace( translate(x,'0123456789','0'), '0', '' ) is null
4 /
X
----------
123
escape \.
Pasko, August 15, 2007 - 3:09 pm UTC
Hi Vijay,
you have to escape the period character '.'
like this :
---------------------------------------
select acc_number from accounts where regexp_like(acc_number,'[\.]');
---------------------------------------
You can also try to use the alphanumeric [:alnum:]
for example here is a small test case:
create table test(num varchar2(10))
insert into test values( '1');
insert into test values( '1123');
insert into test values( 'aa1123');
insert into test values( 'notnumber');
insert into test values( '123.45');
insert into test values( '0.45');
insert into test values( '.46');
commit;
SQL> select * from test where regexp_like(num,'([\.]|[:alnum:])');
NUM
----------
aa1123
notnumber
123.45
0.45
.46
Pattern Matching
Rich, August 15, 2007 - 4:09 pm UTC
Tom,
Again, thanks for posting this site.
I have a requirement where I need to convert a dollar figure into a ten digit zero padded field with no decimal point and the last two digits are always the decimal portion of the number. I created a fairly complex case statement that seems to work okay but was wondering if I used regular expressions I might be able to create a more concise way to accomplish this. Here is the statment
lpad(replace(case length(substr(to_char(sum(decode(trim(f.frick_value), 'B', a.wage_amount, Null))), instr(to_char(sum(decode(trim(f.frick_value), 'B', a.wage_amount, Null))), '.') + 1, 2))
when 0 then to_char(sum(decode(trim(f.frick_value), 'B', a.wage_amount, Null))) || '00'
when 1 then to_char(sum(decode(trim(f.frick_value), 'B', a.wage_amount, Null))) || '0'
when 2 then to_char(sum(decode(trim(f.frick_value), 'B', a.wage_amount, Null)))
else '0000000000'
end, '.', null), 10, '0') "BONUS"
Any suggestions would be appreciated.
Thanks.
August 20, 2007 - 7:22 pm UTC
do you mean this?
ops$tkyte%ORA10GR2> select x, to_char( x, 'fm00000000v00' ) from t;
X TO_CHAR(X,'
---------- -----------
123.45 0000012345
thanks for the help on this.
Vijay Sehgal, August 16, 2007 - 2:52 am UTC
Dear Tom,
good day to you as always and thanks for your help on this, I was trying to achieve the same functionality which you have highlighted using replace and translate.
give example, not really following your text description all of the way.
select x, replace( translate x,'0123456789','0'), '0', '' ) newx from t
X NEWX
---------- ----------
123
12.3 .
12x3 x
Is it possible to achieve the same using regexp_like in one query.
Thanks again for your help and time.
Kind Regards,
Vijay Sehgal.
Hi Rich, try the 'fm' format modifier
Pasko, August 16, 2007 - 5:07 am UTC
Hi Rich from NJ,
before you get a better answer from Tom, i had some time and
this is what i came up with, using the fm format modifier.
you can also see examples from this site using this trick.
First i think this is what your Query does:
step 1:
----------------
extract two digits after decimal point from wage_amount
and check
if length of the extracted value is 0
then append '00' at the end of wage_amount
if length of the extracted value is 1
then append '0' at the end of wage_amount
if length of the extracted value is 2
then append nothing
else append ten '0's
----------------
step 2:
remove '.' from the step 1 result
----------------
step 3:
left pad result from step 3 with '0's until length 10
and then i simplified your query a bit:(Tom likes this :) )
select lpad (
replace (
case length (substr (wage_amount, instr (wage_amount, '.') + 1, 2) )
when 0 then wage_amount || '00'
when 1 then wage_amount || '0'
when 2 then wage_amount
else '0000000000'
end,
'.',
null
),
10,
'0'
) "BONUS"
from (select '0000000000' wage_amount
from dual) a
---
and here is what i think would work:
---
select wage_amount ,
replace( case when instr(wage_amount,'.') > 0
then to_char(wage_amount, 'fm00000000D00')
else
to_char(wage_amount, 'fm0000000000')
end,'.') as Bonus
from
( select '1234567890' wage_amount from dual
union all
select '34567890' wage_amount from dual
union all
select '12345678.9' wage_amount from dual
union all
select '12345678.09' wage_amount from dual
union all
select '0234567890' wage_amount from dual
union all
select '00345678.9' wage_amount from dual
union all
select '00045678.09' wage_amount from dual
)
>>Results:
WAGE_AMOUNT BONUS
----------- ------------
1234567890 1234567890
34567890 0034567890
12345678.9 1234567890
12345678.09 1234567809
0234567890 0234567890
00345678.9 0034567890
00045678.09 0004567809
Hope this helps.
By the way, i am the original poster of the first Question
in this thread.
We learn a lot from Tom.
Rich from NJ, USA
Etbin Bras, August 16, 2007 - 5:10 am UTC
how about (simplified for clarity)
lpad(to_char(round(100*nvl(:num,0),0)),10,'0')
using
sum(decode(trim(f.frick_value),'B',a.wage_amount,null))
as :num
regards
Etbin
difference between Pasko's and Etbin's solution
Pasko, August 16, 2007 - 5:20 am UTC
Hi Etbin,
i like your solution,
the only difference between mines and yours is:
SQL> select * from (
2 select wage_amount ,
3 replace( case when instr(wage_amount,'.') > 0
4 then to_char(wage_amount, 'fm00000000D00')
5 else
6 to_char(wage_amount, 'fm0000000000')
7 end,'.') as Bonus_Pasko,
8 lpad(to_char(round(100*nvl(wage_amount,0),0)),10,'0') Bonus_Etbin
9 from
10 ( select '1234567890' wage_amount from dual
11 union all
12 select '34567890' wage_amount from dual
13 union all
14 select '12345678.9' wage_amount from dual
15 union all
16 select '12345678.09' wage_amount from dual
17 union all
18 select '0234567890' wage_amount from dual
19 union all
20 select '00345678.9' wage_amount from dual
21 union all
22 select '00045678.09' wage_amount from dual
23 )
24 )
25 where Bonus_Pasko <> Bonus_Etbin;
WAGE_AMOUNT BONUS_PASKO BONUS_ETBIN
----------- ------------ ----------
34567890 0034567890 3456789000
0234567890 0234567890 2345678900
Regards,
Pasko
Replacing Special Characters
Maverick439, September 13, 2007 - 2:13 pm UTC
Tom,
I need to search and replace all Special characters [characters, that cannot be entered using Keyboard] in Varchar2 field. Is this possible with Regexp_replace function? if so how? I am not sure how to ask regular expression to look for specific ASCII characters ..any help is greatly appreciated.
Thanks,
September 15, 2007 - 7:44 pm UTC
... [characters, that cannot be
entered using Keyboard] ...
that doesn't make sense, there are keyboards that can enter anything and using the ALT-key, I can enter anything I want really
# - I put this # there by holding alt and typing the number 35.
you would need to put together a better "requirement", then we can build a string that we can use with replace - to replace away anything you don't want.
try these using Oracle Regular Expressions
Pasko, September 16, 2007 - 4:58 am UTC
Hi Maverick,
may be you can try one of these on your Data and see
if they return what you expected:
[:print:] Printable characters
[:punct:] Punctuation characters
[:space:] Space characters (nonprinting),
such as
carriage return,
newline, vertical tab, and
form feed
Hope this helps.
Replacing Special Characters
Maverick439, September 17, 2007 - 10:01 am UTC
Tom, when i said characters those cannot be entered by keyboard, I meant, special characters like ¿ ¿ ` etc. these characters you cannot use Keyboard to get them[Atleast that's what I thought]. I got these special symbols in MS Word. Most frequently used is ` [Apostrophe].
I need to search for them and replace with nulls. Sorry for not being clear.
Any ideas?? [I'll try Pasko's suggestion. Thanks Pasko].
Special Characters
Maverick439, September 18, 2007 - 4:34 pm UTC
Ok, Atleast we are going somewhere. So, How do we eliminate those smart Quotes? I tried Pasko's suggestion, but it did not eliminate those[Doesn't mean I am doing it right though]..
Thanks for your help..
September 19, 2007 - 8:51 am UTC
created a file in word with 'smart quotes'
saved it as text
dumped it
221 and 222 are the 'codes'
od -c smart\ quotes.txt
0000000 221 s m a r t q u o t e s 222 \r \n
0000020 \r \n
0000022
so, if you
translate( str, chr(221)||chr(222), chr(39)||chr(39) )
you would turn those smart quotes into dumb ones
did you try this?
Pasko, September 19, 2007 - 9:12 am UTC
Hi Maverick,
did you try like this:
for example to get all non-printable characters:
select * from table
where regexp_like(column , '[^[:print:]]')
Special Characters
Maverick439, September 19, 2007 - 10:16 am UTC
Thanks Tom and Pasko for your help.
Tom, what do you mean when you said Dumped the text file?
Also, 221 and 222 are for all special characters?
Pasko, I tried your suggestion like this:
select column,regexp_replace(column,[^[:print]]',null) from table
it did not replace those characters..did I miss some thing?
September 19, 2007 - 1:11 pm UTC
I demonstrated what I did, I used a common unix utility "od" to dump out the file and cut and pasted the results...
In my test, using my charactersets installed with windows and my version of word - 221 and 222 were the decimal codes used to represent these so called smart quotes.
Not much of a help
Maverick439, September 19, 2007 - 1:28 pm UTC
Tom, I checked chr 221 and 222 and they are smart characters alright. But If I take your approach, I have to find chr() value for each and every special character out there and try to implement in a loop to replace them. I wish it would be simple like regexp(replace(column,'smartquote',null). I cannot use [[:print:]] posix notation as these are all printable characters..I tried
[^[:graph:]]..but no use with that either.
Thanks
September 20, 2007 - 9:20 pm UTC
this is why I'm trying to tell you that YOU have to identify what is special to YOU.
And what you want to replace them with. Eg: 221 and 222 should be replaced with 39
when you find the rest of them, you have to pretty much tell us what you want them to be replaced with.
Think about this please - these are only special to you, to the rest of the world, they are normal printable characters.
Special Characters
Maverick439, September 21, 2007 - 12:44 pm UTC
Thanks Tom. I understand what you are saying [now!!]. I don't know what are not allowed. I only know what are allowed and those are somewhere between 90 and 102 characters[a big list] that we can enter from keyboard. So, rest are all not allowed. I'll try to filter them someway.
Thanks for all the help.
try NOT ( all characters you know ) ..
Pasko, September 21, 2007 - 4:11 pm UTC
Hi Maverick,
may be it's better if you try to replace all the characters
you do not know :)
regexp_replace(column , '[^a-zA-Z0-9]' )
Special Characters
Maverick439, September 23, 2007 - 2:40 pm UTC
Actually I tried this way after struggling for a while:
select regexp_replace(Description,'[^][(A-Za-z0-9!@#$%&*() _+={}?^\|<>,.\''~`\\/":;\-][:spaces:]','')
into l_desc
from dual;
I know I wanted all those characters and only those. I was wondering if there was any POSIX notation for special characters. Actually there is ..[:PUNCT:] .but it includes every thing..
But this works for me.
Limited Use of Oracle's Regular Expression?
A reader, September 09, 2008 - 1:22 pm UTC
Based on the tutorial
http://www.oracle.com/technology/obe/10gr2_db_single/develop/regexp/regexp_otn.htm it appears that there are 4 functions that can be used for regular expressions:
REGEXP_LIKE
REGEXP_INSTR
REGEXP_REPLACE
REGEXP_SUBSTR
Now I want to write a SQL using regular expression to return strings that contain ONLY numbers. I try the following.
CREATE TABLE x (
col VARCHAR2(20)
);
INSERT INTO x VALUES ('123');
INSERT INTO x VALUES ('abc');
INSERT INTO x VALUES ('ABC');
INSERT INTO x VALUES ('abcdef');
INSERT INTO x VALUES ('abc11');
INSERT INTO x VALUES ('ab39c');
INSERT INTO x VALUES ('45abc');
INSERT INTO x VALUES ('abc!');
INSERT INTO x VALUES ('abc#');
INSERT INTO x VALUES ('abc%');
INSERT INTO x VALUES ('###');
COMMIT;
SELECT col
FROM x
WHERE regexp_like(col, '[0-9]');
COL
--------------------
123
abc11
ab39c
45abc
The SQL does not return the desired result. My guess is that because of the "like" function that I used, it returns all strings that contain one or more numbers. How do I use regular expression to return only what I ask it to match? In this case, I want strings that contain only numbers.
September 10, 2008 - 9:22 am UTC
example here:
http://laurentschneider.com/wordpress/2007/09/isnumber-in-sql.html but I would suggest something easier and faster maybe
where replace( translate( col, '0123456789','0000000000'), '0', '') is null
turn all digits into zero, turn all zeroes into nothing, if you end up with nothing - it was either nothing to begin with or was all digits.
expected pattern
Nicosa, September 10, 2008 - 10:27 am UTC
Hello,
this in answer to "A reader" (previous followup) :
I think the pattern you should have used is REGEXP_LIKE(col,'^[0-9]+$')
(Anyway, Tom's answer might still be better performance wise)
Re:Limited Use of Oracle's Regular Expression?
Brian Camire, September 10, 2008 - 10:47 am UTC
Here's another way:
SQL> SELECT
2 *
3 FROM
4 (
5 SELECT '123' AS COL FROM DUAL
6 UNION ALL SELECT 'abc' FROM DUAL
7 UNION ALL SELECT 'ABC' FROM DUAL
8 UNION ALL SELECT 'abcdef' FROM DUAL
9 UNION ALL SELECT 'abc11' FROM DUAL
10 UNION ALL SELECT 'ab39c' FROM DUAL
11 UNION ALL SELECT '45abc' FROM DUAL
12 UNION ALL SELECT 'abc!' FROM DUAL
13 UNION ALL SELECT 'abc#' FROM DUAL
14 UNION ALL SELECT 'abc%' FROM DUAL
15 UNION ALL SELECT '###' FROM DUAL
16 UNION ALL SELECT NULL FROM DUAL
17 )
18 WHERE
19 NOT REGEXP_LIKE(COL, '[^0-9]');
COL
------
123
In other words, "where COL does not contain a character that is not a digit".
Is this a bug?
A reader, September 10, 2008 - 12:38 pm UTC
CREATE TABLE x (
col VARCHAR2(20)
);
INSERT INTO x VALUES ('123');
INSERT INTO x VALUES ('abc');
INSERT INTO x VALUES ('ABC');
INSERT INTO x VALUES ('abcdef');
INSERT INTO x VALUES ('abc11');
INSERT INTO x VALUES ('ab39c');
INSERT INTO x VALUES ('45abc');
INSERT INTO x VALUES ('abc!');
INSERT INTO x VALUES ('abc#');
INSERT INTO x VALUES ('abc%');
INSERT INTO x VALUES ('###');
COMMIT;
-- ---------------------------------------------------------------------------------------------
-- After applying the translate function, why do we get NULLs for alphanumeric strings?
-- ---------------------------------------------------------------------------------------------
SELECT col, TRANSLATE(LOWER(col), LOWER('0123456789abcdefghijklmnopqrstuvwxyz'), '0') translated
FROM x;
COL TRANSLATED
------- ----------
123
abc
ABC
abcdef
abc11
ab39c
45abc
abc! !
abc# #
abc% %
### ###
-- ---------------------------------------------------------------------------------------------
-- Puzzling result - Why doesn't '123' get translated to '000' instead?
-- ---------------------------------------------------------------------------------------------
SELECT col, TRANSLATE(LOWER(col), LOWER('0123456789abcdefghijklmnopqrstuvwxyz'), '000') translated
FROM x;
COL TRANSLATED
------- ----------
123 00
abc
ABC
abcdef
abc11 00
ab39c
45abc
abc! !
abc# #
abc% %
### ###
September 11, 2008 - 11:11 am UTC
ops$tkyte%ORA10GR2> select translate( 'abc', 'abc', 'X' ) from dual;
T
-
X
ops$tkyte%ORA10GR2> select translate( 'abc', 'abc', 'XY' ) from dual;
TR
--
XY
ops$tkyte%ORA10GR2> select translate( 'abc', 'abc', 'XYZ' ) from dual;
TRA
---
XYZ
Your TRANSLATE(LOWER(col),
LOWER('0123456789abcdefghijklmnopqrstuvwxyz'), '0')
turned zero into zero and 1..9a..z into NULL - that is why.
-- Puzzling result - Why doesn't '123' get translated to '000' instead?
--
--------------------------------------------------------------------------------
-------------
SELECT col, TRANSLATE(LOWER(col),
LOWER('0123456789abcdefghijklmnopqrstuvwxyz'), '000') translated
because you turned 0 into 0, 1 into 0, 2 into 0 and 3..9a..z into NULL
RE: Is this a bug?
Brian Camire, September 10, 2008 - 3:29 pm UTC
This is the documented behavior of TRANSLATE. The documentation at
http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/functions196.htm#i1501659 says:
TRANSLATE returns expr [the first argument] with all occurrences of each character in from_string [the second argument] replaced by its corresponding character in to_string [the third argument]. Characters in expr that are not in from_string are not replaced. If expr is a character string, then you must enclose it in single quotation marks. The argument from_string can contain more characters than to_string. In this case, the extra characters at the end of from_string have no corresponding characters in to_string. If these extra characters appear in char [this should be "expr"], then they are removed from the return value.So:
1. In your first example, all letters a-z and all digits except zero are being removed. For the alphanumeric strings, there are no zeroes, so there are no characters left, leaving an empty (null) string.
2. In your second example, '123' gets translated to '00' because a '0' (if there was one) would get translated to '0', the '1' gets translated to '0' (the second '0' in the to_string), and the '2' gets translated to '0' (the third '0' in the to_string. There's no character in the to_string corresponding to the '3' in the from string, so it gets removed.
Hope this helps.
appropriate indexing for regexp searching
Chris Gould, October 17, 2008 - 4:49 am UTC
I have a requirement similar to the original one in this thread. I have a database of customer phone numbers which are held as varchar2, many containing non-numeric characters (eg spaces, brackets, hyphens etc. example : +44 (208) 123-4566 )
We have a help-desk call routing system which tries to identify customers from the phone-number they call in on and routes the call accordingly. The call-routing system always gives the incoming call number as a string of all digits. eg 02081234566 in the case of the example above.
What I've done so far is to build a table specifically for this search using REGEXP_REPLACE to eliminate all non-digits from the stored numbers.
eg
create table cleansed_data
as select ACCOUNT_ID
,regexp_replace(phonenumber,'([^[:digit:]])',NULL) as phone_number
from CUST_CONTACTS
where phonenumber is not null;
Also, when I try to match the incoming call number against the cleansed data, I ignore the leading zeros on the incoming number and try to match against my stored numbers from the right-hand end of the stored-phone number
eg
select * from cleansed_data
where regexp_like(phone_number,'(&inphone)$')
(not showing the elimination of leading digits here).
My question is regarding what indexes (if any) would be appropriate on my cleansed_data table to make the REGEXP_LIKE search more efficient. (I have about 3million rows in the table being searched).
October 17, 2008 - 9:36 pm UTC
why create the table - why not just create a function based index on the column itself?
ops$tkyte%ORA10GR2> create table t ( phonenumber varchar2(40), data char(1000) );
Table created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create index t_idx on t(regexp_replace(phonenumber,'([^[:digit:]])',NULL));
Index created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> exec dbms_stats.set_table_stats( user, 'T', numrows=> 1000000 );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> variable x varchar2(40)
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> @at
ops$tkyte%ORA10GR2> column PLAN_TABLE_OUTPUT format a72 truncate
ops$tkyte%ORA10GR2> set autotrace traceonly explain
ops$tkyte%ORA10GR2> select * from t where regexp_replace(phonenumber,'([^[:digit:]])',NULL) = :x;
Execution Plan
----------------------------------------------------------
Plan hash value: 470836197
------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU
------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10000 | 976K| 5 (0
| 1 | TABLE ACCESS BY INDEX ROWID| T | 10000 | 976K| 5 (0
|* 2 | INDEX RANGE SCAN | T_IDX | 4000 | | 1 (0
------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access( REGEXP_REPLACE ("PHONENUMBER",'([^[:digit:]])',NULL)=:X)
No auxiliary table to maintain, no muss, no fuss.
Chris Gould, October 23, 2008 - 4:59 am UTC
But I want to use regexp_like to match on the string starting from the right-hand end (the $ anchor in the regexp_like example I gave). Won't your solution still match from the left-hand end of the string?
The phone number I have stored may or may not include the country-code for example. I still want a match regardless of whether that's stored or not.
So the incoming number I'm trying to match might be 2223333444 (I eliminate the leading zeros from the incoming number).
The numbers I have stored might be 00442223333444 or equally +44-222-333-4444 or may omit the country-code 0222 333-4444
I need the incoming number to match with all 3 examples here. (I have other code later which tries to narrow the list down by considering other attributes of the account).
October 23, 2008 - 1:39 pm UTC
if you want to use a fbi- you will come up with a function that can be indexed.
can you come up with such a function? it can use regular expressions and all..
let me ask you - can you create a function that can be indexed that would help you.
I indexed the data you had pre-created, is that sufficient for your needs. You know how an index works - can you use this technique.
Chris Gould, October 23, 2008 - 5:05 am UTC
Sorry : bad examples in my last post.
What I meant to say was >>>
So the incoming number I'm trying to match might be 223334444 (I eliminate the leading zeros from the incoming number).
The numbers I have stored might be 0044223334444 or equally +44-22-333-4444 or may omit the country-code 022 333-4444
I need the incoming number to match with all 3 examples here. (I have other code later which tries to narrow the list down by considering other attributes of the account).
A reader, October 23, 2008 - 8:57 pm UTC
Thanks, as always. I benefit so much from this site . . .
I don't mean to nit-pick and possibly shave off only a hundredth of a percent,
but I believe your Sept 10 suggestion to "A reader" could be optimized/simplified further by:
where translate(col,'~0123456789','~') is null
(Really any non-numeric leading character could be used instead of the ~.)
B
Chris Gould, October 27, 2008 - 8:14 am UTC
Hang on - it was you that suggested using a function-based index, not me! I only wanted to know if it was possible to create an index on my table of cleansed data that would work when I used REGEXP_LIKE(), using an expression that matched from the right-hand end of a string.
I can't see a way of creating a function-based index that would be of any help with this search. Hence my original question : is there ANY index I could create that would be of use?
October 27, 2008 - 9:30 am UTC
Ok, not being a heavy user of regular expression, I misinterpreted the "(&x)$", totally.
This is a combination of "index a function and do a fast leading wildcard search"
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:9455353124561 http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:37336026927381#37360199308431 So, here goes:
ops$tkyte%ORA10GR2> create table t ( phonenumber varchar2(30) );
Table created.
<b>Your table...</b>
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> begin
2 ctx_ddl.create_preference('SUBSTRING_PREF', 'BASIC_WORDLIST');
3 ctx_ddl.set_attribute('SUBSTRING_PREF', 'SUBSTRING_INDEX','TRUE');
4 end;
5 /
PL/SQL procedure successfully completed.
<b>for fast leading edge wildcard searching...</b>
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create or replace procedure clean_phone_number( p_rowid in rowid, p_return in out nocopy varchar2 )
2 as
3 l_string t.phonenumber%type;
4 begin
5 select phonenumber into l_string from t where rowid = p_rowid;
6 p_return := regexp_replace(l_string,'([^[:digit:]])',NULL);
7 exception
8 when no_data_found then raise program_error;
9 end;
10 /
Procedure created.
<b>this'll be the procedure we index...</b>
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> begin
2 ctx_ddl.create_preference('my_user_datastore', 'user_datastore');
3 ctx_ddl.set_attribute('my_user_datastore', 'procedure', '"OPS$TKYTE"."CLEAN_PHONE_NUMBER"');
4 ctx_ddl.set_attribute('my_user_datastore', 'output_type', 'VARCHAR2');
5 end;
6 /
PL/SQL procedure successfully completed.
<b>tell text to use our procedure....</b>
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create index search_idx on T( phonenumber )
2 indextype is ctxsys.context parameters
3 ('wordlist SUBSTRING_PREF datastore my_user_datastore sync(on commit)');
Index created.
<b>and create the index. Now:</b>
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert into t values ( '0044223334444' );
1 row created.
ops$tkyte%ORA10GR2> insert into t values ( '+44-22-333-4444' );
1 row created.
ops$tkyte%ORA10GR2> insert into t values ( '022 333-4444' );
1 row created.
ops$tkyte%ORA10GR2> commit;
Commit complete.
ops$tkyte%ORA10GR2> variable x varchar2(30)
ops$tkyte%ORA10GR2> exec :x := '223334444';
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select * from t where contains( phonenumber, '%' || :x ) > 0;
PHONENUMBER
------------------------------
0044223334444
+44-22-333-4444
022 333-4444
Chris Gould, October 27, 2008 - 10:29 am UTC
Brilliant! Thanks for the extremely prompt and comprehensive reply. I'm also not that familiar with regular expressions, but they've been very useful so far in this data-cleansing / matching problem I've been working on.
followup
Chris Gould, October 28, 2008 - 9:11 am UTC
Tom -
just as a follow-up, I tried out the Oracle-text based solution and the query performance using the index was very good. However I hit several performance problems maintaining the data in the table. Creating the index on table with 800,000 rows took over 30mins. And inserting a new row into the table with the index in place sometimes took several minutes (the insert itself was milli-seconds, but the commit took minutes). I checked that it wasn't a case of waiting for locks - it wasn't.
There seem to be several documented cases on Metalink related to poor performance inserts/updates on tables with Oracle-Text indexes.
What I've done instead is similar to your original suggestion. I've created a function and a function-based index and use the LIKE operator to do the match.
The function uses REGEXP_REPLACE to eliminate non-digits but also reverses the order of the digits so that func('123-456-789') = '987654321'
I then use the LIKE operator in my where clause instead of the REGEXP_LIKE, since I now can match from the left-hand end of the indexed-value (I don't need the REGEXP "$" to match from the right-hand end).
eg.
create or replace function f_revstr(i_string in varchar2)
return varchar2 DETERMINISTIC
AS
v_rev varchar2(4000);
v_len pls_integer;
v_clean varchar2(4000);
begin
v_clean := regexp_replace(i_string,'([^[:digit:]])',NULL);
v_len := length(v_clean);
for i in reverse 1 .. v_len
loop
v_rev := v_rev||substr(v_clean,i,1);
end loop;
return v_rev;
end f_revstr;
/
Index :
create index phone_numbers_fbi on phone_numbers(f_revstr(orig_phone_number));
Query:
select * from phone_numbers
where f_revstr(orig_phone_number) like f_revstr('12-34-56 789')||'%'
/
This seems to do the trick. The index creates in 20 seconds, inserts/updates are more or less instantaneous and queries using the index take only milliseconds. Having read up a bit more on Oracle Text (aka interMedia in your book "Expert Oracle") what I want to do is only a fraction of the capability of Oracle Text, so probably a simpler solution is more appropriate here.
String matching
Prakash, February 13, 2009 - 7:45 am UTC
Tom,
I need to match 2 character columns in 2 tables and the match is fuzzy. The field is 12 characters long and if ANY 6 consecutive characters are the same then I need to consider it a match.
For example:
A8UI0$WER76$ and HGDU0$WER78*
KL6WER78*YIZ and UL6WER78*MBV are matches because atleast 6 consecutive characters match in the 2 columns.
I can use INSTR and SUBSTR to match a 6 characters but that only works if I the match positions are fixed. My requirement if that I need to match any 6 consecutive characters in the string starting anywhere - it could be starting from the first character to the 7th character.
Is there a way to do this in SQL using any in-built functions including Regular expressions?
I have learned so much from your site over the years and I would like to provide my sincerest Thanks for this amazing service you provide to the Oracle community.
Prakash
February 16, 2009 - 11:06 am UTC
well, most applications would involve a cartesian join, this does too - but "a pair of mini cartesian joins"
ops$tkyte%ORA10GR2> create table t1 ( x varchar2(12) );
Table created.
ops$tkyte%ORA10GR2> create table t2 ( y varchar2(12) );
Table created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert into t1 values ( 'A8UI0$WER76$' );
1 row created.
ops$tkyte%ORA10GR2> insert into t2 values ( 'HGDU0$WER78*' );
1 row created.
ops$tkyte%ORA10GR2> insert into t1 values ( 'KL6WER78*YIZ' );
1 row created.
ops$tkyte%ORA10GR2> insert into t2 values ( 'UL6WER78*MBV' );
1 row created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> with data as (select level l from dual connect by level <= 7)
2 select 1, x, l, substr( x, l, 6 ) ss
3 from t1, data
4 union all
5 select 2, y, l, substr( y, l, 6 ) ss
6 from t2, data
7 order by 1, 2, 3
8 /
1 X L SS
---------- ------------ ---------- ------
1 A8UI0$WER76$ 1 A8UI0$
1 A8UI0$WER76$ 2 8UI0$W
1 A8UI0$WER76$ 3 UI0$WE
1 A8UI0$WER76$ 4 I0$WER
1 A8UI0$WER76$ 5 0$WER7
1 A8UI0$WER76$ 6 $WER76
1 A8UI0$WER76$ 7 WER76$
1 KL6WER78*YIZ 1 KL6WER
1 KL6WER78*YIZ 2 L6WER7
1 KL6WER78*YIZ 3 6WER78
1 KL6WER78*YIZ 4 WER78*
1 KL6WER78*YIZ 5 ER78*Y
1 KL6WER78*YIZ 6 R78*YI
1 KL6WER78*YIZ 7 78*YIZ
2 HGDU0$WER78* 1 HGDU0$
2 HGDU0$WER78* 2 GDU0$W
2 HGDU0$WER78* 3 DU0$WE
2 HGDU0$WER78* 4 U0$WER
2 HGDU0$WER78* 5 0$WER7
2 HGDU0$WER78* 6 $WER78
2 HGDU0$WER78* 7 WER78*
2 UL6WER78*MBV 1 UL6WER
2 UL6WER78*MBV 2 L6WER7
2 UL6WER78*MBV 3 6WER78
2 UL6WER78*MBV 4 WER78*
2 UL6WER78*MBV 5 ER78*M
2 UL6WER78*MBV 6 R78*MB
2 UL6WER78*MBV 7 78*MBV
28 rows selected.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> with data as (select level l from dual connect by level <= 7)
2 select t1_data.x, t2_data.y, count(*)
3 from (
4 select 1, x, l, substr( x, l, 6 ) ss
5 from t1, data
6 ) t1_data,
7 (
8 select 2, y, l, substr( y, l, 6 ) ss
9 from t2, data
10 ) t2_data
11 where t1_data.ss = t2_data.ss
12 group by t1_data.x, t2_data.y
13 /
X Y COUNT(*)
------------ ------------ ----------
KL6WER78*YIZ HGDU0$WER78* 1
KL6WER78*YIZ UL6WER78*MBV 3
A8UI0$WER76$ HGDU0$WER78* 1
Excellent
Prakash, February 17, 2009 - 11:46 am UTC
Tom,
Thank you for the solution. Exactly what I was looking for. You come through again and again for thousands of users across the globe. I don't know of another person providing the kind of service you are providing for any other technology/solution. You are truly Peerless...
Prakash
regexp_substr Unexpected Result (10.1.0.4)
Richard, March 13, 2009 - 1:45 pm UTC
Given:
create table t ( ref_codes varchar2(20));
insert into t values ('A00000');
insert into t values ('100000');
insert into t values ('10000A');
insert into t values ('B0000A- 1');
insert into t values ('B0000A-311');
insert into t values ('B0000A/CTK');
insert into t values ('B0000A, 11');
insert into t values ('10000009');
insert into t values ('B0000A A00000 100000');
insert into t values ('B0000A A00000 ');
insert into t values ('B0000A A00001');
insert into t values ('B0000A/A00000');
commit;
I need to extract up to 3 reference codes that are 6 characters in length when separated by a non-alphanumeric character.
select ref_codes,
regexp_substr(ref_codes,'[[:alnum:]][[:digit:]]{4}[[:alnum:]]',1) ref1,
regexp_substr(ref_codes,'[[:alnum:]][[:digit:]]{4}[[:alnum:]]',2) ref2,
regexp_substr(ref_codes,'[[:alnum:]][[:digit:]]{4}[[:alnum:]]',3) ref3
from t;
Expected Output:
REF_CODES REF1 REF2 REF3
A00000 A00000
100000 100000
10000A 10000A
B0000A- 1 B0000A
B0000A-311 B0000A
B0000A/CTK B0000A
B0000A, 11 B0000A
10000009
B0000A A00000 100000 B0000A A00000 100000
B0000A A00000 B0000A A00000
B0000A A00001 B0000A A00001
B0000A/A00000 B0000A A00000
However, records 7-11 come out as:REF_CODES REF1 REF2 REF3
10000009 100000 000000 000009
B0000A A00000 100000 B0000A A00000 A00000
B0000A A00000 B0000A A00000 A00000
B0000A A00001 B0000A A00001 A00001
B0000A/A00000 B0000A A00000 A00000
What do I need to add to the pattern string?
March 13, 2009 - 3:16 pm UTC
sorry, not a regexp expert (hate em actually)...
I see exactly why you get the output you are (it is exactly what I would have expected - you expect the regexp to move on to BEYOND where you last stopped, it doesn't, it picks up at the next character after you started...)
how about you specify in text, in the form of a requirement, what you want - forget regexp (you have some code that doesn't work, not useful, we need a specification that dictates exactly, precisely what you want in the form of a requirement, a specification - not code, really not code that does not work).
it is likely you'll need a series of "regexp substr", then "regexp substr AFTER getting rid of the first regexp substr stuff" - eg: repexp replace - then substr
Huh?
A reader, March 13, 2009 - 3:41 pm UTC
Hi Tom,
I'm a bit surprised when you say you 'hate' regexp. That's quite a statement.
I know you hate 'when others not followed by a raise', you hate triggers and I agree fully on that and fully understand your 'crusade' against them (making their caveats very clear), but the regexp_replace saves me from, lets say: about 42 substr/instr/translate etc. function calls, making my code hardly readable (or understandable), while using regexp turns it all into one call in a SQL statement.
What makes you hate the regexp builtin? What are it's shortcomings in your opinion? Any caveats?
I agree, it's a puzzle sometimes, but isn't puzzling fun?
Or is the regexp 'just not your kinda thing'?
Ah...so many questions do rise now ;-)
Oh, and most of times I use regexp_replace, that's one difference. But still a regexp.
Best regards,
Martijn
March 13, 2009 - 4:05 pm UTC
Ok, I strongly dislike complex regular expressions. Because I've never gotten my head around them entirely. It is a puzzle without commentary most of the times - If there was a long commentary saying "this is what this does and how it does it" (like I many times do with my 'tricky queries'), maybe it would be different.
so, maybe it is the lack of commentary that bothers me :)
reg exps and comments
rc, March 13, 2009 - 4:17 pm UTC
Some regular expressions flavors like perl's make it possible to include comments. Maybe that would help?
March 16, 2009 - 8:46 am UTC
well, everything can be commented and yes - that would definitely help almost everything :)
A reader, March 13, 2009 - 4:50 pm UTC
"so, maybe it is the lack of commentary that bothers me :)"
Without clear documentation, specifications and honest comments and an open mind/ healthy discussion, we're all lost, so let's keep on preventing that from happening, therefore: 'why?' should always be meta-question #1 ;-)
But I'm glad anyway you only 'dislike' regexps, and not 'hate' them anymore, almost thought I took a wrong path :-)
Thanks &
Best regards,
Martijn
re: regexp_substr Unexpected Result (10.1.0.4)
Richard, March 13, 2009 - 11:08 pm UTC
Requirement spec:
The reference codes field is 20 characters and may contain up to 3 valid reference codes; valid reference codes are 6 characters each.
Each code may be separated by space, comma, or /.
The field may contain a combination of 1 valid reference code and extra info following a /, -, or space.
The field may contain a non-standard reference code of 8 or more characters; these can be treated as nulls.
March 16, 2009 - 9:01 am UTC
... Each code may be separated by space, comma, or /. ....
each code may be, or will be - and by "a" space, comma or '/' or by any sequence of space+comma+'/' (eg: / ,,,,,, ,////// ////////,,,,,,,,,,,/ would be a valid, single separator.)
and if "...and extra info following a /, -, or space. ...." is true, how could '/' and ' ' be separators?
Richard, March 16, 2009 - 10:52 am UTC
As originally posted, these are all the formats contained in this field:
insert into t values ('10000A');
insert into t values ('B0000A- 1');
insert into t values ('B0000A-311');
insert into t values ('B0000A/CTK');
insert into t values ('B0000A, 11');
insert into t values ('10000009');
insert into t values ('B0000A A00000 100000');
insert into t values ('B0000A A00000 ');
insert into t values ('B0000A A00001');
March 17, 2009 - 10:23 am UTC
I just need you to answer a question - in text.
As originally posted was a small snippet, rarely (if ever) do we get test cases that test all of the boundary conditions. Your specifications are in conflict with each other.
...
Each code may be separated by space, comma, or /.
The field may contain a combination of 1 valid reference code and extra info following a /, -, or space.
.......
Ok, then
insert into t values ('B0000A/CTK');
is the same as
insert into t values ('B0000A CTK');
and ctk should be considered a field all in it's own right? So, what if you have
insert into t values ('B0000A/CTK B0000B/CTK B0000C/X');
Now we have, by your rules, six fields, the first three of which are the only valid ones to look at - you would get as output just the B0000A & B0000B items, B0000C would not be considered
So, that is probably not what you wanted - and Just because your example doesn't contain that exact insert doesn't mean it won't be a possible value
So, before I spend any time (this is going to be doable I believe), I want you to take a couple of seconds and just spell it out - I know to you it is *obvious* because it is your data, you live with it, you look at it every day. To me it is just data, I need to understand what the real rules are.
I believe the separator is really just "a space", not " ,/"
Re: String matching
Parthiban Nagarajan, March 17, 2009 - 11:57 pm UTC
Hi Tom
The RegEx pack in Oracle is awesome. Just now I started learning RegEx. And here is the alternate solution for "String matching - PRAKASH from Dallas" using RegEx.
create table t1 ( x varchar2(12) );
create table t2 ( y varchar2(12) );
insert into t1 values ( 'A8UI0$WER76$' );
insert into t2 values ( 'HGDU0$WER78*' );
insert into t1 values ( 'KL6WER78*YIZ' );
insert into t2 values ( 'UL6WER78*MBV' );
SQL> col src format a24
SQL> select t1.x||t2.y src,
case when regexp_like(t1.x||t2.y, '(.{6}).*\1') then 1 else 0 end as match,
substr(regexp_substr(t1.x||t2.y, '(.{6}).*\1'),1,6) matched_str
from t1, t2;
SRC MATCH MATCHED_STR
------------------------ ---------- ------------------
A8UI0$WER76$HGDU0$WER78* 1 0$WER7
A8UI0$WER76$UL6WER78*MBV 0
KL6WER78*YIZHGDU0$WER78* 1 WER78*
KL6WER78*YIZUL6WER78*MBV 1 L6WER7
The SQL became very much simpler only because of the power of RegEx.
Comments:
---------
. -> Find a character
.{6} -> Find six consecutive characters
(.{6}) -> Remember those six consecutive characters
.* -> Find any character of arbitrary length
\1 -> Find the 1st remembered pattern again
Tom,
You are a genius in Oracle and its SQL. Common people cannot understand SQL. And developers of a little knowledge (
like me ;) cannot understand some complex and tricky SQL written by geniuses like you,
even with comments to describe it. Guys with RegEx background dont need the description. They see the world as patterns. They can speak and understand the RegEx language. The thing is you dont have
time to MASTER RegEx. You have been kept away from RegEx because of some other busy things (For eg: This ASKTOM site). We are learning Oracle from you and we dont want to miss RegEx tricks from you. When you miss RegEx then WE miss a Guru for RegEx.
March 18, 2009 - 8:15 am UTC
... cannot understand some complex and tricky SQL written by geniuses like you, even with comments to describe it. ...
sorry, that just cannot be true.
how about you take your solution and mine and scale it up and see which "performs"... I have a suspicion I know the answer...
comments
rc, March 18, 2009 - 5:34 am UTC
Sure everything can be commented but is it possible to include comments inside the Oracle reg exp?
March 18, 2009 - 8:19 am UTC
why would it have to be - OR EVEN SHOULD be "in the code, right in the middle of it all"?
I never use comments IN my sql (rarely I guess I should say, I comment out bits sometimes)
I preface my sql with a comment, it would come as a comment block
for /* this is my loop */ ( i = 0 /* set i to zero */; i++ /*increment i */; i<10 /* we need to do this for i in 0..9 */ )
{
I wouldn't recommend that, and I wouldn't recommend a comment "in" a regular expression.
comments
rc, March 18, 2009 - 10:29 am UTC
March 19, 2009 - 9:53 am UTC
that is fine, all I was saying is "it wouldn't really matter to me if it doesn't support comments in the regular expression, I'm of the old school that uses comment blocks above code"
regular expression
A reader, April 27, 2009 - 10:20 am UTC
Hi Tom
I am still trying to understand the 10g regexp* functions.
I have a question on this .
I have a table where I have data like 'JUN10' , 'AUG09','SEPT11' , 'OCTOBER' ;
so I want to write a query using regexp_like to get the rows where I have 2 last characters are always numeric (number) .
APPS@DEVL1-2504>create table t ( ss varchar2(10)) ;
Table created.
Elapsed: 00:00:01.53
insert into t values ('Jun10' ) ;
1 row created.
Elapsed: 00:00:01.57
commit ;
Commit complete.
Elapsed: 00:00:01.67
insert into t values ('ABC') ;
so the query should give me only one row
that is JUN10.
can you please give some simple examples of using regexp
thanks
April 27, 2009 - 2:27 pm UTC
ops$tkyte%ORA10GR2> create table t ( ss varchar2(10)) ;
Table created.
ops$tkyte%ORA10GR2> insert into t values ('Jun10' ) ;
1 row created.
ops$tkyte%ORA10GR2> insert into t values ('ABC') ;
1 row created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select * from t where regexp_like( ss, '.*[0-9][0-9]$' );
SS
----------
Jun10
You'll want to read through various regular expression tutorials (they are not an "Oracle-ism", they are widely used in many bits of software)
http://www.google.com/search?q=regular+expression+tutorial
great
A reader, April 28, 2009 - 3:43 am UTC
Thanks Tom, really good stuff, i will read thorugh the suggested URLs and try to understand it better.
Potential solution
Scott, July 09, 2009 - 5:21 am UTC
I'm facing a similar battle to Maverick back in 07.
Something I found since when I was inserting this information, it seemed to take up more length than I was expecting.
10gr2> select length('that¿more'), lengthb('that¿more') from dual;
LENGTH('THAT¿MORE') LENGTHB('THAT¿MORE')
------------------- --------------------
9 10
Well, this identifies the information... next step is removing. Depends on what you need to do...
CONVERT
Scott, July 09, 2009 - 10:47 pm UTC
10gr2> select convert('aéñàueoüûç','US7ASCII','UTF8') yes from dual
/
YES
----------
ae?aueouuc
Even better.
Teasing out congruent sets of numbers
J from Vancouver, November 05, 2009 - 3:36 pm UTC
Hello Tom (all),
Here's my setup:
drop table jt;
create table jt (idn number(2), str varchar2(30));
insert into jt (idn,str) values (1,'10');
insert into jt (idn,str) values (2,'Test 200');
insert into jt (idn,str) values (3,'Test 3000 ok');
insert into jt (idn,str) values (4,'4000 Test');
insert into jt (idn,str) values (5,'Test 5.00 - ok');
insert into jt (idn,str) values (6,'Test -60.0 ');
insert into jt (idn,str) values (7,'Test 7 but not 1');
insert into jt (idn,str) values (8,'Test 8 but not 2 anywhere');
insert into jt (idn,str) values (9,'Test number nine');
insert into jt (idn,str) values (10,' Test.10');
insert into jt (idn,str) values (11,'11.1.2.3');
commit;
So what I'd like to get is the IDN and numeric value
of STR (including leading minus signs and decimal points),
provided that there is no more than ONE distinct
set of congruent numbers. For example:
IDN NUM_VAL
--- -------
1 10
2 200
3 3000
4 4000
5 5.0
6 -60.0
10 .10
Rows 7 and 8 are excluded because they have two sets of congruent numbers. Row 9 is excluded because the string has no numerics. Row 11 is excluded because 11.1.2.3 is not a number.
So I wrote a little "isnumeric" function which traps an implicit TO_NUMBER:
create or replace function isnumeric
(p_string in varchar2)
return varchar2
as
l_number number;
begin
l_number := nvl(p_string,'x');
return 'Y';
exception
when others then
return 'N';
end;
/
And wrote up a regular expression query which ALMOST gets me there:
select idn,
rtrim(regexp_replace(trim(str),'[^-.0123456789]'),'-') num_val
from jt
where isnumeric(rtrim(regexp_replace(trim(str),'[^-.0123456789]'),'-')) = 'Y';
IDN NUM_VAL
--- -------
1 10
2 200
3 3000
4 4000
5 5.00
6 -60.0
7 71
8 82
10 .10
But I can't figure out how to exclude IDN's 7 and 8.
My regular expression concatenates the numbers together.
Any thoughts as to how I can identify sets of congruent numbers?
I tried combinations of TRANSLATE and REPLACE, but I'm still stuck.
Many Thanks!
- J
November 11, 2009 - 1:13 pm UTC
ops$tkyte%ORA11GR2> select str,
2 trim(regexp_replace(str,'[^-.0123456789]', ' ')) str2,
3 isnumeric(regexp_replace(str,'[^-.0123456789]', ' ')) flag
4 from jt;
STR STR2 F
------------------------------ -------------------- -
10 10 Y
Test 200 200 Y
Test 3000 ok 3000 Y
4000 Test 4000 Y
Test 5.00 - ok 5.00 - N
Test -60.0 -60.0 Y
Test 7 but not 1 7 1 N
Test 8 but not 2 anywhere 8 2 N
Test number nine N
Test.10 .10 Y
11.1.2.3 11.1.2.3 N
11 rows selected.
Teasing out congruent sets of numbers
J from Vancouver, November 06, 2009 - 12:31 pm UTC
Oh duh.
I just needed to add a "space" to the regexp.
From: [^-.0123456789]
To: [^-.0123456789 ]
Whoo-Hoo! =oP
Cheers!
- J
Reg Ex to split the name
A reader, January 12, 2010 - 12:50 pm UTC
I am trying to use Regular expression to split the full name into First Name , Middle Name , Last Name . Some of the records may not have a middle name . Tried few variations .. no luck so far.
SQL> With V_Data As
2 ( SELECT 'Walt F Disney' Fn FROM Dual
3 Union All
4 SELECT 'Walt Disney' Fn FROM Dual
5 )
6 Select fn ,
7 Regexp_Substr(Fn,'[^ ]+', 1, 1) Firstname ,
8 Regexp_Substr(Fn, ' [^ ]+ ') Middle ,
9 Regexp_Substr(Fn,'[^ ]+', 1, 3)Lastname
10 From V_Data;
FN FIRSTNAME MIDDLE LASTNAME
------------- ------------- ------------- -------------
Walt F Disney Walt F Disney
Walt Disney Walt
SQL>
SQL> With V_Data As
2 ( SELECT 'Walt F Disney' Fn FROM Dual
3 Union All
4 SELECT 'Walt Disney' Fn FROM Dual
5 )
6 Select fn ,
7 Regexp_Substr(Fn,'[^ ]+', 1, 1) Firstname ,
8 Regexp_Substr(Fn, ' [^ ]+ ') Middle ,
9 Regexp_Substr(Fn,'[^ ]+', 1, 2)Lastname
10 From V_Data;
FN FIRSTNAME MIDDLE LASTNAME
------------- ------------- ------------- -------------
Walt F Disney Walt F F
Walt Disney Walt Disney
January 18, 2010 - 3:43 pm UTC
ops$tkyte%ORA10GR2> with data
2 as
3 (select 'Tom Kyte' fn from dual
4 union all
5 select 'Thomas Jeffrey Kyte' fn from dual
6 union all
7 select ' Thomas J Kyte ' fn from dual
8 )
9 select piece1, case when piece3 is null then null else piece2 end, nvl(piece3,piece2)
10 from (
11 select substr( trim(fn), 1, instr(trim(fn)||' ',' ')-1) piece1,
12 substr( trim(fn), instr(trim(fn)||' ',' ',1,1)+1, instr(trim(fn)||' ',' ',1,2)-instr(trim(fn)||' ',' ',1,1)-1 ) piece2,
13 substr( trim(fn), instr(trim(fn)||' ',' ',1,2)+1, instr(trim(fn)||' ',' ',1,3)-instr(trim(fn)||' ',' ',1,2)-1 ) piece3
14 from data
15 )
16 /
PIECE1 CASEWHENPIECE3ISNUL NVL(PIECE3,PIECE2)
------------------- ------------------- -------------------
Tom Kyte
Thomas Jeffrey Kyte
Thomas J Kyte
if your data is predicable enough, that'll work.
Enhancements to fuzzy comparision between two columns
Logan Palanisamy, February 23, 2010 - 5:07 pm UTC
This is in response to posting by "Parthiban Nagarajan from Coimbatore, India" dated March 17, 2009.
His solution is pretty elegant, except for the following two cases:
1. Repeating string within the same column
2. End of the first column and the beginning of the second column match the end of the second column.
This is easily solved by introducing a space, (or some other character that is not normally part of the column) between the columns when they concatenated.
SQL>select * from t1;
X
------------
A8UI0$WER76$
KL6WER78*YIZ
012345012345
abcdefghi123
SQL>select * from t2;
Y
------------
HGDU0$WER78*
UL6WER78*MBV
123456123456
456abc123456
SQL>select t1.x, t2.y,
2 case when regexp_like(t1.x||t2.y, '(.{6}).*\1') then 1 else 0 end as match,
3 substr(regexp_substr(t1.x||t2.y, '(.{6}).*\1'),1,6) matched_str
4 from t1, t2;
X Y MATCH MATCHED_ST
------------ ------------ ---------------- ----------
A8UI0$WER76$ HGDU0$WER78* 1 0$WER7
A8UI0$WER76$ UL6WER78*MBV 0
A8UI0$WER76$ 123456123456 1 123456 <-- False match due to repeating strings within Y
A8UI0$WER76$ 456abc123456 0
KL6WER78*YIZ HGDU0$WER78* 1 WER78*
KL6WER78*YIZ UL6WER78*MBV 1 L6WER7
KL6WER78*YIZ 123456123456 1 123456 <-- False match due to repeating strings within Y
KL6WER78*YIZ 456abc123456 0
012345012345 HGDU0$WER78* 1 012345 <-- False match due to repeating strings within X
012345012345 UL6WER78*MBV 1 012345 <-- False match due to repeating strings within X
012345012345 123456123456 1 012345 <-- False match due to repeating strings within X
012345012345 456abc123456 1 012345 <-- False match due to repeating strings within X
abcdefghi123 HGDU0$WER78* 0
abcdefghi123 UL6WER78*MBV 0
abcdefghi123 123456123456 1 123456 <-- False match due to repeating strings within Y
abcdefghi123 456abc123456 1 123456 <-- False match. The last three chars of the X and first three chars of Y match the last six chars of Y
16 rows selected.
-- The false matches go away once the two columns are treated separately with a space, assuming space is not embedded in the columns. \s is the perl notation for space in regex.
SQL>
SQL>select t1.x, t2.y,
2 case when regexp_like(t1.x|| ' ' || t2.y, '(.{6}).*\s.*\1') then 1 else 0 end as match,
3 substr(regexp_substr(t1.x|| ' ' || t2.y, '(.{6}).*\s.*\1'),1,6) matched_str
4 from t1, t2;
X Y MATCH MATCHED_ST
------------ ------------ ---------------- ----------
A8UI0$WER76$ HGDU0$WER78* 1 0$WER7
A8UI0$WER76$ UL6WER78*MBV 0
A8UI0$WER76$ 123456123456 0
A8UI0$WER76$ 456abc123456 0
KL6WER78*YIZ HGDU0$WER78* 1 WER78*
KL6WER78*YIZ UL6WER78*MBV 1 L6WER7
KL6WER78*YIZ 123456123456 0
KL6WER78*YIZ 456abc123456 0
012345012345 HGDU0$WER78* 0
012345012345 UL6WER78*MBV 0
012345012345 123456123456 0
012345012345 456abc123456 0
abcdefghi123 HGDU0$WER78* 0
abcdefghi123 UL6WER78*MBV 0
abcdefghi123 123456123456 0
abcdefghi123 456abc123456 0
16 rows selected.
Regex to split names
Logan Palanisamy, February 24, 2010 - 6:42 pm UTC
SQL>with data
2 as
3 (select 'Tom Kyte' fn from dual
4 union all
5 select 'TomKyte' fn from dual
6 union all
7 select 'Thomas Jeffrey Kyte' fn from dual
8 union all
9 select ' Thomas J Kyte ' fn from dual
10 )
11 select
12 regexp_replace(fn, '^\s*([A-Z][a-z]*)\s*([A-Z][a-z]*)?\s*([A-Z][a-z]*).*$', '\1') first_name,
13 regexp_replace(fn, '^\s*([A-Z][a-z]*)\s*([A-Z][a-z]*)?\s*([A-Z][a-z]*).*$', '\2') middle_name,
14 regexp_replace(fn, '^\s*([A-Z][a-z]*)\s*([A-Z][a-z]*)?\s*([A-Z][a-z]*).*$', '\3') last_name
15 from data;
FIRST_NAME MIDDLE_NAME LAST_NAME
-------------------- -------------------- --------------------
Tom Kyte
Tom Kyte
Thomas Jeffrey Kyte
Thomas J Kyte
Regular Expression
Abhisek, September 29, 2010 - 10:12 am UTC
Hi Tom,
I have to use regular expression to remove the > and < braces from beginning and end only. e.g. if my text is '<DEMO > TEXT>' , it will be displayed as 'DEMO > TEXT'.
It will be great if we could translate them to < and > to look like '<DEMO > TEXT >
but any one will do.
Thanks a lot.
September 29, 2010 - 10:23 am UTC
good luck with that.
how do you plan on dealing with:
<demo > text> and this is something > something else, but this is less then - you know < - this <hello world>
what is the solution to that? how the heck to you know where a <> pair starts and ends??? what are your RULES - then and only then can any solution be provided or even considered.
Regular Expression
Abhisek, September 29, 2010 - 10:14 am UTC
I know it is possible using htf.escape_sc like
select htf.escape_sc('escape <this> one) from dual;
but I want it through regular expression. Please help.
September 29, 2010 - 10:25 am UTC
I probably won't be giving a regular expression - and no you cannot do it via escape_sc - that would change ALL of them - not just the ones you pointed at.
but first, you must - you MUST - provide your specific logic for how this works.
And it almost certainly won't be a regular express from me - it'll be an efficient, correct solution.
If escape_sc does solve your problem, then escape_sc is your answer.
regular expression
Abhisek, September 29, 2010 - 2:45 pm UTC
Hi Tom,
You are absolutely correct when you say that the escape_sc will change every < or > sign.
My requirement is I have to remove the < at the very first character and > is at the last character.. In the text I can have these signs but I am considered for first and last only.. e.g.
Input Text Output Text
<Hello World> Hello World
Hello World Hello World
<Hello World Hello World
<Hello > World> Hello > World
Hello <Test> World Hello <Test> World
Please do let me know if I am clear.
September 29, 2010 - 4:14 pm UTC
ops$tkyte%ORA11GR2> select str , ltrim(rtrim(str,'>'),'<')
2 from (
3 select '<Hello World>' str from dual union all
4 select 'Hello World' str from dual union all
5 select '<Hello World' str from dual union all
6 select '<Hello > World>' str from dual union all
7 select 'Hello <Test> World' str from dual )
8 /
STR LTRIM(RTRIM(STR,'>
------------------ ------------------
<Hello World> Hello World
Hello World Hello World
<Hello World Hello World
<Hello > World> Hello > World
Hello <Test> World Hello <Test> World
see, no regular expression, that would be the wrong way to do it.
Simply Awesome
ABhisek, September 29, 2010 - 4:54 pm UTC
This is really nice pretty way... Just was curious if we could replace the > and < signs with > and <
e.g.
Input Text Output_text
<Hello World> <Hello World>
<Hello > World> <Hello > World>
Thanks a lot.
September 29, 2010 - 5:52 pm UTC
ops$tkyte%ORA11GR2> select str , ltrim(rtrim(str,'>'),'<'),
2 decode( instr(str,'<'), 1, '<' ) ||
3 ltrim(rtrim(str,'>'),'<') ||
4 decode( instr(str,'>',-1), length(str), '>' )
5 from (
6 select '<Hello World>' str from dual union all
7 select 'Hello World' str from dual union all
8 select '<Hello World' str from dual union all
9 select '<Hello > World>' str from dual union all
10 select 'Hello <Test> World' str from dual )
11 /
STR LTRIM(RTRIM(STR,'> DECODE(INSTR(STR,'<'),1,'&
------------------ ------------------ --------------------------
<Hello World> Hello World <Hello World>
Hello World Hello World Hello World
<Hello World Hello World <Hello World
<Hello > World> Hello > World <Hello > World>
Hello <Test> World Hello <Test> World Hello <Test> World
or
ops$tkyte%ORA11GR2> select str , ltrim(rtrim(str,'>'),'<'),
2 case when str like '<%>'
3 then '<' || ltrim(rtrim(str,'>'),'<') || '>'
4 else ltrim(rtrim(str,'>'),'<')
5 end
6 from (
7 select '<Hello World>' str from dual union all
8 select 'Hello World' str from dual union all
9 select '<Hello World' str from dual union all
10 select '<Hello > World>' str from dual union all
11 select 'Hello <Test> World' str from dual )
12 /
STR LTRIM(RTRIM(STR,'> CASEWHENSTRLIKE'<%>'THEN'&
------------------ ------------------ --------------------------
<Hello World> Hello World <Hello World>
Hello World Hello World Hello World
<Hello World Hello World Hello World
<Hello > World> Hello > World <Hello > World>
Hello <Test> World Hello <Test> World Hello <Test> World
depending on whether you want it only when the text is enclosed in <>
Thanks a lot
Abhisek, September 30, 2010 - 1:33 am UTC
Hi Tom,
Thanks a lot for making life easier.
Meanwhile, I had to change a little bit to avoid SCAN SQL as it was looking for assignment variable.
select str , ltrim(rtrim(str,'>'),'<'),
case when str like '<%>'
then '&' ||'lt;' || ltrim(rtrim(str,'>'),'<') || '&' || 'gt;'
else ltrim(rtrim(str,'>'),'<')
end
from (
select '<Hello World>' str from dual union all
select 'Hello World' str from dual union all
select '<Hello World' str from dual union all
select '<Hello > World>' str from dual union all
select 'Hello <Test> World' str from dual )
Thanks a lot again.
September 30, 2010 - 7:31 am UTC
yeah, sorry about that - I set define off and that didn't make the cut and paste. I meant to have that in there...
How to use REGEXP_LIKE to find string with 'a','c', and 'z'
A reader, November 08, 2010 - 1:42 pm UTC
Hey Tom,
How to use REGEXP_LIKE to find string with 'a','c', and 'z'?
Thanks.
November 08, 2010 - 1:44 pm UTC
my answer would be to NOT use regexp_like to do that.
Just use instr - where instr(str,'a')>0 AND instr(str,'c')>0 AND instr(str,'z')>0
it will use slightly (kidding, a lot) less cpu than a regular expression would
Is there a way with regular expression
A reader, November 08, 2010 - 2:52 pm UTC
Tom,
Thank you for your prompt reply.
Is there a way to use ALL? I only see OR with regualr expression.
Thanks.
November 08, 2010 - 3:25 pm UTC
Not really worth my time - I've given you the *efficient* way to accomplish your goal - the *right* way to achieve your goal. I'd rather not chase "wrong ways" myself.
sorry.
REGEXP_SUBSTR
A reader, February 02, 2011 - 3:13 pm UTC
How do I achieve this
using REGEXP_SUBSTR
Requirement "portion of ATTRIBUTE up to and NOT including the first comma"
I tried the following but in for those cases where the values starts with 'comma' it is not working.
<code>
with lst as (select 'test,tes1' as val1,',t' val2 from dual)
select
val1 ,val2,
REGEXP_SUBSTR( val1 , '[^$,]+' ) as "regexp_val1 working",
REGEXP_SUBSTR( val2 , '[^$,]+' ) "regexp_val2 not working"
from lst;
VAL1 VAL2 regexp_val1 working regexp_val2 not working
------------------------------ ------------------------------ ------------------------------ -------
test,tes1 ,t test t
</code>
A reader, April 02, 2011 - 11:13 pm UTC
How would one achieve a Check constraint on a table column that stores string value which should only contain a string that beings with a number and should not contain '?' character?
April 12, 2011 - 11:35 am UTC
ops$tkyte%ORA11GR2> create table t
2 ( x varchar2(30) not null check( to_number(substr(x,1,1)) between 0 and 9 and instr( x, '?' ) = 0 )
3 )
4 /
Table created.
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> insert into t values ( '123abc' );
1 row created.
ops$tkyte%ORA11GR2> insert into t values ( '123?bc' );
insert into t values ( '123?bc' )
*
ERROR at line 1:
ORA-02290: check constraint (OPS$TKYTE.SYS_C0017241) violated
ops$tkyte%ORA11GR2> insert into t values ( 'y23abc' );
insert into t values ( 'y23abc' )
*
ERROR at line 1:
ORA-01722: invalid number
is one approach.
doubt regarding REGEXP_LIKE
Siby, July 05, 2011 - 2:11 pm UTC
Hi,
I have a regular expression which is used to check if an email address matches a certain pattern. This regular expression check goes in an infinite loop when I run it against the following value:
select 1 from dual
where
REGEXP_LIKE (
TRIM ('valenzuela@-call.abc.com-'),
'^[a-zA-Z]+(([-_''`.+!#$%&/=?^{}|~*])*\w*)*\@[A-Za-z0-9]+(-[A-Za-z0-9]+)*(\.[A-Za-z0-9]+(-[A-Za-z0-9]+)*){1,4}$')
I was under the impression that it would either pass or fail in case of a pattern match. Would you happen to know why this is happening so?
Thanks.
July 05, 2011 - 4:27 pm UTC
nope, you'll need a regular expression parser to help you with that and I'm definitely not one of those.
but not sure what you mean by infinite loop - that shouldn't happen (a hang), if it does, please contact support - it seems to fly for me:
ops$tkyte%ORA11GR2> select 1 from dual
2 where
3 REGEXP_LIKE (
4 TRIM ('valenzuela@call.abc.com'),
5 '^[a-zA-Z]+(([-_''`.+!#$%&/=?^{}|~*])*\w*)*\@[A-Za-z0-9]+(-[A-Za-z0-9]+)*(\.[A-Za-z0-9]+(-[A-Za-z0-9]+)*){1,4}$')
6 /
1
----------
1
ops$tkyte%ORA11GR2> edit test
ops$tkyte%ORA11GR2> @test
ops$tkyte%ORA11GR2> select 1 from dual
2 where
3 REGEXP_LIKE (
4 TRIM ('valenzuela@-call.abc.com-'),
5 '^[a-zA-Z]+(([-_''`.+!#$%&/=?^{}|~*])*\w*)*\@[A-Za-z0-9]+(-[A-Za-z0-9]+)*(\.[A-Za-z0-9]+(-[A-Za-z0-9]+)*){1,4}$')
6 /
no rows selected
same as above
Siby, July 05, 2011 - 2:43 pm UTC
Hi,
I put in the wrong example above. Please try executing the following query?
select 1 from dual
where
REGEXP_LIKE (
TRIM ('valenzuelabuentellojesusgerardo@-car.abc.com-'),
'^[a-zA-Z]+(([-_''`.+!#$%&/=?^{}|~*])*\w*)*\@[A-Za-z0-9]+(-[A-Za-z0-9]+)*(\.[A-Za-z0-9]+(-[A-Za-z0-9
]+)*){1,4}$')
Thanks.
July 05, 2011 - 4:28 pm UTC
i did, no problem for me, running 11.2.0.2
Gets tuck for me.
Siby, July 05, 2011 - 5:52 pm UTC
The last regular expression hangs on my database, running 11.1.0.7.0
select 1 from dual
where
REGEXP_LIKE (
TRIM ('valenzuelabuentellojesusgerardo@-car.abc.com-'),
'^[a-zA-Z]+(([-_''`.+!#$%&/=?^{}|~*])*\w*)*\@[A-Za-z0-9]+(-[A-Za-z0-9]+)*(\.[A-Za-z0-9]+(-[A-Za-z0-9
]+)*){1,4}$')
I'll see if i can get help from Oracle Support.
regexp is evil :-)
Sokrates, July 06, 2011 - 2:54 am UTC
@Siby
interesting, I
can reproduce your issue on 11.2.0.2 (linux x86-64):
11.2.0.2 > select 1 from dual
2 where
3 REGEXP_LIKE (
4 TRIM ('valenzuelabuentellojesusgerardo@-car.abc.com-'),
5 '^[a-zA-Z]+(([-_''`.+!#$%&/=?^{}|~*])*\w*)*\@[A-Za-z0-9]+(-[A-Za-z0-9]+)*(\.[A-Za-z0-9]+(-[A
6 -Za-z0-9]+)*){1,4}$')
7 /
(waiting for 5 minutes, then
kill -9 <PID>
on Server)
select 1 from dual
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 7270
Session ID: 6 Serial number: 1671
ERROR:
ORA-03114: not connected to ORACLE
Elapsed: 00:04:58.29
trace (events 10046, level 12) showed:
....
An invalid number has been seen.Memory contents are :
...
Regex is too complex
Steven Kladitis, July 07, 2011 - 10:49 am UTC
Using a popular regex tool, I got the following message on your regex and the test case.
The match attempt was aborted early because the regular expression is too complex.
The regex engine you plan to use it with may not be able to handle it at all and crash.
Look up "catastrophic backtracking" to learn how to avoid this situation.
Not sure what Oracle should do with regular expressions like that.
My 11gr2 test database became very busy for a long time, until I restarted it. I recommend using a tool like regex buddy to check out your regular expressions before running in any environment.
Regexp hanging
Siby, July 07, 2011 - 10:56 pm UTC
The regular expression works fine if I remove the "-" in the domain name. Putting just one hyphen causes all the problem.
Oracle repsonse for hanging REGEXP
Siby, July 08, 2011 - 8:09 am UTC
This was the response from Oracle Support:
This issue was already reported in
Bug 8579113: UPDATE HANGS USING REGEXP_REPLACE ON LOB COLUMN closed as not a bug
Bug 10036245: QUERY SPINS WHEN REGULAR EXPRESSION IS USED ( REGEXP_LIKE ) status 11 - potential duplicate of Bug 8579113
Bug 8579113 was closed as not a bug because the development considers that this is an usage issue as explained in REGEXP_LIKE Hangs and CPU Spikes After DB Upgrade To 11gr2 From 10.2.0.4 (Doc ID 1276379.1)
Bug 10036245: QUERY SPINS WHEN REGULAR EXPRESSION IS USED ( REGEXP_LIKE ) is still in work by development, but it seems that could be closed as duplicate of 8579113, so not an issue from Oracle point of view.
The SQL containing REGEXP_LIKE as below consumes high CPU and runs for ever:
SELECT * FROM dual WHERE REGEXP_LIKE('12aBC', '^(-{0,1}+{0,1})[0-9]+(.{0,1}[0-9]+)$')
The statement spins in the lxregmatch stack; and the only possible way to stop the statement is to kill the session.
As per bug 10036245 and bug 8579113, this is a usage issue.
REG_EXP LIKE
Deanna, August 08, 2011 - 1:03 pm UTC
There is a lot of good information within these posts about REGEXP_LIKE, thank you.
Why would reg_exp return invalid relational operator, is it because it is returning a boolean? I am trying to use REGEXP_LIKE in a SQL statement
The data column is a VARCHAR (2) and has values such as
abcd123
1345-9876
1234-455678-88997
abcdefg
I only want to return those fields that match the pattern:
##-####
I tried the following
select column from table where
(REGEXP_LIKE (column, '^\d{2}-\d{4}$')
I know its got to be something easy I am just not seeing it.
Any help you can provide is greatly appreciated
August 13, 2011 - 3:32 pm UTC
once you fix your parentheses, it works fine:
ops$tkyte%ORA11GR2> select dummy from dual where
2 (REGEXP_LIKE (dummy, '^\d{2}-\d{4}$')
3 /
(REGEXP_LIKE (dummy, '^\d{2}-\d{4}$')
*
ERROR at line 2:
ORA-00907: missing right parenthesis
ops$tkyte%ORA11GR2> select dummy from dual where
2 REGEXP_LIKE (dummy, '^\d{2}-\d{4}$')
3 /
no rows selected
Range parsing of text
J, February 16, 2012 - 9:52 pm UTC
Hello Tom.
Great thread - thanks so much.
Here's our challenge: Tidy up ranges by extracting
out the LOW_VAL, HIGH_VAL, UOM values from a text field.
And it's a royal mess!! Two of us have been tussling
with this for days - plsql developers here with
newbie (but enthusiastic) regex skills. The following
patterns are pretty representative of what we find.
Can you (or an ace regex friend) shed some light as
to how we can best approach this beast? We're still
learning how to "think regex" so any advice would be
greatly appreciated. We are on 11.2.0.3.
Thanks a bunch!
with x as
(select 'R1' nm,
null target_low_val,
null target_high_val,
null target_uom,
'baseline - no range info' notes
from dual union
select 'R1 S-7 FT',
null,
null,
null,
'letter "S" instead of a five'
from dual union
select 'R1-ABC 5.5FT',
5.5,
5.5,
'FT',
'min = max, no space before FT'
from dual union
select 'R1-ABC 5.5 FT',
5.5,
5.5,
'FT',
'min = max, extra spaces'
from dual union
select 'R1-ABC (0.5 FT - 2FT)',
0.5,
2,
'FT',
'sometimes they enclose w/parens'
from dual union
select 'R1 ABC 20-30 FT XYZ',
20,
30,
'FT',
'appended text'
from dual union
select 'R1/ABC 20-30 FTXYZ',
null,
null,
null,
'uom != FT'
from dual)
select nm, target_low_val, target_high_val,
target_uom, notes
from x;
February 17, 2012 - 5:28 am UTC
can we have an explanation of what the answer for the above example should be
and most importantly:
why it is the right answer - the rules for parsing and filling stuff in
Range parsing of text (2)
J, February 17, 2012 - 10:44 am UTC
Yes Sir! (thanks)
We see a couple of patterns. And this is how we have tried to translate into Regex'ese:
Please note: The value numbers are always positive (no numbers <= zero).
The RANGE pattern is as follows - luckily, the data entry for the range ALWAYS follows a [low value+dash+high value] pattern, with optional spaces between those three elements:
[descriptive text]
(might have spaces,dashes,slashes,parens) followed by
[one or more spaces] followed by
[zero or one open parens] followed by
[number - might have a decimal point] followed by
[zero or more spaces] followed by
[dash] followed by
[zero or more spaces] followed by
[number - might have a decimal point] followed by
[zero or one close parens] followed by
['FT'] followed by
[space or end-of-string]
The SINGLE-VALUE pattern is:
[descriptive text]
(might have spaces,dashes,slashes,parens) followed by
[one or more spaces] followed by
[number - might have a decimal point] followed by
[zero or more spaces] followed by
['FT'] followed by
[space or end-of-string]
The output would be:
[the entire string]
[the low value number]
[the high value number] - may be equal to the low value
['FT'] the UOM
Please let me know if you'd like to see a description of what we've been trying. Since the values we are after ALWAYS come on the back-half of the string, we figured to start by finding 'FT' and (hopefully) avoid the potential mess with the [descriptive text]. So we're trying backwards character-by-character evaluation anchored off of 'FT', and it's gruesome.
Hope this helps.
Thanks so very-very much!
February 17, 2012 - 9:00 pm UTC
I've sent an email to someone that likes to do regular expressions to see if they have an idea for you - anyone else is free to comment as well :)
try this
chris227, February 18, 2012 - 5:46 pm UTC
{code}
translate(regexp_substr(nm, '( |\()(\d+\.?\d*)', 1),'#(' ,'#') lv,
translate(regexp_substr(nm, '- *(\d+\.?\d*)', 1),'#- ','#') hv,
regexp_substr(nm, '(FT[^ ]*)', 1) uom,
nm,
target_low_val, target_high_val,
target_uom, notes
from x
{code}
try this follow up
chris227, February 18, 2012 - 5:52 pm UTC
sorry, did not know that editing is not possible here.
translate(regexp_substr(nm, '( |\()(\d+\.?\d*)', 1),'#(' ,'#') lv,
translate(regexp_substr(nm, '- *(\d+\.?\d*)', 1),'#- ','#') hv,
regexp_substr(nm, '(FT[^ ]*)', 1) uom,
nm
from x
LV HV UOM NM
- - - R1
20 30 FT R1 ABC 20-30 FT XYZ
- 7 FT R1 S-7 FT
0.5 2 FT R1-ABC (0.5 FT - 2FT)
5.5 - FT R1-ABC 5.5 FT
5.5 - FT R1-ABC 5.5FT
20 30 FTXYZ R1/ABC 20-30 FTXYZ
I did not know exactly what to do if uom is not FT, so i wrote it out. Just filter it from the result if not needed.
If # may be part of the data, take another character not being in or use regexp_replace instead e.g.
the part with # is nonsens
chris227, February 18, 2012 - 5:54 pm UTC
last statement on the use of # is nonsens, we are in the substring already, so there won't be a #
bug fix
chris227, February 18, 2012 - 6:26 pm UTC
i have done a "bugfix" in regard of the rule for after FT there must be space or end of line.
For R1-ABC (0.5 FT - 2FT) uom should be FT. But with the rule given above this could only be the first occurence of FT.
So FT- 2FT would not be matched.
But i think with the blueprint given it is possible to adjust it quit easy if necessary. (e.g. (FT |FT$|FT\))
select
translate(regexp_substr(nm, '( |\()(\d+\.?\d*)', 1),'#(' ,'#') lv,
translate(regexp_substr(nm, '- *(\d+\.?\d*)', 1),'#- ','#') hv,
regexp_substr(nm, '(FT |FT$)', 1) uom,
nm
from x
Range Parsing of Text(3)
J, February 21, 2012 - 9:51 am UTC
Tom and Chris.
THANK YOU!!! Awesome - this is waay more than we expected. Chris, this is a great bit of core code - we're going to parse this out (in English) to help us decipher what is actually going on in there, and then work it into our solution. BTW, just picked up "Mastering Regular Expressions", and the author does a pretty good job of explaining Regex'ese. Thanks again.
Splitting string with regular expression
Sangeetha, June 13, 2013 - 1:26 pm UTC
Hi Tom,
I have a scenario in which I need to migrate already existing data to a different structure. In our legacy system, there is a log column.Data is like this:
"
Log Time: 14/05/2009 08:31 Log By: Abubaker (26306) A/C Status: ETD 09:00 DOH (051) IAD
BAO R/H IDG FAULT , ENG RUN IS C/O , OCC IS INFORMED BEFORE 10 MINs , ONE HOUR DELAY REQUESTED,
Log Time: 14/05/2009 08:48 Log By: Abubaker (26306) A/C Status: ETD 09:00 DOH (051) IAD
GCU2 REPLACEMENT IN PROGRESS.
Log Time: 14/05/2009 09:20 Log By: Abubaker (26306) A/C Status: ETA 23:20 DOH (051) IAD
A/C DECLARED SERVICEABLE, OCC INFORMED AND REQUESTED TO EXPIDIT THE BOARDING."
I am trying to split this into another structure where, logtime, logged by, log desc are different columns and all these concatenated data will be separate rows in that table.So for this example , I will create 3 rows in the new table.
I tried many things with regular expression. But it is not coming correctly. Just wanted to know whether there is a way with regular expression?
Thanks and Regards
Sangeetha
June 18, 2013 - 3:18 pm UTC
there might be a way with regexp, but it isn't necessary at all. I never ever even consider using regexp unless and until there is no other way (very cpu intensive).
simple instr and substr will work fine here.
ops$tkyte%ORA11GR2> create table t ( x varchar2(4000) );
Table created.
ops$tkyte%ORA11GR2> insert into t values ( 'Log Time: 14/05/2009 08:31 Log By: Abubaker (26306) A/C Status: ETD 09:00
2 DOH (051) IAD
3 BAO R/H IDG FAULT , ENG RUN IS C/O , OCC IS INFORMED BEFORE 10 MINs , ONE HOUR
4 DELAY REQUESTED,' );
1 row created.
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> select substr( x, instr( x, 'Log Time: ' )+10, 16 ),
2 rtrim( substr( x, instr( x, 'Log By: ' )+ 8, instr( x, 'A/C Status: ')-instr( x, 'Log By: ' )-8 ) ) ,
3 substr( x, instr( x, 'A/C Status: ')+12, 9 )
4 from t;
SUBSTR(X,INSTR(X
----------------
RTRIM(SUBSTR(X,INSTR(X,'LOGBY:')+8,INSTR(X,'A/CSTATUS:')-INSTR(X,'LOGBY:')-8))
-------------------------------------------------------------------------------
SUBSTR(X,
---------
14/05/2009 08:31
Abubaker (26306)
ETD 09:00
Thank You
Sangeetha, June 20, 2013 - 2:00 pm UTC
Hi Tom,
Thanks. Yes I did it with instr and substr.
Thanks again
Have a great day.
Sangeetha
pattern matching for strings,
A reader, October 01, 2013 - 7:26 pm UTC
Hello,
Table A has column job_name and table B has column table_name.
The values of Table A:
J_ICtoRZ_VehicleD_Staging
J_ICtoRZ_VehicleD_Merge_Optimized
J_ICtoRZ_VehicleBldCompF_Staging
J_ICtoRZ_VehicleBldCompF_Merge_Optimized
Values of Table B:
VEHICLE
VEHICLE_S
VEHICLE_BLD_COMP
VEHICLE_BLD_COMP_S
I want to join the two tables and get the appropriate tables for their corresponding job names. If the job name has "staging" keyword, then it need to pick the tablename that has "_S"
I want the output like:
J_ICtoRZ_VehicleD_Staging VEHICLE_S
J_ICtoRZ_VehicleD_Merge_Optimized VEHICLE
J_ICtoRZ_VehicleBldCompF_Staging VEHICLE_BLD_COMP
J_ICtoRZ_VehicleBldCompF_Merge_Optimized VEHICLE_BLD_COMP_S
How to achieve this?
Thanks,
Pattern matching,
A reader, November 07, 2013 - 5:54 pm UTC
Hello,
Table A has column job_name and table B has column table_name.
The values of Table A:
J_ICtoRZ_VehicleD_Staging
J_ICtoRZ_VehicleD_Merge_Optimized
J_ICtoRZ_VehicleBldCompF_Staging
J_ICtoRZ_VehicleBldCompF_Merge_Optimized
Values of Table B:
VEHICLE
VEHICLE_S
VEHICLE_BLD_COMP
VEHICLE_BLD_COMP_S
I want to join the two tables and get the appropriate tables for their corresponding job names. If
the job name has "staging" keyword, then it need to pick the tablename that has "_S"
I want the output like:
J_ICtoRZ_VehicleD_Staging VEHICLE_S
J_ICtoRZ_VehicleD_Merge_Optimized VEHICLE
J_ICtoRZ_VehicleBldCompF_Staging VEHICLE_BLD_COMP
J_ICtoRZ_VehicleBldCompF_Merge_Optimized VEHICLE_BLD_COMP_S
How to achieve this?
Thanks,