Skip to Main Content
  • Questions
  • Wildcards in Oracle Text query relaxation and escaping

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, David.

Asked: April 23, 2020 - 9:10 am UTC

Answered by: Chris Saxon - Last updated: April 29, 2020 - 4:18 pm UTC

Category: SQL - Version: Oracle Database 11g version 11.2.0.4.0

Viewed 100+ times

You Asked

Hi Tom,
my question is probably very basic but I found some trouble while using the query relaxation feature along with the '%' wildcard.

So I would like to escape all the chars in the query with {} and then as a suffix add the % to make a prefix search but got no luck.

With the escape chars I can't get any result, but when they aren't escaped it works just fine.

-- WORKS
contains(i.name,
            '<query>
                <textquery grammar="context">
                    <progression>
                        <seq>lawn%</seq>
                    </progression>
                </textquery>
                <score datatype="FLOAT" algorithm="DEFAULT"/>
            </query>', 1) > 0
-- DOESN'T
contains(i.name,
            '<query>
                <textquery grammar="context">
                    <progression>
                        <seq>{lawn}%</seq>
                    </progression>
                </textquery>
                <score datatype="FLOAT" algorithm="DEFAULT"/>
            </query>', 1) > 0


Can you briefly explain why?

Does the '%' auto-escape all the chars or should I manually write a custom function which will escape all the special chars with the single operator '\'?

with LiveSQL Test Case:

and we said...

I asked Roger Ford, PM for Oracle Text about this. His response is:

You cannot use { } in conjunction with %.

If you search for {lawn}% it will actually rewite internally to "{lawn} %" - note the space in front of the %. That causes it to become a two-word query which will match "the word lawn followed by any other word" - assuming it doesn't fail with "too many expansions", which it generally will.

The solution is to use backslash-escaping instead - \l\a\w\n\%
or even better decide whether you actually need to escape the ordinary characters in the first place.

This is covered in my blog post about my query parser:
https://blogs.oracle.com/searchtech/oracle-text-query-parser

and you rated our response

  (2 ratings)

Reviews

April 29, 2020 - 1:26 pm UTC

Reviewer: David

First thanks for the reply to you and mr. Ford!

I have already found this post and my parser is based on this one. But this parser the progRelax function does not work with non alphanumeric characters, if they are within a query they just get replaced and "marked" as unwanted.

So I need to write my own function but now I can not escape the }. Can you please explain to me in short how to escape all the reserved characters?




Chris Saxon

Followup  

April 29, 2020 - 4:18 pm UTC

Thanks.

I'm not sure exactly what you're trying to do. Could you clarify (with an example) please?

May 01, 2020 - 10:42 am UTC

Reviewer: A reader

Sorry!

Since the last word in my query should be "autocomplete" or at least it should propose the best fitting results, I need to use the wildcard "%".

As Mr. Ford says I can't use the multi-chars escaping method ({}), so I need to escape every special char with the backslash "\".

But I found some problems with the escaping of { and }. The official documentation says:

" To include the close brace character in an escaped query expression, use }}."

But if you run the following statements:

INSERT INTO inv_dev VALUES (10, '{}lawnmower');

-- throws an exception
SELECT * FROM inv_dev i
    WHERE CONTAINS(i.name,
     '<query> 
                <textquery grammar="context"> 
                    <progression> 
                        <seq>{}}law%</seq> 
                    </progression> 
                </textquery> 
                <score datatype="FLOAT" algorithm="DEFAULT"/> 
                <order> 
                    <orderkey> Score DESC </orderkey> 
                </order> 
            </query>', 1) > 0;

-- <seq>{}}law%</seq> or  <seq>\{}}law%</seq> or  <seq>%}}law%</seq>   DOESN'T WORK NEITHER


As I already said the example you sent to me works, because it just removes all the non alphnum chars.

More to Explore

Design

New to good database design? Check out Chris Saxon's full fundamentals class.