Skip to Main Content
  • Questions
  • Oracle Text ctxrule - MATCHES: special word work and not work?

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Hoang.

Asked: June 20, 2017 - 8:52 am UTC

Last updated: June 20, 2017 - 10:52 am UTC

Version: 12

Viewed 1000+ times

You Asked

Hi all,

I use CTXRULE to classify some text, but AB&B work and AT&T not work. Please support me on this:

Here is my search text
SQL> select text from test_lexer_special;

TEXT
--------------------
AB&B
{AT&T}
AT&T

SQL>

How I create index
SQL> EXEC CTX_DDL.CREATE_INDEX_SET('test_ctxrule_1');    
PL/SQL procedure successfully completed.

SQL> EXEC CTX_DDL.CREATE_PREFERENCE ('testmy_lexer', 'BASIC_LEXER');    
PL/SQL procedure successfully completed.

SQL> EXEC CTX_DDL.SET_ATTRIBUTE ('testmy_lexer', 'PRINTJOINS', '&');    
PL/SQL procedure successfully completed.

SQL> CREATE INDEX test_ctxrule_index1 ON test_lexer_special (text)
  2  INDEXTYPE IS CTXSYS.CTXRULE  
  3  PARAMETERS ('INDEX SET test_ctxrule_1 LEXER testmy_lexer STOPLIST CTXSYS.EMPTY_STOPLIST');

Index created.

SQL>

This works
SQL> SELECT *  FROM    test_lexer_special WHERE MATCHES(text, 'ab&b')> 0;

TEXT
--------------------
AB&B

SQL>

This does not work
SQL> SELECT *  FROM    test_lexer_special WHERE MATCHES(text, 'at&t')> 0;

no rows selected

SQL> select token_text,token_extra from DR$test_ctxrule_index1$I;

TOKEN_TEXT           TOKEN_EXTRA
-------------------- --------------------
AB                   {B}

SQL> 

Thanks.

and Chris said...

AT is still part of the stoplist! You can see this by reporting on the index with ctx_report.describe_index:

create table test_lexer_special (
  text varchar2(30)
);

set define off
insert into test_lexer_special values ('AB&B');
insert into test_lexer_special values ('{AT&T}');
insert into test_lexer_special values ('AT&T');

CREATE INDEX test_ctxrule_index1 ON test_lexer_special (text) INDEXTYPE IS CTXSYS.CTXRULE  
  PARAMETERS ('INDEX SET test_ctxrule_1 LEXER testmy_lexer STOPLIST CTXSYS.EMPTY_STOPLIST');

set long 2000000000
set head off
set pagesize 10000
select ctx_report.describe_index('TEST_CTXRULE_INDEX1') from dual;

===========================================================================
                             INDEX DESCRIPTION
===========================================================================
index name:                    "CHRIS"."TEST_CTXRULE_INDEX1"
index id:                      1284
index type:                    ctxrule

base table:                    "CHRIS"."TEST_LEXER_SPECIAL"
primary key column:            
text column:                   TEXT
text column type:              VARCHAR2(30)
language column:               
format column:                 
charset column:                
configuration column:          
Query Stats Enabled:           NO


status:                        INDEXED
full optimize token:           
full optimize count:           
docid count:                   3
nextid:                        4

===========================================================================
                               INDEX OBJECTS
===========================================================================
datastore:                     DIRECT_DATASTORE

filter:                        NULL_FILTER

section group:                 NULL_SECTION_GROUP

lexer:                         BASIC_LEXER

wordlist:                      BASIC_WORDLIST
   stemmer:                       ENGLISH
   fuzzy_match:                   GENERIC

stoplist:                      BASIC_STOPLIST
   stop_word:                     Mr
   stop_word:                     Mrs
   stop_word:                     Ms
   stop_word:                     a
   stop_word:                     all
   stop_word:                     almost
   stop_word:                     also
   stop_word:                     although
   stop_word:                     an
   stop_word:                     and
   stop_word:                     any
   stop_word:                     are
   stop_word:                     as
   stop_word:                     at
... etc. ...


So why has this happened? You specified an empty stoplist, right?

Well you've also specified an index set. These only apply to CTXCAT indexes. You're using CTXRULE. Unfortunately this doesn't throw an exception...

Specify the stoplist on its own and everything will be hunky dory:

drop INDEX test_ctxrule_index1;
CREATE INDEX test_ctxrule_index1 ON test_lexer_special (text) INDEXTYPE IS CTXSYS.CTXRULE  
  PARAMETERS ('STOPLIST CTXSYS.EMPTY_STOPLIST');
  
SELECT * FROM test_lexer_special WHERE MATCHES(text, 'at&t',1)> 0;

TEXT    
{AT&T}  
AT&T  

Rating

  (1 rating)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Hoang Viet, June 20, 2017 - 12:00 pm UTC

Thanks Chris.

Bests.

More to Explore

PL/SQL demos

Check out more PL/SQL tutorials on our LiveSQL tool.

PL/SQL docs

PL/SQL reference manual from the Oracle documentation library