Home>Question Details



Zia -- Thanks for the question regarding "Fast wild-card searching", version 8.1.7

Submitted on 25-Mar-2005 20:05 Central time zone
Last updated 9-Nov-2009 14:24

You Asked

What is the best way to implement a solution to the following problem so that wild-card 
searching is very fast.

PROBLEM: Two column tabular data with ~100 million rows of the form given below.  
Searching is on the first column.  The number of search requests are in the order of 
100,000 in a twelve hour period with peak usage during a 4 hour period.

Column 1: varchar(25)
Column 2: varchar(15)

36IPZD7507761NJ, 3984399384
36IPZD7507762NJ, 3984399384
37IPZD7507761NJ, 3984388384
36IDDZ7507768NJ, 3984397384

SEARCH:   Could have leading, trailing or embedded wild-cards e.g. the following search 
strings should all match to the first row above.

36IPZD750776%
%750776%
36%7507761NJ 

and we said...

see the followup below by Ofir Manor for an even better concept..

think about it -- with leading %, you have to search the entire set, unless we break that 
field out.

We can use TEXT to index this.  There are two approaches:

a) store the string 36IPZD7507761NJ as  3 36 36I 36IP 36IPZ 36IPZD 36IPZD7 36IPZD75 
36IPZD750 36IPZD7507 36IPZD75077 3
6IPZD750776 36IPZD7507761 36IPZD7507761N 36IPZD7507761NJ 6 6I 6IP 6IPZ 6IPZD 6I
PZD7 6IPZD75 6IPZD750 6IPZD7507 6IPZD75077 6IPZD750776 6IPZD7507761 6IPZD750776
1N 6IPZD7507761NJ I IP IPZ IPZD IPZD7 IPZD75 IPZD750 IPZD7507 IPZD75077 IPZD750
776 IPZD7507761 IPZD7507761N IPZD7507761NJ P PZ PZD PZD7 PZD75 PZD750 PZD7507 P
ZD75077 PZD750776 PZD7507761 PZD7507761N PZD7507761NJ Z ZD ZD7 ZD75 ZD750 ZD750
7 ZD75077 ZD750776 ZD7507761 ZD7507761N ZD7507761NJ D D7 D75 D750 D7507 D75077
D750776 D7507761 D7507761N D7507761NJ 7 75 750 7507 75077 750776 7507761 750776
1N 7507761NJ 5 50 507 5077 50776 507761 507761N 507761NJ 0 07 077 0776 07761 07
761N 07761NJ 77 776 7761 7761N 7761NJ 76 761 761N 761NJ 61 61N 61NJ 1 1N 1NJ N
NJ J  and index that string as if it were words.

b) index a function that returns that big string -- saving us from having to persist the 
big string.


I'll demonstrate A and B here:


a)

ops$tkyte@ORA9IR2> create table t ( col1 varchar2(25), col2 varchar2(15), col1_shadow 
varchar2(4000) );
Table created.

extra column to hold the expanded string...
 
ops$tkyte@ORA9IR2> create view v as select col1, col2 from t;
View created.

just for my use -- to generate data below...

 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create or replace procedure string_to_words( p_string in varchar2, 
p_return in out varchar2 )
  2  as
  3          l_n     number;
  4          l_piece varchar2(255);
  5  begin
  6          l_n := length(p_string);
  7          p_return := ' ';
  8          for i in 1 .. l_n
  9          loop
 10                  for j in 1 .. l_n-i+1
 11                  loop
 12                          l_piece := substr(p_string,i,j)||' ';
 13                          if ( nvl(instr( p_return, ' ' || l_piece ),0) = 0 )
 14                          then
 15                                  p_return := p_return || l_piece;
 16                          end if;
 17                  end loop;
 18          end loop;
 19  end;
 20  /
 
Procedure created.

turn the string into a "paragraph" of words
 
ops$tkyte@ORA9IR2> create or replace trigger t_trigger
  2  before insert or update on t for each row
  3  begin
  4          string_to_words(:new.col1, :new.col1_shadow );
  5  end;
  6  /
 
Trigger created.
 
ops$tkyte@ORA9IR2> insert into t values ( '36IPZD7507761NJ', '3984399384', null );
1 row created.
 
ops$tkyte@ORA9IR2> exec gen_data( 'V', 10000 );
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> commit;
Commit complete.
 
ops$tkyte@ORA9IR2> create index search_idx on T(col1_shadow)
  2  indextype is ctxsys.context parameters( 'datastore s2w_user_datastore lexer my_lexer 
memory 50m')
  3  /
Index created.
 
you would now accept the user index and build a string to pass into contains (BIND into 
contains in real life of course!).  You would break the string up on '%' and create an 
ANDed list of words

ops$tkyte@ORA9IR2> select col1,col2 from t where contains( col1_shadow, '{36IPZD750776}' 
) > 0 and (col1 like '36IPZD750776%');
 
COL1                      COL2
------------------------- ---------------
36IPZD7507761NJ           3984399384
 
 
ops$tkyte@ORA9IR2> select col1,col2 from t where contains( col1_shadow, '{750776}' ) > 0 
and (col1 like '%750776%');
 
COL1                      COL2
------------------------- ---------------
36IPZD7507761NJ           3984399384
 
ops$tkyte@ORA9IR2> select col1,col2 from t where contains( col1_shadow, '{36} and 
{7507761NJ}' ) > 0 and (col1 like '36%7507761NJ');
 
COL1                      COL2
------------------------- ---------------
36IPZD7507761NJ           3984399384


that'll use the index on col1_shadow in order to find all of the rows that might match 
-- and then use the SQL like clause to find the ones that do (order of terms matters, a 
leading % matters and the like clause cleans that up)

If you don't like the shadow column, you can do this (this is similar to what i do on 
this site -- I have an index that pulls data from three tables to construct a page)

ops$tkyte@ORA9IR2> create table t ( col1 varchar2(25), col2 varchar2(15) );
Table created.
 
ops$tkyte@ORA9IR2> grant select on t to ctxsys;
Grant succeeded.
 
ops$tkyte@ORA9IR2> @su ctxsys
ctxsys@ORA9IR2> create or replace procedure string_to_words( p_id in rowid, p_return in 
out varchar2 )
  2  as
  3          l_n     number;
  4          l_piece varchar2(255);
  5  begin
  6          for x in ( select * from ops$tkyte.t where rowid = p_id )
  7          loop
  8                  l_n := length(x.col1);
  9                  p_return := ' ';
 10                  for i in 1 .. l_n
 11                  loop
 12                          for j in 1 .. l_n-i+1
 13                          loop
 14                                  l_piece := substr(x.col1,i,j)||' ';
 15                                  if ( nvl(instr( p_return, ' ' || l_piece ),0) = 0 )
 16                                  then
 17                                          p_return := p_return || l_piece;
 18                                  end if;
 19                          end loop;
 20                  end loop;
 21          end loop;
 22  end;
 23  /
Procedure created.
 
ctxsys@ORA9IR2> grant execute on string_to_words to ops$tkyte;
Grant succeeded.
 
ctxsys@ORA9IR2> @connect /

ops$tkyte@ORA9IR2> begin
  2     ctx_ddl.drop_preference('s2w_user_datastore');
  3  end;
  4  /
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> begin
  2     ctx_ddl.create_preference('s2w_user_datastore',
  3                               'user_datastore' );
  4
  5     ctx_ddl.set_attribute( 's2w_user_datastore', 'procedure', 'STRING_TO_WORDS' );
  6  end;
  7  /
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> begin
  2    ctx_ddl.drop_preference('my_lexer');
  3  end;
  4  /
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> begin
  2    ctx_ddl.create_preference( 'my_lexer', 'BASIC_LEXER' );
  3    ctx_ddl.set_attribute( 'my_lexer', 'base_letter', 'YES');
  4    ctx_ddl.set_attribute( 'my_lexer', 'mixed_case', 'NO' );
  5  end;
  6  /
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> insert into t values ( '36IPZD7507761NJ', '3984399384' );
1 row created.
 
ops$tkyte@ORA9IR2> exec gen_data( 'T', 10000 );
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> commit;
Commit complete.
 
ops$tkyte@ORA9IR2> create index search_idx on T(col1)
  2  indextype is ctxsys.context parameters( 'datastore s2w_user_datastore lexer my_lexer 
memory 50m')
  3  /
Index created.
 
ops$tkyte@ORA9IR2> select * from t where contains( col1, '{36IPZD750776}' ) > 0 and (col1 
like '36IPZD750776%');
 
COL1                      COL2
------------------------- ---------------
36IPZD7507761NJ           3984399384
 
ops$tkyte@ORA9IR2> select * from t where contains( col1, '{750776}' ) > 0 and (col1 like 
'%750776%');
 
COL1                      COL2
------------------------- ---------------
36IPZD7507761NJ           3984399384
 
ops$tkyte@ORA9IR2> select * from t where contains( col1, '{36} and {7507761NJ}' ) > 0 and 
(col1 like '36%7507761NJ');
 
COL1                      COL2
------------------------- ---------------
36IPZD7507761NJ           3984399384
 

You'll want to check out parallel index creation and other options in the TEXT 
documentation as well for 100,000,000 records

 

Reviews    
5 stars Why not native Oracle Text functionality?   March 26, 2005 - 1pm Central time zone
Reviewer: Jack Wells from Boca Raton, FL USA
Whew!  Very nice and elegant.  I know I can use this somewhere (just need to find a place).

It is surprising, though, that Oracle Text doesn't already have a native "preference function" to 
use as an indextype which does a "string to words" conversion.  Seems so powerful and generic at 
the same time.  Why don't you submit this to Oracle TEXT team so they can include it in the next 
release!

Thanks again for all you do,
Jack
 


5 stars Why?   March 26, 2005 - 6pm Central time zone
Reviewer: A reader 
select * from t where contains( col1, '{36IPZD750776}' ) > 0 
and (col1 like '36IPZD750776%');

I dont understand the second predicate above. The first one says look for the string anywhere in 
the column. The second one says look for columns that start with the string. Doesnt that contradict 
each other? 


Followup   March 26, 2005 - 6pm Central time zone:

contradict?

no...

we are using the text index to find the rows that contain the string 36IPZD750776.
we use like to make sure it meets the specification the end user supplied.


if it contradicted -- there would be zero rows returned, but there were rows returned 

5 stars   March 26, 2005 - 9pm Central time zone
Reviewer: A reader 
I still dont get it. That query will return rows only for records where col1 starts with 
36IPZD750776 (the LIKE ensures that). But the user specified anything that contains the string 
36IPZD750776. The contains already does that. So why is the LIKE needed? 


2 stars Why not use a builtin functionality?   March 27, 2005 - 3am Central time zone
Reviewer: Ofir Manor from Israel
Hi Tom,
I wonder why you hand-code a functionality that is already supplied? If you want to make '%xxx%' 
queries faster, all you need to do is ask the text index to prepare for it, by indexing all the 
substrings as well. There is no need to manually write a procedural code to implement this 
feature... Did I miss anything?
begin
ctx_ddl.create_preference('SUBSTRING_PREF', 
                          'BASIC_WORDLIST');
ctx_ddl.set_attribute('SUBSTRING_PREF', 
                      'SUBSTRING_INDEX','TRUE');
end;
drop index search_idx;
create index search_idx on T(col1)
indextype is ctxsys.context parameters ('wordlist SUBSTRING_PREF MEMORY 50M');
..
Then this query will do:
select * from t where contains( col1,'%750776%')>0
There are some more details and preference attributes to play with here, plus an example:
http://download-west.oracle.com/docs/cd/B14117_01/text.101/b10730/cdatadic.htm#sthref750
How about that? 


Followup   March 27, 2005 - 10am Central time zone:

excellent.  What they've done there is slightly different than what I suggested.  They break the 
string into two pieces:

ops$tkyte@ORA9IR2> select * from DR$SEARCH_IDX$P order by length(pat_part1);
 
PAT_PART1       PAT_PART2
--------------- ---------------
                36IPZD7507761NJ
3               6IPZD7507761NJ
36              IPZD7507761NJ
36I             PZD7507761NJ
36IP            ZD7507761NJ
36IPZ           D7507761NJ
36IPZD          7507761NJ
36IPZD7         507761NJ
36IPZD75        07761NJ
36IPZD750       7761NJ
36IPZD7507      761NJ
36IPZD75077     61NJ
36IPZD750776    1NJ
 
13 rows selected.


and then query:

SELECT LTRIM(PAT_PART1||PAT_PART2) FROM "OPS$TKYTE"."DR$SEARCH_IDX$P"  WHERE PAT_PART2 LIKE :lkexpr

when there is a leading %


sweet, I did not know that -- or  (I too learn something new about Oracle every day -- and probably 
forget two things I used to know).


Thanks very much for the followup, that looks very promising. 

5 stars Thanks   March 27, 2005 - 6pm Central time zone
Reviewer: Zia Hydari from Boston, MA
I appreciate the response and the follow-up.

Thanks
Zia Hydari
hydari _at_ mit.edu


I am not a database developer but this piece forms a part of the bigger problem that I am trying to 
solve.  I hope to get a better buy-in now on my overall solution. 


5 stars   March 27, 2005 - 6pm Central time zone
Reviewer: A reader 
I still dont get it. That query will return rows only for records where col1 starts with 
36IPZD750776 (the LIKE ensures that). But the user specified anything that contains  the string 
36IPZD750776. The contains already does that. So why is the LIKE needed? 

 


Followup   March 27, 2005 - 6pm Central time zone:

No, the original poster said:

36IPZD750776%
%750776%
36%7507761NJ 


I wrote the contains, the contains I wrote did not have the leading edge check.  the LIKE did that 
for me. 

5 stars very interesting thread   March 27, 2005 - 11pm Central time zone
Reviewer: Doug from CT, USA
This is very interesting.. I am going to spend some time on this.  Can I go down a slight sidetrack 
and ask how these ctxsys indexes work under the covers?   


Followup   March 28, 2005 - 7am Central time zone:

alter session set sql_trace=true;
select * from t where contains(...) > 0;


edit trace file.....



in short, they work like an inverted list index. the text is parsed and stored with a blob per word 
(the word could be repeated over and over in the index over time).  

When you say "where contains(col,'text') > 0", it takes the text and queries this index table with 
the keyword/blob, the blob points to the documents of interest.

It is sort of like a bitmap index. 

5 stars most useful stuff   March 28, 2005 - 3am Central time zone
Reviewer: Matthias Rogel from Kaiserslautern, Germany
***** 


3 stars Re:   March 28, 2005 - 7am Central time zone
Reviewer: Jagjeet Singh from INDIA
Sir,

   I just want to know Why this apprach is not good.
   Because you did not suggested it.

   The only problem with index scanning is.. if % 
   is at the leading character. 

    We can just added an extra character before the value.

   -- put this logic in trigger ..    :new.col := 'X'||:new.col

  and on query  

 Select * from tab where col = 'X'||%750776%' ;

Thanks, 


Followup   March 28, 2005 - 8am Central time zone:

and what would that accomplish exactly?

everything starts with X -- all 100,000,000 records.  like 'X%....' would scan them all. 

3 stars How to search this string   March 28, 2005 - 7am Central time zone
Reviewer: satya from USA
Hello Tom,

    How do you solve this problem .

   I have create a function which will except two param and return varchar2 field


   First Param  is String  and second param is Keyword. 


   Let say  The following patters strings 

         A) XYZ:123456;ABC:99999
         B) ABC:99999;XYZ:123456;
         C) ABC:99999;XYZ:123456
         D) ABC:99999;
         E) ABC:99999


  Keywords are the following may be in the following order 

       i)  XYZ
       ii) ABC 

The user can pass the following  combinations
       
       1)   (A,i) or  (B,i)   REsult is   123456

       2)   (A,ii)  or (B,ii) or (c,ii)  or (D,ii) or (E,ii)  => 99999

I use the INSTR key word get the result but the function is returning value only for string B and D 


Could you please suggest me something 

Thanks 

Satya

 


Followup   March 28, 2005 - 8am Central time zone:

not understanding what (A,i) or (B,i) means.

create table
insert into table
your attempted query
better explanation of what you mean...

then we can take a look. 

2 stars here is again   March 28, 2005 - 8am Central time zone
Reviewer: satya from U
Hello Tom,

    How do you solve this problem .

   I have to create a function which will accept two param and return varchar2 field(keyword)


   First Param  is String  and second param is Keyword. 


   Let say  The following patters strings 

         A) XYZ:123456;ABC:99999
         B) ABC:99999;XYZ:123456;
         C) ABC:99999;XYZ:123456
         D) ABC:99999;
         E) ABC:99999


  Keywords are the following may be in the following order 

       i)  XYZ
       ii) ABC 

The user can pass the following  combinations
       
       1)   (A,i) or  (B,i)   REsult is   123456

       2)   (A,ii)  or (B,ii) or (c,ii)  or (D,ii) or (E,ii)  => 99999

    Where  A,B,C,D,E are strings  that is  "XYZ:123456;ABC:99999"

   Let us say the function is mykeyword(mystring varchar2, mykey varchar2) return  varchar2;

     SQL>  select mykeyword('XYZ:123456;ABC:99999','ABC') from dual ;  Expected answer will be 
99999
     SQL>  select mykeyword('ABC:99999;XYZ:123456;','ABC') from dual ;  Expected answer will be 
99999
     SQL>  select mykeyword('ABC:99999','ABC') from dual ;  Expected answer will be 99999
      

I use the INSTR key word get the result but the function is returning value only for string B and D 


Could you please suggest me something 

Thanks 

Satya 


Followup   March 28, 2005 - 9am Central time zone:

ops$tkyte@ORA9IR2> variable pat varchar2(5)
ops$tkyte@ORA9IR2> exec :pat := 'ABC'
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select x, substr( a, 1, instr(a,';')-1 )
  2    from (
  3  select x, substr( x, instr( x, :pat || ':' )+4 ) || ';' a
  4    from t
  5   where instr(x,:pat||':') > 0
  6         )
  7  /
 
X                         SUBSTR(A,1,INSTR(A,';')-1)
------------------------- --------------------------
XYZ:123456;ABC:99999      99999
ABC:99999;XYZ:123456;     99999
ABC:99999;XYZ:123456      99999
ABC:99999;                99999
ABC:99999                 99999
 
ops$tkyte@ORA9IR2> exec :pat := 'XYZ';
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> /
 
X                         SUBSTR(A,1,INSTR(A,';')-1)
------------------------- --------------------------
XYZ:123456;ABC:99999      123456
ABC:99999;XYZ:123456;     123456
ABC:99999;XYZ:123456      123456
 

5 stars normalize?   March 28, 2005 - 8am Central time zone
Reviewer: Jon 
> 36IPZD7507761NJ, 3984399384
> 36IPZD7507762NJ, 3984399384
> 37IPZD7507761NJ, 3984388384
> 36IDDZ7507768NJ, 3984397384
> 
> 36IPZD750776%
> %750776%
> 36%7507761NJ 

Shouldn't col1 really be about 4 columns?  From the examples, it looks like the first two numbers 
mean something.  Break that into a column.  Then the next four seem to mean something else; another 
column.  7507761 looks like a part number; third column.  NJ is the state; fourth column.  Most 
column values are probably low in cardinality too so bitmap indexes would work well here.  

Once you have it in a more normalized structure, then you can use any tool to query the data in an 
efficient manner.   


5 stars Adding more details ...   March 28, 2005 - 8am Central time zone
Reviewer: Zia Hydari from Boston, MA
- There are NO create/delete/update operations during the 12 hour period when search needs to 
perform optimally.  

- There would be a nightly feed; my guesstimate is that we will have about 4 hours for any 
processing of the data (building the index etc).  

- Leading and trailing wild-card searches would make the highest percentage of searches with 
embedded wild-card expected to form a small percentage (<10%) of the total number of searches.

Is any tweaking possible based on this additional info?

Zia Hydari
hydari _at_ mit.edu 


Followup   March 28, 2005 - 9am Central time zone:

using the substr index with a simple contains would be the approach to benchmark. 

3 stars if the length varies   March 28, 2005 - 9am Central time zone
Reviewer: s 
Hello Tom,

    How do you solve this problem .

   I have to create a function which will accept two param and return varchar2 
field(keyword)


   First Param  is String  and second param is Keyword. 


   Let say  The following patters strings 

         A) XYZ:123456;ABC:99999
         B) ABC:99999;XYZ:123456;
         C) ABC:99999;XYZ:123456
         D) ABC:99999;
         E) ABC:99999


  Keywords are the following may be in the following order 

       i)  XYZ
       ii) ABC 

The user can pass the following  combinations
       
       1)   (A,i) or  (B,i)   REsult is   123456

       2)   (A,ii)  or (B,ii) or (c,ii)  or (D,ii) or (E,ii)  => 99999

    Where  A,B,C,D,E are strings  that is  "XYZ:123456;ABC:99999"

   Let us say the function is mykeyword(mystring varchar2, mykey varchar2) 
return  varchar2;

     SQL>  select mykeyword('XYZ:123456;ABC:99999','ABC') from dual ;  Expected 
answer will be 99999
     SQL>  select mykeyword('ABC:99999;XYZ:123456;','ABC') from dual ;  Expected 
answer will be 99999
     SQL>  select mykeyword('ABC:99999','ABC') from dual ;  Expected answer will 
be 99999
      

I use the INSTR key word get the result but the function is returning value only 
for string B and D 

Could you please suggest me something 

Thanks 

Satya 


Followup:  
ops$tkyte@ORA9IR2> variable pat varchar2(5)
ops$tkyte@ORA9IR2> exec :pat := 'ABC'
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select x, substr( a, 1, instr(a,';')-1 )
  2    from (
  3  select x, substr( x, instr( x, :pat || ':' )+4 ) || ';' a
  4    from t
  5   where instr(x,:pat||':') > 0
  6         )
  7  /
 
X                         SUBSTR(A,1,INSTR(A,';')-1)
------------------------- --------------------------
XYZ:123456;ABC:99999      99999
ABC:99999;XYZ:123456;     99999
ABC:99999;XYZ:123456      99999
ABC:99999;                99999
ABC:99999                 99999
 
ops$tkyte@ORA9IR2> exec :pat := 'XYZ';
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> /
 
X                         SUBSTR(A,1,INSTR(A,';')-1)
------------------------- --------------------------
XYZ:123456;ABC:99999      123456
ABC:99999;XYZ:123456;     123456
ABC:99999;XYZ:123456      123456

 
Thanks Tom

  But Actually the values XYZ or not fixed lenght 

that is some time it XYZXYZ or XYZXY or XYZ
so the string will be 

 ABC:99999;XYZXYZ:123456 
 ABC:99999;XYZX:123456     

How do we solve in this case 

Thanks again      
   


Followup   March 28, 2005 - 10am Central time zone:

use length(:pat)+1 instead of 4 then? 

2 stars string search   June 10, 2005 - 2am Central time zone
Reviewer: Rahul from New Delhi
Hi Tom,
   Its never the case that you search for a topic and dont learn something new each time. This site 
is like the Bible for all the DBA's. I read this post and few other but could not find some thing 
specific to solve this problem.
   How do i search for a string in a schema. Mean how do i get the table and column details 
containing a particular string. Is their any utility availbale?

Thanks,
Rahul. 


Followup   June 10, 2005 - 8am Central time zone:

there is no "database grep" like that -- it is highly unusual to ask to search "every table and 
every column looking for a string" in structured information.

but you can always automate things:

create or replace procedure search_schema( p_string in varchar2 )
authid current_user
as
    l_cols long;
    l_where long;
    l_cursor sys_refcursor;
    l_cname  varchar2(4000);
begin
    dbms_application_info.set_client_info( '%' || p_string || '%' );
    for x in ( select table_name from user_tables )
    loop
        l_cols := 'case when 1=0 then ''x'' ';
        l_where := ' where ( 1=0 ';
        for y in ( select '"' || column_name || '"' column_name
                     from user_tab_columns
                    where table_name = upper(x.table_name)
                       and (data_type in ( 'CHAR', 'DATE', 'FLOAT', 'NCHAR', 'NUMBER', 'NVARCHAR2', 
'VARCHAR2' )
                           or data_type like 'INTERVAL%' or data_type like 'TIMESTAMP%' )
                )
        loop
            l_cols := l_cols || ' when ' || y.column_name ||
                      ' like sys_context(''userenv'',''client_info'') then ' ||
                      ' ''' || y.column_name || '''';
            l_where := l_where || ' or ' || y.column_name || ' like 
sys_context(''userenv'',''client_info'') ';
        end loop;
        open l_cursor for 'select ' || l_cols || 'else null end cname from ' ||
                           x.table_name || l_where || ') and rownum=1';
        fetch l_cursor into l_cname;
        if ( l_cursor%found )
        then
            dbms_output.put_line( x.table_name || ' ' || l_cname );
        end if;
        close l_cursor;
    end loop;
end;
/

(might not be perfect, hacked together from other bits of code) 

5 stars Thanks Tom   June 13, 2005 - 12am Central time zone
Reviewer: rahul from New Delhi
Hi Tom,
   Its impossible to belive that any problem can hold you for a long time. I dont remember any time 
when i have asked for a solution and didn't get one.
   Just one doubt could we have used Oracles "ultra search".
I have not read anything about it but the name suggest it can help for search. Just a thought.

Thanks,
Rahul. 


3 stars Searching a string in a Database schema   June 17, 2005 - 3am Central time zone
Reviewer: Abir Bhattacharya from India
Hi Tom 

This is in regards to the procedure you had supplied (search_schema) using sys_refcursor . I have 
got a similar requirement but my DB happens to be in Oracle 8i which is not supported in 8i. I 
tried using REF CURSOR by defining a type in a separate package and trying implement that in the 
stored proc itself . even if the proc do compile up but I am unable to get the desired result.

Code Snippet :


CREATE OR REPLACE PACKAGE Types AS 
  TYPE cursor_type IS REF CURSOR;
END Types;


CREATE OR REPLACE procedure search_schema( p_string in varchar2)
authid current_user
as
    l_cols long;
    l_where long;
    /*l_cursor sys_refcursor;*/
    l_cursor Types.cursor_type;
    l_cname  varchar2(4000);
begin
    dbms_application_info.set_client_info( '%' || p_string || '%' );
    for x in ( select table_name from user_tables )
    loop
        l_cols := 'case when 1=0 then ''x'' ';
        l_where := ' where ( 1=0 ';
        for y in ( select '"' || column_name || '"' column_name
                     from user_tab_columns
                    where table_name = upper(x.table_name)
                       and (data_type in ( 'CHAR', 'DATE', 'FLOAT', 'NCHAR','NUMBER', 'NVARCHAR2', 
'VARCHAR2' )
                           or data_type like 'INTERVAL%' or data_type like'TIMESTAMP%' )
                )
        loop
            l_cols := l_cols || ' when ' || y.column_name ||
                      ' like sys_context(''userenv'',''client_info'') then ' ||
                      ' ''' || y.column_name || '''';
            l_where := l_where || ' or ' || y.column_name || ' like
sys_context(''userenv'',''client_info'') ';
        end loop;
        open l_cursor for 'select ' || l_cols || 'else null end cname from ' || x.table_name || 
l_where || ') and rownum=1';
        fetch l_cursor into l_cname;
        if ( l_cursor%found )
        then
            dbms_output.put_line( x.table_name || ' ' || l_cname );
        end if;
        close l_cursor;
    end loop;
end; 


Followup   June 17, 2005 - 2pm Central time zone:

SQL> @test

Package created.


Procedure created.

SQL> set serveroutput on size 1000000
SQL> exec search_schema( 'KING' )
EMP "ENAME"

PL/SQL procedure successfully completed.

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle8i Enterprise Edition Release 8.1.7.4.1 - Production
PL/SQL Release 8.1.7.4.0 - Production
CORE    8.1.7.2.1       Production
TNS for 32-bit Windows: Version 8.1.7.4.0 - Production
NLSRTL Version 3.4.1.0.0 - Production


it works? 

5 stars memory for wordlist   September 6, 2005 - 5pm Central time zone
Reviewer: mohini from germany
9i (Release 2)
create index search_idx on T(col1)
indextype is ctxsys.context parameters ('wordlist SUBSTRING_PREF MEMORY 50M');

How does one come up with the memory part of this parameter..
didn't find any info in the docs.

Thanks 


3 stars Could you please help me   September 12, 2005 - 12pm Central time zone
Reviewer: sat from USA
Could you help me how to do this 


  Assume that a column conmtaing the following data 

  1) XYZ:123;XYZ:345;

     function should return ABC  key word is XYZ  

  2) XYZ:123;ABC:789;
        
       keyword's are XYZ ,ABC (valid keyword)

      function should return error if  more than KEY word found it is ERROR 
       error message ->  more than one key word found 

  3)  XYZ:123;A12:789;

      keyword's are XYZ and invalid keyword is A12 

    function should return error because A12 is not valid KEY word 



 I have to insert these errors in error table . 2 and 3 rd conditions are handle this way 

  if keyword in ('XYZ','ABC') then 
     if morethan onekey word is error 

     else 
         return the value  ex: condition 1
    end if 
 else 
       error message A12 is not valid key word.
 end if; 


Followup   September 12, 2005 - 2pm Central time zone:

didn't get the ABC comment in #1, ABC didn't come into play.

you'll have to parse the string and implement the logic.  this is just a procedural thing you need 
to write. 

4 stars How to search for rows that have % stored in a column   October 4, 2005 - 11pm Central time zone
Reviewer: A reader 
Tom,
    Let say you have a table with following data.

create table t (c varchar2(10));
insert into t values ('Test%Row4');
insert into t values ('Test%Row2');
insert into t values ('TestRow3');
insert into t values ('TestRow2');
insert into t values ('TestRow1');

How can I write a select statement using "like", to get the rows that have "%" stored in the column 
?
 Thank you so much in advance
 


Followup   October 5, 2005 - 7am Central time zone:

ops$tkyte@ORA9IR2> select * from t where c like '%\%%' escape '\';

C
----------
Test%Row4
Test%Row2


escape 'any character', set up an escape character for your like clause. 

4 stars How to search for rows that have % stored in a column   October 4, 2005 - 11pm Central time zone
Reviewer: A reader 
Tom,
    Let say you have a table with following data.

create table t (c varchar2(10));
insert into t values ('Test%Row4');
insert into t values ('Test%Row2');
insert into t values ('TestRow3');
insert into t values ('TestRow2');
insert into t values ('TestRow1');

How can I write a select statement using "like", to get the rows that have "%" stored in the column 
?
 Thank you so much in advance
 


4 stars Intersect two clobs   October 30, 2005 - 9pm Central time zone
Reviewer: SH from US
Reader:
you can use the below after indexing the field using Oracle Text: select * from t where 
CONTAINS(c,'%Test\%%')> 0 escape '\';
select '\&1' from dual;
'&
--
&1
====================================
Hi Tom,
Excellent site!
I have the following structure:

create table t1 (t1_id number, t1_seqnos clob);
create table t2 (n number);
insert into t2 values(453);
insert into t2 values(233214);
insert into t2 values(1324);
insert into t2 values(3);
insert into t1 values (1 ,',453,4999,233214,5555,0,99,3,');
insert into t1 values (2,',233214,134,');
insert into t1 values (3,',2,4,5,7,8,245,56,3465,');
insert into t1 values (4,',4444,7636,32,5,6,8,');
select * from t1;
           T1_ID T1_SEQNOS
---------------- ------------------------------------------------------------------------
               1 ,453,4999,233214,5555,0,99,3,
               2 ,233214,134,
               3 ,2,4,5,7,8,245,56,3465,
               4 ,4444,7636,32,5,6,8,
select * from t2; --sqlstr provided
               N
----------------
             453
          233214
            1324
               3

Please let know whats the BEST way to:
update T1 
set T1_SEQNOS = (comma separated set of numbers which are present in T1_SEQNOS and also present in 
sqlstr provided)
where T1_ID=&id;

eg. it should update T1.T1_SEQNOS = ',453,233214,3,' where T1_ID=1

I used dynamic sql to get the seqnos from T2 in a cursor and then check if each of those seqnos is 
present in T1.T1_SEQNOS. 

ctx_ddl.create_preference('adsc_lexer','BASIC_LEXER');
ctx_ddl.set_attribute('adsc_lexer','printjoins',',');
create index ind_1 on t1(T1_SEQNOS) indextype is ctxsys.context parameters('lexer adsc_lexer');
select 1 from t1 where T1_ID=1 and CONTAINS(T1_SEQNOS,'%453%')> 0;

But its giving wrong results:
> select * from t1;
           T1_ID T1_SEQNOS
---------------- -------------------------------------------
               1 ,453,4999,233214,5555,0,99,3,
               2 ,233214,134,
               3 ,2,4,5,7,8,245,56,3465,
               4 ,4444,7636,32,5,6,8,

> select T1_SEQNOS from t1 where T1_ID=1 and CONTAINS(T1_SEQNOS,'%,5,%')> 0;
T1_SEQNOS
------------------------------------------------------------,453,4999,233214,5555,0,99,3,

> select T1_SEQNOS from t1 where T1_ID=1 and CONTAINS(T1_SEQNOS,'%,134,%')> 0;
T1_SEQNOS
------------------------------------------------------------
,453,4999,233214,5555,0,99,3, 


4 stars CRM query   December 13, 2005 - 8am Central time zone
Reviewer: A reader 
Hi

I have seen in many CRM users issues queries such as

select *
from contacts
where surname like '%GORDON%

These sort of query is quite slow since we cannot use indexes. Is it a good idea in an OLTP system 
use CTXSYS.CONTEXT indextype for such queries?

 


Followup   December 13, 2005 - 9am Central time zone:

you certainly can - sure, just bear in mind that the ctxsys.context index will always be "a little 
behind". it is generally not maintained in real time. 

4 stars context index maintenance   December 13, 2005 - 9am Central time zone
Reviewer: A reader 
Hi

I am sort of aware the maintenance of context index is different however I am not able to find what 
sort of maintenance we need for these indexes in the documents.

I read this:

Index maintenance is necessary after your application inserts, updates, or deletes documents in 
your base table. Index maintenance involves synchronizing and optimizing your index.

However, if your application performs DML operations (inserts, updates, or deletes) on your base 
table, you must synchronize your index. You can synchronize your index manually with the 
CTX_DDL.SYNC_INDEX PL/SQL procedure.


What can go wrong with this sort of maintenance in a production system? 


Followup   December 13, 2005 - 12pm Central time zone:

nothing should "go wrong".

It'll obviously add some additional amount of workload (the extent of which you'll discover when 
you *test* this) 

5 stars Some Confusion   December 13, 2005 - 3pm Central time zone
Reviewer: Su Baba from Ca, USA
I'm confused as to why the first select statement not returning anything?


SQL> create table t ( col1 varchar2(25), col2 varchar2(15) );

Table created.

SQL> 
SQL> grant select on t to ctxsys;

Grant succeeded.

SQL> 
SQL> 
SQL> begin
  2  ctx_ddl.create_preference('SUBSTRING_PREF', 
  3                            'BASIC_WORDLIST');
  4  ctx_ddl.set_attribute('SUBSTRING_PREF', 
  5                        'SUBSTRING_INDEX','TRUE');
  6  end;
  7  /

PL/SQL procedure successfully completed.

SQL> 
SQL> create index search_idx on T(col1)
  2  indextype is ctxsys.context parameters ('wordlist SUBSTRING_PREF MEMORY 50M');

Index created.

SQL> 
SQL> INSERT INTO t VALUES ('36IPZD7507761NJ', '3984399384');

1 row created.

SQL> INSERT INTO t VALUES ('36IPZD7507762NJ', '3984399384');

1 row created.

SQL> INSERT INTO t VALUES ('37IPZD7507761NJ', '3984388384');

1 row created.

SQL> INSERT INTO t VALUES ('36IDDZ7507768NJ', '3984397384');

1 row created.

SQL> commit;

Commit complete.

SQL> 
SQL> select * from t where contains( col1,'%750776%')>0;

no rows selected

SQL> drop index search_idx;

Index dropped.

SQL> 
SQL> create index search_idx on T(col1)
  2  indextype is ctxsys.context parameters ('wordlist SUBSTRING_PREF MEMORY 50M');

Index created.

SQL> 
SQL> select * from t where contains( col1,'%750776%')>0;

COL1                      COL2
------------------------- ---------------
36IDDZ7507768NJ           3984397384
37IPZD7507761NJ           3984388384
36IPZD7507762NJ           3984399384
36IPZD7507761NJ           3984399384

SQL> 

 


Followup   December 13, 2005 - 5pm Central time zone:

you did not sync the index after inserting the rows. 

4 stars sync index   December 13, 2005 - 7pm Central time zone
Reviewer: A reader 
you have to sync the index everytime you insert?! 


Followup   December 13, 2005 - 7pm Central time zone:

only if you want to have the index synced every time (documented in the text guide)

I sync my index every 30 minutes personally. 

4 stars sync index   December 14, 2005 - 2am Central time zone
Reviewer: A reader 
Hi

What does sync operation do....? Does it acquire a table lock or affects application operation? I 
have been looking the docs but doesnt say what happens 


Followup   December 14, 2005 - 8am Central time zone:

it reads the table that tells text what rows have been modified/added/removed and updates their 
contents in the index.

your application continues, the index is modified. 

5 stars What if You Can't Use Oracle Text?   December 14, 2005 - 4am Central time zone
Reviewer: John Gilmore from Southend on Sea, United Kingdom
These solutions all rely upon the use of Oracle Text.  Is there anything that can be done to 
improve the performance of matching strings like '%ABC' if you can only use ANSI SQL syntax? 


Followup   December 14, 2005 - 8am Central time zone:

if you can only use ansi sql syntax - the base syntax that is supported by more than one vendor, 
you have already doomed yourself to abysmal performance.

think of a world with no analytics, group by rollup, case statements, outer joins, and so on 
(because while they are all ansi, not everyone has them - you are basically stuck with something 
akin to SQL-89, maybe even as bold as SQL-89 with the declarative integrity stuff added in :(


if you were doing this "generically", I support you would have to store a shadow column whereby you 
stuffed the bytes in backwards upon insert so you can search for 

where backwards_string like 'CBA%'

or come up with some other equally "creative" technique. 

4 stars sync index   December 14, 2005 - 9am Central time zone
Reviewer: A reader 
Hi

If sync index needs to do a full table scan I guess for a latge table (more than 20 million) it 
wouldnt be a good idea to use context indexes?

Also, what happen if I make 1000 inserts, instance failure? Will sync able to sync the index? 


Followup   December 14, 2005 - 10am Central time zone:

but it won't need to do a full tablescan.

tell you what  - set up your example, run the sync with sql trace on and you can *see* what it 
does.


If you committed the 1,000 inserts, the inserts will be there and can be sync'ed the next time a 
sync is done (this is one of the primary features of a database.... that once you commit, it is 
there) 

4 stars WILD CHARD OPERATOR +   October 26, 2006 - 1am Central time zone
Reviewer: BARANITHARAN V from BANGALORE, INDIA.
WANT TE EXPLANATION ABOUT THE + OPERATOR WITH EXAMPLES. 


Followup   October 26, 2006 - 11am Central time zone:

I WANT TO LET YOU KNOW ABOUT CAPSLOCK - it is a key typically on the left hand side of your 
keyboard, it would appear that you hit it - hit it again so it doesn't appear that you are shouting 
at us.

so, where do you see "+" being a "wildcard" character?

%, sure, +? 

5 stars   May 14, 2007 - 6pm Central time zone
Reviewer: A reader 
This is what Oracle doc says about substring indexing. Can you tell us more about the bolded parts? Thank you.

BEGIN
   ctx_ddl.create_preference('SUBSTRING_PREF', 
                             'BASIC_WORDLIST');
   ctx_ddl.set_attribute('SUBSTRING_PREF', 
                         'SUBSTRING_INDEX','TRUE');
END;
/


Substring indexing has the following impact on indexing and disk resources:

- Index creation and DML processing is up to 4 times slower

- The size of the substring index created is approximately the size of the $X index on the word table.

- Index creation with substring_index enabled requires more rollback segments during index flushes than with substring index off. Oracle recommends that you do either of the following when creating a substring index:

* make available double the usual rollback or

* decrease the index memory to reduce the size of the index flushes to disk

Followup   May 14, 2007 - 9pm Central time zone:

look above at the example, you'll see how it stores the data and why this index option takes necessarily "more space" than normal.
5 stars indexing the table for a search   July 23, 2007 - 4pm Central time zone
Reviewer: saradha from NJ, USA
Tom,

Can you please give some advice on indexing a table
where any column or combination of columns from a table
will be used in the WHERE clause.

Basically, the search will be done from any one 
or multiple columns of the table.

In such cases, How do we index the table ?

The table:
 ID           NOT NULL NUMBER(12)
 NAME         NOT NULL VARCHAR2(80)
 DESCRIPTION  NOT NULL VARCHAR2(4000)
 FILE_NAME    NOT NULL VARCHAR2(256)
 FILE_SIZE    NOT NULL NUMBER(6)
 MIME_TYPE    NOT NULL VARCHAR2(80)
 CREATED_BY   NOT NULL NUMBER(12)
 CREATED      NOT NULL DATE
 FILE_PATH    VARCHAR2(1024)
 CATEGORY     VARCHAR2(80) 
 AUTHOR       VARCHAR2(512)

For example, the end user might search by AUTHOR
or by NAME or by CREATED and CREATED_BY.

thanks in advance.
saradha
 


Followup   July 24, 2007 - 9am Central time zone:

http://download.oracle.com/docs/cd/B19306_01/text.102/b14217/ind.htm#sthref101


probably a ctxcat index.

or, if the table is read only, read mostly (static reference information or loaded on a schedule), single column bitmap indexes on each column you anticipate using in the where clause.
4 stars Troubleshooting search_schema (10gR2)   February 28, 2008 - 10pm Central time zone
Reviewer: Stephan Uzzell from Columbia, MD USA
Hi Tom,

I found your search_schema routine a while ago and admired it. Now that I actually had an opportunity to use it, I'm running into an issue: on some tables in my schema, it is failing out
BEGIN search_schema(4773.12,'G'); END;

*
ERROR at line 1:
ORA-00939: too many arguments for function
ORA-06512: at "OPERA40303LATEST.SEARCH_SCHEMA", line 25
ORA-06512: at line 1


I've played with the code a little bit. I removed some datatypes I wasn't interested in; I added output for the tables that didn't contain the value (so I could see where it was failing); I added a second parameter to filter the table name so I wasn't looking at anything else.

Line 25 of my code is:
open l_cursor for 'select ' || l_cols || 'else null end cname from ' ||


I know what tables are having the issue (and I don't really think they contain what I need anyway), but I'd like to know why - and how to figure out why.

Any advice?

Thanks,
Stephan

Followup   March 1, 2008 - 10am Central time zone:

how many columns does this table have - likely hitting the limit in CASE.

You could work around that by breaking the case into "more than one"


ops$tkyte%ORA10GR2> declare
  2      l_stmt long := 'create table t ( c1 varchar2(20)';
  3  begin
  4      for i in 2 .. 126
  5      loop
  6          l_stmt := l_stmt || ', c' || i || ' varchar2(20)';
  7      end loop;
  8      l_stmt := l_stmt || ' )';
  9      execute immediate l_stmt;
 10  end;
 11  /

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> exec search_schema( 'foobar' )

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> drop table t purge;

Table dropped.

ops$tkyte%ORA10GR2> declare
  2      l_stmt long := 'create table t ( c1 varchar2(20)';
  3  begin
  4      for i in 2 .. 127
  5      loop
  6          l_stmt := l_stmt || ', c' || i || ' varchar2(20)';
  7      end loop;
  8      l_stmt := l_stmt || ' )';
  9      execute immediate l_stmt;
 10  end;
 11  /

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> exec search_schema( 'foobar' )
BEGIN search_schema( 'foobar' ); END;

*
ERROR at line 1:
ORA-00939: too many arguments for function
ORA-06512: at "OPS$TKYTE.SEARCH_SCHEMA", line 26
ORA-06512: at line 1


ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create or replace procedure search_schema( p_string in varchar2 )
  2  authid current_user
  3  as
  4      l_cols long;
  5      l_where long;
  6      l_cursor sys_refcursor;
  7      l_cname  varchar2(4000);
  8  begin
  9      dbms_application_info.set_client_info( '%' || p_string || '%' );
 10      for x in ( select table_name from user_tables )
 11      loop
 12          l_cols  := '''hit = ''';
 13          l_where := ' where ( 1=0 ';
 14          for y in ( select '"' || column_name || '" ' column_name
 15                       from user_tab_columns
 16                      where table_name = upper(x.table_name)
 17                         and (data_type in ( 'CHAR', 'DATE', 'FLOAT', 'NCHAR', 'NUMBER', 
'NVARCHAR2', 'VARCHAR2' )
 18                             or data_type like 'INTERVAL%' or data_type like 'TIMESTAMP%' )
 19                                           order by column_id
 20                  )
 21          loop
 22              l_cols := l_cols || ' || case when ' || y.column_name ||
 23                       ' like sys_context(''userenv'',''client_info'') then ' ||
 24                       ' ''' || y.column_name || ''' end';
 25              l_where := l_where || ' or ' || y.column_name ||
 26                       ' like sys_context(''userenv'',''client_info'') ';
 27          end loop;
 28          open l_cursor for 'select ' || l_cols || ' cname from ' ||
 29                             x.table_name || l_where || ') and rownum=1';
 30          fetch l_cursor into l_cname;
 31          if ( l_cursor%found )
 32          then
 33              dbms_output.put_line( x.table_name || ' ' || l_cname );
 34          end if;
 35          close l_cursor;
 36      end loop;
 37  end;
 38  /

Procedure created.

ops$tkyte%ORA10GR2> drop table t purge;

Table dropped.

ops$tkyte%ORA10GR2> declare
  2      l_stmt long := 'create table t ( c1 varchar2(20)';
  3  begin
  4      for i in 2 .. 200
  5      loop
  6          l_stmt := l_stmt || ', c' || i || ' varchar2(20)';
  7      end loop;
  8      l_stmt := l_stmt || ' )';
  9      execute immediate l_stmt;
 10  end;
 11  /

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> insert into t (c1,c100,c200) values ( 1,1,1);

1 row created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> exec search_schema( '1' )
T hit = "C1" "C100" "C200"

PL/SQL procedure successfully completed.


4 stars Yes, but...   March 1, 2008 - 7pm Central time zone
Reviewer: Stephan Uzzell from Columbia, MD USA
You were exactly right, Tom - it was the number of columns. Would you believe that table has 447 columns?

So we now get a slightly different error:
BEGIN search_schema('4773.12','G'); END;

*
ERROR at line 1:
ORA-06502: PL/SQL: numeric or value error: character string buffer too small
ORA-06512: at "OPERA40303LATEST.SEARCH_SCHEMA", line 21
ORA-06512: at line 1


The full case statement is too big to fit in l_cols. I tried making l_cols a clob to see if I could fit more that way, but the it wouldn't even compile:
Errors for PROCEDURE SEARCH_SCHEMA:
LINE/COL ERROR
-------- -----------------------------------------------------------------
27/9     PL/SQL: Statement ignored
27/27    PLS-00382: expression is of wrong type


Which is the open cursor line...

Whattaya think? Fixable?

Thanks as always,
Stephan

Followup   March 1, 2008 - 10pm Central time zone:

choices are:

a) convert it to use dbms_sql, you can put the query into an array, the array can be of enormous size.

b) change the code to make it fit into 32k (eg: don't list the column that caused the problem, just do the search, that'll likely fit in 32k)

c) get to 11g where native dynamic sql takes lobs
5 stars Text Index   September 28, 2009 - 11am Central time zone
Reviewer: Sameer from Pune, India
Dear Sir,

I am a great fan of your site and your 2 books, expert one on one and effective oracle by design. 
Each time I stumble upon an issue in Oracle I make it a point to check your site. I don't even have 
to do googling for oracle these days. Your site is an excellent source for anything and everything 
under the sun for Oracle. 

Thanks to your SQL tuning and tkprof discussion threads, I was able to improve the perf of our 
system from over a minute to under a second, without even posting the issue here. I do have an 
issue in Text indexing now, but this discussion has given me some good leads to experiment. 

Thanks Again.

- Sameer


3 stars how to handle conditional input params   October 28, 2009 - 1pm Central time zone
Reviewer: JGP from MD USA
Tom, The sql below runs fast in 10gR2 using domain index, but when I uncomment the two commented lines, it reverts back to full table scan and takes longer. Any insight you can provide?

select 
pharmacy_name
FROM pharmacy_test
WHERE 1=1
and
(
(:v_search_pharmacy_name is not null and 
contains (pharmacy_name, '%' || :v_search_pharmacy_name || '%') > 0
)
-- or
-- :v_search_pharmacy_name is null )
;
Here are the two explain plans (first w/o null check, second with null check):
*SELECT STATEMENT ALL_ROWSCost: 11,811 Bytes: 142,940 Cardinality: 4,084 *
* 1 TABLE ACCESS FULL TABLE REPORTUSER.PHARMACY_TEST Cost: 11,811 Bytes: 142,940 Cardinality: 
4,084* 


*SELECT STATEMENT ALL_ROWSCost: 14 Bytes: 1,435 Cardinality: 41 *
* 3 FILTER *
* 2 TABLE ACCESS BY INDEX ROWID TABLE REPORTUSER.PHARMACY_TEST Cost: 14 Bytes: 1,435 Cardinality: 
41 *
* 1 DOMAIN INDEX INDEX (DOMAIN) REPORTUSER.PHARMACY_TEST_IX1 Cost: 4*


Followup   October 28, 2009 - 4pm Central time zone:

well, it needs two plans here - one for when :bind is not null (use contains) and one when :bind is null (full scan)






ops$tkyte%ORA10GR2> set echo on
ops$tkyte%ORA10GR2> /*
ops$tkyte%ORA10GR2> drop table t;
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create table t ( x int, y varchar2(2000) );
ops$tkyte%ORA10GR2> create index t_idx on t(y) indextype is ctxsys.context;
ops$tkyte%ORA10GR2> insert into t values ( 'hello world' );
ops$tkyte%ORA10GR2> exec dbms_stats.set_table_stats( user, 'T', numrows => 100000 );
ops$tkyte%ORA10GR2> */
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> variable y varchar2(2000)
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> @trace
ops$tkyte%ORA10GR2> alter session set events '10046 trace name context forever, level 12';

Session altered.

ops$tkyte%ORA10GR2> select * from t where :y is not null and contains(y,'%'||:y||'%') > 0
  2  union all
  3  select * from t where :y is null;

no rows selected

ops$tkyte%ORA10GR2> exec :y := 'ello'

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> select * from t where :y is not null and contains(y,'%'||:y||'%') > 0
  2  union all
  3  select * from t where :y is null;

no rows selected


Note in the following (tkprof run with aggregate=no) the first query hits only the full scan tree 
and the second hits only the index access - the row source operation shows that.

select * from t where :y is not null and contains(y,'%'||:y||'%') > 0
union all
select * from t where :y is null

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      0.00       0.00          0          3          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.00       0.00          0          3          0           0

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 563

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  UNION-ALL  (cr=3 pr=0 pw=0 time=154 us)
      0   FILTER  (cr=0 pr=0 pw=0 time=8 us)
      0    TABLE ACCESS BY INDEX ROWID T (cr=0 pr=0 pw=0 time=0 us)
      0     DOMAIN INDEX  T_IDX (cr=0 pr=0 pw=0 time=0 us)
      0   FILTER  (cr=3 pr=0 pw=0 time=119 us)
      0    TABLE ACCESS FULL T (cr=3 pr=0 pw=0 time=112 us)
********************************************************************************
select * from t where :y is not null and contains(y,'%'||:y||'%') > 0
union all
select * from t where :y is null

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.01       0.01          0         15          0           0
Fetch        1      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.01       0.01          0         15          0           0

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 563

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  UNION-ALL  (cr=23 pr=0 pw=0 time=41548 us)
      0   FILTER  (cr=23 pr=0 pw=0 time=41511 us)
      0    TABLE ACCESS BY INDEX ROWID T (cr=23 pr=0 pw=0 time=41505 us)
      0     DOMAIN INDEX  T_IDX (cr=23 pr=0 pw=0 time=41482 us)
      0   FILTER  (cr=0 pr=0 pw=0 time=8 us)
      0    TABLE ACCESS FULL T (cr=0 pr=0 pw=0 time=0 us)

3 stars how to handle conditional input params   October 29, 2009 - 7am Central time zone
Reviewer: JGP from MD USA
So essentially what you're saying is split these into two separate queries. Problem is I have 6 optional variables that can be null. That would be a lot of nested if-else statements. Is there a better way than that?

Followup   October 29, 2009 - 10am Central time zone:

dynamic sql probably.

http://www.oracle.com/technology/oramag/oracle/09-jul/o49asktom.html


that is how I handle my advanced search on this site.
3 stars how to handle conditional input params   October 29, 2009 - 1pm Central time zone
Reviewer: JGP from MD USA
Thanks for the link.. very useful.Do you use Oracle Text and substring indexes for your Advanced Search as well?

Thanks,
JGP



Followup   October 29, 2009 - 2pm Central time zone:

not substring, I do keyword searching only...

but yes, I use text..
3 stars OK   November 3, 2009 - 4am Central time zone
Reviewer: Siva 
Hi Tom,
I need a query which serves the following purpose

" I want the table names and column names where columns have a value of "Insurance" "

Could you please help?


Followup   November 9, 2009 - 2pm Central time zone:

that is very vague.


you really want to search ALL TABLES, ALL COLUMNS in the entire database for a table/column name that has the value 'insurance' in it?

Really?

ctl-f for

database grep

on this page.
5 stars Already answered   November 7, 2009 - 5pm Central time zone
Reviewer: JerrQ from Killaloe
If you look back up the screen, you'll see Tom already answered your question.

Followup   June 10, 2005 - 8am Central time zone:

there is no "database grep" like that -- it is highly unusual to ask to search "every table and 
every column looking for a string" in structured information.

but you can always automate things:

create or replace procedure search_schema( p_string in varchar2 )
authid current_user
as
    l_cols long;
    l_where long;
    l_cursor sys_refcursor;
    l_cname  varchar2(4000);
begin
    dbms_application_info.set_client_info( '%' || p_string || '%' );
    for x in ( select table_name from user_tables )
    loop
        l_cols := 'case when 1=0 then ''x'' ';
        l_where := ' where ( 1=0 ';
        for y in ( select '"' || column_name || '"' column_name
                     from user_tab_columns
                    where table_name = upper(x.table_name)
                       and (data_type in ( 'CHAR', 'DATE', 'FLOAT', 'NCHAR', 'NUMBER', 'NVARCHAR2', 

'VARCHAR2' )
                           or data_type like 'INTERVAL%' or data_type like 'TIMESTAMP%' )
                )
        loop
            l_cols := l_cols || ' when ' || y.column_name ||
                      ' like sys_context(''userenv'',''client_info'') then ' ||
                      ' ''' || y.column_name || '''';
            l_where := l_where || ' or ' || y.column_name || ' like 
sys_context(''userenv'',''client_info'') ';
        end loop;
        open l_cursor for 'select ' || l_cols || 'else null end cname from ' ||
                           x.table_name || l_where || ') and rownum=1';
        fetch l_cursor into l_cname;
        if ( l_cursor%found )
        then
            dbms_output.put_line( x.table_name || ' ' || l_cname );
        end if;
        close l_cursor;
    end loop;
end;
/

(might not be perfect, hacked together from other bits of code) 



Write a Review
 


All information and materials provided here are provided "as-is"; Oracle disclaims all express and implied warranties, including, the implied warranties of merchantability or fitness for a particular use. Oracle shall not be liable for any damages, including, direct, indirect, incidental, special or consequential damages for loss of profits, revenue, data or data use, incurred by you or any third party in connection with the use of this information or these materials.

About Oracle | Legal Notices and Terms of Use | Privacy Statement