Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Elie.

Asked: March 05, 2001 - 3:30 pm UTC

Last updated: October 25, 2011 - 1:12 pm UTC

Version: 8.1.6.0.0

Viewed 50K+ times! This question is

You Asked

Is there a way to SELECT from a table in a case insensitive manner? For example, I want to execute:

SELECT * FROM SCOTT.EMP WHERE ENAME = 'ToM kYTe';

The only way I know to do this is via:

SELECT * FROM SCOTT.EMP WHERE UPPER(ENAME) = 'TOM KYTE';

This is fine if the table has a comparitively small number of records. But when the table is large, applying the UPPER function is costly.

I thought of using function based indexes for this, but this only works in Enterprise, not in Standard.

And perhaps the Oracle InterMedia Cardrodge would help, but this seems to me like "overkill" for something most other relational databases do "out of the box", sort to speak.

In Microsoft's Sql Server, for example, one need only set a certain character set and one is then able to do "mixed case" queries without any more fuss and without setting any special indexes.

Is Oracle capable of doing such "mixed case" queries and, if so, how?

Thank you in advance, Tom, for any help.

Elie Grunhaus
Concentrix Corporation
egrunhaus@concentrix.com

and Tom said...

Well, I don't know if I would call SQLserver "most every other database" but.... There are dozens of things we do out of the box they do not as well, I guess it just depends on what features you've grown accustomed to.

With that option in SQLserver it is all or nothing. Either every search is case insensitive or not. That is not necessarily a good thing either.

The ways to do the case insensitive search in Oracle would be:

o function based indexes as you are aware.

o a shadow column that is indexed and maintained by a trigger. eg:

alter table emp add upper_name varchar2(30);
create trigger emp_trigger
after insert or update on emp for each row
begin
:new.upper_ename := upper(:new.ename);
end;
/
create index upper_ename on ename(upper_ename);


select * from emp where upper_ename = 'KING';

o intermedia as you are aware.

o using an index organized table you maintain.

create table upper_ename
( ename varchar2(30),
rid rowid, primary key (ename,rid) ) organization index;

create trigger emp_trigger
after insert or update on emp for each row
begin
if (updating and nvl(:old.ename,'x') <> nvl(:new.ename,'x'))
then
delete from upper_ename where
ename = upper(:old.ename) and rid = :old.rowid;
insert into upper_ename values
(upper(:new.ename),:new.rowid );
elsif ( inserting )
then
insert into upper_ename values
(upper(:new.ename),:new.rowid );
end if;
end;

and then query:

select * from
emp where rowid in ( select rid
from upper_ename
where ename = 'KING' );


or
select emp.*
from emp, upper_emp
where emp.rowid = upper_emp.rid
and upper_emp.ename = 'KING';

(thats much like a function based index but your application must be aware of it).

Rating

  (62 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Goodness

Andy, June 07, 2001 - 1:31 am UTC

What would I do without this ability to search through the archives!

does it really work ?

Reader, December 17, 2001 - 3:15 pm UTC

Does case statement work in forms and reports 6.x and > ?

or are there any settings required for it ?


Tom Kyte
December 17, 2001 - 8:03 pm UTC

They may (like PLSQL in the database in 8i) not recognize it. The workaround until they do is to

o hide the construct with dynamic sql
o use a view.

From your book

Gururaj Kulkarni, December 18, 2001 - 3:27 pm UTC

Tom, 

I did the following steps to create and make use of 
function based indexes. I have followed the way which you 
have described in your book. I have the spooled file from
my sql*plus for your reference.

And I would like to know..
1) Why my query plan is showing table access full ?
2) And how these parameters will be helpful ?
   i) QUERY_REWRITE_ENABLE = TRUE
   ii)QUERY_REWRITE_INTEGRITY=TRUSTED

SQL> create index emp_upper_idx on emp(upper(ename));

Index created.

SQL> analyze table emp compute statistics
  2  for table
  3  for all indexed columns
  4  for all indexes;

Table analyzed.

SQL> alter session set query_rewrite_enabled=true;

Session altered.

SQL> alter session set query_rewrite_integrity=trusted;

Session altered.

SQL> set autotrace on explain;
SQL> select ename,empno,sal from emp where upper(ename)='KING';

ENAME          EMPNO       SAL                                                                      
---------- --------- ---------                                                                      
KING            7839      5000                                                                      


Execution Plan
----------------------------------------------------------                                          
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=40)                                
   1    0   TABLE ACCESS (FULL) OF 'EMP' (Cost=1 Card=1 Bytes=40)                                   



SQL> update emp set ename=initcap(ename);

14 rows updated.


Execution Plan
----------------------------------------------------------                                          
   0      UPDATE STATEMENT Optimizer=CHOOSE (Cost=1 Card=14 Bytes=196)                              
   1    0   UPDATE OF 'EMP'                                                                         
   2    1     TABLE ACCESS (FULL) OF 'EMP' (Cost=1 Card=14 Bytes=196)                               



SQL> select ename,empno,sal from emp where upper(ename)='KING';

ENAME          EMPNO       SAL                                                                      
---------- --------- ---------                                                                      
King            7839      5000                                                                      


Execution Plan
----------------------------------------------------------                                          
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=40)                                
   1    0   TABLE ACCESS (FULL) OF 'EMP' (Cost=1 Card=1 Bytes=40)                                   



SQL> spool off;

Thanks in advance   

Tom Kyte
December 18, 2001 - 4:22 pm UTC

There are only 14 rows in EMP.  that is not enough to invoke the use of an index by the CBO.

If we TRICK it into thinking there are 100,000 rows -- it'll use the index:

ops$tkyte@ORA817DEV.US.ORACLE.COM> create table emp as select * from scott.emp;

Table created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> create index emp_upper_idx on emp(upper(ename));

Index created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> analyze table emp compute statistics
  2  for table
  3  for all indexed columns
  4  for all indexes;

Table analyzed.

ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> alter session set query_rewrite_enabled=true;

Session altered.

ops$tkyte@ORA817DEV.US.ORACLE.COM> alter session set query_rewrite_integrity=trusted;

Session altered.

ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> set autotrace on explain;
ops$tkyte@ORA817DEV.US.ORACLE.COM> select ename,empno,sal from emp where upper(ename)='KING';

ENAME           EMPNO        SAL
---------- ---------- ----------
KING             7839       5000


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=33)
   1    0   TABLE ACCESS (FULL) OF 'EMP' (Cost=1 Card=1 Bytes=33)



ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> exec dbms_stats.set_table_stats( user, 'EMP', numrows => 100000 );

PL/SQL procedure successfully completed.

ops$tkyte@ORA817DEV.US.ORACLE.COM> select ename,empno,sal from emp where upper(ename)='KING';

ENAME           EMPNO        SAL
---------- ---------- ----------
KING             7839       5000


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=7143 Bytes=235719)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'EMP' (Cost=2 Card=7143 Bytes=235719)
   2    1     INDEX (RANGE SCAN) OF 'EMP_UPPER_IDX' (NON-UNIQUE) (Cost=1 Card=7143)



ops$tkyte@ORA817DEV.US.ORACLE.COM> set autotrace off


 

Continuing my previous feedback

Gururaj Kulkarni, December 19, 2001 - 12:09 pm UTC

Tom,

Thanks for the reply. I tried it and it works.
I tried another thing..I set numrows back to 14 and tried 
excuting the same query. This time surprisingly my explain plan shows it is using my functional index. 

1)Could you tell me why ?

2)I wanted to know why have you used the parameters
QUERY_REWRITE_ENABLED and QUERY_REWRITE_INTEGRITY before using the query in your book ?


SQL> exec dbms_stats.set_table_stats(user,'EMP',numrows=>14);

PL/SQL procedure successfully completed.

SQL> set autotrace on explain
SQL> select ename,sal,empno from emp
  2  where upper(ename)='KING';

ENAME            SAL     EMPNO
---------- --------- ---------
King            5000      7839


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=40)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'EMP' (Cost=2 Card=1 Byte
          s=40)

   2    1     INDEX (RANGE SCAN) OF 'EMP_UPPER_IDX' (NON-UNIQUE) (Cost
          =1 Card=1)
 

Tom Kyte
December 19, 2001 - 4:12 pm UTC

1) it is because the first call to dbms_stats, seeing that the number of blocks wasn't set in user_tables, set it to a high number. The second call to dbms_stats -- seeing that blocks wasn't set -- did not reset it (it is like you deleted the rows -- the number of BLOCKS won't go down).

So the cost of the FULL SCAN was high because the estimated number of blocks was high (even though there were just 14 rows -- a full scan reads to the high water mark which we set artificially high).

Use this:
exec dbms_stats.set_table_stats( user, 'EMP', numrows => 14, numblks => 1 );

and it'll go back to the full scan.


2) i used them because they are a prerequisite for having an FBI get used and by default they are not enabled. I would have everyone scratching their heads wondering why. It was easier to demonstrate with them on -- right there, explicitly -- then to say "add this to your init.ora, bounce the instance, etc"...




Thank you

Gururaj Kulkarni, December 20, 2001 - 10:34 am UTC

Tom,

Thanks for the explaination. It works.


-Gururaj

Ron Chennells, January 24, 2002 - 5:53 am UTC

I'm working on an appliciation that uses the shadow column
approach. I'd like to change it to a function based index
method. As an extra complication the application currently
genrates the queries dynamically using string concatenation
rather than bind variables.I'm considering using the
cursor_sharing=force parameter as a work around.
My question. Are there any interactions between the function
based index and cursor_sharing parameter that I need to consider ? Does the addition of the wildcard character as shown have any effect ?

Typical case insensitive query.

WHERE cemp_surname_uc_c LIKE 'CH%'


Tom Kyte
January 25, 2002 - 8:09 am UTC

Yes, there can be -- not with FBI's in general, just with the 100% use of bind variables everywhere.

For example, where x like 'CH%' is a good candidate for an index typically. "where x like :x" might NOT be according to the optimizer. That query may need a hint in order to use the index (might not, depends).

There can be other more insidous side effects with cursor sharing and FBI's however. Suppose you create:

create index fbi on t( substr( x, 1, 5 ) );

Now, a query like:

select * from t where substr( x, 1, 5 ) = '12345'

should use an index right -- well, cursor sharing = force goes all out and turns that into:

where substr( x, :b1, :b2 ) = :b3

whoops -- we don't know that is 1 and 5 in the substr anymore, cannot use the FBI.

If you have my book -- I go into this in some detail in chapter 10, the downsides to cursor sharing = force, it can have other side effects. I consider it to be a band-aid, a temporary crutch, that can help you till you actually fix the underlying problem, not a solution.

Intermedia and Case insensitive

A Reader, November 13, 2002 - 2:16 pm UTC

Hi Tom,
At the beginning of this thread you mentioned that intermedia can be used in resolve the case sensitive issue when Oracle Standard Edition is used. Can you please give a little bit detail about the steps or point to URL, other than Oracle documentation, where I can read and learn more. Our developers don’t like the shadow-column/trigger solution.
Thank you very much for your help.



Tom Kyte
November 13, 2002 - 2:55 pm UTC

why other then the documentation? Is there an issue with it? (it does sort of cover the topic)

You can look on otn.oracle.com

you can search for ctxsys on this site.

SQL Server Character Sets

Andy Gilfrin, November 14, 2002 - 4:24 am UTC

My boss spent 3 weeks developing a procedure in our HR application cant remember exactly what it did, but it was for a massive deal we were trying to get.

We had to prove we could do something or other to get the deal so I turned up on site with a CD containing this procedure all ready to impress but it turns out they had a different case sensitive sort order (what was origionally described here as a character set) so this procedure wouldnt run because my boss had decided to go against our standards and use uppercase table names in his select/update statments.

I spent 1 hour trying to track down all of the uppercase characters and replace them but it was an impossible task considering we said we could demonstrate this straight out the box.

In SQL Server 6.5 and 7 the sort order was set at server level, in 2000 this changed to database level so different DB's on the same server could use a mixture of sort orders.

The thing to remeber is this if your trying to find 'Jones' in Oracle you use 'Jones' in the where clause and you get back 'Jones'

In SQL server with a case insensitive sort order (which is default as I remember) you get 'Jones','JONES',JoNes' etc etc.

So you now have the problem in SQL server that you have to use a function to get the exact match. i.e. benefit in one way restrictive in many others






Why UPPER(col) and LOWER(col) return different values?

A Reader, February 19, 2003 - 12:36 pm UTC

create table t (c1 number,c2 varchar2(20));

insert into t values(1,'Ø 2 cm');
insert into t values(2,'Ø 2.5 mm');
insert into t values(3,'Ø 1,2 x 15,6 cm');
insert into t values(4,'Ø 10 mm');


select * from t where upper( c2 ) >= 'Ø 2';
C1 C2
---------- --------------------
1 Ø 2 cm
2 Ø 2.5 mm

select * from t where lower( c2 ) >= 'Ø 2';

C1 C2
---------- --------------------
1 Ø 2 cm
2 Ø 2.5 mm
3 Ø 1,2 x 15,6 cm
4 Ø 10 mm


Tom Kyte
February 19, 2003 - 3:23 pm UTC

what is "theta" -- I see a funky character there.

999 times out of 1000 this is due to a character set issue. what is your CLIENT character set (nls_lang) and what is your DATABASE character set. Betcha they are different -- to the point that whatever "theta" is -- it is not a valid character in on or the other character set and in the conversion -- all bets are off. The upper case "theta" may be very different from the lower case "theta"

Upper and Lower return different results

A Reader, February 19, 2003 - 4:28 pm UTC

I tried different languages and the results are similar - upper and lower function return different results. I real case, the client side is GERMAN and the server side is AMERICAN. The LOWER seems to treat the "2" after the "theta" as a number and considers all the numerical digits in the string. The UPPER seems considering only the first numerical digit.
I tried the following:
select ascii(lower('Ø')) from dual;
which returns 248, and
select ascii(upper('Ø')) from dual;
which returns 216.
I still don't understand why the results for UPPER and LOWER functions are different.
I thought as long as we can get result from ascii function, it should be a valid character.

Thank you.

Tom Kyte
February 19, 2003 - 4:33 pm UTC

It is not the "2" -- it is the theta --

in your characterset (not LANGUAGE, character set) lower theta is ascii 248, upper theta is 216 -- that is upper theta <> lower theta -- hence the return values are NATUALLY different.

pretend theta is 'a'

where 'A' || something >= 'A' || something_else

vs

where 'a' || something >= 'A' || something_else


the second one is always true, because 'a' > 'A'. The first one is SOMETIMES true depending on the value of something and something else because 'A' = 'A', not >

UPPER and LOWER return same results now.

A Reader, February 19, 2003 - 5:13 pm UTC

Thank you very much for the explaination. Now I get it. Putting the functions on both sides:
select * from t where lower(c2)>=lower('Ø 2');
or
select * from t where upper(c2)>=upper('Ø 2');

will always get the correct results.

Thank for your help.

Rule Based Optimizer and FBI's

Sachin, April 02, 2003 - 6:27 pm UTC

Database: 8.1.7.4.0 Enterprise Edition
I know that to use Function Based Indices (FBI's)
you have to have Cost Based Optimizer.
We have lot of web based queries that upper all the columns in the where clause...and we have Rule Based optimizer cuurently (plan to go to CBO..at some point...but not right now)...
in the mean time....Here is what my solution is..please comment:
Have statistics computed on some of the big tables (invloved in slow queries)....and

option 1: set the optimizer to "choose" for the whole database

option 2: set the optimizer to choose right before returning the recordset to the web page for a session.

The table that is loosing indexes because of UPPERs has about 300,000 rows in it...and the queries do upper for almost all the column (web based search)...
So, shadow column will add a lot of data....and it will be lot of shadows :)



Tom Kyte
April 02, 2003 - 9:18 pm UTC

or different approach (for now)

just use a hint in thost queries. a hint will use CBO, a hint against unanalyzed tables that produces the plan you want is a whole lot like the RBO (predicable, always the same boring plan). Just promise to remove the hints when you use the CBO.


if you compute stats on SOME tables and leave the optimizer at choose -- any query that touches them will use the CBO.

Hints CBO

Sachin, April 03, 2003 - 3:34 am UTC

"Just Promise to remove hints...while using CBO"
I am analyzing the tables involved....and adding the hint to use the "upper" FBI...
Why do we have to remove hints while using CBO....

Tom Kyte
April 03, 2003 - 7:58 am UTC


no - don't analyze the tables, if you are in choose and you have SOME tables analyzed, but not all -- and you haven't tested your application thoroughly -- you'll be setting yourself up for the worst week of your life....


You remove the hints when using the CBO so the CBO can do its job.

hints CBO

sachin, April 03, 2003 - 8:35 am UTC

Thanks...

I wasn't going to set optimizer to choose..since you showed the "HINT" path...but I was under the impression..that
if the database's optimizer mode is rule...
and we use a hint for index...database will take the Cost Based approach/use hinted index only if the table is analyzed...guess that is not true...

Thanks again....



Tom Kyte
April 03, 2003 - 8:38 am UTC

I was just reading your other note:

..
option 1: set the optimizer to "choose" for the whole database

option 2: set the optimizer to choose right before returning the recordset to
the web page for a session.

....

option 0 (do nothing) leaves the optimizer mode at choose.
options 1 and 2 do the same..

sooo, naturally I assumed. If you are using hints, you need no stats. If you have stats on everything, you need no hints.


the moment you put a hint in there -- cbo kicks in and the FBI is available.

Still UPPER has a 'gotcha'..I guess

Wachaspati Pandey, May 02, 2003 - 9:56 am UTC

I have Oracle8i Enterprise Edition Release 8.1.7.4.0 which has a DB characterset as UTF8.

I connect to it on UNIX through SQLPLUS after verifying that echo $NLS_LANG = 'AMERICAN_AMERICA.UTF8'

Once I logon to UNIX, I try this and get that error :

SQL> select upper('ï') from dual;
ERROR:
ORA-01756: quoted string not properly terminated

Also, if I have a table with data like this :
SQL> select * from demo;

COL1
----------
Adela拉

and I try this..
SQL> select upper(col1) from demo;

UPPER(COL1)
--------------
ADELA拉

I do not get my letter ï in capital. Just 'dela' were made as caps.

Note : I am using the shadow column approach and if the user writes Adela拉 to search against, I fail because
my DB UPPER has resulted in "ADELA拉" which is surely not
same as "ADELAϤ¥" - which is the caps form, I do (in my ASP Page) before comparing with shadow column to achieve case insensitive search.

Any help ideas...why is UPPER not working ???

As a side note, if the trial on UNIX is done with client NLS characterset set as WE8ISO8859P1 (though DB is UTF8), we get followng :

$ export NLS_LANG='.WE8ISO8859P1';
$ sqlplus apps/p2p32apps1@p2pdb32 

SQL*Plus: Release 8.0.6.0.0 - Production on Thu May 1 20:39:07 2003

(c) Copyright 1999 Oracle Corporation.  All rights reserved.


Connected to:
Oracle8i Enterprise Edition Release 8.1.7.4.0 - Production
With the Partitioning option
JServer Release 8.1.7.4.0 - Production

SQL> create table semo (col1 varchar2(234));

Table created.

SQL> insert into semo values ('Adela拉') ;

1 row created.

SQL> select upper(col1) from semo;

UPPER(COL1)
------------
ADELAϤ¥     <================SEEMS to WORK !!!!!!


Really *PUZZLED* 

Tom Kyte
May 02, 2003 - 10:15 am UTC

you lied to us. You said "i'm going to give you UTF8 data" but then you used a non-utf set of data.

Your client is NOT a utf8 client. It is us7ascii or maybe we8iso ....


upper is working -- you just need set your NLS_LANG to something appropriate for your client.

A reader, July 09, 2003 - 11:50 am UTC


Case insensitive search with LIKE '%XXX%'

Michael, October 23, 2003 - 5:15 am UTC

Hello Tom,

This is my test case:

SQL> l

create table t
(a varchar2(30) not null,
 b char(2000) default 'DUMMY1',
 c char(2000) default 'DUMMY2')
SQL> /

I want to search in the column a. The other columns are only there to make the table "fat".

SQL> insert into t(a)
     select dbms_random.string('X', 30)
     from all_objects
     where rownum <= 20000;

SQL> commit;

This table has 20480 blocks (8k blocks) in 91 extents.
Then i create a function based index for the case insensitive search:

SQL> create index t_upper_a_idx on t(upper(a));

This index has 128 blocks in 16 extents.

SQL> analyze table t compute statistics
     for table for all indexed columns for all indexes;

OK, now the table and index are ready.
Here are some instance parameters:

SQL> show parameter optimizer

NAME                                 TYPE        VALUE
------------------------------------ ----------- -------
optimizer_dynamic_sampling           integer     1
optimizer_features_enable            string      9.2.0
optimizer_index_caching              integer     90
optimizer_index_cost_adj             integer     25
optimizer_max_permutations           integer     2000
optimizer_mode                       string      CHOOSE

SQL> show parameter query

NAME                                 TYPE        VALUE
------------------------------------ ----------- ---------
query_rewrite_enabled                string      TRUE
query_rewrite_integrity              string      enforced

SQL> set autotrace traceonly
SQL> set timing on
SQL> select a, b from t
  2  where upper(a) like upper('ef%3');

Abgelaufen: 00:00:00.03

Ausführungsplan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=16 Bytes=32304)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=5 Card=16 Bytes=32304)
   2    1     INDEX (RANGE SCAN) OF 'T_UPPER_A_IDX' (NON-UNIQUE) (Cost=2 Card=16)

Statistiken
----------------------------------------------------------
         47  recursive calls
          0  db block gets
         12  consistent gets
          6  physical reads
          0  redo size
       2466  bytes sent via SQL*Net to client
        499  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
          1  rows processed

This looks good and is fast.

But:

SQL> select a, b from t
     where upper(a) like upper('%ef%3');

11 Zeilen ausgewählt.

Abgelaufen: 00:00:41.08

Ausführungsplan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1929 Card=1000 Bytes=2019000)
   1    0   TABLE ACCESS (FULL) OF 'T' (Cost=1929 Card=1000 Bytes=2019000)

Statistiken
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      20053  consistent gets
      18791  physical reads
          0  redo size
       2853  bytes sent via SQL*Net to client
        499  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         11  rows processed

This is very slow. Why does Oracle not fast full scan the index???
When i add a INDEX_FFS hint - it uses the index, but it's even much slower (seems to go from index to table for *every* row):

SQL> select /*+ INDEX_FFS (T, T_UPPER_A_IDX) */ a, b
     from t
     where upper(a) like upper('%ef%3');

11 Zeilen ausgewählt.

Abgelaufen: 00:05:45.07

Ausführungsplan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=252 Card=1000 Bytes=2019000)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=252 Card=1000 Bytes=2019000)
   2    1     INDEX (RANGE SCAN) OF 'T_UPPER_A_IDX' (NON-UNIQUE) (Cost=7 Card=1000)

Statistiken
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      20119  consistent gets
      19750  physical reads
          0  redo size
       2853  bytes sent via SQL*Net to client
        499  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         11  rows processed

Then i tried the following query:

SQL> select a, b from t
     where rowid in (select /*+ INDEX_FFS (T, T_UPPER_A_INDX) */ rowid
                from t
                where upper(a) like upper('%ef%3'));

11 Zeilen ausgewählt.

Abgelaufen: 00:00:00.03

Ausführungsplan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1013 Card=1000 Bytes=2050000)
   1    0   NESTED LOOPS (Cost=1013 Card=1000 Bytes=2050000)
   2    1     INDEX (FAST FULL SCAN) OF 'T_UPPER_A_IDX' (NON-UNIQUE) (Cost=13 Card=1000 Bytes=24000)
   3    1     TABLE ACCESS (BY USER ROWID) OF 'T' (Cost=1 Card=1 Bytes=2026)

Statistiken
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        136  consistent gets
          0  physical reads
          0  redo size
       2853  bytes sent via SQL*Net to client
        499  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         11  rows processed

This is very fast! (Without the INDEX_FFS hint it's slow!)

But i don't really know why this query works fast, and i try to avoid using hints in production code/queries.
What to do?

Regards, Michael. 

Tom Kyte
October 23, 2003 - 12:46 pm UTC

use a bind variable. period.

one plan will be generated. the index scan.

Why not store the data in uppercase?

Andy, May 03, 2004 - 7:42 pm UTC

Tom, please help, the philistines are attempting to enforce uppercase on all text in all our descriptor columns. They say that it will simplify code and allow for easier integration between databases (i.e. sql server & oracle).
How would you counter the argument to store all text values in uppercase?

Tom Kyte
May 03, 2004 - 8:03 pm UTC

what is a descriptor column?

do you search on it?
do you join on it?

Clarification of Descriptor columns

Andy, May 03, 2004 - 8:18 pm UTC

Descriptor columns are free form text eg first_name, company_name, item_description, address lines, etc.
They will be searched on and used in lookups from the front end. They are also used in customer reports ie Invoices, Sales summaries, etc.

Tom Kyte
May 04, 2004 - 7:07 am UTC

they all (so far) look like mixed cased things to me that would either

o use TEXT for indexing in a case insensitive fashion
o use function based indexes so upper(column) can use an index


It would be so mainframe-ish to upper case it all.

function based index

reader, June 01, 2004 - 2:12 am UTC

Tom,

Our application has a query like the one below:

select count(0) from testmaster where 1=1 and upper(customer_fname) like '%SA%';

When we use the FBI, it is not using the index because of '%' is used in the leading edge of the predicate.

Is there a way out to rewrite or some other solution towards using the index to make the query run faster.

Thanks.




Tom Kyte
June 01, 2004 - 8:13 am UTC

ops$tkyte@ORA9IR2> create table t
  2  as
  3  select * from all_objects;
 
Table created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create index t_idx on t(upper(object_name));
 
Index created.
 
ops$tkyte@ORA9IR2> analyze table t compute statistics;
 
Table analyzed.
 
ops$tkyte@ORA9IR2> set autotrace on
ops$tkyte@ORA9IR2> select count(*) from t where 1=1 and upper(object_name) like '%SA%';
 
  COUNT(*)
----------
       882
 
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=17 Card=1 Bytes=23)
   1    0   SORT (AGGREGATE)
   2    1     INDEX (FAST FULL SCAN) OF 'T_IDX' (NON-UNIQUE) (Cost=17 Card=1604 Bytes=36892)
 
 
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        163  consistent gets
          0  physical reads
          0  redo size
        380  bytes sent via SQL*Net to client
        499  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
 
ops$tkyte@ORA9IR2> set autotrace off


<b>works for me, i'd need an example -- including versions and all of course</b>

 

Case In sensitivity of oracle

A reader, June 02, 2004 - 2:09 pm UTC

Oracle Doesn't have any of these features. They are trying to mimic , but I feel they have to re-write the rdbms itself

Tom Kyte
June 02, 2004 - 2:41 pm UTC

huh? what do you mean?

first "features" is plural. I believe we were discussing a "feature" here.

second "feel" is like "in my opinion", which when you look at something factually, opinions just *don't cut it*



if you mean case insensitive search you have:

a) function based indexes, create index t_idx on t(upper(whatever));

b) Oracle text (eg: search for SENSITIVITY or SeNsItIvItY on this site, doesn't matter, we search by default case insensitive)

so there you have it -- case insensitive searching for the masses. And in 10g, you can go further.
</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:16370675423662 <code>


so I guess that rewrite of the rdbms can be postponed.

(but coming from the same person -- who by the way won't even leave a name -- that is upset i don't like windows, and that I didn't answer a question the way they wanted.... oh well)


function based index

reader, June 03, 2004 - 2:50 am UTC

Tom,

The version is 9.2.0.1. Even after table analysis, FBI is not used. As needed, the details are given below: Why the index is not used? How to optimize the query?

SQL> select count(customer_fname) from testmaster;

COUNT(CUSTOMER_FNAME)
---------------------
               517640

SQL> create index perf4 on testmaster(upper(customer_fname)) parallel 4 nologging;

Index created.


SQL> analyze table testmaster compute statistics;

Table analyzed.

SQL> select count(0) from testmaster where 1=1 and upper(customer_fname) like '%SA%';


Elapsed: 00:00:02.81

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=185 Card=1 Bytes=11)
   1    0   SORT (AGGREGATE)
   2    1     TABLE ACCESS (FULL) OF 'TESTMASTER' (Cost=185 Card=51764
           Bytes=569404)

Statistics
----------------------------------------------------------
          5  recursive calls
          0  db block gets
       3012  consistent gets
          0  physical reads
          0  redo size
        381  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
          1  rows processed


Regards


 

Tom Kyte
June 03, 2004 - 8:20 am UTC

sigh, no table? perhaps the table is smaller than the index.

(full examples work best -- all of the facts)


and you do understand that with the leading '%', the best you can hope for would be an INDEX FAST FULL SCAN -- eg: a full scan of the index, since every single entry must be inspected.

function based index

reader, June 03, 2004 - 11:47 pm UTC

Tom,

I tried your example on all_objects and dba_objects tables. Both almost contain same number of rows.  I see index scan in one case and full scan in another case. Kindly explain this phenomena. Details are given below:

============================================================

SQL> create table t as select * from all_objects;

Table created.

SQL> create index t_idx on t(upper(object_name));

Index created.

SQL> analyze table t compute statistics;

Table analyzed.

SQL> select count(*) from t;

  COUNT(*)                                                                      
----------                                                                      
      6193                                                                      

SQL> select count(*) from t where 1=1 and upper(object_name) like
  2  '%SA%';

  COUNT(*)                                                                      
----------                                                                      
        67                                                                      

SQL>  set autotrace traceonly
SQL> select count(*) from t where 1=1 and upper(object_name) like
  2  '%SA%';


Execution Plan
----------------------------------------------------------                      
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=3 Card=1 Bytes=16)            
   1    0   SORT (AGGREGATE)                                                    
   2    1     INDEX (FAST FULL SCAN) OF 'T_IDX' (NON-UNIQUE) (Cost=3 C          
          ard=310 Bytes=4960)                                                   
                                                                                




Statistics
----------------------------------------------------------                      
          0  recursive calls                                                    
          0  db block gets                                                      
         27  consistent gets                                                    
          0  physical reads                                                     
          0  redo size                                                          
        379  bytes sent via SQL*Net to client                                   
        503  bytes received via SQL*Net from client                             
          2  SQL*Net roundtrips to/from client                                  
          0  sorts (memory)                                                     
          0  sorts (disk)                                                       
          1  rows processed                                                     

SQL>  set autotrace off
SQL> drop table t;

Table dropped.

SQL> create table t as select * from dba_objects;

Table created.

SQL> create index t_idx on t(upper(object_name));

Index created.

SQL> analyze table t compute statistics;

Table analyzed.

SQL> select count(*) from t;

  COUNT(*)                                                                      
----------                                                                      
      6324                                                                      

SQL> select count(*) from t where 1=1 and upper(object_name) like
  2  '%SA%';

  COUNT(*)                                                                      
----------                                                                      
        67                                                                      

SQL> set autotrace traceonly
SQL> select count(*) from t where 1=1 and upper(object_name) like
  2  '%SA%';


Execution Plan
----------------------------------------------------------                      
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=6 Card=1 Bytes=16)            
   1    0   SORT (AGGREGATE)                                                    
   2    1     TABLE ACCESS (FULL) OF 'T' (Cost=6 Card=316 Bytes=5056)           




Statistics
----------------------------------------------------------                      
          0  recursive calls                                                    
          0  db block gets                                                      
         80  consistent gets                                                    
          0  physical reads                                                     
          0  redo size                                                          
        379  bytes sent via SQL*Net to client                                   
        503  bytes received via SQL*Net from client                             
          2  SQL*Net roundtrips to/from client                                  
          0  sorts (memory)                                                     
          0  sorts (disk)                                                       
          1  rows processed                                                     

======================================================= 

Tom Kyte
June 04, 2004 - 8:10 am UTC

your dba_objects fits in 80 blocks???

these are teeny tiny tables, get some real data in there.

function based index

reader, June 16, 2004 - 2:47 am UTC

Tom,

It is more than 80 blocks. Also, let me know when the index is larger than the table with an example.

Regards.

Tom Kyte
June 16, 2004 - 12:35 pm UTC

Ok, we are in "autotrace lies" mode.


do you have cursor sharing = force set (i'll bet you "yes" is the answer).

explain plan cannot/does not see "cursor sharing". explain plan shows you what would happen if you used those literals.

but, you are not using those literals. If you used SQL_TRACE -- I'd bet the REAL plan in the trace shows the index (and hence the reason why 80 consistent gets and no db block gets -- which we would expect with a full scan, some db block gets)

tkprof it and you'll see, you'll also see the literals replaced with bind variables.

function based index

reader, June 18, 2004 - 12:33 am UTC

Tom,

cursor_sharing is only "exact"
SQL> show parameter cursor_sh%

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
cursor_sharing                       string      EXACT

The tkprof report is given below:

select count(*) from t where 1=1 and upper(object_name) like
  '%SA%'

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.04       0.08          0         80          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.04       0.08          0         80          0           1

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 21  

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE 
     67   TABLE ACCESS FULL T 


Also. I asked for an instance of where index is larger than table. Kindly give an example.

Regards 

Tom Kyte
June 18, 2004 - 10:37 am UTC

then your copy of the table is 80 blocks and your example is "erroneous" somewhere.

if a full scan results in 80 gets like that, your table cannot be bigger than 80 blocks. You say it is. a mistake has been made in your testing somewhere.

Tome Oracle Text is a separate tool.

A reader, June 18, 2004 - 1:28 pm UTC

That is why I said Oracle need to re-write the RDBMS. Function based index is not a substitute for what Oracle Text can do.

Tom Kyte
June 18, 2004 - 1:38 pm UTC

you make no sense whatsoever.

Oracle text can do case insensitive searches.
function based indexes can do case insensitive searches.
Oracle10g can make case insensitive searches using fbi's "invisible".

We have 100% support for case insensitive searches.

So again, what is your point.


Oracle text is not a separate "tool"
Oracle text is a feature of SE, EE and PE.
Oracle text is no more separate then a b*tree index.

So again, what is your point.


I never said "fbi's are substitutes for oracle text". they are a tool, text is a tool, b*trees are tools, insert is a tool, update is a tool.

not really sure what your "point" is.

Thanks for the update

A reader, June 18, 2004 - 1:48 pm UTC

I like Oracle Text than the function based index.
Can you throw some ideas on Advanced Queues as well. Its benefits and applications.



Tom Kyte
June 18, 2004 - 1:55 pm UTC

AQ is somewhat off topic here in this thread. but if you plug

AQ

into my search field, you'll see lots of articles about it, and if you check out:

</code> http://docs.oracle.com/docs/cd/B10501_01/appdev.920/a96587/qintro.htm#65672 <code>

you'll find a nice "benefits and applications"

funtion based index

reader, June 21, 2004 - 11:50 pm UTC

Tom,

You are right...The number of blocks are 78. Sorry for the error. But, with less number of blocks in the query against all_objects table, index is used whereas full scan is made while querying the dba_objects table wherein the number of blocks are only 76. Why? Please explain.

Also, let me have an instance of index being larger than table.

I give below the details....

===============================================

SQL> create index t_idx on t(upper(object_name));

Index created.

SQL> analyze table t compute statistics;

Table analyzed.

SQL> select blocks from user_tables where table_name='T';

    BLOCKS
----------
        76

SQL> select count(*) from t where 1=1 and upper(object_name) like
    2  '%SA%';


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=3 Card=1 Bytes=16)
   1    0   SORT (AGGREGATE)
   2    1     INDEX (FAST FULL SCAN) OF 'T_IDX' (NON-UNIQUE) (Cost=3 C
          ard=310 Bytes=4960)

SQL>  set autotrace off
SQL> drop table t;

Table dropped.

SQL> create table t as select * from dba_objects;

Table created.

SQL> create index t_idx on t(upper(object_name));

Index created.

SQL> analyze table t compute statistics;

Table analyzed.


SQL> select blocks from user_tables where table_name='T';

    BLOCKS
----------
        78

SQL> select count(*) from t where 1=1 and upper(object_name) like
    2  '%SA%';


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=6 Card=1 Bytes=16)
   1    0   SORT (AGGREGATE)
   2    1     TABLE ACCESS (FULL) OF 'T' (Cost=6 Card=316 Bytes=5056)

================================================

Regards
 

Tom Kyte
June 22, 2004 - 7:52 am UTC

the cost of full scanning a really really small table is more expensive than the cost of full scanning a moderately sized table.

for tables of this trivial size, the optimizer is doing the right thing.




an instance of an index being larger than the table? easy -- single column tables for example -- index = key + rowid. rowid is an address in the table (doesn't exist), rowid exists in the index.

or an index that is very sparse over time due to lots of updates. it can happen.

function based index

reader, June 21, 2004 - 11:54 pm UTC

Tom,

While copying, the following lines got missed in the  example given above towards creation of table "t" from all_objects.

=====================================

SQL> create table t as select * from all_objects;

Table created.

======================================================= 

index

reader, June 22, 2004 - 11:47 pm UTC

Tom,

Index being larger than table---you said....
"an index that is very sparse over time due to lots of updates. it can happen"

How to find the incidence of the above? What should we do in the above instance? should we rebuild or recreate?

Regards.





Tom Kyte
June 23, 2004 - 8:38 am UTC

you would analyze the index and discover most all of it is del_lf_rows. it would be extremely rare (so rare, i've only seem them in the lab). but I know they've been sighted out in the field as well.

a coalesce would probably be more than sufficient.

a rebuild would be the second choice.

I would not drop/create as there is always the chance something bad happens between drop/create and we end up with an index thats going missing.

A reader, September 23, 2004 - 1:43 pm UTC

Tom,

I am bit confused.

see below. I have a function based index on first_name

insert into emp(first_name) values ('TOM');


If I query select first_name from emp
where first_name = 'Tom'

should it return the row that has 'TOM' in first_name
?

It's not working.

Pls advise


Tom Kyte
September 24, 2004 - 9:22 am UTC

you applied no function? why would it?


if you have "create index i on t(upper(first_name))"

you would query

where upper(first_name) = 'TOM'


if you just query first_name = 'TOM' -- well, there are NO first names that are 'TOM' in that table, only 'Tom'

Checking for Case

Vinnie, September 13, 2005 - 4:09 pm UTC

Is there anyway to check a attribute in a table to see if there are any lower case characters?

Tom Kyte
September 13, 2005 - 4:31 pm UTC

where upper(col) <> col ?

Case insensitivity of UTF-8

Markku Uttula, October 18, 2005 - 7:40 pm UTC

I'm sorry to bring up over a year old thread, but...

In my PL/SQL stored procedure, I've been wondering for quite a while now why lower('Маркку') <> 'маркку' ? (but instead lower('Маркку') = 'ðœð°ñ€ðºðºñƒ' which (I think) doesn't make too much sense when it's supposed to still be UTF-8 and what it comes to doesn't look that way)

The characters are cyrillic (but I've understood that that shouldn't be a problem with UTF-8) and I sincerely hope they get through the right way :)

I believe this is just some sort of character set problem (or I haven't been using lower() the right way) and I've so far failed to find any good sources for my problem online and this was the closest I got to my own problem. Hopefully some light will eventually be shead to my darkness.

In case you wish to know, NLS_CHARACTERSET=AL32UTF8, NLS_NCHAR_CHARACTERSET=UTF8, NLS_LANGUAGE=FINNISH, NLS_SORT=FINNISH, NLS_NCHAR_CONV_EXCP=FALSE

Tom Kyte
October 19, 2005 - 6:46 am UTC

and what does the CLIENT have as their characterset.

More on case insensitivity & UTF-8

Markku Uttula, October 18, 2005 - 7:43 pm UTC

Of course I managed to forget the most important part; the server is 9.2.0.1.0

Client character set

Markku Uttula, October 22, 2005 - 10:48 am UTC

That proved to be interesting. Mainly for the reason that after a bit of pokeing around, the answer to "what does the CLIENT have as their characterset" appears to be "I don't have the faintest idea" :)

All NLS_LANG-properties in registry (yes, I'm using Windows, naughty me) were set to "NA"... However, if I changed them to "AL32UTF8" (I also tried "FINNISH_FINLAND.AL32UTF8" and a couple of other alternatives), nothing seems to have changed; the query still returns the wrong result (wrong in a sense that it ain't what I was expecting).

So... How can I confirm and modify the character set the client uses? I usually use Benthic Software's Golden and PLEdit, and sometimes SQL*Plus - in case this differs between clients.

Tom Kyte
October 22, 2005 - 11:01 am UTC

sorry - I don't think I'll be able to decipher this - because I cannot even really tell "what" I'm looking at (I just see sqiggles on the screen above).

I'd probably start by trying to find out what character(s) are "different from what I was expecting" and trying to research that.

Character representation in UTF8, case insensitivity

Markku Uttula, October 22, 2005 - 1:32 pm UTC

'М' is UTF8 representation of Unicode character /u041C (Cyrillic Capital Letter Em). When I ask lower('М'), I'd expect to get 'м' which is Unicode character /u043C (Cyrillic Small Letter Em). Instead of the expected result Oracle gives me 'ðœ' which (I think) means nothing in Unicode but is correct in a sense that both of the bytes making up this character have been correctly lowered.

I was wondering whether I shouldn't really use lower in here[*], or is there something else I'm just doing fundamentally wrong?

[*] For example, in PHP, there is the mb_strtolower-function to be used instead of strtolower if there's a possibility that the string to be handled contains multibyte characters.

Tom Kyte
October 23, 2005 - 4:41 am UTC

You can glance at
</code> http://docs.oracle.com/docs/cd/B14117_01/server.101/b10749/ch7progrunicode.htm#sthref790 <code>

but there is no special "lower" function that I'm aware of.

nls_lower ?

Alberto Dell'Era, October 23, 2005 - 5:56 am UTC

There's NLS_LOWER:
</code> http://download-west.oracle.com/docs/cd/B14117_01/server.101/b10759/functions088.htm#sthref1476 <code>

In 10.2.0.1, database charset AL32UTF8:

dellera@ORACLE10> select 'match' from dual where nls_lower(unistr ('\041C'),'nls_sort=finnish') = unistr ('\043C');

'MATCH'
---------------
match

dellera@ORACLE10> select 'match' from dual where nls_lower(unistr ('\041C')) = unistr ('\043C');

'MATCH'
---------------
match

But even good-old lower seems to give the same results:

dellera@ORACLE10> select 'match' from dual where lower (unistr ('\041C')) = unistr ('\043C');

'MATCH'
---------------
match

Maybe the two-byte char got incorrectly inserted by the client as two single-byte characters ? That would be consistent with "both of the bytes making up this character have been correctly lowered".

Tom Kyte
October 23, 2005 - 1:39 pm UTC

right, but that is for when the character set of the client is "wrong", not what you wanted....


the character set stuff is designed to be "transparent" from the client perspective if everything is set up right..

Alberto Dell'Era, October 23, 2005 - 2:10 pm UTC

> the character set stuff is designed to be "transparent" from
> the client perspective if everything is set up right..

Yes, absolutely, that's what my trivial test case shows :)

I was just guessing that the issue of the poster is exactly that, say the client charset set wrong that turns N-byte chars into N chars when the string is inserted ;)

Markku Uttula, October 23, 2005 - 6:16 pm UTC

I finally figured out what I was doing wrong. As I suspected, there was a fundamental flaw in my approach...

I was actually attempting to query UTF8 encoded data from the database instead of using the characters themself. This caused the client to encode the data I was querying as UTF8, which naturally was not what I wanted.

Meaning... instead of actually using character /u041C I was using the UTF8 encoded form М which was correctly (but unexpectedly by me) inserted into the database as two different characters; /u00D0 and /u009C (I think)...

Special thanks for guiding me through this problem. Sorry for taking your time, but this seems to be the only way for me to live and learn :)

Tom Kyte
October 24, 2005 - 12:28 am UTC

.... but this seems to be the only way for me to live and learn ...

but isn't that true of us all.

Index on date columns

Vishal Tandon, October 25, 2005 - 9:29 am UTC

select *
from table1
where mod_date > extract_date

mod_date and extract_date are both the column of same table table1. This table has more than 10mil records. Can
Oracle9i use an index on these date columns. Can you recomend me some index on this SQL statement.

Tom Kyte
October 26, 2005 - 7:21 am UTC

create index t_idx on table1( mod_date-extract_date );


select * from table1 where (mod_date-extract_date) > 0;




Got a problem with character sets and case conversion ...

ASCII sufferer., February 01, 2006 - 8:49 am UTC

Hey Tom,

I was hoping I could get some insight from you on this ...
I feel I have a good idea of what's going on, and why .. but I'm not sure about a "good" solution to the problem.

I've recommended a number of times we "bite the bullet" and upgrade our character set ... but management is hesitant to do it ... so alas, we're kinda "stuck" where we are ... *sigh*

In any case, it's probably easier with a sample case than anything else ... (and I'm pretty sure this'll be consistent on an Oracle 10 database ... I recall seeing similar behaviour when we were testing for our upgrade ...)

The following is on Oracle 8i (8.1.7) ... version specs further down (including NLS stuff) ...

===================
Script:
=================
set pause off

drop table t;

create table t
( a varchar2(1) );

insert into t values ( chr(199) );

commit;

select * from t;

select version from v$instance;

select * from NLS_DATABASE_PARAMETERS;

select a, lower(a), nls_lower(a), nls_lower(a, 'nls_sort=french') from t;

select 199 - 128, chr(199-128), ascii(lower(chr(199-128))), chr(ascii(lower(chr(199-128)))) from dual;

select chr(199), chr(199+32) from dual;

select rnum, chr(rnum), chr(rnum + 32)
from ( select rownum rnum
from all_objects
where rownum < 256 )
where rnum between 192 and 223;
=========================
Output:
===============
Table dropped.


Table created.


1 row created.


Commit complete.


A
-
Ç

1 row selected.


VERSION
-----------------
8.1.7.4.0

1 row selected.


PARAMETER VALUE
------------------------------ ----------------------------------------
NLS_LANGUAGE AMERICAN
NLS_TERRITORY AMERICA
NLS_CURRENCY $
NLS_ISO_CURRENCY AMERICA
NLS_NUMERIC_CHARACTERS .,
NLS_CHARACTERSET US7ASCII
NLS_CALENDAR GREGORIAN
NLS_DATE_FORMAT DD-MON-YYYY
NLS_DATE_LANGUAGE AMERICAN
NLS_SORT BINARY
NLS_TIME_FORMAT HH.MI.SSXFF AM
NLS_TIMESTAMP_FORMAT DD-MON-RR HH.MI.SSXFF AM
NLS_TIME_TZ_FORMAT HH.MI.SSXFF AM TZH:TZM
NLS_TIMESTAMP_TZ_FORMAT DD-MON-RR HH.MI.SSXFF AM TZH:TZM
NLS_DUAL_CURRENCY $
NLS_COMP BINARY
NLS_NCHAR_CHARACTERSET US7ASCII
NLS_RDBMS_VERSION 8.1.7.4.0

18 rows selected.


A L NL NL
- - -- --
Ç g g g

1 row selected.


199-128 C ASCII(LOWER(CHR(199-128))) C
---------- - -------------------------- -
71 G 103 g

1 row selected.


C C
- -
Ç ç

1 row selected.


RNUM CH CH
---------- -- --
192 À à
193 Á á
194 Â â
195 Ã ã
196 Ä ä
197 Å å
198 Æ æ
199 Ç ç
200 È è
201 É é
202 Ê ê
203 Ë ë
204 Ì ì
205 Í í
206 Î î
207 Ï ï
208 Ð ð
209 Ñ ñ
210 Ò ò

RNUM CH CH
---------- -- --
211 Ó ó
212 Ô ô
213 Õ õ
214 Ö ö
215 × ÷
216 Ø ø
217 Ù ù
218 Ú ú
219 Û û
220 Ü ü
221 Ý ý
222 Þ þ
223 ß ÿ

32 rows selected.

==============
So yeah ... here's what I know (please correct me if I got any of this wrong:

1) Character Set: US7ASCII doesn't really support french character. That is, anything over ASCII code > 127.
2) Ç is showing as ASCII code 199, therefore is "converted" on the fly to - essentially - MOD(199, 128) = 71
3) ASCII code 71 = G; lower case G = g
4) This data was "forced" into our database over 10 years ago by developers who are no longer with us; we've kinda "inherited" this.
5) Our developers are currently trying to convert a string to lower case, which contains french characters ... and obviously, it's not working.
6) I've recommended to managers a number of times that we need to upgrade our character set (and that said conversion is going to be "messy" - to say the least). As trying to convert this stuff - and not lose any data is going to require some ... uh .. "finesse"?
7) In the meantime, an "immediate" workaround is needed .. I've noticed that a "shift" of 32 seems to work for most "upper case" characters that we have ... we considering writing out own "upper/lower" case routine to do this for us ...

I already know what you're going to say ...

"Yuck" .. ;p

But I was hoping for any other suggestions/etc., that you might have that could help us in avoiding any major issues as we move forward with this .... "thing" ...

Thanks in advance!!


Tom Kyte
February 02, 2006 - 3:52 am UTC

I have no good solution/ideas for you - your observations are correct. You've got 7bit characters that have had 8bit data stuffed into them. Any character set conversions are going to be extremely problematic.

That's what I was afraid of ...

A reader, February 02, 2006 - 7:56 am UTC

That's kewl ... thank-you ...
Just wanted to make sure there wasn't something I was missing .. :)

Just one more question:

spooling the data out via a query + spool command, then bringing it back into a converted database via sql*loader *should* work, correct??


Tom Kyte
February 02, 2006 - 12:07 pm UTC

unloading the data with NO CHARACTERSET conversion (nls_lang of client = database)

and reloading with proper 8bit characterset, should work.

don't know if I would use spool, more likely a custom program....
</code> http://asktom.oracle.com/~tkyte/flat/index.html <code>

Thank-you!!

A reader, February 02, 2006 - 1:43 pm UTC

Cannot thank-you enough for that little bit of info!!

Thank-you .. thank-you .. thank-you ...
uh ..

Thank-you ..
;)


Great solution

Jim, February 02, 2006 - 2:03 pm UTC

Doing some work trying to get some data out of an Oracle DB (US 7 bit ascii) where all sorts of things have been stuffed into it. The Vendor has put us in a sticky position (should give you a hint as to which vendor) since they recommend the client and database be on the same NLS settings so data can be forced in. (says to do it that way in their FAQ and not to worry becuase you shouldn't be sending the data to non-sticky systems via direct database access)

So your response above means we can't use imp and export, but do it to flat files correct?

Tom Kyte
February 02, 2006 - 2:34 pm UTC

import and export would be not good here because import and export are very careful to OBEY the characterset. conversion would have to kick in with them.

Geting the field names

Tony, February 03, 2006 - 12:37 pm UTC

Tom,

The following is the sql code for one of the value, What I want to is to take out all the column names only which are been used in this statment.

The columns which are used in the columns are coming from a single table.

Do you have any fine idea to get this...


CASE WHEN NVL(V_MED_PLASTIC_ID,'0') = '065' AND NVL(V_MED_GROUP_CODE,'~') NOT IN ('88000','88020','88029','88030','88037','88040','88045','88049','88050','88060','90020','90021','90022','90024','72031','72037','92050') AND SUBSTR(NVL(V_MED_GROUP_CODE,'~~'),1,2) NOT IN ('50','52','70','71','72') AND ( NVL(SUBSTR(DECODE(V_MED_DWH_USER_CODE,'MSN',NULL,V_MED_DWH_USER_CODE),2,1),'MSN') IN ('F','G','X')) THEN 'MAIL' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,'0') = '065' AND NVL(V_MED_GROUP_CODE,'~') NOT IN ('88000','88020','88029','88030','88037','88040','88045','88049','88050','88060','90020','90021','90022','90024','72031','72037','92050') AND SUBSTR(NVL(V_MED_GROUP_CODE,'~~'),1,2) NOT IN ('50','52','70','71','72') AND NVL(SUBSTR(DECODE(V_MED_DWH_USER_CODE,'MSN',NULL,V_MED_DWH_USER_CODE),1,3),'~') = '1RP' THEN 'DIRECT SALES VISA CLASSIC' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,'0') = '065' AND NVL(V_MED_GROUP_CODE,'~') NOT IN ('88000','88020','88029','88030','88037','88040','88045','88049','88050','88060','90020','90021','90022','90024','72031','72037','92050') AND SUBSTR(NVL(V_MED_GROUP_CODE,'~~'),1,2) NOT IN ('50','52','70','71','72') AND ( NVL(SUBSTR(DECODE(V_MED_DWH_USER_CODE,'MSN',NULL,V_MED_DWH_USER_CODE),3,1),'MSN') IN ('C')) THEN 'STUDENT' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,'0') = '065' AND NVL(V_MED_GROUP_CODE,'~') NOT IN ('88000','88020','88029','88030','88037','88040','88045','88049','88050','88060','90020','90021','90022','90024','72031','72037','92050') AND SUBSTR(NVL(V_MED_GROUP_CODE,'~~'),1,2) NOT IN ('50','52','70','71','72') AND NVL(SUBSTR(DECODE(V_MED_DWH_USER_CODE,'MSN',NULL,V_MED_DWH_USER_CODE),1,3),'~') = 'LPP' THEN 'PREEMBOSSED'
ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,'0') = '065' AND NVL(V_MED_GROUP_CODE,'~') IN ('88000','88020','88029','88030','88037','88040','88045','88049','88050','88060','90020','90021','90022') THEN 'CitiOne Free' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '065' AND NVL(V_MED_GROUP_CODE,'~') IN ('90024') THEN 'Visa Card Extra' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '065' AND SUBSTR(NVL(V_MED_GROUP_CODE,'~~'),1,2) IN ('50','52','70','71','72') AND NVL(V_MED_GROUP_CODE,'~') NOT IN ('72031','72037') THEN 'GROUP' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '065' AND NVL(V_MED_GROUP_CODE,'~') IN ('72031', '72037', '92050') THEN 'DUO' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '067' THEN 'HANSA ROSTOCK' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '068' THEN 'SV DUISBURG' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '069' THEN '1.FC KAISERSLAUTERN' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '070' THEN 'HAMBURGER SV' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '071' THEN 'BORUSSIA M''GLADBACH' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '072' THEN 'BORUSSIA DORTMUND' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '091' THEN 'BERLET' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '092' THEN 'INNOVA' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '093' THEN 'RUDER' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '095' THEN 'MEDIAMARKT' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '096' THEN 'SATURN' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '097' THEN 'TECHNOLAND' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '099' THEN 'HOERCO' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '100' THEN 'VOBIS' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '110'OR NVL(V_MBS_LOGO,0) = '123' THEN 'EBAY'
ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '111' THEN 'PREMIUM CLASSIC' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '113' THEN 'GUENSTIGER.DE' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) IN ('085','086','087','088','089','090') OR NVL(V_MBS_LOGO,0) = '042' THEN 'VISA BAHN' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '066' AND NVL(V_MED_GROUP_CODE,'~') NOT IN ('90027','90134') AND NVL(SUBSTR(DECODE(V_MED_DWH_USER_CODE,'MSN',NULL,V_MED_DWH_USER_CODE),1,3),'~') = '4RG' THEN 'DIRECT SALES VISA GOLD' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '066' AND NVL(V_MED_GROUP_CODE,'~') IN ('90027') THEN 'CITI BEST' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '066' AND NVL(V_MED_GROUP_CODE,'~') IN ('90134') THEN 'CITI GOLD' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '066'OR NVL(V_MBS_LOGO,0) = '035' THEN 'VISA GOLD OTHER' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '109' OR NVL(V_MBS_LOGO,0) = '135' THEN 'AADVANTAGE' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '102' OR NVL(V_MBS_LOGO,0) = '055' THEN 'VODAFONE RED' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '103' OR NVL(V_MBS_LOGO,0) = '056' THEN 'VODAFONE SILVER' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '104' OR NVL(V_MBS_LOGO,0) = '057' THEN 'VODAFONE GOLD' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '105' OR NVL(V_MBS_LOGO,0) = '058' THEN 'VODAFONE PLATINUM' ELSE CASE WHEN (NVL(V_MED_PLASTIC_ID,0) = '121' or NVL(V_MED_PLASTIC_ID,0) = '118') AND NVL(V_MED_GROUP_CODE,'~') NOT IN ('90027','90134') AND NVL(SUBSTR(DECODE(V_MED_DWH_USER_CODE,'MSN',NULL,V_MED_DWH_USER_CODE),1,3),'~') = '2RM' THEN 'DIRECT SALES MC CLASSIC'
ELSE CASE WHEN (NVL(V_MED_PLASTIC_ID,0) = '121' or NVL(V_MED_PLASTIC_ID,0) = '118') AND NVL(V_MED_GROUP_CODE,'~') NOT IN ('90027','90134') AND NVL(SUBSTR(DECODE(V_MED_DWH_USER_CODE,'MSN',NULL,V_MED_DWH_USER_CODE),3,1),'MSN') IN ('C') THEN 'MC STUDENT' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '122' AND NVL(V_MED_GROUP_CODE,'~') IN ('90027') THEN 'MC CITI BEST' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '123' THEN 'Ultima' ELSE CASE WHEN (NVL(V_MED_PLASTIC_ID,0) = '121' or NVL(V_MED_PLASTIC_ID,0) = '118') AND NVL(V_MED_GROUP_CODE,'~') NOT IN ('90027','90134') AND NVL(SUBSTR(V_MED_DWH_USER_CODE,1,3),'~') = 'DPM' THEN 'PREGENERATED' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '122' AND NVL(V_MED_GROUP_CODE,'~') NOT IN ('90027','90134') AND NVL(SUBSTR(DECODE(V_MED_DWH_USER_CODE,'MSN',NULL,V_MED_DWH_USER_CODE),1,3),'~') = '5RQ' THEN 'DIRECT SALES MC GOLD' ELSE CASE WHEN (NVL(V_MED_PLASTIC_ID,0) = '121' or NVL(V_MED_PLASTIC_ID,0) = '118') OR NVL(V_MBS_LOGO,0) = '059' THEN 'MASTERCARD CLASSIC OTHER' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '122' OR NVL(V_MBS_LOGO,0) = '060' THEN 'MASTERCARD GOLD OTHER' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '065' OR NVL(V_MBS_LOGO,0) IN ('022','023','024','032','132') THEN 'VISA CLASSIC OTHER' ELSE CASE WHEN NVL(V_MED_PLASTIC_ID,0) = '0' OR NVL(V_MBS_LOGO,0) = '0' THEN 'MSN' ELSE 'OTHERS' END END END END END END END END END END END END END END END END END END END END END END END END END END END END END END END END END END END END END END END END END END END END END


Back here again? :)

Markku Uttula, February 03, 2006 - 1:35 pm UTC

For some reason, always when I'm having trouble with character sets and I google to get a solution, I always end up in this thread. And very often, I find the solution (or a path to a solution) in this thread.

However, I have a few questions on some things in here...

"You remove the hints when using the CBO so the CBO can do its job."

Have I misunderstood something? If we want to use CBO, are hints considered more than "hints"? I've understood that they are not processing directives for the engine, and can be dismissed if a better path for execution is found?

Then another thing for anyone who has trouble with character sets; I found Oracle's NLS_LANG FAQ very usefull. It can be found at </code> http://www.oracle.com/technology/tech/globalization/htdocs/nls_lang%20faq.htm <code>

I think there's a lot to globalisation, and people are very often chosen wrong paths to doing things (like me; I had my NLS_LANG set to .AL32UTF8 since it allowed me to do many things like I supposed them to work. However, I was wrong, and had to fix a lot of things).

Another question (in addition to CBO & Hints) ... currently I have my web server's DADs set up as follow (in regards to globalisation needs):

nls_lang = .AL32UTF8
cgi_env_list = REQUEST_CHARSET=UTF-8,REQUEST_IANA_CHARSET=UTF-8

This seems to work fine, but Apache warns me any time it's started that: "[alert] mod_plsql: Wrong language for NLS_LANG .AL32UTF8 for ... DAD". Should I provide it with Language and territory too? I understood that what I've given it would be sufficient (after all, I don't know the actual language and/or territory of a connecting user, so we can as well use the defaults in this case).

And finally, I definetly must send my congrats to Tony; I haven't seen a "turboquery" like the one you presented in a very long time :) I honestly am *not* making fun of you - I understand that sometimes doing things like that *are* necessary, but I wouldn't like to be the developer who it gets thrown into without a warning... hope it's documented thoroughly... for the sake of mental healthness of "the next developer".

Tom Kyte
February 03, 2006 - 5:25 pm UTC

hints, if well formed and expressed fully, will be obeyed. It is when they are only halfway specified, or incorrectly specified - they can appear to be "ignored".

So the answer is yes, no, maybe - depends on how correctly and strongly the hint was specified.



Geting the field names

Tony, February 05, 2006 - 3:50 am UTC

Tom,

Can you please please guide me on this?

Please see the message mentioned above from me.

Thx
Tony

Tom Kyte
February 06, 2006 - 12:35 am UTC

I see a really big huge query above that I don't have time to digest and comprehend whilst quickly reading what should be reviews. It is *huge* and a problem in PARSING SQL - you need an algorithm, a program - not really something I can just say "use database feature X".

You have to write some code I guess.

Problem will be that "me, as a human being, with just that text - I cannot tell you what the column names are"

....
CASE WHEN NVL(V_MED_PLASTIC_ID,'0') = '065'


is v_med_plastic_id a database column - maybe yes, maybe no, just maybe.

You would have to write a sql parser, a non-trivial sort of operation.

Geting the field names

Tony, February 07, 2006 - 7:22 am UTC

Tom,

I am really sorry if I wasted your valuable time.

What exactly I want is,
If u have any CASE stamen/Decode or any SQL expression,
from that expression if I want to know what are all database columns are used then how can I do that?

Exa : CASE when empid=7788 then 'Yes' else 'No end

In the above CASE statement we used empid which is database field. we can know because this is very small case statement.

But is there any way we can get this information for big sql statements where lots of database fields are used?

Please guide me.

Thanks in advance.

Tony


Tom Kyte
February 08, 2006 - 12:48 am UTC

you didn't waste my time, I didn't really spend much of it.

...
Exa : CASE when empid=7788 then 'Yes' else 'No' end
.....

I cannot say that empid is a database column. You cannot say it definitely either. It could be a function, it could be a plsql variable.

But anyway - like I said, you would have to write a SQL parser to do this!

something that could take any CASE statement, tokenize it, and they try to figure out what are POSSIBLE database fields and what are definitely not.

What is your real need here, how about you tell us why you want to do this, maybe we can help you achieve your goal that way (writing a parser, that won't happen)

Geting the field names

Tony, February 08, 2006 - 3:51 am UTC

Tom,

Thanks,

I am creating data dictionary.
Which will show me the data as:

Dimension_nm |SQL_Expr |T_Dt_Base_fld |Row_Fld
-------------|-------------|--------------|-----------
DIM_PROD |CASE WHEN ...| V_T_PROD |V_PROD
..........
.......

------------------------------------------------------

In this :
Dimension nm : Dimension Name
SQL_Expr : The SQL expression which is used to build the Dimension
T_Dt_Base_fld : The Fields used in the SQL Expression
Row_Fld : The Row table field from which the T_Data_Base_field is created this can be done as there is direct link between the row table field and T_Dt_Base_fld


This is what my requirement is. As the Dimension sql expressions are too long and it is almost more then 5000 dimensions and more dimensions are yet to create in future so please give me some suggestions / guidance how to achieve this.

Thanks in advance.

Tony


Tom Kyte
February 08, 2006 - 8:08 am UTC

sorry, only thing you can do is "write a sql parser". There is no magic for this one that I know of.

You have an arbitrary sql string, you would need to parse it.

Geting the field names

Tony, February 09, 2006 - 3:48 am UTC

Tom,

Thanks Tom Thanks for your time.,

Can you please help me to write the SQL parse for this?

Just give me the idea and the steps involved in it.

Thanks in advance

Tony

Tom Kyte
February 10, 2006 - 10:54 am UTC

nope, I don't happen to have a sql parser lying about. They are quite large pieces of code to write actually....

Geting the field names

Tony, March 13, 2006 - 3:46 am UTC

Hi Tom,

I tried to get some information about the sql parser but not so successful..

Can you guide me where or from which book I can get this information?

Many thanks in advance.

Tony

Tom Kyte
March 13, 2006 - 10:05 am UTC

what information about the parser do you want??


I don't know what you mean - your subject seems to say "I would like to get the names of the columns in the query" - if so, you'd need to say "what language" you are using so one could point you to the correct API call for your language.

After you prepare (parse) a sql statement, there are typically API calls in your programming language you can make.

Geting the field names

TONY, March 14, 2006 - 4:22 am UTC

Tom,

I am using oracle. It is SQL.

I am looking for
1) Identify various database objects such as table, field, function in sqls.
2) Continuously check multiple SQL Statements even if there are syntax errors in some of these statements.

Example :
If in my database empno,ename are the database fields.

then in the statment
"Case when empno=10 then 'Go' when ename = 'TONY' then 'Cancel' end "

SO If i want all the database objects from the above query I need the sql parser.

Can you please tell me from which book i can get this information.

Many thanks in advance.

TONY





Tom Kyte
March 14, 2006 - 10:55 am UTC

you must be using some "programming language" other than SQL - since SQL is not a procedural language.


I believe we've had this discussion before though - you want to get the "column names where ever they appear in a sql statement"

And as I said before "no such feature exists in the database, you would have to write your OWN parser and resolver using whatever language YOU choose"

We can take a query in the form

select <stuff>
from ....

and tell you the column names <stuff> represents for the result set, but there is nothing to tell you the names of every referenced column in that query.

Geting the field names

Tony, March 15, 2006 - 5:34 am UTC

Tom,

Ohh sorry tom ,

Actually there is a portal where users will see the data dictionary.

Actually I am not fully aware of what language to use to get this stuff done.

Java is one of the option.

First I need to understand the approach , then I can decide on which language to be used.

Any how thanks very much for you help.

Tony

Tom Kyte
March 15, 2006 - 5:03 pm UTC

you are programming this portal. Whatever language you are using to program this portal must have some SQL api (else - we wouldn't even be having this discussion).




Table Access by Rowid

Su Baba, April 14, 2006 - 1:28 pm UTC

I'm simulating an example from one of your books. This is used for case insensitive search on multiple columns. In the excution plan generated below, I was surprised to find that Oracle does not try to access table x by using rowids it retrieved from fast_x. Is hash join a really much more efficient way to accomplish the task?

Also is this still the way to do case insensitive search with a predicate like "LIKE '%xxx%'" in Oracle9i? In Oracle10g?


SQL> select * from v$version;

BANNER
------------------------------------------------------
Oracle9i Release 9.2.0.7.0 - Production
PL/SQL Release 9.2.0.7.0 - Production
CORE    9.2.0.7.0       Production
TNS for 32-bit Windows: Version 9.2.0.7.0 - Production
NLSRTL Version 9.2.0.7.0 - Production


CREATE TABLE x AS
SELECT * FROM all_objects;

ANALYZE TABLE x COMPUTE STATISTICS;

CREATE TABLE fast_x
PCTFREE 0
CACHE
AS
SELECT LOWER(object_type) || '/' || LOWER(object_name) || '/' || LOWER(owner) search_string,
       rowid row_id
FROM   x;

ANALYZE TABLE fast_x COMPUTE STATISTICS;


EXPLAIN PLAN FOR
SELECT object_id, owner, object_type, object_name
FROM   x
WHERE  rowid IN (SELECT row_id
                 FROM   fast_x
                 WHERE  search_string LIKE LOWER('%mem%'));

select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------

--------------------------------------------------------------------
| Id  | Operation            |  Name       | Rows  | Bytes | Cost  |
--------------------------------------------------------------------
|   0 | SELECT STATEMENT     |             |  1216 |   168K|    71 |
|*  1 |  HASH JOIN           |             |  1216 |   168K|    71 |
|   2 |   SORT UNIQUE        |             |       |       |       |
|*  3 |    TABLE ACCESS FULL | FAST_X      |  1216 | 60800 |    19 |
|   4 |   TABLE ACCESS FULL  | X           | 22642 |  2034K|    35 |
--------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("X".ROWID="FAST_X"."ROW_ID")
   3 - filter("FAST_X"."SEARCH_STRING" LIKE '%mem%')

Note: cpu costing is off

18 rows selected.


 

Tom Kyte
April 14, 2006 - 1:55 pm UTC

there are other ways (such as using text and a substring prefix index).

In this case - it is thinking it will get 1,216 out of 22,642 rows - and decided to full scan. you may have to first rows optimize this instead of the default all rows (used by the cbo with choose or all rows)

Su Baba, April 14, 2006 - 2:10 pm UTC

Doesn't a text index require re-synching? Would that be appropriate for a highly transactional system?

Tom Kyte
April 14, 2006 - 4:46 pm UTC

that is one of the issues with it.

but in a transactional system, I don't see you searching like this anyway - sure it is "highly transactional"

see
</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:37336026927381 <code>


substring prefix index

Su Baba, April 14, 2006 - 2:18 pm UTC

Oh...can you also talk about substring prefix index? I just did a search on this site and found nothing on this topic. Thanks.

Coverting to Upper Case

Jay, May 07, 2007 - 2:09 pm UTC

Hello Tom,

Good day to you! I have a query that has a parameter. The user can enter multiple comma separated values as a parameter.

Eg: Where State_Code in ('dc','va','ky')

Could you please give me an idea to convert all of the values to upper case? Is this possible as Upper function requires only one string to be present.

Upper('va') works.
Upper('va','dc') fails :-(

Thanks in advance for your time and your help.

Have a good day.




Tom Kyte
May 10, 2007 - 7:04 pm UTC

when you have a quote in a string, you use

a) two quotes
b) new 10g feature for quoted identifiers
c) BINDS
d) BINDS
e) BINDS instead of literals.


if you use binds, NO PROBLEM AT ALL.

if you use literals, you have to obviously escape the embedded quotes.

so, use binds.

Converting list of values to upper case

András Gábor, May 11, 2007 - 8:01 am UTC

You could use a subquery to select all values from a collection and convert them one by one like this:

SQL> SELECT UPPER(column_value) 
  FROM TABLE(sysman.varchar2_table('dc','va','ky'))
;

UPPER(COLUMN_VALUE)
-------------------
DC
VA
KY

3 rows selected


If you don't use 10g or just don't like the type SYSMAN.VARCHAR2_TABLE, you can creat your own an use that:

CREATE TYPE varchar2_table IS TABLE OF VARCHAR2(<max_length>)
/

Note: I actually found the sysman-owned type by looking for an appropriate one in DBA_SOURCE and DBA_TYPES views:

SELECT *
  FROM dba_source
 WHERE (owner, name) IN
        (SELECT owner, type_name
           FROM dba_types
          WHERE typecode = 'COLLECTION'
            AND owner LIKE 'SYS%')
   AND LOWER(text) LIKE '%varchar%'
 ORDER BY 1, 2, 3, 4
;

Thank you!

Jay, May 11, 2007 - 9:28 am UTC

Thanks a lot for your time Tom and András Gábor.

Have a good day!

case insensitive search,

A reader, October 25, 2011 - 10:10 am UTC

Hello,

We have a requirement to perform both regular and case-insensitive search on the same set of data. The application will let the database know about the choice the user has picked (case sensitive or insensitive) and it has to return the data accordingly.

I am confused on how to provide this kind of solution. It appears a column now requires two indexes: 1. regular and 2. Function based. The regular index will be used by regular search and Function based index for case-insensitive search.

Question: Can a column have two indexes? Indexes are expensive. Is it a good to idea to have two indexes on a column? How can we make CBO understand to use FBI instead of regular index when case-insensitive search happens?

I want to see any other solution is available to the business requirement.

Thanks,

Tom Kyte
October 25, 2011 - 12:06 pm UTC

Can a column have two indexes?

no, but fortunately you will be ok.

create index i1 on t(c);
create index i2 on t(nlssort(c,...));

are on two different expressions - so it works out ok

ops$tkyte%ORA11GR2> create table t ( data varchar2(20) );

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> insert into t values ( 'Hello' );

1 row created.

ops$tkyte%ORA11GR2> insert into t values ( 'HeLlO' );

1 row created.

ops$tkyte%ORA11GR2> insert into t values ( 'HELLO' );

1 row created.

ops$tkyte%ORA11GR2> insert into t values ( 'hello' );

1 row created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> create index t_idx1 on
  2  t( nlssort( data, 'NLS_SORT=BINARY_CI' ) );

Index created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> create index t_idx2 on t(data);

Index created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> variable x varchar2(25)
ops$tkyte%ORA11GR2> exec :x := 'hello';

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> set autotrace on explain
ops$tkyte%ORA11GR2> select * from t where data = :x;

DATA
--------------------
hello


Execution Plan
----------------------------------------------------------
Plan hash value: 3020041354

---------------------------------------------------------------------------
| Id  | Operation        | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT |        |     1 |    12 |     1   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| T_IDX2 |     1 |    12 |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("DATA"=:X)

Note
-----
   - dynamic sampling used for this statement (level=2)

ops$tkyte%ORA11GR2> alter session set nls_comp=ansi;

Session altered.

ops$tkyte%ORA11GR2> alter session set nls_sort=binary_ci;

Session altered.

ops$tkyte%ORA11GR2> select * from t where data = :x;

DATA
--------------------
Hello
HeLlO
HELLO
hello


Execution Plan
----------------------------------------------------------
Plan hash value: 4055207394

--------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |     1 |    99 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T      |     1 |    99 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T_IDX1 |     1 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access(NLSSORT("DATA",'nls_sort=''BINARY_CI''')=NLSSORT(:X,'nls_sort=''
              BINARY_CI'''))

Note
-----
   - dynamic sampling used for this statement (level=2)

ops$tkyte%ORA11GR2> set autotrace off





Indexes are expensive.

that is far far far too simplistic. You cannot think of it in those terms.


If you have a 1,000,000 row table and you need to find 10 rows from it quickly using few resources - an index will be the least expensive approach won't it?

It is a tradeoff - a bit of a hit on modification in order to retrieve better. Remember - you insert a row once and hopefully you query it over and over and over again and again and again....

follow up,

A reader, October 25, 2011 - 12:50 pm UTC

That was a very good example. thanks.

So the setting of nls_comp and nls_sort is for the CBO know to pick FBI? Are there any other option (without setting nls parameters dynamically)?

Appreciate your help.

Thanks,

Tom Kyte
October 25, 2011 - 1:12 pm UTC

you could just write the query to use the nlssort function directly.

ops$tkyte%ORA11GR2> select * from t where nlssort(data,'nls_sort=binary_ci') = nlssort(:x,'nls_sort=binary_ci' );

DATA
--------------------
Hello
HeLlO
HELLO
hello


Execution Plan
----------------------------------------------------------
Plan hash value: 4055207394

--------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |     1 |    99 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T      |     1 |    99 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T_IDX1 |     1 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access(NLSSORT("DATA",'nls_sort=''BINARY_CI''')=NLSSORT(:X,'nls_sort=''
              BINARY_CI'''))



but most people would find it easier to set a setting in the application rather than have to figure out "should I run this query or that query"

The alter makes it quite "transparent" so you could use a single set of code and just flip a switch on and off when you wanted to.

like search

santosh, March 29, 2012 - 9:17 am UTC

We have a case where we application triggers a like search for an exact match.

Similar to the below queries

select * from emp where empno like '%123%'
and field1..=
and field2..=
...
...

select * from emp where empno = 123
and field1..=
and field2..=
...
...

Ideally, generated query should be like the second query below instead of first.

Only now we have perfromance problem.Query 2 returns quickly compared to Query1.
users never encountered issues before.

Querly 1 goes for index range and 2 goes for index skip scan (on some other scan) returnign no
output.

My assumption is users had faster performance only because they probably had small percentage of
records earlier and now,we have none and we need to wait for full scan of index.

More to Explore

PL/SQL demos

Check out more PL/SQL tutorials on our LiveSQL tool.

PL/SQL docs

PL/SQL reference manual from the Oracle documentation library