Skip to Main Content

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, Madhusudhana Rao.

Asked: March 23, 2001 - 11:29 pm UTC

Last updated: June 01, 2018 - 7:29 am UTC

Version: Oracle 8.1.6

Viewed 10K+ times! This question is

You Asked

Hi Tom,

In comparison to an Index on CHAR field and Index on NUMERIC field
pls let me know which Index(whether CHAR or NUMERIC) would be faster.

Thanx & Regards,
Madhusudhana Rao.P

and Tom said...

They would be the same. Numbers are stored in a packed decimal like format in Oracle -- they are character strings.

I would suggest not using CHAR but varchar2 instead as a CHAR is simply a varchar2 that is always blank padded out to the maximum length (there is no efficiency of char over varchar2, char just takes more space)

Rating

  (8 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Index On Char Column

Ron, April 01, 2002 - 3:14 pm UTC

Tom,
I have a char(5) column in my Table, but when we are storing the variable length data in this column like 'ABC','ABCD','ABCDE','AB'.We have an index on this column and most of our selects are based on this column, As I am using TRIM function on this column it is not using the index.I know changing it to varchar2 is one solution.But I can not do it now..Is there any alternate way to force the sql to use the index on this column..

Thanks in advance..

Ron

Tom Kyte
April 01, 2002 - 8:36 pm UTC

read
</code> http://asktom.oracle.com/~tkyte/article1/index.html <code>

char's stink, don't use them again.

Index On Char Column

Ron, April 01, 2002 - 3:18 pm UTC

Sorry I missed the Version in the above question.
My Database version is 8.0.5

Tom Kyte
April 01, 2002 - 8:38 pm UTC

Oh, sorry then, you need 8i EE and up for my suggestion above.

You'll be wanting to code:

select * from table where that_column = rpad( :bind_variable, 5 );

in the meanwhile. that'll let you use the index (rpad the thing to be compared to, don't rtrim the column in the table)

Dave, April 01, 2002 - 11:11 pm UTC

Is it worth noting that in the Performance Guide and Methods at ...
</code> http://206.204.21.139/otndoc/oracle9i/901_doc/server.901/a87503/data_acc.htm#2416 <code>
... under "Performance tips for Bitmap Indexes" it states that "Fixed-length datatypes are more amenable to a compact bitmap representation than variable length datatypes."?

I'll admit that I haven't benchmarked any difference but I'm guessing that Oracle developers must have done so in order to make such a statement.

Incidentally, just above this it states that ...

"...a B-tree index on (marital_status, region, gender) is useless for queries that only access region and gender....". Ooops.

Tom Kyte
April 02, 2002 - 8:08 am UTC

The entire thing is:

<quote>
Performance Tips for Bitmap Indexes

To get optimal performance and disk space usage with bitmap indexes, consider the following tips:

* To make compressed bitmaps as small as possible, declare NOT NULL constraints on all columns that cannot contain null values.

* Fixed-length datatypes are more amenable to a compact bitmap representation than variable length datatypes.

This is because Oracle needs to consider the theoretical maximum number of rows that will fit in a data block when creating bitmap indexes.
</quote>

Amenable is a "fuzzy term" if you ask me. They are trying to save space in a bitmap index (small thing) in a datawarehouse (big thing). With the bitmap index -- the internal algorithms can make the best use of the bitmaps (utilization) with a fixed field (the math is easier) but the trade off is -- the field itself is "wider"

Tell me -- would you use char(80) on a department name field where the average size is 30 characters and you have a billion rows in the table? I myself would use varchar2(80) and reap the savings in size in the table itself. The bitmap index on the char(80) would necessarily be larger as well if you think about it.

You can benchmark this with a bitmap index on a varchar2 and another on the same field using rpad( , some_fixed_length ) to see which is better for you. I didn't see any noticable differences (other then the rpad'ed one was bigger)

And yes, it looks like they forgot to update the perf guide with respect to index skip scanning.

Dave, April 02, 2002 - 8:47 am UTC

Indeed, the use of a Char data type to store 80 characters in the example you give is crazy talk.

I think you're unlikely to try and use such a field width as a key value in a data warehouse fact table of course. IMO -- you're not going to start a campaign against the use of that, are "u" ;) -- a Char data type is a reasonable alternative for short (say less than 10 characters) key values.

Tom Kyte
April 02, 2002 - 10:20 am UTC

I disagree. I would NEVER use char(10) in any condition.

I do not believe it is reasonable in any circumstance, no.

Why is a char(80) or a char(30) or a char(20) "crazy". I'm not talking crazy -- I'm talking "i have an 80 character field that has low cardinality that people frequently query on".

It is NOT a key field, (key fields would not be candidates for bitmaps would they). It is an attribute. Sure, you could factor out the attribute to a fact table -- but you still need to index it, and teach the people how to join to it and so on.


Dave, April 02, 2002 - 2:38 pm UTC

Well there's two issues here. Would you use Char data types or VarChar2's in general, and is there any difference in performance of bitmap indexes on the two types. I take it for granted that the second point is not relevant if a bitmap index is inappropriate to the data.

The amount of space you waste with a Char varies, of course, from near to num_rows*column_length at one extreme to zero at the other, and in the example you gave it is obviously such a large wastage that I contend that it would indeed be crazy to pick Char(80) over VarChar2(80), where you would be wasting 50bytes * 1,000,000,000. Specifying Char(10) would be approaching crazy (or "bad practice" if that's too strong a word) if your average length was 5, and much less significant if your average length was 9. If it's a choice between a varchar2(2) and a Char(2) I'd care a lot less. The craziness here has nothing to do with the cardinality, only the space wastage.

So it's a matter of diminishing returns as the average length approaches the maximum, and I would consider whether -- for a column on which it is appropriate to use a bitmap index -- there is a point at which increased efficiency in the bitmap index structure overcomes the waste of disk space in the table. "Never" is too strong a word for my taste, and it seems trivial to suggest that a char(10) is indeed reasonable if your average length is 10.



Tom Kyte
April 02, 2002 - 3:14 pm UTC

The craziness is that people NEVER remember to pad out the things to be compared to, so they run:

select * from t where fixed_length_column = :x;

Now, you have a problem -- that finds no data. It is non-intuitive and makes the application dependent on the field length - that should be coded as:

fixed_length_column = rpad( :x, 10 );

but later, someone makes that field char(11) or char(15) and now the apps broken.

I would NEVER use a char, period. Never, the data errors that pop up due to improper use of that type on retrieval have wasted more of my cycles then I could ever pick up in savings on a bitmap index ;) (eg: I've wasted more time explaining why char's are bad, why they do what they do, why the query in question returns the wrong answer then I could ever recoup from using them...)

I rarely say never -- but this is one of the times when I do. Never.

Dave, April 03, 2002 - 8:33 am UTC

Well, time spent educating is never time wasted. That's what I tell functional staff anyway, and I appreciate you taking the time on this.

So sorry to keep on like this, but I must be doing something different here as I can't reproduce your case, which as I understand it is that Char bind variables must be RPad'ed to the correct length to match Char column types, but VarChar2 bv's do not in order to match VarChar2 column types.

Here's what I did ...

SQL> set sqlprompt '' feedback on sqlcontinue '' sqlnumber off
/* Create a test table with sample rows */
create table char_test (fixed_length_column char(4), variable_length_column varchar2(4));

Table created.

insert into char_test values ('','');

1 row created.

insert into char_test values ('AA','AA');

1 row created.

select * from char_test;

FIXE VARI
---- ----

AA   AA

2 rows selected.

/* See how SQL handles different cases */
select * from char_test where fixed_length_column = '';

no rows selected

select * from char_test where variable_length_column = '';

no rows selected

select * from char_test where fixed_length_column = 'AA';

FIXE VARI
---- ----
AA   AA

1 row selected.

select * from char_test where variable_length_column = 'AA';

FIXE VARI
---- ----
AA   AA

1 row selected.

/* See if it's different with defined data types */
variable v_fixed char(4)
variable v_variable varchar2(4)
variable v_fixed_row_count number
variable v_variable_row_count number

begin 
:v_fixed := '';
:v_variable := '';
execute immediate 'select count(0) from char_test where fixed_length_column = :s_string'
   into :v_fixed_row_count
   using :v_fixed;
execute immediate 'select count(0) from char_test where variable_length_column = :s_string'
   into :v_variable_row_count
   using :v_variable;
end;
/

PL/SQL procedure successfully completed.

print :v_fixed_row_count

V_FIXED_ROW_COUNT
-----------------
                0

print :v_variable_row_count

V_VARIABLE_ROW_COUNT
--------------------
                   0

begin 
:v_fixed := 'AA';
:v_variable := 'AA';
execute immediate 'select count(0) from char_test where fixed_length_column = :s_string'
   into :v_fixed_row_count
   using :v_fixed;
execute immediate 'select count(0) from char_test where variable_length_column = :s_string'
   into :v_variable_row_count
   using :v_variable;
end;
/

PL/SQL procedure successfully completed.


print :v_fixed_row_count

V_FIXED_ROW_COUNT
-----------------
                1

print :v_variable_row_count

V_VARIABLE_ROW_COUNT
--------------------
                   1


begin 
:v_fixed := 'AA ';
:v_variable := 'AA ';
execute immediate 'select count(0) from char_test where fixed_length_column = :s_string'
   into :v_fixed_row_count
   using :v_fixed;
execute immediate 'select count(0) from char_test where variable_length_column = :s_string'
   into :v_variable_row_count
   using :v_variable;
end;
/

PL/SQL procedure successfully completed.


print :v_fixed_row_count

V_FIXED_ROW_COUNT
-----------------
                1

print :v_variable_row_count

V_VARIABLE_ROW_COUNT
--------------------
                   0

So ...

Case1
Char(4)'' doesn't find Char(4)''
VarChar2(4)'' doesn't find VarChar(4)''

Case2
Char(4)'AA' finds Char(4)'AA'
VarChar2(4)'AA' finds VarChar2(4)'AA'

Case3
Char(4)'AA' doesn't find VarChar(4)'AA'
VarChar2(4)'AA' doesn't find Char(4)'AA'

Case4
Char(4)'AA ' finds Char(4)'AA'
VarChar2(4)'AA ' doesn't find VarChar2(4)'AA'

To summarize, it seems to me that it is the VarChar2 that is more difficult to handle (based on Case4). I discount Case 3 because in PL/SQL the variable should be defined as ...
   v_variable char_test.fixed_length_column%Type;
... anyway, thus getting the same type and length even if the column definition changes.

So, is there an init parameter thing going on here? Is it an ANSI compatibility thing? 

Tom Kyte
April 03, 2002 - 9:51 am UTC

ops$tkyte@ORA8I.WORLD> create table t ( x varchar2(5), y char(5) );

Table created.

ops$tkyte@ORA8I.WORLD> variable a varchar2(5)
ops$tkyte@ORA8I.WORLD> exec :a := 'a'

PL/SQL procedure successfully completed.

ops$tkyte@ORA8I.WORLD> 
ops$tkyte@ORA8I.WORLD> insert into t values ( 'a', 'a' );

1 row created.

ops$tkyte@ORA8I.WORLD> 
ops$tkyte@ORA8I.WORLD> select * from t where x = :a;

X     Y
----- -----
a     a

ops$tkyte@ORA8I.WORLD> select * from t where y = :a;

no rows selected

ops$tkyte@ORA8I.WORLD>

There is the simple example.  99.999% of all programs will bind using a VARCHAR -- not a char (odbc, jdbc, proc, oci, whatever -- especially C based things with null terminated strings).

 

Dave, April 03, 2002 - 10:06 am UTC

Ah hah, I see now. So if there's not a mechanism for declaring a variable to inherit the data type of the column, or there is some potential for this in the future, then VarChar2's are pretty much mandatory. It all makes sense to me now.

I've previously had problems with supporting OLTP apps that used VarChar2's, as bad coding by third-party COTS suppliers was allowing trailing spaces to be appended unnecessarily to character strings, something that's not an issue with Char's. It's been quite a few years since I worked on OLTP.

Tom Kyte
April 03, 2002 - 10:24 am UTC

In PLSQL there is a mechanism

table.column%type


In a 3gl such as C, Java, "vb" and so on -- this is impossible. If someone changed the database, you would have to RECOMPILE the apps with the changed definition.

A reader, May 31, 2018 - 9:52 am UTC

If there is an index on numeric column ,will use of quotes impact performance in indexed column having datatype as number ?
Connor McDonald
June 01, 2018 - 7:29 am UTC

You can see that we perform a conversion of the string to a number during optimization, so ultimately it appears to be the same query

SQL> create table t as select * from dba_objects;

Table created.

SQL>
SQL> set autotrace traceonly explain
SQL>
SQL> select * from t where object_id = 1234;

Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   132 |   447   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |     1 |   132 |   447   (1)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("OBJECT_ID"=1234)

SQL>
SQL> select * from t where object_id = '1234';  <<=== quotes

Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   132 |   447   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |     1 |   132 |   447   (1)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("OBJECT_ID"=1234)   <<=== no quotes


But having said that - I don't like this kind of mixing of datatypes. What if one day we change our approach, so the optimizer would change it internally to:

where to_char(object_id) = '1234'

Then you might be impacted etc. Keep your datatypes clean.