Skip to Main Content

Breadcrumb

Easter

Question and Answer

Tom Kyte

Thanks for the question.

Asked: April 09, 2002 - 9:38 pm UTC

Answered by: Tom Kyte - Last updated: June 01, 2006 - 9:04 am UTC

Category: Database - Version: 901

Viewed 10K+ times! This question is

You Asked

Hi Tom

I am using Oracle 901 on solaris.
My question is as folllows:
Between the Number and Varchar datatype which one is better performance wise . What are the pros & cons of having numbers or char/varchar as primary/joining keys ? Is the conception of searches on number types faster than varchar TRUE?


Thanks





and we said...

You want my opinion --

you should choose a datatype based on the data that is to be contained therein, not based on anything else.

Are you going to put numbers in there? then by all means - -use a number.

Is it going to be a date? better use a DATE or TIMESTAMP.

Can it be an arbitrary string of letters, numbers, etc -- use a varchar2.


Numbers are stored as varying length strings in the database. You'll find in general it is pretty much the same using either/or. Pick the datatype SOLELY based on the data contained therein.

and you rated our response

  (2 ratings)

We're not taking reviews currently, so please try again later if you want to add a review.

Reviews

Followup

April 10, 2002 - 7:37 pm UTC

Reviewer: A reader

In one of your articles on keeping the varchar2 datatypes not very huge such as varchar2(4000) you stated.
Quoting your example :
"Ok, so the code prepares a query that select 10 columns that are varchar2.
The developer, for performance would like to array fetch (very important). They would like to array fetch say 100 rows (very very typical). So, we have 4000
times 10 times 100 = alsmost 4meg of RAM the developer must allocate!!"

Now if we have a number datatype " number " which has a default precision of 38. It would to the
maximum take 22 Bytes.
And if we have a column varchar2(38) it would 38KB.

If we do the above array fetch 22*10*100= 22KB
38000*10*100= 38MB of RAM

So this is a huge difference in the amount of memory being allocated to a number vs varchar2 datatype.

Please comment if Im right in stating the above.

Thanks

Tom Kyte

Followup  

April 10, 2002 - 8:48 pm UTC

I was talking PRO*C and other 3gls and query tools -- not plsql in that case.

with PLSQL, it will dynamically allocate just enough to hold what it needs. Still NO EXCUSE not to define the columns with meaningful, proper lengths.


I don't get your math though. a varchar2(38) would be 38 bytes -- not 38Kb



A number and varchar2 data type in the where condition

May 31, 2006 - 4:03 pm UTC

Reviewer: A reader

Tom,

We have a query that has the following where condition

select ...
from ...
a.col1 = b.col2
and ...

where a.col1 is a number and b.col2 is varchar2(20)

Assuming we are getting 5% of the data from the tables will indexes present on columns a.col1 and b.col2 be used?
Will function basaed indexes help in this situation - a function based index either on to_char(a.col1) or to_number(b.col2)

Appreciate your comments on this topic.

Thank you


Tom Kyte

Followup  

June 01, 2006 - 9:04 am UTC

impossible to comment.

need a better, more complete example.

for I can come with examples such that

a) yes, it would be useful
b) no, it would not be useful
c) it really would not matter one way or the other


there are only two answers to all questions!!!

answer 1: "why"
answer 2: "it depends"


ops$tkyte@ORA10GR2> create table a ( x number constraint x_unique unique, col1 number constraint col1_unique unique );
Table created.

ops$tkyte@ORA10GR2> create table b ( y number constraint y_unique unique, col2 varchar2(20) constraint col2_unique unique  );
Table created.


Suppose the query is:


ops$tkyte@ORA10GR2> set autotrace traceonly explain
ops$tkyte@ORA10GR2> select *
  2    from a, b
  3   where a.x = 10
  4     and b.y = 11
  5     and a.col1 = b.col2
  6  /

Execution Plan
----------------------------------------------------------
Plan hash value: 2215097824

-------------------------------------------------------------------------------
| Id  | Operation                    | Name     | Rows  | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |          |     1 |    51 |     2   (0)|
|   1 |  NESTED LOOPS                |          |     1 |    51 |     2   (0)|
|   2 |   TABLE ACCESS BY INDEX ROWID| B        |     1 |    25 |     1   (0)|
|*  3 |    INDEX UNIQUE SCAN         | Y_UNIQUE |     1 |       |     1   (0)|
|*  4 |   TABLE ACCESS BY INDEX ROWID| A        |     1 |    26 |     1   (0)|
|*  5 |    INDEX UNIQUE SCAN         | X_UNIQUE |     1 |       |     0   (0)|
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("B"."Y"=11)
   4 - filter("A"."COL1"=TO_NUMBER("B"."COL2"))
   5 - access("A"."X"=10)


<b>Unlikely to consider an index on COL1/COL2 - when you have such nice alternatives on X and Y..

Or maybe you do this:</b>


ops$tkyte@ORA10GR2> select *
  2    from a, b
  3   where a.col1 = b.col2
  4     and b.y = 5
  5  /

Execution Plan
----------------------------------------------------------
Plan hash value: 2711371623

-------------------------------------------------------------------------------
| Id  | Operation                    | Name        | Rows  | Bytes | Cost (%CPU
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |             |     1 |    51 |     3   (0
|   1 |  NESTED LOOPS                |             |     1 |    51 |     3   (0
|   2 |   TABLE ACCESS BY INDEX ROWID| B           |     1 |    25 |     2   (0
|*  3 |    INDEX UNIQUE SCAN         | Y_UNIQUE    |     1 |       |     1   (0
|   4 |   TABLE ACCESS BY INDEX ROWID| A           |     1 |    26 |     1   (0
|*  5 |    INDEX UNIQUE SCAN         | COL1_UNIQUE |     1 |       |     0   (0
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("B"."Y"=5)
   5 - access("A"."COL1"=TO_NUMBER("B"."COL2"))

<b>and since Oracle converts implicitly the string to a number, the existing index on A.COL1 is naturally used...

However, if you:</b>

ops$tkyte@ORA10GR2> select *
  2    from a, b
  3   where a.col1 = b.col2
  4     and a.x = 5
  5  /

Execution Plan
----------------------------------------------------------
Plan hash value: 18212380

-------------------------------------------------------------------------------
| Id  | Operation                    | Name     | Rows  | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |          |     1 |    51 | 30850   (1)|
|   1 |  NESTED LOOPS                |          |     1 |    51 | 30850   (1)|
|   2 |   TABLE ACCESS BY INDEX ROWID| A        |     1 |    26 |     2   (0)|
|*  3 |    INDEX UNIQUE SCAN         | X_UNIQUE |     1 |       |     1   (0)|
|*  4 |   TABLE ACCESS FULL          | B        |     1 |    25 | 30848   (1)|
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("A"."X"=5)
   4 - filter("A"."COL1"=TO_NUMBER("B"."COL2"))

<b>then EITHER:</b>

ops$tkyte@ORA10GR2> select *
  2    from a, b
  3   where to_char(a.col1) = b.col2
  4     and a.x = 5
  5  /

Execution Plan
----------------------------------------------------------
Plan hash value: 950355497

-------------------------------------------------------------------------------
| Id  | Operation                    | Name        | Rows  | Bytes | Cost (%CPU
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |             |     1 |    51 |     3   (0
|   1 |  NESTED LOOPS                |             |     1 |    51 |     3   (0
|   2 |   TABLE ACCESS BY INDEX ROWID| A           |     1 |    26 |     2   (0
|*  3 |    INDEX UNIQUE SCAN         | X_UNIQUE    |     1 |       |     1   (0
|   4 |   TABLE ACCESS BY INDEX ROWID| B           |     1 |    25 |     1   (0
|*  5 |    INDEX UNIQUE SCAN         | COL2_UNIQUE |     1 |       |     0   (0
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("A"."X"=5)
   5 - access("B"."COL2"=TO_CHAR("A"."COL1"))

<b>an explicit conversion of the number to a string could be used - BUT - it might change the answer!!! since you can store a number in a string using many many many representations:

0001
01
1
1.00000
+1
1e-0

those are all the number "1" of course if you to_number them.  However a to_char of a number 1 will result in "one string" and that one string won't match them all

But of course, if you can create a function based index on B.COL2 like this:


create index I on b(to_number(col2))

that implies col2 is REALLY A NUMBER

and you have made one of the the worst mistakes you could ever make in your life - putting a number in a string.

And the solution to that of course is


FIX IT</b>