Skip to Main Content

Breadcrumb

Dev Live Dev Intro

This month we are celebrating Developers at AskTOM. We welcome Developers of all levels of experience to join us at our FREE Developer Live events coming in August. Just click on the left to register today! If you are brand new to Database Technology, then we also have got you covered. Just click on the right for your comprehensive FREE training program to kick start your Oracle Database Development journey!

Question and Answer

Tom Kyte

Thanks for the question, Charanjiv.

Asked: March 03, 2001 - 5:26 am UTC

Answered by: Tom Kyte - Last updated: August 16, 2011 - 5:30 pm UTC

Category: Database - Version: 8.1.5

Viewed 10K+ times! This question is

You Asked

Hi,

I am a little confused here, as far as I understood, the VARCHAR2 datatype was supposed use memory equaling to the data in it. But the followig lines from "PL/SQL User’s Guide and Reference" Version 8.1.5 for NT tells a different story.

" However, for a VARCHAR2(< 2000) variable, PL/SQL preallocates enough memory to hold a maximum-size value. So, for example, if you assign the same 500-byte value to a VARCHAR2(2000) variable and to a VARCHAR2(1999) variable, the latter uses 1499 bytes more memory. "

Does this mean there is no difference between a CHAR(1999) and VARHAR2(1999)

Charanjiv



and we said...

A varchar2 datatype, when stored in a database table, uses only the space allocated to it. If you have a varchar2(1999) and put 50 bytes in the table, we will use 52 bytes (leading length byte).

A char datatype, when stored in a database table, always uses the maximum length and is blank padded. If you have char(1999) and put 50 bytes into it, it will consume 2001 bytes (leading length field is present on char's as well).

In the database -- a CHAR is a VARCHAR that is blank padded to its maximum length.


In PLSQL, what they are saying is that for optimization purposes -- a varchar2 less then 2000 bytes will allocate a fixed amount of memory. It will preallocate the maximum size upon instantiation. This cuts down on the reallocations as different sized strings are assigned to it. On the other hand, a varchar2 greater then 2000 characters will truely be varying sized and as you assign different sized strings to it it will grow dynamically as needed.

One takes a little more ram (if you never use the entire string) and performs a little better, the other uses only what ram it needs but if you cause the string to grow frequently will not perform as well due to the reallocs to resize the buffer.


There are great differences between a char(1999) and varchar2(1999) as the CHAR is always 1999 characters long physically AND logically. the varchar is physically 1999 bytes but logically can be any size in between 0 and 1999. The char is always blank padded, not so for the varchar.

You can measure this for yourself to see this effect:

tkyte@TKYTE816> declare
2 x varchar2(1999);
3 l_start number default dbms_utility.get_time;
4 begin
5 for i in 1 .. 100
6 loop
7 for j in 1..1999
8 loop
9 x := rpad('x',j,'x');
10 end loop;
11 end loop;
12 dbms_output.put_line
13 ( 'Elapsed: ' ||
round( (dbms_utility.get_time-l_start)/100, 2 ) );
14 end;
15 /
Elapsed: .99

PL/SQL procedure successfully completed.

tkyte@TKYTE816> declare
2 x varchar2(2001);
3 l_start number default dbms_utility.get_time;
4 begin
5 for i in 1 .. 100
6 loop
7 for j in 1..1999
8 loop
9 x := rpad('x',j,'x');
10 end loop;
11 end loop;
12 dbms_output.put_line
13 ( 'Elapsed: ' ||
round( (dbms_utility.get_time-l_start)/100, 2 ) );
14 end;
15 /
Elapsed: 1.21

PL/SQL procedure successfully completed.


The use of the varchar2(1999) was slightly more efficient in this worst case, fake case to demonstrate a point. In general, you will never notice this different as you probably don't make assignments of every single size to a string 199,900 times in real life.



and you rated our response

  (103 ratings)

Is this answer out of date? If it is, please let us know via a Review

Reviews

Char VS Varchar

April 03, 2002 - 4:43 pm UTC

Reviewer: kiran from cincinnati, OH USA

In all your replys to Char and varchar2 questions you mentioned that Varchar2 supports all the char properties.
I noticed a weired problem with Varchar2.
I have a table test1(tname char(20),my_name varchar2(20));
I inserted two records into this table
('kiran','kiran')
('kiran ','kiran ')---Notice there is a space after kiran.
Now I am selecting from the table using the below sql
select * from test1 where tname = 'kiran';
---this returned the above two records
but when I am select the same table using the below sql
select * from test1 where my_name = 'kiran'
---it returned only one record(1st one). I tested this on 7.3 and also on 8I could you explain why it is happening?
another interesting thing is select * from test1 where tname like 'kiran' is not returning any recors.When the "=" is working fine, why the like is not working?

Tom Kyte

Followup  

April 03, 2002 - 6:26 pm UTC

varchar2's can store trailing blanks -- IF You put them there.

the answer is simple.

'a ' <> 'a'

tried this in 8i (r2 and r3) 

ops$tkyte@ORA817DEV.US.ORACLE.COM> create table t ( x char(10), y varchar2(10) );

Table created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> insert into t values ( 'x', 'x ' );

1 row created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> select * from t where x like 'x';

X          Y
---------- ----------
x          x

ops$tkyte@ORA817DEV.US.ORACLE.COM> select * from t where y like 'x';

no rows selected

works as expected for me. 

Is there a Leading Length field in RAW Datatype ?

April 04, 2002 - 2:32 am UTC

Reviewer: Pascal from Hamburg , Germany

Hi Tom,

Thanks for a Great answer!

If i could just post a followup on that expecting a short YES/NO answer-

Is there a Leading Length field in RAW Datatype ?

and

Is it True that when i use RAW type in my tables as an automatically generated Primary Key - by selecting sys_guid() from dual , from within a Trigger - then i would not be able to Create Materialized Views with columns of RAW type ?


Thanks in advance



Tom Kyte

Followup  

April 04, 2002 - 11:47 am UTC

you can either

select length( raw_field )/2 from t

or 

select utl_raw.length(raw_field) from t

to get the length.  



ops$tkyte@ORA817DEV.US.ORACLE.COM> create table t ( pk raw(16) primary key, data int );

Table created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> create materialized view t_mv
  2  as
  3  select * from t;

Materialized view created.

seems to work. 

Are PL/SQL LONG variables similar?

April 04, 2002 - 4:14 pm UTC

Reviewer: Brent from Hartford, CT, USA

I was searching for documentation on the memory behavior with respect to LONG variables, and found the same snippet at the top of this question for VARCHARs, but nothing about LONGs.

Do the memory allocation rules operate the same way for LONG variables, with the 2000-byte threshold? Or if it is different for LONGs, what are the differences?

Thanks.

(I know Tom knows what I mean but just to clarify for the readers -- I am referring to the PL/SQL LONG datatype which has a 32K maximum size, and not the database LONG column type which has a 2GB limit).



Tom Kyte

Followup  

April 04, 2002 - 4:21 pm UTC

From standard.sql:

...
type VARCHAR2 is NEW CHAR_BASE;
pragma PACK(VARCHAR2);

subtype VARCHAR is VARCHAR2;
subtype STRING is VARCHAR2;

subtype LONG is VARCHAR2(32760);

subtype RAW is VARCHAR2;
subtype "LONG RAW" is RAW(32760);

subtype ROWID is VARCHAR2(256);
....

a plsql long is just a varchar2 in disguise. So, yes, they are just a varchar2 that is bigger then 2000.

Can preallocate varchar2(>2000)?

April 04, 2002 - 5:55 pm UTC

Reviewer: benedict from australia

Thanks for the informative answer Tom.

Supposing I was building up a large varchar2 variable (declared >2000) characters by appending text to itself... e.g.
declare
big varchar(30000);
begin
for i := 1..100 loop
big := big || ' something';
end loop;
end;

I understand from your answer that since it's been declared larger than 2000, pl/sql will reallocate memory each time it gets longer. My question is: is it possible to force pl/sql to allocate tons of room to begin with? And does it make a big enough difference for me to care?

Will pl/sql reallocate if the varchar2 gets smaller?
e.g. will

...
begin
big := rpad(null, 30000);
big := null;
...

do the trick?

Cheers

Tom Kyte

Followup  

April 04, 2002 - 7:57 pm UTC

Based on this test, I would say 'yes, it'll have an effect'.

But -- only if you do this really degenerate case like I did.  I mean -- how many times do you make that string grow a byte at a time??

Also -- setting to NULL sees to be "free" -- it shrinks it right back.

I ran this a couple of times:

ops$tkyte@ORA817DEV.US.ORACLE.COM> create or replace procedure p_big_no_init
  2  as
  3      x varchar2(30000);
  4  begin
  5      for j in 1..30000
  6      loop
  7        x := rpad('x',j,'x');
  8      end loop;
  9  end ;
 10  /

Procedure created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> create or replace procedure p_big_init
  2  as
  3      x varchar2(30000) := rpad( 'x', 30000, 'x' );
  4  begin
  5      for j in 1..30000
  6      loop
  7        x := rpad('x',j,'x');
  8      end loop;
  9  end ;
 10  /

Procedure created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> create or replace procedure p_big_init_with_null
  2  as
  3      x varchar2(30000) := rpad( 'x', 30000, 'x' );
  4  begin
  5          x := null;
  6      for j in 1..30000
  7      loop
  8        x := rpad('x',j,'x');
  9      end loop;
 10  end ;
 11  /

Procedure created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> declare
  2      l_start number default dbms_utility.get_time;
  3  begin
  4      for i in 1 .. 5 loop p_big_no_init; end loop;
  5      dbms_output.put_line
  6      ( 'Elapsed: ' ||
  7          round( (dbms_utility.get_time-l_start)/100, 2 ) );
  8  
  9      l_start := dbms_utility.get_time;
 10      for i in 1 .. 5 loop p_big_init; end loop;
 11      dbms_output.put_line
 12      ( 'Elapsed: ' ||
 13          round( (dbms_utility.get_time-l_start)/100, 2 ) );
 14  
 15      l_start := dbms_utility.get_time;
 16      for i in 1 .. 5 loop p_big_init_with_null; end loop;
 17      dbms_output.put_line
 18      ( 'Elapsed: ' ||
 19          round( (dbms_utility.get_time-l_start)/100, 2 ) );
 20  end;
 21  /
Elapsed: 22.85
Elapsed: 22.06
Elapsed: 24.69

PL/SQL procedure successfully completed.
 

What about when constants are declared in PL/SQL?

April 10, 2002 - 11:10 am UTC

Reviewer: jon from Hartford, CT USA

If you declare a constant in PL/SQL, would it be better to size it as VARCHAR2(2001), so that you would get the actual size, instead of 2000 bytes allocated? Or is it smart enough to recognize that constants can't be changed and allocate the correct amount of memory.

In other words, is
usps_code_connecticut constant varchar2(2001) := 'CT';
any different than
usps_code_connecticut constant varchar2(2) := 'CT';
?

Tom Kyte

Followup  

April 10, 2002 - 11:26 am UTC

regardless of whether it was constant or not -- you should use varchar2(2).

The varchar2(2001) will allocate storage for 2 characters when it is initialized.

The varchar2(2) will allocate storage for 2 characters when it is initialized.

Now, if you had coded:


usps_code_connecticut containt varchar2(1000) := 'CT';

then 2001 and might be better. But the real fix would be to declare it properly as a varchar2(2).

Constants.

January 21, 2003 - 3:54 pm UTC

Reviewer: Kashif from Houston, TX

Hi Tom,

Is using the CONSTANT keyword a performance benefit? Any examples would be helpful. Also, can you explain whether a declaration such as:

l_filename constant varchar2 (7) default to_char (l_start_date, 'MONYYYY');

is advisable/logical. The constant and default keywords are used in the same declaration. Thanks in advance.

Kashif

Tom Kyte

Followup  

January 21, 2003 - 5:34 pm UTC

it is a compiler directive saying "don't let them modify it".

I've never seen any sort of performance benefit. It is still a plsql variable.

Assiging CHAR type to VARCHAR2

March 04, 2003 - 12:05 pm UTC

Reviewer: CSR from India

Hi Tom,

Is there any problem if we assign CHAR type variable directly to a VARCHAR2 type variable.
I have table and SP like below.

chandu@chandra> desc t_clips_audit_info
Name Null? Type
--------------------------- -------- ---------------
AUDIT_SEQUENCE NOT NULL NUMBER(10)
CALL_TYPE CHAR(3)
CALL_RETURN_ERR CHAR(3)
ERROR_DATE_TIME DATE
REQUESTOR_ID VARCHAR2(20)
CALL_STRING VARCHAR2(250)
CALL_RETURN VARCHAR2(2000)


PROCEDURE sp(
ic_or_number IN VARCHAR2,
in_on_number IN NUMBER,
in_amt IN NUMBER,
oc_is_valid OUT CHAR,
oc_is_or_valid OUT CHAR)
As

vc_call_string t_clips_audit_info.call_string%TYPE := NULL;
vc_call_return t_clips_audit_info.call_return%TYPE := NULL;

begin

--------- many lines. here -------

-- #.1
vc_call_return := vc_call_return || ' ' ||'oc_is_obgr_obgt_valid: '||
oc_is_obgr_obgt_valid;

--
end ;
/

When I execute the above SP from SQL*> with a sql client(exec sp) it is giving me proper results.
But java developers are getting the error showing below.

error is :ORA-06502: PL/SQL: numeric or value errorCode:-6502

After debugging I found the exception at occurred the #.1.

If I use SUBSTR(oc_is_obgr_obgt_valid,0,1) in the code, I don't get any errors.

Could you please suggest us.


Tom Kyte

Followup  

March 04, 2003 - 6:32 pm UTC

The java developers are binding a string that is too small to hold the result. chars are blank padded to the maximum length and the java guys bind variable isn't big enough I guess.

one would actually need a small, yet complete testcase to say exactly.


almost forgot to say - i hate char, don't use them, just use varchar2.

CHAR(1) vs VARCHAR2(1)

March 04, 2003 - 8:30 pm UTC

Reviewer: Trevor

Tom...
I was told once that if you only need a single char
use CHAR(1) since using VARCHAR2(1) has overhead
due to the Oracle having to maintain how long
the value is in a VARCHAR2 field

Is this a load of rubbish or is there some truth
to it?

Tom Kyte

Followup  

March 05, 2003 - 7:21 am UTC

that is rubbish.

A char(n) is a varchar2(n) that is blank padded to it's maximum length. The implementation of a char physically on disk is identical to a varchar2 -- there is a length byte for a char(1) just as there is for a varchar2(1).




Binned

March 05, 2003 - 6:12 pm UTC

Reviewer: Trevor

Thanks Tom


VARCHAR2(10) vs. VARCHAR2(2000) for a 10 char string

June 03, 2003 - 10:28 am UTC

Reviewer: Tajana from Austria, Vienna

Tom,

clear about the difference between VARCHAR2 and CHAR.

But what about defining VARCHAR2(10) and, let's say VARCHAR2(2000) for a column that is expected not to be "wider" than 10 characters ?

One advantage of defining such a column actually as what it is (VARCHAR2(10)) for me is, that I can easily list a table containing many of such columns in SQL*Plus (Worksheet) without having to scroll right, till my fingers get wounded.

Disadvantage: If my text gets wider, I'll have to redefine the column; means: I have to maintain that.

Defining such a column as - let's say: 2000, would mean: No or less maintenance.

But much more interesting is:
Are there any
space
index
optimizer issues , or
any other important issues with this ?
Any other arguments for defining a 10 char column as VARCHAR2(10) and not VARCHAR2(2000) ?
Or: Any arguments why it is better to generally define VARCHAR2 columns to its maximum ?

Thanks in advance,

Tajana

Tom Kyte

Followup  

June 03, 2003 - 11:16 am UTC

well, the real issue is -- what is the business requirement. What the heck is in that field and what is its length.

If you say "use a varchar2(2000)" then you know what will happen? someone will put 2000 characters in there someday. Is that OK?

varchar2(10) says "this field accepts 10 bytes of data"

Since you can make a field larger easily with ALTER, maintenance isn't an excuse.


You define a field to be correct and proper. Period. This field is a varchar2(10).


think about the poor client tools that say "oh look - a varchar2(2000) and an array size of 100. We need to allocate 200,000 bytes of storage (instead of 1000).



VARCHAR2(10) vs. VARCHAR2(2000) for a 10 char string

June 04, 2003 - 7:09 am UTC

Reviewer: Tajana from Vienna, Austria

Tom,

in german you would say: You a running against my open doors.
Means: We are of the same opinion.

My intention was to prevent development colleagues to define hundreds of VARCHAR2(255) columns (which they did, before I "jumped" in the project, and still trying to do).
So I needed arguments.
You gave me one - that with client tools, arraysize and allocation.

But before I discuss that with colelagues, I would like to know this a little bit more in detail:
Do you mean memory allocation by the client tool at the client ?
Can you explain that matter a little bit further, please ?

Any other considerations (optimizer, indexes ?)

Thanks a lot,

Tajana

Tom Kyte

Followup  

June 04, 2003 - 8:35 am UTC

there is only one reason necessary:

o it is the only sensible and professional thing to do.


but the technical reasons are:

o it is called data integrity. if you make it 255, you are saying that field legitimately contains upto 255 bytes of data. If you make it 10, you are saying otherwise. If you make it 255, someone WILL -- not might -- WILL put 255 bytes in there some day. Then what?

o say you have 20 255 varchar2 columns instead of 20 right sized ones. Pull that up in an ad hoc tool. Have you done the end user a serious dis-service here? they'll spend the next 5 minutes formatting data - data that should have been, could have been formatted already.

o say you move onto the next project -- leaving all of these 255 byte columns. Who is going to tell the next guys "c1 is a 10 character field, c2 is 12, c3 is 44" and so on.

o when the client application, which wants to run fast, decides "I'll array fetch 100 rows at a time", it'll have to allocate 100 * 255 * 20 bytes of data for the buffering of such data (remember, no one KNOWS c1 is just 10 characters, it is 255 characters). That is as opposed to allocating just what you really need.


I've got an idea for these guys -- don't use numbers or dates either. What if later on you want to store "hello world" in that date column. (only kidding)



June 04, 2003 - 10:22 am UTC

Reviewer: Dave from Colorado

In the same kidding spirit as "don't use numbers or dates either", you could use VarChar2(2000) for all your columns, and then put a check constraint on the column "length(my_column) < 21". Ta Da! Same as VarChar2(20)!!!

I like that one..

June 04, 2003 - 11:30 am UTC

Reviewer: Jose Cleto from Santo Domingo, Dominican Republic

It reads like "American funnies Scripts"

Funnies scripts

June 05, 2003 - 8:32 am UTC

Reviewer: Tajana from Vienna, Austria

Thanks, Tom.
============
This was what I needed.

Jose,
It reads like:
What funny ideas developers sometimes have, and what a DBA / data designer has not only to clean after them, but also to convince them of on the first place ...
(Tenor: "Don't change this, my procedure won't work at all anymore !!! And, by the way, it is yet implemented at the cust site !")

In this case: European funnies scripts, but I am sure, you have your own ones too ;-)

Dave,
I will evaluate your suggestion and present it to my dev colleagues ;-)
We will see, what they are gonna to say...

Cheers, Tajana

Sorry if I miss the point...

July 30, 2003 - 10:45 am UTC

Reviewer: Jose Cleto from Dominican Republic

We all have fallen in this situation:

- "Let's do poor req. and anal. and then go with the code"
- "I know that is better, but this is faster"
- "We don't have time for that now"

An the list goes on...

The true, all the true and nothing but the true is that developers, DBA, and the like all we, are payed to do our job the better we can.

Why pay like a pro if some one will work like a novice. We most work all the time to get the next level. Computers is a "science", like math, chemestry, we have discipline CMM, OOP, Database Design, Structure programming, UML, etc., all this exists is for guide us in doing our work, but only our self could make our work.

Making the king of VARCHAR2(2000) to hold just 20 is silly. So mr, Tajana, Your job "if You decide to accept it", is to convince your developers to be as near as possble to the reallity and make use of all the things they are suppose to know.

BTW.: In my case is "Dominican Funniest scripts"

Char is returing vaules that are not padded?

September 11, 2003 - 10:31 am UTC

Reviewer: Vivek

Tom, I created a table with a column of char(10).  When I select a row with a where clause of the "col" not padded with spaces it still returns the row.  I would think the row should not be found if I do not pad.  Am I wrong?

SQL> create table x(col char(10));

Table created.

SQL> insert into x values('0');

1 row created.

SQL> insert into x values('1     ');

1 row created.

SQL> insert into x values('A');

1 row created.

SQL> select length(col),'x'||col||'x' from x;

LENGTH(COL) 'X'||COL||'X
----------- ------------
         10 x0         x
         10 x1         x
         10 xA         x

SQL>
SQL> DECLARE
  2     x   CHAR (14);
  3  BEGIN
  4     DBMS_OUTPUT.put_line ('0');
  5
  6     SELECT col
  7       INTO x
  8       FROM x
  9      WHERE col = '0';
 10
 11     DBMS_OUTPUT.put_line ('1');
 12
 13     SELECT col
 14       INTO x
 15       FROM x
 16      WHERE col = '1';
 17
 18     DBMS_OUTPUT.put_line ('A');
 19
 20     SELECT col
 21       INTO x
 22       FROM x
 23      WHERE col = 'A';
 24      DBMS_OUTPUT.put_line ('NO Errors');
 25  EXCEPTION
 26     WHEN NO_DATA_FOUND THEN
 27        DBMS_OUTPUT.put_line ('no_data_found');
 28        RAISE;
 29     WHEN OTHERS THEN
 30        DBMS_OUTPUT.put_line ('others');
 31        RAISE;
 32  END;
 33  /
0
1
A
NO Errors

PL/SQL procedure successfully completed.

SQL> spool off


thanks 

Tom Kyte

Followup  

September 11, 2003 - 6:10 pm UTC

character string literals like that are coerced to the proper datatype automagically

'A' is really 'A '

it is the ansi defined behaviour.

number vs date in datawarehouse

November 03, 2003 - 4:07 am UTC

Reviewer: A reader

Hi

I have seen many Datamarts/Datawarehouses using number for date columns. For example instead of having

2003-AUG-01 23:00:00

they have

20030801230000 in number

Is this good practice? I find it so hassle because when I have to calculate aggreates with time functions I always have to convert that to date first!

I have never seen this in OLTP but many times in DWH, is there some directive to do so or what?

Tom Kyte

Followup  

November 03, 2003 - 7:26 am UTC



In fact, I write about this very topic in my book "Effective Oracle By Design" where I state very loudly

USE NUMBERS FOR NUMBERS

USE DATES FOR DATES

don't use anything else, use the correct proper datatypes.


Not only do you spend 1/2 your time converting that number to a date -- but I'll assure you that one day, you'll get a date in there that IS NOT A VALID DATE and then what?

Not only that but tell me, how many values are there between

20031231115959

and

20040101000000



well, the optimizer thinks there could be 8869884041 values -- but we all know there are no values between those two dates!!! You can totally mess up the optimizer with things like "between" and "<", ">" doing this silly thing of putting dates into numbers or strings.....

can you tell ? i'm dead set against this "common practice"



November 03, 2003 - 11:51 am UTC

Reviewer: Dave from Colorado

"Date folding" is related good reason, as if another one were needed, for using date types for dates, especially in a DW environment ...

</code> http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96520/qr.htm#50228 <code>

November 03, 2003 - 12:09 pm UTC

Reviewer: Dave from Colorado

By the way, there seems to be some very poor advice in that documentation, in the "Note:" section.

example from your new book

November 14, 2003 - 4:14 am UTC

Reviewer: A reader

Hi

you have an example showing using proper datatypes in chapter about effective schema design

I did a small test using my own script and saw something not predicted, opposite from your example


create table test_date
(
varchar_fecha varchar2(9),
number_fecha number,
date_fecha date
)
pctfree 70
pctused 10;

insert /*+ APPEND */ into test_date
select to_char( to_date( '19980101', 'YYYYMMDD' ) + rownum, 'YYYYMMDD' ),
to_number( to_char( to_date( '19980101', 'YYYYMMDD' ) + rownum, 'YYYYMMDD' ) ),
to_date( '19980101', 'YYYYMMDD' ) + rownum
from dba_source
where rownum <= trunc(sysdate) + 36500 - to_date( '19940101', 'YYYYMMDD' );

create index test_date_idx1 on test_date(varchar_fecha);

create index test_date_idx2 on test_date(number_fecha);

create index test_date_idx3 on test_date(date_fecha);

analyze table test_date compute statistics for all indexes for table for all columns;

set autotrace trace exp stat

select * from test_date
2 where number_fecha between 20011231 and 20050101;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=6 Card=1318 Bytes=26360)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST_DATE' (Cost=6 Card=1318 Bytes=26360)
2 1 INDEX (RANGE SCAN) OF 'TEST_DATE_IDX2' (NON-UNIQUE) (Cost=5 Card=1318)

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
166 consistent gets
0 physical reads
0 redo size
50919 bytes sent via SQL*Net to client
8528 bytes received via SQL*Net from client
75 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1098 rows processed



select * from test_date
2 where date_fecha between to_date('20011231', 'YYYYMMDD') and to_date('20050101', 'YYYYMMDD');

1098 filas seleccionadas.


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=46 Card=1097 Bytes=21940)
1 0 TABLE ACCESS (FULL) OF 'TEST_DATE' (Cost=46 Card=1097 Bytes=21940)

Statistics
----------------------------------------------------------
0 recursive calls
15 db block gets
545 consistent gets
7 physical reads
0 redo size
50919 bytes sent via SQL*Net to client
8528 bytes received via SQL*Net from client
75 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1098 rows processed


as you can see the proper datatype in this case date is working worse than number datatype... interesting no?





Tom Kyte

Followup  

November 14, 2003 - 8:26 am UTC

My point was the information the optimizer sees

look Card=1097 when using dates -- and reality says 1098 rows.
     Card=1318 when using numbers -- and reality says 1098 rows.


using the number gave the optimizer bad information here.  That said (i was demonstrating something wholly different there - that the optimizer is getting bad bad data) -- I do not observe the same thing as you:


ops$tkyte@ORA920PC> create table test_date
  2  (
  3   varchar_fecha varchar2(9),
  4   number_fecha number,
  5   date_fecha date
  6  )
  7  pctfree 70
  8  pctused 10;
 
Table created.
 
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> insert /*+ APPEND */ into test_date
  2  select to_char( to_date( '19980101', 'YYYYMMDD' ) + rownum, 'YYYYMMDD' ),
  3         to_number( to_char( to_date( '19980101', 'YYYYMMDD' ) + rownum, 'YYYYMMDD' ) ),
  4         to_date( '19980101', 'YYYYMMDD' ) + rownum
  5    from dba_source
  6   where rownum <= trunc(sysdate) + 36500 - to_date( '19940101', 'YYYYMMDD' );
 
40104 rows created.
 
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> create index test_date_idx1 on test_date(varchar_fecha);
 
Index created.
 
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> create index test_date_idx2 on test_date(number_fecha);
 
Index created.
 
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> create index test_date_idx3 on test_date(date_fecha);
 
Index created.
 
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> analyze table test_date compute statistics for all indexes for table for all
  2  columns;
 
Table analyzed.
 
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> set autotrace traceonly
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> select * from test_date where number_fecha between 20011231 and 20050101;
 
1098 rows selected.
 
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=21 Card=1318 Bytes=26360)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'TEST_DATE' (Cost=21 Card=1318 Bytes=26360)
   2    1     INDEX (RANGE SCAN) OF 'TEST_DATE_IDX2' (NON-UNIQUE) (Cost=5 Card=1318)
 
 
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        163  consistent gets
         14  physical reads
          0  redo size
      36016  bytes sent via SQL*Net to client
       1302  bytes received via SQL*Net from client
         75  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       1098  rows processed
 
ops$tkyte@ORA920PC> select * from test_date where date_fecha between to_date('20011231', 'YYYYMMDD') and to_date('20050101', 'YYYYMMDD');
 
1098 rows selected.
 
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=17 Card=1096 Bytes=21920)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'TEST_DATE' (Cost=17 Card=1096 Bytes=21920)
   2    1     INDEX (RANGE SCAN) OF 'TEST_DATE_IDX3' (NON-UNIQUE) (Cost=4 Card=1096)
 
 
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        163  consistent gets
          0  physical reads
          0  redo size
      36016  bytes sent via SQL*Net to client
       1302  bytes received via SQL*Net from client
         75  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       1098  rows processed
 
ops$tkyte@ORA920PC> set autotrace off
ops$tkyte@ORA920PC>


<b>It seems unusual to me that the query that selects less rows (is estimated to select less rows) would use a FULL SCAN (given that both indexes by definition would have the same cluster factors).  You must have lots of non-standard optimizer init.ora settings going on there.</b>

so, use the 10053 trace event (in chapter 6) and show us your optimizer settings.  also, a version is ALWAYS relevant. 

For Dave from Colorado...

November 14, 2003 - 8:59 am UTC

Reviewer: Jan M. van Mourik from Houston, TX

You thought you were kidding, Dave?
How about this:

CREATE TABLE BFX_EMPGROUP (
EMPLOYEE_GROUPID VARCHAR2(150) NOT NULL,
NAME VARCHAR2(150) NOT NULL,
START_DATE DATE,
END_DATE DATE);

ALTER TABLE BFX_EMPGROUP ADD (
CHECK (LENGTH(EMPLOYEE_GROUPID)<=50),
CHECK (LENGTH(NAME)<=50));

From PeopleSoft HR...

November 25, 2003 - 10:05 am UTC

Reviewer: A reader


is there any reason to use CHAR

December 01, 2003 - 11:18 am UTC

Reviewer: A reader

Tom, is there any situation in which it is better to use a CHAR datatype instead of VARCHAR2? I am trying to convince developers to stop using it but they seem to think that if the number of characters in a string column will always be the same, like in state abbreviations, then it is better to use CHAR. Is this a valid argument?

Tom Kyte

Followup  

December 01, 2003 - 5:52 pm UTC

nope.

ask them for the proof or at least the reason why they think "better"

please disregard previous post

December 01, 2003 - 4:30 pm UTC

Reviewer: A reader

I found the answer in the following posting-

</code> http://asktom.oracle.com/pls/ask/f?p=4950:8:856236758633628034::::: <code>

correction to link above

December 01, 2003 - 4:32 pm UTC

Reviewer: A reader

sorry for this, the link above points back to this page. The link where I found the answer was here:

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:123212348063, <code>

GR8

December 26, 2003 - 4:48 am UTC

Reviewer: Bipin Ganar from INDIA

Hi Tom,
Thanx for the Difference.
I got the difference for char and varchar2 But Still not clear about Char and varchar.
So, can you highlight more on difference between Char and varchar


Tom Kyte

Followup  

December 26, 2003 - 10:07 am UTC

there are none. varchar2 and varchar are the same right now.

Converting many CHAR columns to VARCHAR columns

February 17, 2004 - 10:52 am UTC

Reviewer: Sha from USA

Hi Tom,
I need to convert many char columns to varchar columns.

We have about 150 tables with appox 2 columns of type char and 500,000 rows each. I need to convert all columns of datatype char to varchar2.

Please suggest the best way to do this.

Regards,
Sha

Tom Kyte

Followup  

February 17, 2004 - 10:55 am UTC

you'll have to

a) alter the tables to change the types
b) update each of the tables setting column = rtrim(column) to remove the trailing blanks.

or, rebuild them all.

Table update slow

February 18, 2004 - 11:47 pm UTC

Reviewer: Sha from USA

Hi Tom,
In our schema we have constraints,Indexes and Master Details relationships. I need to update the CHAR columns to VARCHAR datatype and later RTrim the new VARCHAR.

e.g
Lets say TMaster has 3 columns A, B and X, say X is of datatype CHAR(10) and TMASTER has indexs multiple indexs on columns including index on X.

TMaster( A,B,X char(10) )

TDETAIL has 3 columns P,Q and Y, say Y is on datatype CHAR(10) and TDETAIL has indexs multiple indexs on columns including index on Y.

X is primary key in TMASTER and Y is foreign key in TDETAIL.

TDETAIL (P,Q, Y char(10))

TMASTER has 50,000 records while TDETAIL has 1 Million records.

I did the follwing
1) Drop constraints on both tables
2) Removed Indexs on X and Y columns only all other Indexes for TMASTER and TDETAIL are inplace.
3)Alter tables TMaster and than TDetail.
4) Update X = Rtrim(X) on TMASTER
5) Update Y = Rtrim(Y) on TDETAIL

To update column Y it took 8 hours which has 1 Millions rows.

1) Is this an efficient way to do this job because I have around 150 tables with similar change.
2) How do I reduce the time to update the column atleast in couple of minutes and not 8 hours.
3) I guess the oracle should be fast ( couple of minutes ) to update 1 million columns with Rtrim, Is there something holding my table to get updated, how should I figure that out.
4) Did I miss out anything or should I do things in a particular order, how do I know what is making my table update slow.

I am new to oracle.

Thanks in advance,
Sha

Tom Kyte

Followup  

February 19, 2004 - 10:27 am UTC

you did something wrong -- not sure what (i would guess you didn't really get the indexes off).

to update 1,000,000 rows (unindexed) is a couple of minutes.


big_table@ORA920PC> create table t
2 as
3 select cast( object_name as char(30) ) x, big_table.*
4 from big_table
5 where rownum <= 1000000
6 /

Table created.

big_table@ORA920PC> set timing on
big_table@ORA920PC> alter table t modify x varchar2(30);

Table altered.

Elapsed: 00:00:00.32
big_table@ORA920PC> select min(length(x)) from t;

MIN(LENGTH(X))
--------------
30

Elapsed: 00:00:02.00
big_table@ORA920PC> update t set x = rtrim(x);

1000000 rows updated.

Elapsed: 00:01:26.51
big_table@ORA920PC> select min(length(x)) from t;

MIN(LENGTH(X))
--------------
1

Elapsed: 00:00:03.64
big_table@ORA920PC>



So, don't know what you did wrong (logs big enough? any checkpoint not complete messages, are you sure the indexes were "gone").

Why char as parameter not restricting?

March 19, 2004 - 4:08 pm UTC

Reviewer: dharma from CA USA

I have been debugging some procedures from a Oracle 7.3 system which is in 9i rel 2 now where I found one of your BIG NO NO. it had a "when others then null" and the procedure has been there for 4 years.
What I found to be causing the error looks strange.

The defn of CHAR without any length is defaulted CHAR(1)

But If I have a parameter in the procedure as CHAR how is it able to get the values larger than 1 byte?

DECLARE
PROCEDURE p(p_a IN CHAR) IS
b CHAR;
BEGIN
Dbms_output.put_line('Before assign'||p_a);
b:=p_a; --raises error if assigned;
Dbms_output.put_line('After assign'||b);
END p;
BEGIN
p('12');
END;

if the line b:=a is commented it works fine. should'nt it be throwing an error for assigning parameter p_a a value of length greater than 1.
Am I missing something.

Thanks,
-dharma

Tom Kyte

Followup  

March 20, 2004 - 9:46 am UTC

the parameters size is defined at runtime as specified by the INPUT parameter.

if you called

declare
x char(10);
y char(20);
begin
p( x ) ;
p( y ) ;
end;

the size of "p_a" would be 10 and then 20. sooo, look back in the code to the invokers of this routine

(and stop using the dreaded CHAR type !!! hate it)

Char(1) vs Varchar(1) Revisited

May 12, 2004 - 5:07 pm UTC

Reviewer: Jeff Westman from Walnut Creek, CA, USA

So, Tom ... Am I understanding this correctly?

Situation #1
CHAR(1) = null, VARCHAR(1) = null
The char takes up 3 bytes (length id + space), and the varchar takes up 2 bytes (length id only)

Situation #2
CHAR(1) = 'X', VARCHAR(1) = 'X'
The char takes up 3 bytes still (length id + 'X'), and the varchar takes up 3 bytes (length id + 'X')

So [if I understand this right] it seems best to always use a VARCHAR2 since you will use a minimum of 2 bytes either way.

When would you want to use a CHAR then???!

Thanks in advance,

Jeff

Tom Kyte

Followup  

May 13, 2004 - 9:34 am UTC

situation #1: both take zero or one byte -- the NULL flag byte. (zero bytes if they are a trailing null column in the row)

situation #2: both take three bytes -- the NULL flag byte, one byte for the length, one byte for the data.


I would never use char, in fact, I never do.

What if CHAR(1) is a NOT NULL column

June 30, 2004 - 3:23 am UTC

Reviewer: Sea from Australia

I have a not null column which is a flag. The value is either 'Y' or 'N'. There will never be any null values. In this case, does it mean it does not really matter whether to use CHAR or VARCHAR?


Tom Kyte

Followup  

June 30, 2004 - 10:06 am UTC

no, they would be 100% the same -- physically on disk, and query wise.

soooo, no good reason to change the rule, i would still use varchar2.

number vs date in data warehouse

June 30, 2004 - 1:55 pm UTC

Reviewer: CH from WA

Tom,
Use sequantial numbers for date_key instead of keys like 20040630 is better? That means you have to always join date dimention table whenever there is a constraint on date.
Thank you very much for your always eye opening replys.


Tom Kyte

Followup  

June 30, 2004 - 2:05 pm UTC

(or just use a date?)

Number vs Date

June 30, 2004 - 6:11 pm UTC

Reviewer: Wayne

Tom,
Using DATE is much better than using number keys if the field is not null and always have good values (e.g. trans_date, populated with sysdate at insert time). But if otherwise, dates can have bad values (and not desirable to be displayed as is in DW), we can turn it into null, but then in many queries, we have to ask the users to apply the outerjoins (to a TIME_DIM dimension table) even if the we don't have original values.

I saw some people put a default date there, such as '4-JUL-1776' or '31-DEC-9999', but calculation could be a problem if people run a avg(sysdate -birthday), we might get wrong results.

Any suggestions?

Thanks,

Tom Kyte

Followup  

June 30, 2004 - 8:33 pm UTC

if the field is a date and you put in a number -- what good is that? you cannot do any analysis?



Please DO NOT post irrelevant question

June 30, 2004 - 8:13 pm UTC

Reviewer: A Reader

The topic is "Char Vs Varchar".
Please DO NOT post questions that are irrelevant to the topic.

Thanks!

Date Key

June 30, 2004 - 9:58 pm UTC

Reviewer: Wayne

Tom,
Some "date" source fields are not stored as date, but rather as a varchar2 in 'mm/dd/yyyy' format. I know it's bad, and bad data can slip in like '2/29/2003', or '6/30/2037'.
1. We don't want to store them "as is" in DW, it's humilating (when we find somebody is 900 years old) and difficult to use (convert to dates and might fail);
2. So we store them as dates and set the field to null if they are bad or unrealistic dates;
3. Users want to know for those null fields, which are originally null (we didn't collect the info) and which are set to null (meaning bad value).
4. So we have to put date_key (number) there and put 0 as unavailable, -1 as bad data.

I know we can let users go to a staging table where these data are stored and before they are used to update the real DW table, but sometimes users just don't want to and they set the rules.

Your insights?

Thanks,
Wayne



Tom Kyte

Followup  

July 01, 2004 - 10:22 am UTC

i would have stored a date field AND the source data when bad

that is, convert the date and if the conversion fails, store the string. so a null date with a null bad field is truly missing, a null date with a non-null bad field is bad data and they can see what the bad data was exactly.

import of char values into varchar2 fields

July 05, 2004 - 9:16 am UTC

Reviewer: Dorothea from Germany

Dear Tom,

I have the following problem: I have to make a full import from a database with char fields into a database with only varchar2 fields. The problem is of course, that all the trailing blanks are as well imported. How is it possible to get rid of the trailing blanks (for quite a big database)?

thanks a lot in advance!


Tom Kyte

Followup  

July 05, 2004 - 10:26 am UTC

you could put triggers on the tables prior to the import to trim them, else you have to update them.

we put in exactly what we are told to put in, nothing more and in this case, nothing less.

thanx

July 06, 2004 - 5:43 am UTC

Reviewer: Dorothea from Germany

This was a really exact answer! Adn it helped. Thanx.
Dorothea

Varchar vs Varchar2

October 19, 2004 - 1:52 pm UTC

Reviewer: John from USA

Tom,

What is the difference between Varchar and Varchar2 datatypes? I read somewhere that, we should always use Varchar2 in preference to Varchar.

Pl let me know.


Tom Kyte

Followup  

October 19, 2004 - 1:56 pm UTC

they are and have been the same -- but varchar2 will remain the same, varchar is subject to change (if ansi changes how it should behave)

Help on Datatype!!!!!!!

October 28, 2004 - 10:10 am UTC

Reviewer: A reader

Hi Tom,
The other day one of the developers asked me to help him create a table for reports. This table will be used for recording SQL Queries. So I created a table and one of the columns is Repsql varchar2(2000 char).
My question is from the performance point of view and space which datatype should I use from the following:
varchar2(2000 char);
char(2000 varchar2);
This datatype I was not allowed to create:
varchar2(2000 varchar2);
It generates the error: ORA-00905: missing keyword

Also a point to note is that the queries can be huge. I could not explain to my developer neither could I remember the link on your site which explains the difference between
varchar2(2000) and varchar2(2000 char).
Another thing I did a desc v$sql and saw the field SQL_TEXT to be just varchar2(1000). So is this size sufficient for the above table.
Oracle 91R2 on Win 2k Advances Server.

Thanks in advance as always.


Tom Kyte

Followup  

October 28, 2004 - 1:37 pm UTC

varchar2 is the only type i use for strings

it would be varchar2( 2000 char ) (for 2000 CHARACTERS)

2000 char lets you hold 2000 characters -- in a mutli-byte character set, that might be 4000 bytes (2 bytes/character) for example

2000 byte lets you hold 2000 bytes of data, which might be as little as 2000/6 characters.

The max size of varchar2 in table and proceudre.

November 12, 2004 - 11:32 am UTC

Reviewer: Sean from NJ, USA

Hi Tom,

The max size of varchar2 data type is 4000 bytes, but in the stored procedure I can declare varchar2 variable to 32767. But in chapter 3 of Application developer's guide, it only states that max size varchar2 is 4000. Why are the max sizes different?

Thanks so much for your help.

Oracle 9204

Sean





Tom Kyte

Followup  

November 12, 2004 - 1:31 pm UTC

the max size of a varchar2 column ON DISK is 4000 bytes.

the max size of a varchar2 column IN MEMORY is 32k.

they just are different, not sure there is a "reason", other than in plsql you need more than 4000 characters for variables sometimes. On disk you have clobs.

clob in plsql

November 12, 2004 - 1:41 pm UTC

Reviewer: Sean from NJ, USA

Hi Tom,

We can use clobs in plsql too if it is larger than 32K.

Thanks so much for your help.


Tom Kyte

Followup  

November 12, 2004 - 3:33 pm UTC

in 9i yes, you can.

Shouldn't this..

November 12, 2004 - 3:33 pm UTC

Reviewer: A reader

"the varchar is physically 1999 bytes but logically can be any size in between 0 and 1999."

be..

"the varchar is logically 1999 bytes but physically can be any size in between 0 and 1999."

Tom Kyte

Followup  

November 12, 2004 - 3:43 pm UTC

nope, meant it the first way when talking about plsql (only plsql).


a varchar2 less than 2000 characters in plsql is pre-allocated physically to the maximum length.

varchar2(50) -- is 50 bytes

varchar2(50) := 'x'
is logically 1 byte in my character set, physically however takes 50 bytes of RAM.




veriifcation

November 12, 2004 - 6:12 pm UTC

Reviewer: Menon

"varchar2(50) := 'x'
is logically 1 byte in my character set, physically however takes 50 bytes of
RAM.
"

Is there a way to verify this?
Thanx.

Tom Kyte

Followup  

November 12, 2004 - 8:48 pm UTC

did you read the original answer? that was the crux of the question in the first place.

sorry

November 13, 2004 - 10:33 am UTC

Reviewer: A reader

did not read the original answer..
however, I was looking for a more "direct" proof. I guess
there is none - you can only see the effect indirectly
by measuring elapsed times in the degenerate case.

Anyway this was more of an "academic" question.

Tom Kyte

Followup  

November 13, 2004 - 11:03 am UTC

<b>fair enough, here is the "direct" proof:</b>



ops$tkyte@ORA9IR2> create or replace
  2  procedure create_procedure( p_name in varchar2,
  3                              p_num_vars in number,
  4                              p_size_vars in number )
  5  as
  6      l_stmt long := 'create or replace procedure ' || p_name || ' as ';
  7  begin
  8
  9      for i in 1 .. p_num_vars
 10      loop
 11          l_stmt := l_stmt || chr(10) || ' v' || i || ' varchar2('||p_size_vars||') := ''x'';';
 12      end loop;
 13
 14      l_stmt := l_stmt || chr(10) || 'begin null; end;';
 15
 16      execute immediate l_stmt;
 17  end;
 18  /
 
Procedure created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec create_procedure( 'p1', 1000, 1999 );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> exec create_procedure( 'p2', 1000, 2001 );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec p1
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> exec p2
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> @connect /
ops$tkyte@ORA9IR2> set termout off
ops$tkyte@ORA9IR2> set termout on
ops$tkyte@ORA9IR2> @mystat "session pga memory max"
ops$tkyte@ORA9IR2> set echo off
 
NAME                                VALUE
------------------------------ ----------
session pga memory max             286044
 
ops$tkyte@ORA9IR2> exec p1
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> @mystat2
ops$tkyte@ORA9IR2> set echo off
 
NAME                                    V       DIFF
------------------------------ ---------- ----------
session pga memory max            2314240    2028196
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> @connect /
ops$tkyte@ORA9IR2> set termout off
ops$tkyte@ORA9IR2> set termout on
ops$tkyte@ORA9IR2> @mystat "session pga memory max"
ops$tkyte@ORA9IR2> set echo off
 
NAME                                VALUE
------------------------------ ----------
session pga memory max             286044
 
ops$tkyte@ORA9IR2> exec p2
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> @mystat2
ops$tkyte@ORA9IR2> set echo off
 
NAME                                    V       DIFF
------------------------------ ---------- ----------
session pga memory max             364120      78076
 
ops$tkyte@ORA9IR2>
 

excellent!

November 13, 2004 - 11:05 am UTC

Reviewer: Menon

completely forgot about the pga memory stats:) Thanx Tom.

January 11, 2005 - 7:44 am UTC

Reviewer: A reader


Difference

January 12, 2005 - 5:14 am UTC

Reviewer: Reader from India

what is difference between varchar,and varchar2 datatype in oracle

Tom Kyte

Followup  

January 12, 2005 - 9:28 am UTC

nothing right now.

PL/SQL Variable Sizes

March 29, 2005 - 6:03 pm UTC

Reviewer: Andrew from Wellington, NZ

My team is tired of seeing value errors caused by PL/SQL developers not allocating a big enough string for a message (ORA-6502 numeric or value error: character string buffer too small), so someone has suggested enforcing all VARCHAR2s in PL/SQL to be 2000+. This way there is never a downside to memory usage, only a small downside to performance due to re-allocating modified strings.

Assume for a second that none of these varchar2 variables are used to store/retrieve information from the database (assume we use plsql records for that).

Is there any downside to enforcing all VARCHAR2 variables to be 2000+? The only one could be: what about PL-SQL collections? We use index-by tables and associative arrays (but not varrays) to pass around information. But because these collections cannot be pre-allocated sizes, I am guessing that PL/SQL does the right thing and doesn't waste any space (in memory) when extending a collection, i.e.

TYPE typet IS TABLE OF VARCHAR2(2000);
t typet := typet('x','y','z');

Does this take up 3 bytes or 6000 bytes?


Tom Kyte

Followup  

March 29, 2005 - 8:24 pm UTC

there is a downside in that you no longer verify the string is the right size either.  That doesn't seem like the correct fix to me.

Use of %TYPE, %ROWTYPE maybe.

Looks "safe" enough memory wise, but doesn't sound like a sound programming concept (well, it isn't frankly).


ops$tkyte@ORA9IR2> create or replace package demo_pkg
  2  as
  3          type array is table of varchar2(2000);
  4          g_data array := array();
  5  end;
  6  /
 
Package created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> disconnect
Disconnected from Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.4.0 - Production
ops$tkyte@ORA9IR2> @connect /
ops$tkyte@ORA9IR2> set termout off
ops$tkyte@ORA9IR2> set termout on
ops$tkyte@ORA9IR2> @mystat "session pga memory max"
ops$tkyte@ORA9IR2> set echo off
 
NAME                                VALUE
------------------------------ ----------
session pga memory max             286044
 
ops$tkyte@ORA9IR2> begin
  2          dbms_output.put_line( demo_pkg.g_data.count );
  3  end;
  4  /
0
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> @mystat2 "session pga memory max"
ops$tkyte@ORA9IR2> set echo off
 
NAME                                    V DIFF
------------------------------ ---------- ----------------
session pga memory max             360092           74,048
 
ops$tkyte@ORA9IR2> begin
  2          demo_pkg.g_data.extend(500);
  3  end;
  4  /
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> @mystat2 "session pga memory max"
ops$tkyte@ORA9IR2> set echo off
 
NAME                                    V DIFF
------------------------------ ---------- ----------------
session pga memory max             360092           74,048
 
ops$tkyte@ORA9IR2> begin
  2          for i in 1 .. 50
  3          loop
  4                  demo_pkg.g_data(i) := 'x';
  5          end loop;
  6  end;
  7  /
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> @mystat2 "session pga memory max"
ops$tkyte@ORA9IR2> set echo off
 
NAME                                    V DIFF
------------------------------ ---------- ----------------
session pga memory max             360092           74,048
 
ops$tkyte@ORA9IR2> begin
  2          for i in 1 .. 500
  3          loop
  4                  demo_pkg.g_data(i) := 'x';
  5          end loop;
  6  end;
  7  /
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> @mystat2 "session pga memory max"
ops$tkyte@ORA9IR2> set echo off
 
NAME                                    V DIFF
------------------------------ ---------- ----------------
session pga memory max             360092           74,048
 
ops$tkyte@ORA9IR2> begin
  2          for i in 1 .. 500
  3          loop
  4                  demo_pkg.g_data(i) := rpad('x',200,'x');
  5          end loop;
  6  end;
  7  /
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> @mystat2 "session pga memory max"
ops$tkyte@ORA9IR2> set echo off
 
NAME                                    V DIFF
------------------------------ ---------- ----------------
session pga memory max             491164          205,120
 
ops$tkyte@ORA9IR2> begin
  2          for i in 1 .. 500
  3          loop
  4                  demo_pkg.g_data(i) := rpad('x',2000,'x');
  5          end loop;
  6  end;
  7  /
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> @mystat2 "session pga memory max"
ops$tkyte@ORA9IR2> set echo off
 
NAME                                    V DIFF
------------------------------ ---------- ----------------
session pga memory max            1867420        1,581,376
 
 

string literal padded with trailing blanks

April 15, 2005 - 3:29 pm UTC

Reviewer: A reader

Hi Tom,
Our problem is related to one of your comments earlier -

character string literals like that are coerced to the proper datatype automagically
'A' is really 'A '
it is the ansi defined behaviour

We have a select like this in our jsp (java server page):
select '250' as TYPE_ID, ... from ...

At one client site, this is returning '250 ' in a java String variable. According to the developers, it always returned '250' (without the trailing blanks). I noted that this particular client uses utf8 as their db characterset, whereas all others use us7ascii. Is this behaviour of padded trailing blanks because of utf8 characterset? Is there anything we could do on the jdbc side that will globally suppress the trailing blanks from strings rather than we having to change our java code all over to do a trim?

Tom Kyte

Followup  

April 15, 2005 - 3:31 pm UTC

other thought first -- what is cursor_sharing set to on this customer site?

cursor_sharing?

April 15, 2005 - 5:55 pm UTC

Reviewer: A reader

Sorry for the delay, Tom but the client is out of U.S and I don't have sysdba access to their db. I guess cursor_sharing=similar of force could be causing this? I will ask them to set it to exact and see. Thanks much.

Tom Kyte

Followup  

April 15, 2005 - 6:26 pm UTC

ops$tkyte@ORA9IR2> alter session set cursor_sharing=exact;
 
Session altered.
 
ops$tkyte@ORA9IR2> select '123' from dual d1;
 
'12
---
123
 
ops$tkyte@ORA9IR2> alter session set cursor_sharing=force;
 
Session altered.
 
ops$tkyte@ORA9IR2> select '123' from dual d2;
 
'123'
--------------------------------
123
 
ops$tkyte@ORA9IR2>


see the column widths?  with cursor sharing force/simlar the second query becomes select :sys_bv0 from dual and you know longer know "3" is the answer


and "show parameter" does not need sysdba

in fact, just run select '123' from dual and see what you see in sqlplus! 

Difference in performance?

May 09, 2005 - 10:09 am UTC

Reviewer: A reader

I just stepped into a shop where the developers have been strictly instructed to use char rather then varchar2 because of "performance benefits". Any truth to this?

PS - No metrics, no test cases...surprised?

Thanks in advance!

Tom Kyte

Followup  

May 09, 2005 - 10:16 am UTC

yes, char will make you run slower, consume more ram and make the pockets of your hardware vendor better padded.

Other than that, they are great.

recommendation -- never use char()

Documentation to this fact...

May 18, 2005 - 11:00 am UTC

Reviewer: A reader

Tom,

Does Oracle have any documentation stating that varchar2 is prefered to char and why? A colleague is convinced that using varchar2 instead of char columns will degrade the performance of the database. He says that using char columns will write everything in nice, neat groups and prevent fragmentation. Being from Missouri, I need to show him to convice him otherwise.

Thanks!

Tom Kyte

Followup  

May 18, 2005 - 11:39 am UTC

<quote>
Does Oracle have any documentation stating that varchar2 is prefered to char and
why?
</quote>

no, it is your choice and my voice on it is "USE VARCHAR2 and FORGET CHAR even exists"

a char is simply a fixed with varchar2 -- with blanks at the end.

a char(80) will take bytes for the length (like a varchar2) and consume 80 more bytes storage on disk.

it will make your indexes as big as they can be.
it will make your tables as large as they can.
I will never use them.

Performence between data types

July 11, 2005 - 9:56 am UTC

Reviewer: Sanjaya Balasuriya from Colombo, Sri Lanka

Hi Tom,

1.
When used for storing text (characters and numbers) data, is there any performance wise difference between number, char and varchar2 ?

2.
How can we compare number, char and varchar2 when it comes to efficiency of storage management ?

Thanks in advance.

Tom Kyte

Followup  

July 11, 2005 - 11:36 am UTC

1) well, the only thing that makes SENSE To store text is "varchar2"

the only thing that makes SENSE to store a number is, well, a number

same with a date. The question doesn't even make sense to me -- if you have numbers, use a NUMBER type.

2) you don't, you won't, you use the right type for your needs, period, plain and simple.

Case for CHAR

July 27, 2005 - 7:56 pm UTC

Reviewer: Kevin Kirkpatrick from St. Louis, MO

I have a friend (wow, is that cliche or what) who works in a shop where the primary context in which data will be extracted will be into fixed-width flat files processed by legacy COBOL programs. Thus, the majority of SQL written against the database will pull fixed-width fields. Given that space is not an issue, that single-order-magnitute-or-less performance degradation is not an issue, might this be a case where even Tom Kyte would consider using CHAR's in place of VARCHAR2's?

Consider:
SELECT FIELD1, FIELD2
FROM TEMP

vs.
SELECT RPAD(FIELD1,10,' '), RPAD(FIELD2,15,' ')
FROM TEMP

Thanks!

Tom Kyte

Followup  

July 27, 2005 - 8:44 pm UTC

Nope, because eventually someone will ask "what is the length of 'bob'" and getting 20 back is not what they expect. Or they use a varchar in their program OR they just any one of the thousands of ad-hoc tools and it uses varchars.

Nope, still would not.

CHAR x VARCHAR2 usage

January 05, 2006 - 3:40 pm UTC

Reviewer: Leonardo Pinheiro from Brasil

Hi there,
thinking about CHAR and VARCHAR2 datatypes I figure out one cenario that might indicate CHAR datatype instead of VARCHAR. I would like to confirm it with Tom or any other DBA who have already tested it using block dumps.

If we have a column that is intended to get updated a log and those updates will change data lenght, we might have a lot more I/O using VARCHAR2 due to the block reallocation possibility inside the datafile. For instance:

create table test (A varchar2(2000))

insert into test values ('a') (lets say 1000 of there inserts)

update test set A='aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa' we would have to move all the blocks forward in the datafile to fit the new data for each row. Is that performance issue considerable or is it just rubish?

thanks!

Tom Kyte

Followup  

January 05, 2006 - 4:46 pm UTC

you would not move all of the blocks "forward"...

When you update a row - and make it grow in size such that is no longer fits on a block - Oracle will MIGRATE the row.

That is, the original block will just have a "forwarding address"- a rowid pointer to where the row is right now.

So, no, it won't cause the entire "end of the table to move" to make room - it will however incurr an extra IO on retrieval via an index (the index will point to where the row started on, we goto that block to discover the row has "moved" or migrated and go to that block to get it).

This is why you would consider setting pctfree. To reserve room on the block for future updates and to avoid the row migration.

January 06, 2006 - 4:26 am UTC

Reviewer: Raj from IN

Tom,
in case of above reply can I have my tablespace segment space management as auto and not worry about setting pctfree, pctused etch

Tom Kyte

Followup  

January 06, 2006 - 1:51 pm UTC

which above reply, there are lots of them.

ASSM (automatic segment space management) does not help in the setting of pctfree, you still need to do that.

Blank Padds in Char Data Type

January 26, 2006 - 10:41 pm UTC

Reviewer: Sanjai kumaar NM from THAILAND

I have two table with following descriptions.
Table TEST1 (X CHAR(10)) AND Table TEST2 (Y CARCHAR2(10))
Both the tables has some similar values. When I do a select statement like
SELECT X,Y FROM TEST1,TEST2 WHERE TEST1.X=TEST2.Y
This does not return any rows. Whereas if I issue a select statement seperately on individual tables like
SELECT X FROM TEST1 WHERE X='ABC12' AND
SELECT X FROM TEST2 WHERE Y='ABC12'
These select statements returns rows.
What makes the difference.
Thanks In Advance.


Tom Kyte

Followup  

January 27, 2006 - 8:33 am UTC

select dump(x) from test1 where x = 'ABC12';
select dump(y) from test2 where y = 'ABC12'

what does that produce (cut and PASTE)


Documentation on CHAR deprecation

April 21, 2006 - 8:03 am UTC

Reviewer: Nice

Hello Tom,

Will CHAR data type would be deprecated? Is there any official announcement on this?
The first reference to the de-support of CHAR seems to have surfaced on metalink in December 2001
CHAR is part of the ANSI standard, so it seems rather unlikely to go.

Tom Kyte

Followup  

April 21, 2006 - 8:46 am UTC

I would doubt it would be.

Where do you see a reference to the desupport of this type?

VARCHAR2(<2000) Vs VARCHAR2(>=2000)

June 22, 2006 - 1:33 pm UTC

Reviewer: VKOUL from WA USA

Hi Tom,

"In PLSQL, what they are saying is that for optimization purposes -- a varchar2
less then 2000 bytes will allocate a fixed amount of memory. It will
preallocate the maximum size upon instantiation. This cuts down on the
reallocations as different sized strings are assigned to it. On the other hand,
a varchar2 greater then 2000 characters will truely be varying sized and as you
assign different sized strings to it it will grow dynamically as needed.

One takes a little more ram (if you never use the entire string) and performs a
little better, the other uses only what ram it needs but if you cause the string
to grow frequently will not perform as well due to the reallocs to resize the ..."

Does this behaviour still hold in Oracle 9i and 10g ?

Thanks.

Strange thing happening

October 02, 2006 - 6:58 am UTC

Reviewer: abz

Oracle 9.2.0.5

SQL> DESC CI_BILL;
 Name                                      Null?    Type
 ----------------------------------------- -------- --------
 BILL_ID                                   NOT NULL CHAR(12)
 BILL_CYC_CD                               NOT NULL CHAR(4)
 WIN_START_DT                                       DATE
 ACCT_ID                                   NOT NULL CHAR(10)
 BILL_STAT_FLG                             NOT NULL CHAR(2)

SQL> L
  1  select COUNT(bill_stat_flg), SUM(DECODE(bill_stat_flg, 'C', 1, 0))
  2* from ci_bill where bill_stat_flg='C'
SQL> /

COUNT(BILL_STAT_FLG) SUM(DECODE(BILL_STAT_FLG,'C',1,0))
-------------------- ----------------------------------
             4651756                                  0


I think both of them should be equal, whats going wrong
here? 

Tom Kyte

Followup  

October 02, 2006 - 7:57 am UTC

you got that neato char datatype working for you there.  I hate char, I will not use char, I will never recommend char, and this - this is a classic reason WHY.


ops$tkyte%ORA10GR2> create table t ( x char(2) );

Table created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert into t values ( 'C' );

1 row created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select decode( x, 'C', 'I am ''C''!', 'I am certainly NOT ''C''' ),
  2         decode( x, 'C ', 'I am ''C ''!', 'I am certainly NOT ''C ''' )
  3    from t;

DECODE(X,'C','IAM''C'' DECODE(X,'C','IAM''C''!
---------------------- -----------------------
I am certainly NOT 'C' I am 'C '!


You have NO 'C' in your table, you have 'C '.  Decode works with varchar types by default, not char types.  When you compare a char to a varchar, the varchar better be padded out OR IT WILL NOT WORK as you "expect" (but it will work entirely as ANSI asked for it to work)

ops$tkyte%ORA10GR2> variable x varchar2(2)
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> exec :x := 'C'

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select * from t where x='C';

X
--
C

ops$tkyte%ORA10GR2> select * from t where x=:x;

no rows selected



When you compare a literal to it (oh my gosh, please do NOT think I'm saying to use literals in your code - I'm really not, bind stuff that varies from execution to execution!!!!) the rules state "the literal shall be promoted to the type we compare to".  Hence the char(1) that is 'C' becomes a char(2) with 'C ' in it.

However, when you put 'C' into a varchar2, you are now comparing a varchar2 to a char and the results are very different.


My suggestion would be to (I am deadly serious) implement an engineering change request on your system to "remove that type, it wastes space, it causes confusion, it provides negative value". 

very strange

October 02, 2006 - 8:40 am UTC

Reviewer: abz

Oh, its really very confusing.
Problem is that we are using a 3rd party product
called CC&B from SPL www.splwg.com.

That product has used 99 percent of CHAR datatypes.

can you please tell me how can I convince them.

OR
Although you have explained it very well, but is there
a document like ALL ABOUT CHAR DATATYPE, which I can
circulate to all the developers in the organization, so
that they know that what should they expect when working
with CHAR datatypes.

Thanks


Tom Kyte

Followup  

October 02, 2006 - 9:03 am UTC

well, the documentation actually covers this. I cover it too in my book Expert Oracle Database Architecture - chapter on datatypes - but I cover it from the perspective of "the char type is just a waste, do not use it"

CHAR vs VARCHAR

January 15, 2007 - 12:02 pm UTC

Reviewer: Deepak from India

Hi Tom,

First of all please excuse me because am going to ask a very trivial question. But it raises few points in my mind.

Have done this simple exercise...

SQL> conn scott/tiger
Connected.
SQL> create table char_vs_varchar(
2 col1 char(5),
3 col2 varchar2(5));

Table created.

SQL> insert into char_vs_varchar values('TEST','TEST');

1 row created.

SQL> COMMIT;

Commit complete.

SQL> SELECT * FROM char_vs_varchar
2 WHERE COL1=COL2;

no rows selected

SQL> SELECT * FROM char_vs_varchar
2 WHERE TRIM(COL1)=COL2;

COL1 COL2
----- -----
TEST TEST

SQL> SELECT * FROM char_vs_varchar
2 WHERE COL1='TEST';

COL1 COL2
----- -----
TEST TEST

Have the following queries...

I know that Oracle is doing the right thing as it right pads the string for the CHAR column with ' '.

>> But is it always recommended not to use this type of comparisions in queries?

>> Shall we always insert fixed length strings in CHAR columns? In other words, shall we decide on the CHAR data type based on the possible length of the string that will go in, CHAR would be selected for only FIXED length strings?

>> In the last example why does Oracle chose to trim the CHAR column value?

Please help me in understanding the basics?


Common defined types

February 20, 2007 - 2:18 pm UTC

Reviewer: Andrew Markiewicz from Madison, WI

Tom.
My question/response is similar to a previous post on this thread about requiring all varchar2 to be >2000. Your response was
'Looks "safe" enough memory wise, but doesn't sound like a sound programming concept (well, it isn't
frankly).'

We were considering making a package spec to contain predefined types so that programmers could use them instead of defining their own type (most likely differently) in each package. Similar to the standards package Oracle defines but to include some standard plsql tables definitions.

Ex:
create or replace package typ
as
TYPE hashInt is TABLE of integer
index by varchar2(32000);
 
TYPE hashVc is TABLE of varchar2(32000)
index by varchar2(32000);

TYPE hashDt is TABLE of date
index by varchar2(32000);

TYPE vaInt  is VARRAY(10000) of integer;
TYPE vaVc   is VARRAY(10000) of varchar2(32000);
TYPE vaDt   is VARRAY(10000) of date;



end;


And using the types would merely be,
...
xyz  typ.hashDt;
abc  typ.vaInt;


The main benefits are not having to redefine a similar type everywhere it is needed and homogeneous code.
My main concern was the memory consumption but it seems Oracle handles that dynamically, which is good for memory conservation.
I would not advocate the use of these types if there was a requirement to limit the varchar2 size and trap an appropriate error as part of an application, but rather just as a convenient method of using simple arrays.

Are there any drawbacks to you can think of or is this also not a sound programming technique?

Thanks.

Tom Kyte

Followup  

February 20, 2007 - 3:16 pm UTC


If you treat everything as a 32k string, everything will be a 32k string.

You lose the strong typing - which is generally considered "not a good thing".


February 21, 2007 - 12:08 pm UTC

Reviewer: Andrew Markiewicz

But I am not suggesting that everything be a varchar2(32000). There is a plsql table defined for integers and dates as well so the data type is preserved. The main goal is to avoid having to define a type every time you just want a simple array of strings (or integers or dates).

An additional consideration could be to make a UDT VARRAY(32767). This adds the benefit of using the type constructor within function calls or even SQL.

create or replace type vcarr as VARRAY(32767) of varchar2(32768)
/

set echo on
col column_value format a20

select * from table(vcarr('x','y','z'));

COLUMN_VALUE
--------------------
x
y
z

declare
    x   vcarr := vcarr();

    procedure proc1(p_arr in vcarr)
    is
    begin
        for i in p_arr.FIRST .. p_arr.LAST
        loop
            dbms_output.put_line('i='|| i || ' ' || p_arr(i));
        end loop;
    end;

begin
    proc1(vcarr('a','b','c','d'));
    proc1(vcarr('hello', 'world'));

    x.EXTEND(5000);
    for i in 1 ..5000
    loop
        x(i) := to_char(i);
    end loop;
    proc1(x);

end;
/
i=1 a
i=2 b
i=3 c
i=4 d
i=1 hello
i=2 world
i=1 1
i=2 2
i=3 3
i=4 4
i=5 5
i=6 6
i=7 7
i=8 8
i=9 9
i=10 10
i=11 11
i=12 12
i=13 13
i=14 14
i=15 15
i=16 16
i=17 17
...
i=4999 4999
i=5000 5000

PL/SQL procedure successfully completed.


For the case of a type (or plsql table) I just abhor the idea of creating types like:

TYPE vc100 of varchar2(100)
TYPE vc500 of varchar2(500)
TYPE vc4000 of varchar2(4000)
...

when plsql handles the memory allocation dynamically (at a minor performance hit). Why not just one type per native data type (varchar2, integer, date,...)?
Again, for simple arrays that do not require constraints on the size of the data.
Unless of course Oracle has some designs on creating real arrays for plsql....

Andrew
Tom Kyte

Followup  

February 21, 2007 - 12:31 pm UTC

if you have a table of names (that should be 30 characters or less)
and a table of addresses (that should be 120 characters or less)

and you use a table of 32k - you'll have names that are 32k and addresses as well.


it is called "strong typing", think of the datatype as a DATA INTEGRITY check (because, after all, that is exactly what it is)

No, you won't convince me to use a few generic types for everything.

I might not convince you not to - but that is another story.


You asked my opinion, I'm of the opinion that creating the correct type for your code is the right approach.


.... for simple arrays that do not require constraints on the size of the data. ...


not sure where that occurs, or why it is onerous to create a type specific to an API.

February 21, 2007 - 1:55 pm UTC

Reviewer: Andrew Markiewicz from Madison, WI

Fair enough.
That's why I'm debating the issue internally. I'm not convinced it's a great idea myself, just convenient.

As always, thanks for your input.

is there anything different varchar with varchar2?

March 14, 2007 - 10:51 am UTC

Reviewer: Jerhyn from Seoul, South Korea

this place is so amazing, i always appraciate you tom.

while i read this topic, i was wondering what is different varchar with varchar2.

could you tell me? or just let me know where can i find about them.

Tom Kyte

Followup  

March 14, 2007 - 2:05 pm UTC

nothing right now, they are the same right now, they might differ in the future.

furture

March 14, 2007 - 3:19 pm UTC

Reviewer: marc from usa

How far in the future did they think about? Oracle has been saying "for future use" since version 7. Right?
Tom Kyte

Followup  

March 15, 2007 - 7:23 am UTC

at least.

Char vs Varchar2, like vs =

March 28, 2007 - 2:03 pm UTC

Reviewer: Jason Tsai from Santa Clara, CA USA

Tom,

This is the question regarding the reply you made for the second post in this thread from kiran.

I tried your test case in both 9.2.0.3 and 10.2.0.1, but got different results for predicate containing char datatype column and like operator:

In 9.2.0.3:

invmgmt_dev@SCORDEV3(95,1290)> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.3.0 - 64bit Production
PL/SQL Release 9.2.0.3.0 - Production
CORE    9.2.0.3.0       Production
TNS for IBM/AIX RISC System/6000: Version 9.2.0.3.0 - Production
NLSRTL Version 9.2.0.3.0 - Production

Elapsed: 00:00:00.01
invmgmt_dev@SCORDEV3(95,1290)> create table t ( x char(10), y varchar2(10) );

Table created.

Elapsed: 00:00:00.01
invmgmt_dev@SCORDEV3(95,1290)> insert into t values ( 'x', 'x ' );

1 row created.

Elapsed: 00:00:00.00
invmgmt_dev@SCORDEV3(95,1290)> select * from t where x like 'x';

X          Y
---------- ----------
x          x

Elapsed: 00:00:00.00
invmgmt_dev@SCORDEV3(95,1290)> select * from t where y like 'x';

no rows selected

Elapsed: 00:00:00.01


In 10.2.0.1:

jason@ORA10GR2(143,8689)> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
PL/SQL Release 10.2.0.1.0 - Production
CORE    10.2.0.1.0      Production
TNS for 32-bit Windows: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production

5 rows selected.

Elapsed: 00:00:00.09
jason@ORA10GR2(143,8689)> create table t ( x char(10), y varchar2(10) );

Table created.

Elapsed: 00:00:00.01
jason@ORA10GR2(143,8689)> insert into t values ( 'x', 'x ' );

1 row created.

Elapsed: 00:00:00.00
jason@ORA10GR2(143,8689)> select * from t where x like 'x';

no rows selected

Elapsed: 00:00:00.01
jason@ORA10GR2(143,8689)> select * from t where y like 'x';

no rows selected

Elapsed: 00:00:00.01


After searching the metalink, I found a bug no. 4397534, which indicates the behavior in 9.2.0.3 is a bug.

Which behavior is correct?

Thanks,

Jason
Tom Kyte

Followup  

March 30, 2007 - 11:31 am UTC

why would I choose an answer other than what was stated in the bug report - that 9i had a bug that was corrected in 10g?

What is this behaviour of Oracle 10gR2

November 03, 2007 - 2:52 am UTC

Reviewer: Sagar from Mumbai, India

Why varchar column is varchar2 in desc output?.

SQL> ed
Wrote file afiedt.buf

  1* create table tvar ( col1 varchar(100),col2 varchar2(4000 ))
SQL> /

Table created.

SQL> desc tvar
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 COL1                                               VARCHAR2(100)
 COL2                                               VARCHAR2(4000)

Tom Kyte

Followup  

November 05, 2007 - 11:15 am UTC

as stated above... and as documented:

in response to "what is the difference between varchar and varchar2"


... nothing right now, they are the same right now, they might differ in the future. ...

date as varchar

February 19, 2008 - 3:19 pm UTC

Reviewer: dobromil kautsky na stroji husgvarna

hello tom,

i know you say that dates should be stored in date columns but there is one case where i am not sure about how to achieve that.

imagine you need to store a date value whose precision can be anything between a year and a second. so the column may contain values like this:

'2008'
'2008-10'
'2008-10-03'
'2008-10-03 12:20'

etc.

we used to store these dates as a date column and as a number column where the number specified the precision of the date, but working with such structure turned out to be quite cumbersome.

now we store these dates as a varchar ('yyyymmdd.....') - everything is much simpler now, but i still don't feel well about it - i wonder if there is a way to achieve this functionality with date datatype.

thank you
Tom Kyte

Followup  

February 20, 2008 - 7:58 am UTC

when you store 2008 in yyyymmdd - what do you store?

01-jan-2008 00:00:00 -> 2008               
01-oct-2008 00:00:00 -> 2008-10
03-oct-2008 00:00:00 -> 2008-10-30
03-oct-2008 12:20:00 -> 2008-10-30 12:20


and how is it better/different then using a date as shown above.

if you store 2008 as 20080000000000 - then you have just stored the logical equivalent of a date column and a number column - just using one field (eg: how would that be any more or less cumbersome than what you already had)

dates

February 20, 2008 - 8:20 pm UTC

Reviewer: dobromil kautsky na stroji husgvarna

when you store 2008 in yyyymmdd - what do you store?

only the part of the date that we know is stored - so for year it's '2008', for a month it's '200810', etc.

if i only used a date column i would lose an important piece of information - the precision.

sometimes we do not know the exact date, we only know the year, or the year and month, etc. so storing 2005 as 2005-01-01 would be missleading for the users (that's why we used to have the additional number column specifying the precision).

for example - in the table containing statutory body members there are two date columns which tell us from when to when the person was a member of the statutory body of a company. when we want to select all records valid for a given date, we can write:

select * from statutory_body_members where :date_str between start_date and end_date||'z'

for this to work we would have to have different rules for storing the value of '2008' in start_date (2008-01-01) and in end_date (2008-12-31 23:59:59) (which of course would be possible but after several years of working with both ways of storing these dates, i find the 'varchar' way much simpler to use - especially due to the fact that only one column is needed to store the date)



Tom Kyte

Followup  

February 20, 2008 - 8:53 pm UTC

you said you stored yyyymmddhh24miss, so you don't. You use a variable mask.


The varchar approach might be "easier" for you, but throws off the computed cardinalities. Your choice.


Wonder how many of your dates are invalid.... Data types are integrity checks too.

I'm not a fan of it (using the wrong type)


what if you used two columns again, the start date and the interval (duration).

February 20, 2008 - 9:24 pm UTC

Reviewer: dobromil kautsky na stroji husgvarna

Wonder how many of your dates are invalid.... Data types are integrity checks too.

s simple check constraint can take care of that.


I'm not a fan of it (using the wrong type)

i know - that's why i do not feel well about our use of varchars

what if you used two columns again, the start date and the interval (duration).

all operations (that we perform) on these columns would become more complicated. (eg. the one thing that we do the most - store the date, display the date to the user - now all we do is store the string as it comes and then show it again - with dates you need to keep track of where to store the "start date" format of the date and where to store the "end date", the when reading the date you need to format the date with a function because again there is no way to pass the date containing only the year as a date)

it would be best if oracle was capable of storing this information in a date column, unfortunatelly it only distinguishes between a date and a date with time - maybe a new datatype - just 3 more bits and so much work could be saved (at least at our company :-) )
Tom Kyte

Followup  

February 20, 2008 - 10:40 pm UTC

.... s simple check constraint can take care of that. ...

so you've done that right?

with a start_date and a duration, you would not have to store the "format" anywhere.

the duration would cover that already.

start_date, start_date + interval



It does not distinguish between a date and a date with time, ALL DATES have time, all of them.


February 21, 2008 - 7:09 am UTC

Reviewer: dobromil kautsky na stroji husgvarna

i am not sure whether i understand how this start_date + interval could help me. could you please show me sample data? for example for a person who was a member of the statutory body from '200204' to '2007'. if i understand you well, you suggest to use two columns for both these dates?


Tom Kyte

Followup  

February 21, 2008 - 7:41 am UTC

show me how you do that with your current data first.


it'll be very similar.

provide your test data. Let us see examples of people that were there in 2005, another in October 2006 and someone for one day in 2007.

sample data

February 22, 2008 - 7:20 am UTC

Reviewer: dobromil kautsky na stroji husgvarna

here is a sample of real data (not very representative because 95% of strt_dt's and end_dt's in this table are yyyymmdd, but it's those that are not that cause "problems"..)


create table SB_SAMPLE
(
ROW_ID NUMBER(20) not null,
ROW_VER NUMBER(10),
ROW_DT DATE not null,
COMPANY_ID NUMBER(20) not null,
TYPE_CD VARCHAR2(90),
PERSON_ID NUMBER(20) not null,
STRT_DT VARCHAR2(30),
END_DT VARCHAR2(30)
)
;

insert into SB_SAMPLE (ROW_ID, ROW_VER, ROW_DT, COMPANY_ID, TYPE_CD, PERSON_ID, STRT_DT, END_DT)
values (700, null, to_date('22-10-2002 19:05:12', 'dd-mm-yyyy hh24:mi:ss'), 1700, '04', 887200, null, '19960318');
insert into SB_SAMPLE (ROW_ID, ROW_VER, ROW_DT, COMPANY_ID, TYPE_CD, PERSON_ID, STRT_DT, END_DT)
values (1100, null, to_date('22-10-2002 19:05:12', 'dd-mm-yyyy hh24:mi:ss'), 2500, '04', 884700, null, null);
insert into SB_SAMPLE (ROW_ID, ROW_VER, ROW_DT, COMPANY_ID, TYPE_CD, PERSON_ID, STRT_DT, END_DT)
values (1200, null, to_date('22-10-2002 19:05:12', 'dd-mm-yyyy hh24:mi:ss'), 2800, '04', 167000, '19950601', '199807');
insert into SB_SAMPLE (ROW_ID, ROW_VER, ROW_DT, COMPANY_ID, TYPE_CD, PERSON_ID, STRT_DT, END_DT)
values (1600, 2, to_date('22-06-2007', 'dd-mm-yyyy'), 3600, '04', 371300, '1994', '199708');
insert into SB_SAMPLE (ROW_ID, ROW_VER, ROW_DT, COMPANY_ID, TYPE_CD, PERSON_ID, STRT_DT, END_DT)
values (1900, null, to_date('22-10-2002 19:05:12', 'dd-mm-yyyy hh24:mi:ss'), 4000, '04', 1009200, '1992', '19971231');
commit;
Tom Kyte

Followup  

February 22, 2008 - 7:28 am UTC

you did not answer my question.

please show how to retrieve any bit of data with your where clause.

whether it be year/yearmonth/yearmonthday/whatever.

I want to find those people I referred to above, how would *you* do it.

February 22, 2008 - 8:11 am UTC

Reviewer: dobromil kautsky na stroji husgvarna

ok, so basically we access these dates in 3 ways:

1) just display them (or store), so it's just
"select strt_dt from ..." - our nls_date_format is set
to 'yyyymmddhh24miss' so accessing date columns and varchar columns containing dates is transparent to the middle tier.

2) select records that were valid on a given date (this is usualy (almost always) a part of a more complex filter - eg: all members for a given set of companies and a given date):

select * from sb_sample where company_id in (1,2,...) and :date_str_param between between nvl(strt_dt,'0') and end_dt||'z'

3) select records whose dates fall within a given range (used for monitoring changes of tha data). eg:

select * from sb_sample where strt_dt||'z'>=:date_str_oparam_from and strt_dt<=:date_str_param_to||'z'

(in 99% of cases the query is run with :date_str_param_to set to current date)


Tom Kyte

Followup  

February 22, 2008 - 12:06 pm UTC

so, it would be IDENTICAL.

You would store a start date, eg:

01-jan-2007

and an interval

1 year

now, start_date is obviously 01-jan-2007
and end_date is that plus 1 year.

So, how much harder is that then the string? (eg: none)


select * from sb_sample where company_id in (1,2,...)
and to_date(:date_str_param,fmt)
between between nvl(strt_dt,some_date_way_in_past)
and nvl(strt_dt+interval,some_date_way_in_future)


February 22, 2008 - 12:22 pm UTC

Reviewer: dobromil kautsky na stroji husgvarna

i do not think it's the same.

1) how would you store a records with dates like this:

i) start_date = '2005', end_date = '2006-03-02'
ii) start_date= '2005-03', end_date = '2007-04-02'
iii) start_date = '2005-01-06', end_date = '2009'

2) how woud you rewrite the third select:

select * from sb_sample where strt_dt||'z'>=:date_str_oparam_from and
strt_dt<=:date_str_param_to||'z'

to achieve the same functionality.

3) bear in mind that the same selects need
to be run on end dates too

i think what you sugest is functionally
equivalent to just having start_date
and end_date as two date columns,

Tom Kyte

Followup  

February 22, 2008 - 12:56 pm UTC

'2005', end_date = '2006-03-02'

start date of 01-jan-2005 and interval of your end date minus start date


Look: you have START DATES and END DATES
I'm saying store the START DATE and a DURATION

now, we both have start dates and end dates

how to rewrite this?

select * from sb_sample where strt_dt||'z'>=:date_str_oparam_from and
strt_dt<=:date_str_param_to||'z'

come one - you can do that given a date that is a start_date and another date that is an end_date equal to start_date plus duration.

instead of strt_dt||'z' you nvl(start_date, to_far_in_future)
instead of strt_dt, well, you just have start_date
instead of :date_str_param_to||'z' you have nvl(to_date(:bind),date_in_future)

actually, you could lose the strt_dt||'z' couldn't you - the null trick doesn't count since you just use strt_dt next.



All you need is

a) start_date
b) end_date

that is all, that is what you have, you are playing games with strings to get them to sort. We need not play games with dates, they just do the right thing in the first place.

February 22, 2008 - 12:33 pm UTC

Reviewer: dobromil kautsky na stroji husgvarna

and one thing - when i insert '2008' i need to select '2008', if i was to use what you suggest, how would i do that?
i can't display '2008-01-01' to the user, when the value that was inserted (=the most precise value we know) was '2008' - this is very important.
Tom Kyte

Followup  

February 22, 2008 - 12:57 pm UTC

that sounds like you want to store the display format when you save the data perhaps.

look - you apparently very much like your tricks with 'z' and everything. So be it. You can do it all with dates, you can do it just as easily with dates, and if you need a specific format for display purposes, by all means - go for it, that is pretty easy.

February 22, 2008 - 1:17 pm UTC

Reviewer: A reader

that sounds like you want to store the display format when you save the data perhaps.

exactly - so there we are again. that's what we moved from - using a date + precision (or format whatever) - so instead of one column per date, i need two.

look - you apparently very much like your tricks with 'z' and everything. So be it

it's not about what i like or what i do not like - all i need is a solution that will be as "correct" as possible and at the same time, working with the data should be simple and straight forward.


instead of strt_dt||'z' you nvl(start_date, to_far_in_future)


well those two are not equivallent at all!
see

:date_param = '2006-05-01'
strt_dt = '2006';

while
:date_param<=strt_dt||'z' is TRUE
:date_param<=nvl('2006-01-01','2222-01-01') is FALSE

the thing is, every time i use a date after '<', '<=', or as the second operand of 'between' i need to make it the "latest second" in the given period (be it a year, month, day or whatever - so something<'2008' must be somtehing<'20081231235959', while something>'2008' is the same as somthing>'2008' (or '20080101' - ot's the same in this case) - but the same column can (and does) appear both in "less than" and "greater than" conditions)

if there is a simple way to achieve this with dates, i want to know it! - of course it's simple if you know that all records will hold 'yyyymmdd', or 'yyyymmmddhh24miss', but when every record can contain a different "precision" it's not that easy - so far i have not been able to come up with a way to achieve this with dates while keeping the simplicity of the solution anywhere near the varchar way..

it does not please me, believe me. i would love to have a date datatype remembering the exact value inserted.




Tom Kyte

Followup  

February 22, 2008 - 3:03 pm UTC

.... exactly - so there we are again. that's what we moved from - using a date + precision (or format whatever) - so instead of one column per date, i need two. ...

and so what, you are storing multiple bits of data encoded differently into a single field for your ease, your convenience, this is no different than using a field to store attribute_1 or attribute_2 - depending on how you feel.

You are just encoding data special to save yourself a keystroke. And adding the need for your 'z' tricks and such - pretending dates are strings and having to munge them to make them work.


...
it's not about what i like or what i do not like - all i need is a solution that will be as "correct" as possible and at the same time, working with the data should be simple and straight forward....

Ok, now here you are just plain wrong.

If I have:

a) start date
b) end date

in Oracle dates - which I do if I either store a start and end date (YOU ARE, just in a string, not in a date) or start date and interval - getting all of the data BY DATE is easy, straightforward, not cumbersome.

When you put it in a string, now you have to teach people the zen and art of the letter 'z'.

Your approach is cumbersome and requires training. Using DATES, using simple DATES - anyone should be able to retrieve the data.

And selecting

to_char(start_date, start_date_fmt), to_char(end_date, end_date_fmt)

does not seem cumbersome (heck - think 'view'). And inserting the format is not either (umm, you must have some editing in place right, you don't after all accept 20081 as valid input do you? how do you prevent that - well, just replace all of that code with collect a format and insert it. The DATE will be validated by the datatype, your format is on of YYYY YYYYMM YYYYMMDD YYYYMMDDHH24MISS - easy for a check constraint)


I still am not seeing your perceived complexity in getting data out that has DATES in it.

I don't know why you used start date there, you would use start date and end date to get things that happen in a range.

You had:

select * from sb_sample where strt_dt||'z'>=:date_str_oparam_from and
strt_dt<=:date_str_param_to||'z'


(which frankly, the 'z' trick, confusing - unless you invented it)

if you want overlaps, you only need:

(start_date <= :p_ending) AND (end_date >= :p_beginning)

that is more clear that 'z'

February 22, 2008 - 3:30 pm UTC

Reviewer: A reader

if you want overlaps, you only need:

(start_date <= :p_ending) AND (end_date >= :p_beginning)


did i say that overlaps are a problem? not at all - it's the other query that takes avdantage of varchar storage:


You had:

select * from sb_sample where
strt_dt||'z'>=:date_str_oparam_from and
strt_dt<=:date_str_param_to||'z'


so please - show me how to rewrite this simple query using dates instead of varchar.



Tom Kyte

Followup  

February 22, 2008 - 3:35 pm UTC

tell us what that query does.

put it out in specification form. I'm not following the 'z' logic frankly.

because you know, it looks like "find me things that overlap with my range", apparently - it isn't.

In fact, it is what you said:

3) select records whose dates fall within a given range (used for monitoring changes of tha data). 
eg:

select * from sb_sample where strt_dt||'z'>=:date_str_oparam_from and 
strt_dt<=:date_str_param_to||'z'

(in 99% of cases the query is run with :date_str_param_to set to current date)


taking what you said in that, my predicate is what you use with dates.

February 22, 2008 - 3:42 pm UTC

Reviewer: dobromil kautsky na stroji husgvarna

ok, an example:

the user enters a range of dates: "from" and "to"
(again - they can enter just a year, year+month, date, date with time - but that's not important now)

now they want all records whose start_date (the same goes for end dates) is in that range, so, if they enter

'2006-07-21' - '2007-02-03'

they want all records whose start_date was between those two dates (including records where the start_date is '2006', '2006-07',etc.)

or they can enter '2006-10' - '2007'
and the same applies.

so if the date is '2006' it must be returned for all ranges that overlap with this year at least by a second.

Tom Kyte

Followup  

February 22, 2008 - 4:02 pm UTC

and that is what my predicate would return.

Your example has a from/to
Your front end gets a from/to

Your front end adjusts those from/to (just like you add your 'z')

So, if they enter from 2006 to 2007, then yes, your front end would be

2006-01-01-00:00:00 2007-12-31-23:59:59

I see that as being no different than 'z', your front end accepts data, and does the right thing - regardless.

so, if you have dates, this is not hard - certainly *not impossible or impracticable'

but like I said, you seem extremely happy with what you have, I'm not going to change that - you have 'z', you like it.

I look at this and say "the front end would easily be able to accept input dates - validate these inputs - and bind the correct value.

Heck, you could even bind STRINGS if you wanted.... A simple CASE statement in the where clause would convert them.

for the start date and then a different one for end date
case when length(:bind) = 4 /* must be yyyy */ then to_date ....


February 22, 2008 - 4:30 pm UTC

Reviewer: dobromil kautsky na stroji husgvarna

it has nothing to do with the front end and how the parameters are sent
look:

alter session set nls_date_format = 'yyyymmddhh24miss';

create table dates (
val_txt varchar2(20) not null,
val_dt date not null
);

insert into dates values ('2006','20060101');
insert into dates values ('200610','20061001');
insert into dates values ('20061021','20061021');
commit;



-- user input:
-- from = '2006-02-01', to '2006-10-02'
-- correct result:
select * 
  from dates 
 where val_txt||'z'>='20060201'
   and val_txt<'20061002'

-- wrong result
select *
  from dates
 where val_dt between '20060201' and '20061002235959'



it's all about the need for '2006' to fullfill "between '20060201' and '20061002'"

i can do that with dates - but a simple >= becomes long chain of "or's"
Tom Kyte

Followup  

February 22, 2008 - 4:37 pm UTC

I said - the database table would always have

a) start_date
b) a way to derive END DATE - either by putting in there or putting an interval, you choose.

so, the table would have start_date, end_date

never 2006

it would have the first day of 2006, last day of 2006 - or first day of 2006 and an interval of one year.


AND IT IS ONE AND - no ORS

I'm not getting it.

start_date
end_date/interval

simple where clause

done.

February 23, 2008 - 3:35 am UTC

Reviewer: dobromil kautsky na stroji husgvarna

i'm so stupid!

i think we still misunderstand each other - i have no problem with representing the interval - start_date, end_date, but to work with the one column (be it start_date, or end_date).

when i need to filter records based on their start_date (it's not related to end_Date or any other columns) - however i store tha start_date representing '2006' i need to make sure, that when i compare this column to another date value with ">" or ">=" (like in start_date>=:param), this start_date must not act as 2006-01-01 but as 2006-12-32 23:59:59, but at the same time when i compare it with "<" or "<=" it must act as 2006-01-01.
(like in the example where the user wanted all records whose start_date is between '2006-02-01' and '2006-10-02' - when they search this interval, they want to see the records with start_date = '2006' too, which is '2006-01-01' when stored as a date and does not match this condition)

i was so biased agianst having each date (that needs this variable precision) in two columns, that not finding a simple soliution in sql i did not even think of using a function, on which i could build an index (which is what i do with strt_dt||'z' anyway) and then use

getlastsecof(start_date,start_date_precision)>=:param

just like i used strt_dt||'z'

maybe i needed to hear that storing the date in two columns is ok (more ok than using the string) to be able to see the obvious.

thanks :-)
Tom Kyte

Followup  

February 24, 2008 - 11:41 am UTC

... but to work with the one column (be it
start_date, or end_date). ..

and I find that to be a bad practice. You are encoding multiple attributes into a single string, needing to use the "z" trick. It is all tricks and all not a good practice.


When you start encoding data in an attribute - you are doing something not good in general (in fact, I'm hard pressed to come up with a case whereby encoding data magically into an attribute was a good thing - I cannot).

You have to teach people how to query your data. That in itself is a bad thing. This 'z' trick - is just that, a neat trick - but it indicates a deeper issue - the data is munged in there in a tricky way.

That is why I do not like it, I like straightforward, simple, correct...

10g and varchar2 optimization

June 03, 2008 - 8:24 pm UTC

Reviewer: Andres from Estonia

about the original answer and especially the "more direct proof" above ( http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1542606219593#28397955862861 ).
It seems, that the line whether the amount of memory is allocated at once or dynamically has moved in 10g from 1999/2000 to 4000/4001 (despite the documentation still claims it to be 1999/2000). I mean, declaring
x varchar2(4000) := 'x'; takes 4000 bytes of memory
and
x varchar2(4001) := 'x'; takes 1 byte.
Now, when in plsql dealing with varchar2(4000) columns that is 50% emtpty, 49% contain less than 100 bytes and 1% several thousand bytes, it would be more wize to declare as variables varchar2(4001) instead of table.column%type?
Tom Kyte

Followup  

June 04, 2008 - 9:24 am UTC

this is why I like to post these snippets - things change and it is nice to be able to verify that the advice of yesterday still does OR does not apply...

as for the last bit, it'll always be a trade off between performance and memory.

4001 - allocate less overall memory, but do so dynamically - might impose a runtime ping

4000 - allocate more overall memory, but run as fast as you can..

Char vs. varchar storage for Null

June 19, 2008 - 7:00 pm UTC

Reviewer: Nitin from Boston, MA USA

Tom, This is extremely informative - but I couldn't definitively derive the answer to something that's key to some database size issues we're facing. I get that character data pads up to the defined size - so characters are bad. But then there are third party apps that you can't change over to varchar :-)
So the question is - if I predominantly only have NULL values in a character column, will it still take up the entire padded space storage wise? Or are NULLs a special condition under which char and varchar take up the same amount of storage space on disk?
Tom Kyte

Followup  

June 19, 2008 - 8:04 pm UTC

if null, they will consume just the space needed to indicate "null"

ops$tkyte%ORA10GR2> create table t ( x char(1000) );

Table created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert into t select null from all_objects;

49738 rows created.

ops$tkyte%ORA10GR2> commit;

Commit complete.

ops$tkyte%ORA10GR2> exec show_space( 'T' );
Unformatted Blocks .....................               0
FS1 Blocks (0-25)  .....................               0
FS2 Blocks (25-50) .....................               0
FS3 Blocks (50-75) .....................               0
FS4 Blocks (75-100).....................               5
Full Blocks        .....................              75
Total Blocks............................              88
Total Bytes.............................         720,896
Total MBytes............................               0
Unused Blocks...........................               0
Unused Bytes............................               0
Last Used Ext FileId....................               4
Last Used Ext BlockId...................          22,753
Last Used Block.........................               8

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> truncate table t;

Table truncated.

ops$tkyte%ORA10GR2> insert into t select 'x' from all_objects;

49738 rows created.

ops$tkyte%ORA10GR2> exec show_space( 'T' );
Unformatted Blocks .....................               0
FS1 Blocks (0-25)  .....................               2
FS2 Blocks (25-50) .....................               0
FS3 Blocks (50-75) .....................               1
FS4 Blocks (75-100).....................               4
Full Blocks        .....................           7,103
Total Blocks............................           7,296
Total Bytes.............................      59,768,832
Total MBytes............................              57
Unused Blocks...........................              64
Unused Bytes............................         524,288
Last Used Ext FileId....................               4
Last Used Ext BlockId...................          56,457
Last Used Block.........................              64

PL/SQL procedure successfully completed.


Char Vs Varchar2

January 08, 2009 - 4:29 am UTC

Reviewer: A reader

Hi Tom,

Could you please let me know, if there is any benefit in having CHAR(1) instead of VARCHAR2(1) in the database? i.e., in relation to performance, memory structure, execution plan etc.

Regards,
Tom Kyte

Followup  

January 08, 2009 - 9:24 am UTC

nope, they would be identical.


so my advice is "do not use char(), it would just cause confusion, pretend char() does not exist"

testcase for timestamp datatype

May 25, 2009 - 6:09 am UTC

Reviewer: jvs from India

Dear Tom,
good day to you as always, if you can please comment on the below test case to proof that timestamp datatype should be used to store timestamp values and not varchar2 datatype.

create table testts (ts_str varchar2(24),ts_ts timestamp);

insert into testts select to_char(to_timestamp('31121995135022833459','ddmmyyyyhh24missff') + numtodsinterval(rownum,'DAY'),'ddmmyyyyhh24missff6') ts_str,
to_timestamp('31121995135022833459','ddmmyyyyhh24missff') + numtodsinterval(rownum,'DAY') from all_objects;

commit;

create index idx_ts_ts on testts(ts_ts);

create index idx_ts_str on testts(ts_str);


execute dbms_stats.gather_table_stats(user,'TESTTS',estimate_percent => null,cascade => true);

select * from testts where ts_str between '01011996135022833459' and '02021996135022833459';
-- Card and consitent gets will be high

select * from testts where to_timestamp(ts_str,'ddmmyyyyhh24missff9') between to_timestamp('01011996135022833459','ddmmyyyyhh24missff9') and to_timestamp('02021996135022833459','ddmmyyyyhh24missff9');
-- Card may be fine but consitent gets will be high

select * from testts where ts_ts between to_timestamp('01011996135022833459','ddmmyyyyhh24missff') 
and to_timestamp('02021996135022833459','ddmmyyyyhh24missff');
-- Card and consistent gets will be fine.


I have not pasted the output of auto trace, but overall is the test case fine.

Thanking you for your time on this.

Regards,
jvs
Tom Kyte

Followup  

May 26, 2009 - 9:46 am UTC

... good day to you as always, if you can please comment on the below test case to proof that timestamp datatype should be used to store timestamp values and not varchar2 datatype. ...

you need no testcase for that.

it is called common sense, logic, doing it correctly.


if you use a string/number to store a date you will get corrupt data some day (you will, not "you might", but rather YOU WILL. Someone will insert feb-29, 2009 - some day. IT WILL happen)

if you use a string/number to store a date, you will consume MORE storage. It takes more space.

If you use a string/number to store a date, you will be plagued with conversion errors.

if you use a string/number to store a date, you WILL confuse the optimizer - causing it to incorrectly estimate cardinalities. This will result in incorrect plans and poor performance (funny, you know what type of systems typically do this extremely poor practice? data warehouses - precisely the systems whereby good query plans are VITAL)

ops$tkyte%ORA10GR2> create table testts (ts_str varchar2(24),ts_ts timestamp);

Table created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert into testts select
  2  to_char(to_timestamp('31121995135022833459','ddmmyyyyhh24missff') +
  3  numtodsinterval(rownum,'DAY'),'ddmmyyyyhh24missff6') ts_str,
  4  to_timestamp('31121995135022833459','ddmmyyyyhh24missff') +
  5  numtodsinterval(rownum,'DAY')
  6  from dual connect by level <= 65000;

65000 rows created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> commit;

Commit complete.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create index idx_ts_ts on testts(ts_ts);

Index created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create index idx_ts_str on testts(ts_str);

Index created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> execute dbms_stats.gather_table_stats(user,'TESTTS');

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> @at
ops$tkyte%ORA10GR2> column PLAN_TABLE_OUTPUT format a72 truncate
ops$tkyte%ORA10GR2> set autotrace traceonly explain
ops$tkyte%ORA10GR2> select * from testts where ts_str between '01011996135022833459' and '02021996135022833459';

Execution Plan
----------------------------------------------------------
Plan hash value: 3830092671

------------------------------------------------------------------------
| Id  | Operation         | Name   | Rows  | Bytes | Cost (%CPU)| Time
------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |        |  2181 | 69792 |    28   (4)| 00:00:
|*  1 |  TABLE ACCESS FULL| TESTTS |  2181 | 69792 |    28   (4)| 00:00:
------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("TS_STR"<='02021996135022833459' AND
              "TS_STR">='01011996135022833459')

<b>full scan, prompted by the optimizer guessing 2,181 rows... there are lots of possible strings between those two strings...</b>


ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select * from testts where to_timestamp(ts_str,'ddmmyyyyhh24missff9') between
  2  to_timestamp('01011996135022833459','ddmmyyyyhh24missff9') and
  3  to_timestamp('02021996135022833459','ddmmyyyyhh24missff9');

Execution Plan
----------------------------------------------------------
Plan hash value: 3131276641

------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time
------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |   163 |  5216 |    34  (21)| 00:00
|*  1 |  FILTER            |        |       |       |            |
|*  2 |   TABLE ACCESS FULL| TESTTS |   163 |  5216 |    34  (21)| 00:00
------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(TO_TIMESTAMP('01011996135022833459','ddmmyyyyhh24missff9')
              <=TO_TIMESTAMP('02021996135022833459','ddmmyyyyhh24missff9
   2 - filter(TO_TIMESTAMP("TS_STR",'ddmmyyyyhh24missff9')>=TO_TIMESTAMP
              ('01011996135022833459','ddmmyyyyhh24missff9') AND
              TO_TIMESTAMP("TS_STR",'ddmmyyyyhh24missff9')<=TO_TIMESTAMP
              22833459','ddmmyyyyhh24missff9'))

<b>full scan prompted by lack of index - and if you say "function based index" then I say "so, you store a string and then to_timestamp it in an index - why?  why???? why do it wrong??

and the estimated card= value is still off, better, but off</b>


ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select * from testts where ts_ts between
  2  to_timestamp('01011996135022833459','ddmmyyyyhh24missff')
  3  and to_timestamp('02021996135022833459','ddmmyyyyhh24missff');

Execution Plan
----------------------------------------------------------
Plan hash value: 2730030766

------------------------------------------------------------------------
| Id  | Operation                    | Name      | Rows  | Bytes | Cost
------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |    10 |   320 |     3
|*  1 |  FILTER                      |           |       |       |
|   2 |   TABLE ACCESS BY INDEX ROWID| TESTTS    |    10 |   320 |     3
|*  3 |    INDEX RANGE SCAN          | IDX_TS_TS |    10 |       |     2
------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(TO_TIMESTAMP('01011996135022833459','ddmmyyyyhh24missff')<
              AMP('02021996135022833459','ddmmyyyyhh24missff'))
   3 - access("TS_TS">=TO_TIMESTAMP('01011996135022833459','ddmmyyyyhh24
              AND "TS_TS"<=TO_TIMESTAMP('02021996135022833459','ddmmyyyy

<b> use the right type, get the right cardinality, get the right plan</b>


ops$tkyte%ORA10GR2> set autotrace off
ops$tkyte%ORA10GR2> select count(*) from testts where ts_ts between
  2  to_timestamp('01011996135022833459','ddmmyyyyhh24missff')
  3  and to_timestamp('02021996135022833459','ddmmyyyyhh24missff');

  COUNT(*)
----------
        33


thanks for your inputs

May 26, 2009 - 12:20 pm UTC

Reviewer: jvs from India

Dear Tom,
good day to you, thanks a lot for your time and help on this, I can now prove with the test case to people who don't have the common sense to understand proper data type usage.

Thanks and Regards,
jvs

char > varchar2 Referencial integrity

May 26, 2009 - 5:58 pm UTC

Reviewer: FMedina from Spain

Hello Tom,

I have the following problem:

-- 1/
CREATE TABLE FIELD_CHAR_MASTER(FIELD_PARENT CHAR(12));

-- 2/
CREATE TABLE FIELD_VARCHAR2_DETAIL(FIELD_CHILD VARCHAR(12));

-- 3/
ALTER TABLE FIELD_VARCHAR2_DETAIL ADD (
CONSTRAINT MASTER_DETAIL FOREIGN KEY (FIELD_CHILD)
REFERENCES FIELD_CHAR_MASTER (FIELD_PARENT));

-- 4/
ORA-02267: column type incompatible with referenced column type


The FIELD_CHAR_MASTER table is big and it´s used for a lot of objects and querys. It has other child tables with CHAR(12) datatype. It not "easy" alter these table.

Is there any way to implement that referencial integrity?

Should I use de CHAR datatype in the future if it´s necesary another child table?

[ Database Version: 9.2.0.8.0 ]

Thanks in advanced
Tom Kyte

Followup  

May 27, 2009 - 8:04 am UTC

you should never have used the char() type at all - but since you did and since it is the parent, you'll need to modify your CHILD table to have an appropriate type.

Of course you should use a char type in the future if referential integrity is to be supported, you have no other choice (other than to change everything else to be a varchar2)

CHAR to VARCHAR in Stored Procedure param?

June 23, 2009 - 2:12 pm UTC

Reviewer: Nigel Olding from California

When I changed my stored procedure parameter to use the datatype of the column in the table from which the data was being retrieved, my CHAR(1) column was returned as a maximum-sized VARCHAR2 to my calling PL/SQL.

Using this table...

create table NTEST_CHAR1
(
COLUMN1 CHAR(1) not null,
COLUMN2 CHAR(1) not null,
COLUMN3 CHAR(1) not null
);

...with this data...

insert into ntest_char1 n
values ('1','A','a');

...this procedure...

CREATE OR REPLACE PROCEDURE test_char1(p_col2 OUT ntest_char1.column2%TYPE) IS
BEGIN
SELECT n.column2 INTO p_col2 FROM ntest_char1 n WHERE rownum = 1;

END test_char1;

...returns a max-sized VARCHR2...

begin
declare l_col2 varchar(32000);

begin

test_char1(l_col2);

dbms_output.put_line(length(l_col2));

end;
end;

Output is:
32000

Is this the expected behaviour? Shouldn't it be 1 character?
If I declare the stored proc param as OUT VARCHAR2, it is returned as 1 character...

Oracle 9.2.0.1.0

Thanks,
Tom Kyte

Followup  

June 26, 2009 - 9:34 am UTC

plsql parameters are always unconstrained types - this is the way is was designed to work, yes.

Reason #3213 to never never never use the CHAR type

char(1) vs. varchar2(1)

June 25, 2009 - 5:26 pm UTC

Reviewer: David from Austin, TX

Tom,
if char(1) and varchar2(1) are really the same thing in terms of storage and performance, what's the reason to keep the CHAR data type in Oracle?

Thanks,
David

Tom Kyte

Followup  

June 26, 2009 - 10:53 am UTC

none, except the standards.

char(1) vs. varchar2(1)

June 26, 2009 - 1:46 pm UTC

Reviewer: David from Austin, TX

when responding to my question about the reasons to keep CHAR data type in Oracle, you said
"none, except the standards.". What Standards are you referring to? Also, do you think it's appropriate to eliminate the use of CHAR in table design in my own organization? Thanks, David

Tom Kyte

Followup  

June 26, 2009 - 2:30 pm UTC

ANSI-ISO SQL Standards


I have written many times "pretend char does not exist, use only varchar2"

June 30, 2009 - 2:03 pm UTC

Reviewer: Chuck Jolley from OKC, OK USA

When we upgraded from 8i to 10g we converted all of our char columns and pl/sql variables to varchar2 because some of our code broke related to chars and it seemed too much trouble to go through tens of thousands of lines of code and hundreds of tables testing for obscure data conversion and comparison errors just to upgrade.
Our tables and code base started in Oracle 7.
The process was mostly scriptable and relatively painless.
The only place we had to do hand modifications were a few lines of code in our front end that expected padded data as a workaround for the behavior of chars in the first place.

Not a high cost to never have to worry about something again.


varchar2 (char) to varchar2 (byte)

July 21, 2009 - 11:46 pm UTC

Reviewer: Stuart from Auckland, New Zealand

Hi Tom,

I have a 10.2.0.4 database (character set=UTF8, nls_length_semantics=byte), where the developer created a table as follows:

create table xxxx
( column_a_id number,
column_b_text varchar2(255 char),
column_c_text varchar2(255 char),
...
...

Unfortunately, the user querying a view on this base table shows the data like it is of CHAR datatype (with lots of trailing spaces). Researching this, I found the tool has a bug which reads as follows: ( http://support.sas.com/kb/20/088.html )

"When you access an Oracle database from SAS® in single-byte encoding and set the database characterset to UTF8, the SAS/ACCESS Interface to Oracle might return incorrect byte lengths for character columns that have lengths defined in CHAR semantics (lengths are usually much larger than needed).

This problem occurs when using SAS/ACCESS Interface to Oracle for CHAR and VARCHAR2 columns and reflects byte lengths based on the database characterset and not the client-side (SAS) encoding. For example, for a VARCHAR2(3 CHAR) column in an Oracle UTF8 database, the byte length surfaced in SAS is tripled (9). So all the CHAR and VARCHAR2 column lengths are tripled in SAS even though the client-side encoding is single byte. "

So Oracle is fine. Doing a length() function on the data shows the correct length, which confirms I'm hitting this bug.

Rather than adding the suggested hotfix to all the PC's, or changing the table structure to varchar2(byte), I'd prefer to update the existing view on this table.

Question 1:
Is it possible to do a conversion in the SQL for the view creation where it converts varchar2 char to byte?

Question 2:
Although our developer is adamant we don't store any multibyte data, I'm loathe to go about changing the varchar2 datatype semantics across our schema. Just out of interest, if I did go down this track (I'd install the hotfix rather than this option), would there be any ramifications apart from the possibility of truncating any data that might be multi-byte?

If I have to install the hotfix, then so be it.
Tom Kyte

Followup  

July 26, 2009 - 6:27 am UTC

Question 1:
Is it possible to do a conversion in the SQL for the view creation where it
converts varchar2 char to byte?


when you tested, what did you discover? In any case - you would need to substr AND cast in the view. varchar2(255 char) in a UTF8 database can be a lot larger than varchar2(255 byte)


Although our developer is adamant we don't store any multibyte data, so, they have run a scan of the entire database right? Else, they do not know.

You would/could truncate data - yes.

varchar2 (char) to varchar2 (byte)... continued

July 22, 2009 - 12:01 am UTC

Reviewer: Stuart from Auckland, New Zealand

Further to the previous post,...

I've seen a few base tables which have a mix of:
column_name_a varchar2(99 byte)
column_name_b varchar2(99 char)

So it looks like the hotfix won't work, as it specifically states in the caveats that the hotfix will fail with tables created with mixed semantics.

So it looks like I either
a. update the tables to be varchar2 byte semantics
b. somehow update the views to convert varchar2 char to byte semantics

If you can advise on my previous question, it would be much appreciated.
Tom Kyte

Followup  

July 26, 2009 - 6:28 am UTC

it'll be up to you as to the approach you want to take on this one.

Modifying varchar2 column to increase size

September 11, 2009 - 2:58 pm UTC

Reviewer: KK from india

Tom,

We have a requirement to modify a varchar2 column size increased. The table is essentially a partitioned table and billions of rows are there. As varchar2 allocates space according to inserted value, does this operation will affect only data dictionary. I mean will this change will be quick enough to modify data dictionary alone and not modify any tablespaces which holds the data right now.
Tom Kyte

Followup  

September 14, 2009 - 1:07 pm UTC

it won't touch any existing data, it just modifies the data dictionary.

strange behaviour or varchar and char

October 07, 2009 - 7:12 am UTC

Reviewer: MAYANK GUPTA from Delhi,INDIA

Hi Tom,
I used these following steps:--

drop table a cascade;
create table a (a varchar(5),b char(5));

insert into a values('abc','abc');
insert into a values('xyz ','xyz '); --trailing space after z in both cases.

select * from a where a=b;

POSTGRES:

a | b
------+-------
abc | abc
xyz | xyz
(2 rows)


ORACLE 10g:

No data Found

and if I am using same datatype then oracle give results

Regards:
Mayank Gupta
cont: 09711065568
Tom Kyte

Followup  

October 08, 2009 - 7:32 am UTC

<quote src=Expert Oracle Database Architecture>

It is for this reason – the fact that a CHAR/NCHAR is really nothing more than a VARCHAR2/NVARCHAR2 in disguise – that makes me of the opinion there are really only two character string types to ever consider – VARCHAR2 and NVARCHAR2. I have never found a use for the CHAR type in any application. Since a CHAR type always blank pads the resulting string out to a fixed width, we discover rapidly that it consumes maximum storage both in the table segment as well as any index segments. That would be bad enough, but there is another important reason. It creates confusion in applications that need to retrieve this information (many cannot “find” their data after storing it). The reason has to do with the rules of character string comparison, and the strictness with which they are performed. Let’s use ‘Hello World’ string in a simple table to demonstrate:

ops$tkyte@ORA10G> create table t
2  ( char_column      char(20),
  3    varchar2_column  varchar2(20)
  4  )
  5  /
Table created.
ops$tkyte@ORA10G> insert into t values ( ‘Hello World’, ‘Hello World’ );
1 row created.
ops$tkyte@ORA10G> select * from t;
CHAR_COLUMN          VARCHAR2_COLUMN
-------------------- --------------------
Hello World          Hello World
ops$tkyte@ORA10G> select * from t where char_column = ‘Hello World’;
CHAR_COLUMN          VARCHAR2_COLUMN
-------------------- --------------------
Hello World          Hello World
ops$tkyte@ORA10G> select * from t where varchar2_column = ‘Hello World’;
CHAR_COLUMN          VARCHAR2_COLUMN
-------------------- --------------------
Hello World          Hello World


So far they look identical, they appear to be the same – but in fact, there has been some implicit conversions happening – the CHAR(11) literal was promoted to a CHAR(20) and blank padded when compared to the CHAR column. We see for a fact that these two strings are materially different:
ops$tkyte@ORA10G> select * from t where char_column = varchar2_column;
no rows selected

They are not equal to each other. We would have to either blank pad out the VARCHAR2_COLUMN to be 20 bytes in length or we would have to trim the trailing blanks from the CHAR_COLUMN:
ops$tkyte@ORA10G> select * from t where trim(char_column) = varchar2_column;
CHAR_COLUMN          VARCHAR2_COLUMN
-------------------- --------------------
Hello World          Hello World
ops$tkyte@ORA10G> select * from t where char_column = rpad( varchar2_column, 20 );
CHAR_COLUMN          VARCHAR2_COLUMN
-------------------- --------------------
Hello World          Hello World


NOTE: there are many ways to blank pad the varchar2_column, such as using the CAST() function.


The problem comes in with applications. Many of them will in fact utilize varying length strings when they bind inputs – with the resulting “no data found” that is sure to follow. For example:
ops$tkyte@ORA10G> variable varchar2_bv varchar2(20)
ops$tkyte@ORA10G> exec :varchar2_bv := ‘Hello World’;
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA10G> select * from t where char_column = :varchar2_bv;
no rows selected
ops$tkyte@ORA10G> select * from t where varchar2_column = :varchar2_bv;
CHAR_COLUMN          VARCHAR2_COLUMN
-------------------- --------------------
Hello World          Hello World

So here, the search for the VARCHAR2 string worked but the CHAR column did not. The varchar2 bind variable will not be promoted to a CHAR(20) as the character string literal is. At this point, many programmer gets of the opinion that “bind variables don’t work, we have to use literals”. That would be a very bad decision indeed. So, the solution is to bind using a CHAR type:
ops$tkyte@ORA10G> variable char_bv char(20)
ops$tkyte@ORA10G> exec :char_bv := ‘Hello World’;
 
PL/SQL procedure successfully completed.
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> select * from t where char_column = :char_bv;
CHAR_COLUMN          VARCHAR2_COLUMN
-------------------- --------------------
Hello World          Hello World
ops$tkyte@ORA10G> select * from t where varchar2_column = :char_bv;
no rows selected

Only partially – if you mix and match VARCHAR2 and CHAR, you’ll be running into this issue constantly. Not only that, but the developer is now having to consider the field width into their applications. If they opt for the rpad() trick to convert the bind variable into something that will be comparable to the CHAR field (it is preferable of course to pad out the bind variable, rather than TRIM the database column!). Now if the size of the field changes – the application is impacted, it must change it’s field width.

It is for these reasons – the fixed width storage, which tends to make the tables and related indexes much larger than normal, coupled with the bind variable issue that makes me avoid the CHAR type in all circumstances. I cannot even make an argument for it in the case of the 1 character field – for in that case, it is really of no material difference. The VARCHAR2(1) and CHAR(1) are identical in all aspects. There is no compelling reason to use the CHAR type in that case and in order to avoid any confusion – I just say “no” for even the CHAR(1) field.


</quote>

pga_aggregate_target

October 12, 2009 - 4:33 am UTC

Reviewer: Gurudarshan G P from india

Hi Tom,
I would like to ask you that where the pga will be allocated when the userprocess started,either in the serverside or in the client side. And if we dont set pga_aggregate_target value then how to check that the allocated pga memory.

Awaiting for ealiest relpy.



What about MD5 sums that are always 32 characters?

November 18, 2009 - 12:07 pm UTC

Reviewer: Owen from Denver, USA

I'm one of those ignorant developers who has the misguided notion of char being better for fixed lengths. Please set me straight:

We're storing MD5 checksums which are always, no matter what, exactly 32 characters. Is it still better to use varchar2(x) instead of char(32)? The column is currently defined as a varchar2(300) because the dba says 'it doesn't matter just as long as it is long enough' - is this true? Even if varchar2 wins over char, it seems to me it needs to be varchar2(32) because 32 needs only 5 bits to store/handle the length but 300 needs 9. We use WHERE clauses against this column and there are over 100M rows so we need so squeeze the performance as much as possible. Right now it's slower than we need it to be.

PS - There's no index because the values are unique and a lot of other columns need to be returned so such an index would have 1:1 with the table and take practically just as much space. Is this a correct analysis?

Thanks for any tips!
Tom Kyte

Followup  

November 23, 2009 - 3:08 pm UTC

a char(n) is simply a varchar2(n) blank padded to be always N characters.

So, if you store something that is always 32 characters - a char(32) and a varchar2(32) are *identical*. Identical on disk, identical in memory.

Which leads me to turn the question around - is it better to use a char(32)? What do you think you would gain by it?

I believe you should not use char(n) ever - for the simple reason that is makes you forget they exist and you are not ever tempted to use them incorrectly.


forget about bits, nothing happens at the bit level here. The leading length indicator will be 1 byte for 255 or less and 2 bytes for anything else.


I believe your DBA is mistaken however in their statement - you want a datatype to be as RESTRICTIVE as possible, the only thing you'll get by using varchar2(300) to hold a field that can only be varchar2(32) is - someday someone will put 300 characters in there - and then what? data integrity - goodbye.


The stupid thing is - we could alter a varchar2(32) to be a varchar2(300) if the need to do so arises. However, to alter a varchar2(300) to the correct length of 32 would take a reorganization of the entire table. 300 was wrong, 32 is correct and a char(32) would provide no benefit over a varchar2(32) and since you never really want to remember char exists - do not use it.

... PS - There's no index because the values are unique and a lot of other columns
need to be returned so such an index would have 1:1 with the table and take
practically just as much space. Is this a correct analysis?
...

if this is an md5 checksum and the values must be unique - I don't really CARE how much space it takes. What will happen to your application when (not if, but when) a duplicate value occurs. You know that the space of 32 character md5 checksums is FINITE right (it is big, but finite) and your inputs - can and will generate duplicate hashes - it will happen someday - and then what?

If you have a unique constraint - unique it. Space be darned.

If you access this data by the checksum - index it for performance reasons.
If you need this to be unique - put the constraint on.

assiging clob value to varchar2

July 19, 2010 - 4:49 am UTC

Reviewer: Ian from Sydney, Australia

Hi Tom,

A little off-topic but....

On 10g r2, I have a column in a table stored as a clob, it holds a sql "select" string.  I want to retrieve the value in this column, and assign it to a varchar2 in order to execute the sql string dynamically.  I get an error assigning the clob to a varchar, when i do this within a FOR LOOP.

SQL> desc master_heading_filters;
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 MAHF_ID                                   NOT NULL NUMBER(10)
 MAHF_CURR_HEADING_ID                      NOT NULL NUMBER(10)
 MAHF_NEXT_QUESTION                                 VARCHAR2(20)
 MAHF_ORDER                                         NUMBER
 MAHF_QUERY                                         CLOB
 MAHF_VALID_QUERY                                   VARCHAR2(1)
 MAHF_CREATED_DATE                                  DATE
 MAHF_MODIFIED_DATE                                 DATE

SQL> select length(mahf_query) length
  2  from   master_heading_filters mahf
  3  where  mahf.mahf_id = 1042;

    LENGTH
----------
     15163

SQL> 
SQL> declare
  2    v_sqlStr  varchar2(32000 char);
  3    v_exists  varchar2(1);
  4    
  5    type cur_type is ref cursor;
  6    c cur_type;
  7  
  8    v_runStr varchar2(1000) := '  select mahf_query
  9    from   master_heading_filters mahf
 10    where  mahf.mahf_id = 1042';
 11    
 12    cursor c1 
 13    is
 14    select mahf_query
 15    from   master_heading_filters mahf
 16    where  mahf.mahf_id = 1042;  
 17     
 18  begin
 19    open c for v_runStr; 
 20      loop 
 21        fetch c into v_sqlStr;
 22        EXIT WHEN c%NOTFOUND;
 23        dbms_output.put_line('01 length(v_sqlStr)='||length(v_sqlStr) );
 24      end loop;
 25    close c;
 26  
 27    for rec in c1 loop
 28      v_sqlStr := rec.mahf_query;
 29      dbms_output.put_line('02 length(v_sqlStr)='||length(v_sqlStr) );
 30    end loop;
 31    
 32  end ;
 33  /
01 length(v_sqlStr)=15163
declare
*
ERROR at line 1:
ORA-06502: PL/SQL: numeric or value error
ORA-06512: at line 28

My question is, how come it works for the ref cursor, and not the other, as far as i can see this is the same?

What am i missing here?

Thanks!

Tom Kyte

Followup  

July 19, 2010 - 2:05 pm UTC

ops$tkyte%ORA10GR2> create table master_heading_filters( mahf_query clob );

Table created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> declare
  2          l_str long := rpad( '*', 15163, '*' );
  3  begin
  4          insert into master_heading_filters values (l_str);
  5  end;
  6  /

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> declare
  2    v_sqlStr  varchar2(32000 char);
  3    v_exists  varchar2(1);
  4
  5    type cur_type is ref cursor;
  6    c cur_type;
  7
  8    v_runStr varchar2(1000) := '  select mahf_query
  9    from   master_heading_filters mahf';
 10
 11    cursor c1
 12    is
 13    select mahf_query
 14    from   master_heading_filters mahf;
 15
 16  begin
 17    open c for v_runStr;
 18      loop
 19        fetch c into v_sqlStr;
 20        EXIT WHEN c%NOTFOUND;
 21        dbms_output.put_line('01 length(v_sqlStr)='||length(v_sqlStr) );
 22      end loop;
 23    close c;
 24
 25    for rec in c1 loop
 26      v_sqlStr := rec.mahf_query;
 27      dbms_output.put_line('02 length(v_sqlStr)='||length(v_sqlStr) );
 28    end loop;
 29
 30  end ;
 31  /
01 length(v_sqlStr)=15163
02 length(v_sqlStr)=15163

PL/SQL procedure successfully completed.



I need a way to reproduce. Character set information will be crucial as well.

Performance of VARCHAR2 vs CHAR

July 01, 2011 - 4:06 am UTC

Reviewer: Nand Kishore Sagi from Hyderabad, India

Hi Tom,

In our application the account_id is of the format "abc-def-ghi" that is the length of the account_id would always be 11 characters (including hyphens).

In this scenario what datatype would you suggest that we use for both at table level and at PL/SQL level? Would there be a performance impact if we declared the columns as VARCHAR2(11 char) vs CHAR(11 char)?

Thanks a lot for your help.

Thanks and Regards
Nand Kishore Sagi
Tom Kyte

Followup  

July 01, 2011 - 9:22 am UTC

varchar2(11) is what I would use.

a char(11) is simply a varchar2(11) that is blank padded if it is less than 11 charcters.

since you are always putting 11 characters in there - they are exactly and precisely the same - 100% the same. Both would be stored identically.


And I don't like char at all - ever.

Hi...

August 16, 2011 - 10:31 am UTC

Reviewer: Sambhav from USA


Dear Tom,

How oracle internally work for this below query:

Select * from XYZ where Column_name between
'12345' and '987'.



Select * from XYZ where Column_name between
'12345' and 'ABCE'


Column_name datatype is VARCHAR.

When i used it no getting expected output but if he column is Number datatype it works fine.

We wanted to know how oracle internally compare stings value

Thanks
Sambhav

Tom Kyte

Followup  

August 16, 2011 - 5:30 pm UTC

It compares strings using your collating sequence - based on your NLS settings.

But in general, it compares strings in the way all computer based software does. You are getting the expected output from a string compare, you just don't realize it.

ops$tkyte%ORA11GR2> select * from dual where '100' >= '12';

no rows selected

ops$tkyte%ORA11GR2> select * from dual where 100 >= 12;

D
-
X

ops$tkyte%ORA11GR2> 



so, in the above, when you compare '100' to '12', they compare character by character- until the strings differ - then we can tell which one is "greater".

For example since '1' = '1' - we would move on to compare '0' to '2'. '0' (the string) is less than '2' (the string). Hence, in a character string comparison - '100' is LESS THAN '12' - for the same reason that 'BAA' is less than 'BC' (I used A=0, B=1, C=2 and just did a replace on those characters)

If you understand why 'BAA' is less than 'BC', you can understand why '100' is less than '12'. '100' is NOT A NUMBER, it is a string and compared to any other string as a string.

This is not how "Oracle" compares strings - this is how pretty much all software on the planet compares strings.

@Tom

August 17, 2011 - 10:38 am UTC

Reviewer: Sambhav from USA

Thanx a lot Tom

Update on the Performance Test of Kevin

December 04, 2014 - 9:15 am UTC

Reviewer: Adrian Shum from Hong Kong

Just an update for the performance test raised by Kevin O'Keeffe:

Once I slightly modify the test, to make the original insert operation inserting 20-char long data, and make the update with value of 20-char long (to simulate the only-seems-to-be-reasonable use case of CHAR, which is column that always contains fixed-width data), there is no obvious performance difference between CHAR and VARCHAR2:

Original Test Result using Kevin's test:
Elapsed: 00:00:01.616 <- Update using VARCHAR2
Elapsed: 00:00:00.435 <- Update using CHAR

Updated Test Result with fixed width data:
Elapsed: 00:00:00.322 <- Update using VARCHAR2
Elapsed: 00:00:00.342 <- Update using CHAR

Obviously the performance difference is caused by the reallocation of storage. However, given that the only case that is reasonable to use CHAR is when the data is always in fixed width, there is no benefit in using CHAR over VARCHAR2 base on my testing result.

I will say, forget about CHAR. Stick to VARCHAR2 (at least for Oracle)


More to Explore

DBMS_UTILITY

More on PL/SQL routine DBMS_UTILITY here