Home>Question Details



Charanjiv -- Thanks for the question regarding "Char Vs Varchar", version 8.1.5

Submitted on 3-Mar-2001 5:26 Central time zone
Last updated 23-Nov-2009 15:08

You Asked

Hi, 

I am a little confused here, as far as I understood, the VARCHAR2 datatype was supposed 
use memory equaling to the data in it. But the followig lines from "PL/SQL User’s Guide 
and Reference" Version 8.1.5 for NT tells a different story.

" However, for a VARCHAR2(< 2000) variable, PL/SQL preallocates enough memory to hold a 
maximum-size value. So, for example, if you assign the same 500-byte value to a 
VARCHAR2(2000) variable and to a VARCHAR2(1999) variable, the latter uses 1499 bytes more 
memory. " 

Does this mean there is no difference between a CHAR(1999) and VARHAR2(1999) 

Charanjiv 

 

and we said...

A varchar2 datatype, when stored in a database table, uses only the space allocated to 
it.  If you have a varchar2(1999) and put 50 bytes in the table, we will use 52 bytes 
(leading length byte).

A char datatype, when stored in a database table, always uses the maximum length and is 
blank padded.  If you have char(1999) and put 50 bytes into it, it will consume 2001 
bytes (leading length field is present on char's as well).

In the database -- a CHAR is a VARCHAR that is blank padded to its maximum length.


In PLSQL, what they are saying is that for optimization purposes -- a varchar2 less then 
2000 bytes will allocate a fixed amount of memory.  It will preallocate the maximum size 
upon instantiation.  This cuts down on the reallocations as different sized strings are 
assigned to it.  On the other hand, a varchar2 greater then 2000 characters will truely 
be varying sized and as you assign different sized strings to it it will grow dynamically 
as needed.

One takes a little more ram (if you never use the entire string) and performs a little 
better, the other uses only what ram it needs but if you cause the string to grow 
frequently will not perform as well due to the reallocs to resize the buffer.


There are great differences between a char(1999) and varchar2(1999) as the CHAR is always 
1999 characters long physically AND logically.  the varchar is physically 1999 bytes but 
logically can be any size in between 0 and 1999.  The char is always blank padded, not so 
for the varchar.

You can measure this for yourself to see this effect:

tkyte@TKYTE816> declare
  2      x varchar2(1999);
  3      l_start number default dbms_utility.get_time;
  4  begin
  5      for i in 1 .. 100
  6      loop
  7          for j in 1..1999
  8          loop
  9              x := rpad('x',j,'x');
 10          end loop;
 11      end loop;
 12      dbms_output.put_line
 13      ( 'Elapsed: ' || 
             round( (dbms_utility.get_time-l_start)/100, 2 ) );
 14  end;
 15  /
Elapsed: .99

PL/SQL procedure successfully completed.

tkyte@TKYTE816> declare
  2      x varchar2(2001);
  3      l_start number default dbms_utility.get_time;
  4  begin
  5      for i in 1 .. 100
  6      loop
  7          for j in 1..1999
  8          loop
  9              x := rpad('x',j,'x');
 10          end loop;
 11      end loop;
 12      dbms_output.put_line
 13      ( 'Elapsed: ' || 
              round( (dbms_utility.get_time-l_start)/100, 2 ) );
 14  end;
 15  /
Elapsed: 1.21

PL/SQL procedure successfully completed.


The use of the varchar2(1999) was slightly more efficient in this worst case, fake case 
to demonstrate a point.  In general, you will never notice this different as you probably 
don't make assignments of every single size to a string 199,900 times in real life.

 

Reviews    
4 stars Char VS Varchar   April 3, 2002 - 4pm Central time zone
Reviewer: kiran from cincinnati, OH USA
In all your replys to Char and varchar2 questions you mentioned that Varchar2 supports all the char 
properties.
I noticed a weired problem with Varchar2.
I have a table test1(tname char(20),my_name varchar2(20));
I inserted two records into this table
('kiran','kiran')
('kiran ','kiran ')---Notice there is a space after kiran.
Now I am selecting from the table using the below sql
select * from test1 where tname = 'kiran'; 
 ---this returned the above two records
  but when I am select the same table using the below sql
select * from test1 where my_name = 'kiran'  
  ---it returned only one record(1st one). I tested this on 7.3 and also on 8I could you explain 
why it is happening?
another interesting thing is select * from test1 where tname like 'kiran' is not returning any 
recors.When the "=" is working fine, why the like is not working? 


Followup   April 3, 2002 - 6pm Central time zone:

varchar2's can store trailing blanks -- IF You put them there.

the answer is simple.

'a ' <> 'a'

tried this in 8i (r2 and r3) 

ops$tkyte@ORA817DEV.US.ORACLE.COM> create table t ( x char(10), y varchar2(10) );

Table created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> insert into t values ( 'x', 'x ' );

1 row created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> select * from t where x like 'x';

X          Y
---------- ----------
x          x

ops$tkyte@ORA817DEV.US.ORACLE.COM> select * from t where y like 'x';

no rows selected

works as expected for me. 

5 stars Is there a Leading Length field in RAW Datatype ?   April 4, 2002 - 2am Central time zone
Reviewer: Pascal from Hamburg , Germany
Hi Tom,

Thanks for a Great answer!

If i could just post a followup  on that expecting a short YES/NO  answer-

Is there a Leading  Length field  in  RAW  Datatype ?

and  

Is it True that when i use RAW  type in my tables as an automatically generated Primary Key - by 
selecting sys_guid() from dual ,  from within a Trigger  - then i would not be able to  Create  
Materialized  Views with columns of RAW type ?


Thanks in advance

 


Followup   April 4, 2002 - 11am Central time zone:

you can either

select length( raw_field )/2 from t

or 

select utl_raw.length(raw_field) from t

to get the length.  



ops$tkyte@ORA817DEV.US.ORACLE.COM> create table t ( pk raw(16) primary key, data int );

Table created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> create materialized view t_mv
  2  as
  3  select * from t;

Materialized view created.

seems to work. 

4 stars Are PL/SQL LONG variables similar?   April 4, 2002 - 4pm Central time zone
Reviewer: Brent from Hartford, CT, USA
I was searching for documentation on the memory behavior with respect to LONG variables, and found 
the same snippet at the top of this question for VARCHARs, but nothing about LONGs.

Do the memory allocation rules operate the same way for LONG variables, with the 2000-byte 
threshold?  Or if it is different for LONGs, what are the differences? 

Thanks.

(I know Tom knows what I mean but just to clarify for the readers -- I am referring to the PL/SQL 
LONG datatype which has a 32K maximum size, and not the database LONG column type which has a 2GB 
limit).

 


Followup   April 4, 2002 - 4pm Central time zone:

From standard.sql:

...
  type VARCHAR2 is NEW CHAR_BASE;
  pragma PACK(VARCHAR2);

  subtype VARCHAR is VARCHAR2;
  subtype STRING is VARCHAR2;

  subtype LONG is VARCHAR2(32760);

  subtype RAW is VARCHAR2;
  subtype "LONG RAW" is RAW(32760);

  subtype ROWID is VARCHAR2(256);
....

a plsql long is just a varchar2 in disguise.  So, yes, they are just a varchar2 that is bigger then 
2000. 

4 stars Can preallocate varchar2(>2000)?   April 4, 2002 - 5pm Central time zone
Reviewer: benedict from australia
Thanks for the informative answer Tom.

Supposing I was building up a large varchar2 variable (declared >2000) characters by appending text 
to itself... e.g.
declare
   big varchar(30000);
begin
  for i := 1..100 loop
    big := big || ' something';
  end loop;
end;

I understand from your answer that since it's been declared larger than 2000, pl/sql will 
reallocate memory each time it gets longer. My question is: is it possible to force pl/sql to 
allocate tons of room to begin with? And does it make a big enough difference for me to care?

Will pl/sql reallocate if the varchar2 gets smaller? 
e.g. will

...
begin
  big := rpad(null, 30000);
  big := null;
...

do the trick? 

Cheers 


Followup   April 4, 2002 - 7pm Central time zone:

Based on this test, I would say 'yes, it'll have an effect'.

But -- only if you do this really degenerate case like I did.  I mean -- how many times do you make 
that string grow a byte at a time??

Also -- setting to NULL sees to be "free" -- it shrinks it right back.

I ran this a couple of times:

ops$tkyte@ORA817DEV.US.ORACLE.COM> create or replace procedure p_big_no_init
  2  as
  3      x varchar2(30000);
  4  begin
  5      for j in 1..30000
  6      loop
  7        x := rpad('x',j,'x');
  8      end loop;
  9  end ;
 10  /

Procedure created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> create or replace procedure p_big_init
  2  as
  3      x varchar2(30000) := rpad( 'x', 30000, 'x' );
  4  begin
  5      for j in 1..30000
  6      loop
  7        x := rpad('x',j,'x');
  8      end loop;
  9  end ;
 10  /

Procedure created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> create or replace procedure p_big_init_with_null
  2  as
  3      x varchar2(30000) := rpad( 'x', 30000, 'x' );
  4  begin
  5          x := null;
  6      for j in 1..30000
  7      loop
  8        x := rpad('x',j,'x');
  9      end loop;
 10  end ;
 11  /

Procedure created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> declare
  2      l_start number default dbms_utility.get_time;
  3  begin
  4      for i in 1 .. 5 loop p_big_no_init; end loop;
  5      dbms_output.put_line
  6      ( 'Elapsed: ' ||
  7          round( (dbms_utility.get_time-l_start)/100, 2 ) );
  8  
  9      l_start := dbms_utility.get_time;
 10      for i in 1 .. 5 loop p_big_init; end loop;
 11      dbms_output.put_line
 12      ( 'Elapsed: ' ||
 13          round( (dbms_utility.get_time-l_start)/100, 2 ) );
 14  
 15      l_start := dbms_utility.get_time;
 16      for i in 1 .. 5 loop p_big_init_with_null; end loop;
 17      dbms_output.put_line
 18      ( 'Elapsed: ' ||
 19          round( (dbms_utility.get_time-l_start)/100, 2 ) );
 20  end;
 21  /
Elapsed: 22.85
Elapsed: 22.06
Elapsed: 24.69

PL/SQL procedure successfully completed.
 

4 stars What about when constants are declared in PL/SQL?   April 10, 2002 - 11am Central time zone
Reviewer: jon from Hartford, CT USA
If you declare a constant in PL/SQL, would it be better to size it as VARCHAR2(2001), so that you 
would get the actual size, instead of 2000 bytes allocated?  Or is it smart enough to recognize 
that constants can't be changed and allocate the correct amount of memory.

In other words, is 
usps_code_connecticut constant varchar2(2001) := 'CT';
any different than
usps_code_connecticut constant varchar2(2) := 'CT';
? 


Followup   April 10, 2002 - 11am Central time zone:

regardless of whether it was constant or not -- you should use varchar2(2).  

The varchar2(2001) will allocate storage for 2 characters when it is initialized.

The varchar2(2) will allocate storage for 2 characters when it is initialized.

Now, if you had coded:


usps_code_connecticut containt varchar2(1000) := 'CT';

then 2001 and might be better.  But the real fix would be to declare it properly as a varchar2(2). 

4 stars Constants.   January 21, 2003 - 3pm Central time zone
Reviewer: Kashif from Houston, TX
Hi Tom,

Is using the CONSTANT keyword a performance benefit? Any examples would be helpful. Also, can you 
explain whether a declaration such as:

l_filename constant varchar2 (7) default to_char (l_start_date, 'MONYYYY');

is advisable/logical. The constant and default keywords are used in the same declaration. Thanks in 
advance.

Kashif 


Followup   January 21, 2003 - 5pm Central time zone:

it is a compiler directive saying "don't let them modify it".

I've never seen any sort of performance benefit.  It is still a plsql variable. 

5 stars Assiging CHAR type to VARCHAR2   March 4, 2003 - 12pm Central time zone
Reviewer: CSR from India
Hi Tom,

Is there any problem if we assign CHAR type variable directly to a VARCHAR2 type variable.
I have table and SP like below.

chandu@chandra> desc t_clips_audit_info
 Name                         Null?    Type
 --------------------------- -------- ---------------
 AUDIT_SEQUENCE              NOT NULL NUMBER(10)
 CALL_TYPE                            CHAR(3)
 CALL_RETURN_ERR                      CHAR(3)
 ERROR_DATE_TIME                      DATE
 REQUESTOR_ID                         VARCHAR2(20)
 CALL_STRING                          VARCHAR2(250)
 CALL_RETURN                          VARCHAR2(2000)


PROCEDURE sp(
                ic_or_number    IN  VARCHAR2,
                in_on_number    IN  NUMBER,
                in_amt          IN  NUMBER,
                oc_is_valid     OUT CHAR,
                oc_is_or_valid  OUT CHAR)
As

 vc_call_string t_clips_audit_info.call_string%TYPE := NULL;
 vc_call_return t_clips_audit_info.call_return%TYPE := NULL;

 begin

--------- many lines. here -------

-- #.1
      vc_call_return :=    vc_call_return || ' ' ||'oc_is_obgr_obgt_valid: '||
                                              oc_is_obgr_obgt_valid;
                       
--
 end ;
/

When I execute the above SP from SQL*> with a sql client(exec sp) it is giving me proper results.
But java developers are getting the error showing below.

error is :ORA-06502: PL/SQL: numeric or value errorCode:-6502

After debugging I found the exception at occurred the #.1.

If I use SUBSTR(oc_is_obgr_obgt_valid,0,1) in the code, I don't get any errors.

Could you please suggest us.
 


Followup   March 4, 2003 - 6pm Central time zone:

The java developers are binding a string that is too small to hold the result.  chars are blank 
padded to the maximum length and the java guys bind variable isn't big enough I guess.

one would actually need a small, yet complete testcase to say exactly.


almost forgot to say - i hate char, don't use them, just use varchar2. 

5 stars CHAR(1) vs VARCHAR2(1)   March 4, 2003 - 8pm Central time zone
Reviewer: Trevor 
Tom...
I was told once that if you only need a single char
use CHAR(1) since using VARCHAR2(1) has overhead
due to the Oracle having to maintain how long
the value is in a VARCHAR2 field

Is this a load of rubbish or is there some truth
to it? 


Followup   March 5, 2003 - 7am Central time zone:

that is rubbish.

A char(n) is a varchar2(n) that is blank padded to it's maximum length.  The implementation of a 
char physically on disk is identical to a varchar2 -- there is a length byte for a char(1) just as 
there is for a varchar2(1).


 

5 stars Binned   March 5, 2003 - 6pm Central time zone
Reviewer: Trevor 
Thanks Tom
 


3 stars VARCHAR2(10) vs. VARCHAR2(2000) for a 10 char string   June 3, 2003 - 10am Central time zone
Reviewer: Tajana from Austria, Vienna
Tom,

clear about the difference between VARCHAR2 and CHAR.

But what about defining VARCHAR2(10) and, let's say VARCHAR2(2000) for a column that is expected 
not to be "wider" than 10 characters ?

One advantage of defining such a column actually as what it is (VARCHAR2(10)) for me is, that I can 
easily list a table containing many of such columns in SQL*Plus (Worksheet) without having to 
scroll right, till my fingers get wounded.

Disadvantage: If my text gets wider, I'll have to redefine the column; means: I have to maintain 
that.

Defining such a column as - let's say: 2000, would mean: No or less maintenance. 

But much more interesting is: 
Are there any
    space 
    index
    optimizer issues , or 
any other important issues with this ? 
Any other arguments for defining a 10 char column as VARCHAR2(10) and not VARCHAR2(2000) ?
Or: Any arguments why it is better to generally define VARCHAR2 columns to its maximum ?

Thanks in advance,

Tajana 


Followup   June 3, 2003 - 11am Central time zone:

well, the real issue is -- what is the business requirement.  What the heck is in that field and 
what is its length.

If you say "use a varchar2(2000)" then you know what will happen?  someone will put 2000 characters 
in there someday.  Is that OK?

varchar2(10) says "this field accepts 10 bytes of data"

Since you can make a field larger easily with ALTER, maintenance isn't an excuse.


You define a field to be correct and proper.  Period.  This field is a varchar2(10).


think about the poor client tools that say "oh look - a varchar2(2000) and an array size of 100.  
We need to allocate 200,000 bytes of storage (instead of 1000).

 

5 stars VARCHAR2(10) vs. VARCHAR2(2000) for a 10 char string   June 4, 2003 - 7am Central time zone
Reviewer: Tajana from Vienna, Austria
Tom,

in german you would say: You a running against my open doors. 
Means: We are of the same opinion.

My intention was to prevent development colleagues to define hundreds of VARCHAR2(255) columns 
(which they did, before I "jumped" in the project, and still trying to do).
So I needed arguments.
You gave me one - that with client tools, arraysize and allocation. 

But before I discuss that with colelagues, I would like to know this a little bit more in detail:
Do you mean memory allocation by the client tool at the client ? 
Can you explain that matter a little bit further, please ?

Any other considerations (optimizer, indexes ?)

Thanks a lot,

Tajana  


Followup   June 4, 2003 - 8am Central time zone:

there is only one reason necessary:

o it is the only sensible and professional thing to do. 


but the technical reasons are:

o it is called data integrity.  if you make it 255, you are saying that field legitimately contains 
upto 255 bytes of data.  If you make it 10, you are saying otherwise.  If you make it 255, someone 
WILL -- not might -- WILL put 255 bytes in there some day.  Then what?

o say you have 20 255 varchar2 columns instead of 20 right sized ones.  Pull that up in an ad hoc 
tool.  Have you done the end user a serious dis-service here?  they'll spend the next 5 minutes 
formatting data - data that should have been, could have been formatted already.

o say you move onto the next project -- leaving all of these 255 byte columns.  Who is going to 
tell the next guys "c1 is a 10 character field, c2 is 12, c3 is 44" and so on.  

o when the client application, which wants to run fast, decides "I'll array fetch 100 rows at a 
time", it'll have to allocate 100 * 255 * 20 bytes of data for the buffering of such data 
(remember, no one KNOWS c1 is just 10 characters, it is 255 characters).  That is as opposed to 
allocating just what you really need.


I've got an idea for these guys -- don't use numbers or dates either.  What if later on you want to 
store "hello world" in that date column.  (only kidding)

 

5 stars   June 4, 2003 - 10am Central time zone
Reviewer: Dave from Colorado
In the same kidding spirit as "don't use numbers or dates either", you could use VarChar2(2000)  
for all your columns, and then put a check constraint on the column "length(my_column) < 21". Ta 
Da! Same as VarChar2(20)!!!  


5 stars I like that one..   June 4, 2003 - 11am Central time zone
Reviewer: Jose Cleto from Santo Domingo, Dominican Republic
It reads like "American funnies Scripts" 


5 stars Funnies scripts   June 5, 2003 - 8am Central time zone
Reviewer: Tajana from Vienna, Austria
Thanks, Tom.
============
This was what I needed.

Jose, 
It reads like:
What funny ideas developers sometimes have, and what a DBA / data designer has not only to clean 
after them, but also to convince them of on the first place ...
(Tenor: "Don't change this, my procedure won't work at all anymore !!! And, by the way, it is yet 
implemented at the cust site !")

In this case: European funnies scripts, but I am sure, you have your own ones too ;-)

Dave,
I will evaluate your suggestion and present it to my dev colleagues ;-)
We will see, what they are gonna to say...

Cheers, Tajana 


5 stars Sorry if I miss the point...   July 30, 2003 - 10am Central time zone
Reviewer: Jose Cleto from Dominican Republic
We all have fallen in this situation:

-  "Let's do poor req. and anal. and then go with the code"
-  "I know that is better, but this is faster"
-  "We don't have time for that now"

An the list goes on...

The true, all the true and nothing but the true is that developers, DBA, and the like all we, are 
payed to do our job the better we can.

Why pay like a pro if some one will work like a novice.  We most work all the time to get the next 
level.  Computers is a "science", like math, chemestry, we have discipline CMM, OOP, Database 
Design, Structure programming, UML, etc., all this exists is for guide us in doing our work, but 
only our self could make our work.

Making the king of VARCHAR2(2000) to hold just 20 is silly. So mr, Tajana, Your job "if You decide 
to accept it", is to convince your developers to be as near as possble to the reallity and make use 
of all the things they are suppose to know.

BTW.: In my case is "Dominican Funniest scripts"  


3 stars Char is returing vaules that are not padded?   September 11, 2003 - 10am Central time zone
Reviewer: Vivek 
Tom, I created a table with a column of char(10).  When I select a row with a where clause of the 
"col" not padded with spaces it still returns the row.  I would think the row should not be found 
if I do not pad.  Am I wrong?

SQL> create table x(col char(10));

Table created.

SQL> insert into x values('0');

1 row created.

SQL> insert into x values('1     ');

1 row created.

SQL> insert into x values('A');

1 row created.

SQL> select length(col),'x'||col||'x' from x;

LENGTH(COL) 'X'||COL||'X
----------- ------------
         10 x0         x
         10 x1         x
         10 xA         x

SQL>
SQL> DECLARE
  2     x   CHAR (14);
  3  BEGIN
  4     DBMS_OUTPUT.put_line ('0');
  5
  6     SELECT col
  7       INTO x
  8       FROM x
  9      WHERE col = '0';
 10
 11     DBMS_OUTPUT.put_line ('1');
 12
 13     SELECT col
 14       INTO x
 15       FROM x
 16      WHERE col = '1';
 17
 18     DBMS_OUTPUT.put_line ('A');
 19
 20     SELECT col
 21       INTO x
 22       FROM x
 23      WHERE col = 'A';
 24      DBMS_OUTPUT.put_line ('NO Errors');
 25  EXCEPTION
 26     WHEN NO_DATA_FOUND THEN
 27        DBMS_OUTPUT.put_line ('no_data_found');
 28        RAISE;
 29     WHEN OTHERS THEN
 30        DBMS_OUTPUT.put_line ('others');
 31        RAISE;
 32  END;
 33  /
0
1
A
NO Errors

PL/SQL procedure successfully completed.

SQL> spool off


thanks 


Followup   September 11, 2003 - 6pm Central time zone:

character string literals like that are coerced to the proper datatype automagically

'A' is really 'A        '

it is the ansi defined behaviour. 

5 stars number vs date in datawarehouse   November 3, 2003 - 4am Central time zone
Reviewer: A reader 
Hi

I have seen many Datamarts/Datawarehouses using number for date columns. For example instead of 
having

2003-AUG-01 23:00:00 

they have

20030801230000 in number

Is this good practice? I find it so hassle because when I have to calculate aggreates with time 
functions I always have to convert that to date first!

I have never seen this in OLTP but many times in DWH, is there some directive to do so or what? 


Followup   November 3, 2003 - 7am Central time zone:


In fact, I write about this very topic in my book "Effective Oracle By Design" where I state very 
loudly

USE NUMBERS FOR NUMBERS

USE DATES FOR DATES

don't use anything else, use the correct proper datatypes.


Not only do you spend 1/2 your time converting that number to a date -- but I'll assure you that 
one day, you'll get a date in there that IS NOT A VALID DATE and then what?

Not only that but tell me, how many values are there between

20031231115959

and

20040101000000



well, the optimizer thinks there could be 8869884041 values -- but we all know there are no values 
between those two dates!!!  You can totally mess up the optimizer with things like "between" and 
"<", ">" doing this silly thing of putting dates into numbers or strings.....

can you tell ?  i'm dead set against this "common practice"

 

5 stars   November 3, 2003 - 11am Central time zone
Reviewer: Dave from Colorado
"Date folding" is related good reason, as if another one were needed, for using date types for 
dates, especially in a DW environment ...

http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96520/qr.htm#50228


5 stars   November 3, 2003 - 12pm Central time zone
Reviewer: Dave from Colorado
By the way, there seems to be some very poor advice in that documentation, in the "Note:" section. 


4 stars example from your new book   November 14, 2003 - 4am Central time zone
Reviewer: A reader 
Hi

you have an example showing using proper datatypes in chapter about effective schema design

I did a small test using my own script and saw something not predicted, opposite from your example


create table test_date
(
 varchar_fecha varchar2(9),
 number_fecha number,
 date_fecha date
)
pctfree 70
pctused 10;

insert /*+ APPEND */ into test_date
select to_char( to_date( '19980101', 'YYYYMMDD' ) + rownum, 'YYYYMMDD' ),
       to_number( to_char( to_date( '19980101', 'YYYYMMDD' ) + rownum, 'YYYYMMDD' ) ),
       to_date( '19980101', 'YYYYMMDD' ) + rownum
  from dba_source
 where rownum <= trunc(sysdate) + 36500 - to_date( '19940101', 'YYYYMMDD' );

create index test_date_idx1 on test_date(varchar_fecha);

create index test_date_idx2 on test_date(number_fecha);

create index test_date_idx3 on test_date(date_fecha);

analyze table test_date compute statistics for all indexes for table for all columns;

set autotrace trace exp stat

select * from test_date
  2   where number_fecha between 20011231 and 20050101;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=6 Card=1318 Bytes=26360)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'TEST_DATE' (Cost=6 Card=1318 Bytes=26360)
   2    1     INDEX (RANGE SCAN) OF 'TEST_DATE_IDX2' (NON-UNIQUE) (Cost=5 Card=1318)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        166  consistent gets
          0  physical reads
          0  redo size
      50919  bytes sent via SQL*Net to client
       8528  bytes received via SQL*Net from client
         75  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       1098  rows processed



select * from test_date
  2   where date_fecha between to_date('20011231', 'YYYYMMDD') and to_date('20050101', 'YYYYMMDD');

1098 filas seleccionadas.


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=46 Card=1097 Bytes=21940)
   1    0   TABLE ACCESS (FULL) OF 'TEST_DATE' (Cost=46 Card=1097 Bytes=21940)

Statistics
----------------------------------------------------------
          0  recursive calls
         15  db block gets
        545  consistent gets
          7  physical reads
          0  redo size
      50919  bytes sent via SQL*Net to client
       8528  bytes received via SQL*Net from client
         75  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       1098  rows processed


as you can see the proper datatype in this case date is working worse than number datatype... 
interesting no?



 


Followup   November 14, 2003 - 8am Central time zone:

My point was the information the optimizer sees

look Card=1097 when using dates -- and reality says 1098 rows.
     Card=1318 when using numbers -- and reality says 1098 rows.


using the number gave the optimizer bad information here.  That said (i was demonstrating something 
wholly different there - that the optimizer is getting bad bad data) -- I do not observe the same 
thing as you:


ops$tkyte@ORA920PC> create table test_date
  2  (
  3   varchar_fecha varchar2(9),
  4   number_fecha number,
  5   date_fecha date
  6  )
  7  pctfree 70
  8  pctused 10;
 
Table created.
 
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> insert /*+ APPEND */ into test_date
  2  select to_char( to_date( '19980101', 'YYYYMMDD' ) + rownum, 'YYYYMMDD' ),
  3         to_number( to_char( to_date( '19980101', 'YYYYMMDD' ) + rownum, 'YYYYMMDD' ) ),
  4         to_date( '19980101', 'YYYYMMDD' ) + rownum
  5    from dba_source
  6   where rownum <= trunc(sysdate) + 36500 - to_date( '19940101', 'YYYYMMDD' );
 
40104 rows created.
 
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> create index test_date_idx1 on test_date(varchar_fecha);
 
Index created.
 
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> create index test_date_idx2 on test_date(number_fecha);
 
Index created.
 
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> create index test_date_idx3 on test_date(date_fecha);
 
Index created.
 
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> analyze table test_date compute statistics for all indexes for table for all
  2  columns;
 
Table analyzed.
 
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> set autotrace traceonly
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> select * from test_date where number_fecha between 20011231 and 20050101;
 
1098 rows selected.
 
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=21 Card=1318 Bytes=26360)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'TEST_DATE' (Cost=21 Card=1318 Bytes=26360)
   2    1     INDEX (RANGE SCAN) OF 'TEST_DATE_IDX2' (NON-UNIQUE) (Cost=5 Card=1318)
 
 
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        163  consistent gets
         14  physical reads
          0  redo size
      36016  bytes sent via SQL*Net to client
       1302  bytes received via SQL*Net from client
         75  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       1098  rows processed
 
ops$tkyte@ORA920PC> select * from test_date where date_fecha between to_date('20011231', 
'YYYYMMDD') and to_date('20050101', 'YYYYMMDD');
 
1098 rows selected.
 
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=17 Card=1096 Bytes=21920)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'TEST_DATE' (Cost=17 Card=1096 Bytes=21920)
   2    1     INDEX (RANGE SCAN) OF 'TEST_DATE_IDX3' (NON-UNIQUE) (Cost=4 Card=1096)
 
 
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        163  consistent gets
          0  physical reads
          0  redo size
      36016  bytes sent via SQL*Net to client
       1302  bytes received via SQL*Net from client
         75  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       1098  rows processed
 
ops$tkyte@ORA920PC> set autotrace off
ops$tkyte@ORA920PC>


It seems unusual to me that the query that selects less rows (is estimated to select less rows) 
would use a FULL SCAN (given that both indexes by definition would have the same cluster factors).  
You must have lots of non-standard optimizer init.ora settings going on there.

so, use the 10053 trace event (in chapter 6) and show us your optimizer settings.  also, a version 
is ALWAYS relevant. 

5 stars For Dave from Colorado...   November 14, 2003 - 8am Central time zone
Reviewer: Jan M. van Mourik from Houston, TX
You thought you were kidding, Dave?
How about this:

CREATE TABLE BFX_EMPGROUP (
    EMPLOYEE_GROUPID  VARCHAR2(150) NOT NULL,
    NAME              VARCHAR2(150) NOT NULL,
    START_DATE        DATE,
    END_DATE          DATE);

ALTER TABLE BFX_EMPGROUP ADD ( 
    CHECK (LENGTH(EMPLOYEE_GROUPID)<=50),
    CHECK (LENGTH(NAME)<=50));

From PeopleSoft HR... 


5 stars   November 25, 2003 - 10am Central time zone
Reviewer: A reader 


5 stars is there any reason to use CHAR   December 1, 2003 - 11am Central time zone
Reviewer: A reader 
Tom, is there any situation in which it is better to use a CHAR datatype instead of VARCHAR2?  I am 
trying to convince developers to stop using it but they seem to think that if the number of 
characters in a string column will always be the same, like in state abbreviations, then it is 
better to use CHAR.  Is this a valid argument?  


Followup   December 1, 2003 - 5pm Central time zone:

nope.

ask them for the proof or at least the reason why they think "better" 

5 stars please disregard previous post   December 1, 2003 - 4pm Central time zone
Reviewer: A reader 
I found the answer in the following posting-

http://asktom.oracle.com/pls/ask/f?p=4950:8:856236758633628034:::::


5 stars correction to link above   December 1, 2003 - 4pm Central time zone
Reviewer: A reader 
sorry for this, the link above points back to this page. The link where I found the answer was 
here:

http://asktom.oracle.com/pls/ask/f?p=4950:8:856236758633628034::NO::F4950_P8_DISPLAYID,F4950_P8_CRIT
ERIA:123212348063,


4 stars GR8   December 26, 2003 - 4am Central time zone
Reviewer: Bipin Ganar from INDIA
Hi Tom,
Thanx for the Difference. 
I got the difference for char and varchar2 But Still not clear about Char and varchar.
So, can you highlight more on difference between Char and varchar 
 


Followup   December 26, 2003 - 10am Central time zone:

there are none.  varchar2 and varchar are the same right now. 

5 stars Converting many CHAR columns to VARCHAR columns   February 17, 2004 - 10am Central time zone
Reviewer: Sha from USA
Hi Tom,
I need to convert many char columns to varchar columns.

We have about 150 tables with appox 2 columns of type char and 500,000 rows each. I need to convert 
all columns of datatype char to varchar2.

Please suggest the best way to do this.

Regards,
Sha 


Followup   February 17, 2004 - 10am Central time zone:

you'll have to 

a) alter the tables to change the types
b) update each of the tables setting column = rtrim(column) to remove the trailing blanks.

or, rebuild them all. 

5 stars Table update slow   February 18, 2004 - 11pm Central time zone
Reviewer: Sha from USA
Hi Tom,
In our schema we have constraints,Indexes and Master Details relationships. I need to update the 
CHAR columns to VARCHAR datatype and later RTrim the new VARCHAR.

e.g
Lets say TMaster has 3 columns A, B and X, say X is of datatype CHAR(10) and TMASTER has indexs 
multiple indexs on columns including index on X.

TMaster( A,B,X char(10) )

TDETAIL has 3 columns P,Q and Y, say Y is on datatype CHAR(10) and TDETAIL has indexs multiple 
indexs on columns including index on Y.

X is primary key in TMASTER and Y is foreign key in TDETAIL.

TDETAIL (P,Q, Y char(10))

TMASTER has 50,000 records while TDETAIL has 1 Million records.

I did the follwing
1) Drop constraints on both tables
2) Removed Indexs on X and Y columns only all other Indexes for TMASTER and TDETAIL are inplace.
3)Alter tables TMaster and than TDetail.
4) Update X = Rtrim(X) on TMASTER 
5) Update Y = Rtrim(Y) on TDETAIL 

To update column Y it took 8 hours which has 1 Millions rows.

1) Is this an efficient way to do this job because I have around 150 tables with similar change.
2) How do I reduce the time to update the column atleast in couple of minutes and not 8 hours.
3) I guess the oracle should be fast ( couple of minutes ) to update 1 million columns with Rtrim, 
Is there something holding my table to get updated, how should I figure that out.
4) Did I miss out anything or should I do things in a particular order, how do I know what is 
making my table update slow.

I am new to oracle.

Thanks in advance,
Sha 


Followup   February 19, 2004 - 10am Central time zone:

you did something wrong -- not sure what (i would guess you didn't really get the indexes off).

to update 1,000,000 rows (unindexed) is a couple of minutes.


big_table@ORA920PC> create table t
  2  as
  3  select cast( object_name as char(30) ) x, big_table.*
  4  from big_table
  5  where rownum <= 1000000
  6  /
 
Table created.
 
big_table@ORA920PC> set timing on
big_table@ORA920PC> alter table t modify x varchar2(30);
 
Table altered.
 
Elapsed: 00:00:00.32
big_table@ORA920PC> select min(length(x)) from t;
 
MIN(LENGTH(X))
--------------
            30
 
Elapsed: 00:00:02.00
big_table@ORA920PC> update t set x = rtrim(x);
 
1000000 rows updated.
 
Elapsed: 00:01:26.51
big_table@ORA920PC> select min(length(x)) from t;
 
MIN(LENGTH(X))
--------------
             1
 
Elapsed: 00:00:03.64
big_table@ORA920PC>



So, don't know what you did wrong (logs big enough?  any checkpoint not complete messages, are you 
sure the indexes were "gone"). 

5 stars Why char as parameter not restricting?   March 19, 2004 - 4pm Central time zone
Reviewer: dharma from CA USA
I have been debugging some procedures from a Oracle 7.3 system which is in 9i rel 2 now where I 
found one of your BIG NO NO. it had a "when others then null" and the procedure has been there for 
4 years. 
What I found to be causing the error looks strange.

The defn of CHAR without any length is defaulted CHAR(1)

But If I have a parameter in the procedure as CHAR how is it able to get the values larger than 1 
byte?

DECLARE
  PROCEDURE p(p_a IN CHAR) IS
    b CHAR;
  BEGIN
    Dbms_output.put_line('Before assign'||p_a);
    b:=p_a; --raises error if assigned; 
    Dbms_output.put_line('After assign'||b);
  END p;
BEGIN
  p('12');
END;

if the line b:=a is commented it works fine. should'nt it be throwing an error for assigning 
parameter p_a a value of length greater than 1.
Am I missing something.

Thanks,
-dharma 


Followup   March 20, 2004 - 9am Central time zone:

the parameters size is defined at runtime as specified by the INPUT parameter.

if you called

declare
   x char(10);
   y char(20);
begin
   p( x ) ;
   p( y ) ;
end;

the size of "p_a" would be 10 and then 20. sooo, look back in the code to the invokers of this 
routine

(and stop using the dreaded CHAR type !!! hate it) 

3 stars Char(1) vs Varchar(1) Revisited   May 12, 2004 - 5pm Central time zone
Reviewer: Jeff Westman from Walnut Creek, CA, USA
So, Tom ... Am I understanding this correctly?

Situation #1
  CHAR(1) = null, VARCHAR(1) = null
The char takes up 3 bytes (length id + space), and the varchar takes up 2 bytes (length id only)

Situation #2
  CHAR(1) = 'X', VARCHAR(1) = 'X'
The char takes up 3 bytes still (length id + 'X'), and the varchar takes up 3 bytes (length id + 
'X')

So [if I understand this right] it seems best to always use a VARCHAR2 since you will use a minimum 
of 2 bytes either way.

When would you want to use a CHAR then???!

Thanks in advance,

Jeff 


Followup   May 13, 2004 - 9am Central time zone:

situation #1: both take zero or one byte -- the NULL flag byte.  (zero bytes if they are a trailing 
null column in the row)

situation #2: both take three bytes -- the NULL flag byte, one byte for the length, one byte for 
the data.


I would never use char, in fact, I never do. 

4 stars What if CHAR(1) is a NOT NULL column   June 30, 2004 - 3am Central time zone
Reviewer: Sea from Australia
I have a not null column which is a flag.  The value is either 'Y' or 'N'.  There will never be any 
null values.  In this case, does it mean it does not really matter whether to use CHAR or VARCHAR?
 


Followup   June 30, 2004 - 10am Central time zone:

no, they would be 100% the same -- physically on disk, and query wise.

soooo, no good reason to change the rule, i would still use varchar2. 

5 stars number vs date in data warehouse   June 30, 2004 - 1pm Central time zone
Reviewer: CH from WA
Tom,
Use sequantial numbers for date_key instead of keys like 20040630 is better? That means you have to 
always join date dimention table whenever there is a constraint on date. 
Thank you very much for your always eye opening replys.
 


Followup   June 30, 2004 - 2pm Central time zone:

(or just use a date?) 

4 stars Number vs Date   June 30, 2004 - 6pm Central time zone
Reviewer: Wayne 
Tom, 
Using DATE is much better than using number keys if the field is not null and always have good 
values (e.g. trans_date, populated with sysdate at insert time). But if otherwise, dates can have 
bad values (and not desirable to be displayed as is in DW), we can turn it into null, but then in 
many queries, we have to ask the users to apply the outerjoins (to a TIME_DIM dimension table) even 
if the we don't have original values.

I saw some people put a default date there, such as '4-JUL-1776' or '31-DEC-9999', but calculation 
could be a problem if people run a avg(sysdate -birthday), we might get wrong results.

Any suggestions?

Thanks, 


Followup   June 30, 2004 - 8pm Central time zone:

if the field is a date and you put in a number -- what good is that?  you cannot do any analysis?

 

5 stars Please DO NOT post irrelevant question   June 30, 2004 - 8pm Central time zone
Reviewer: A Reader 
The topic is "Char Vs Varchar".
Please DO NOT post questions that are irrelevant to the topic.

Thanks! 


4 stars Date Key   June 30, 2004 - 9pm Central time zone
Reviewer: Wayne 
Tom,
Some "date" source fields are not stored as date, but rather as a varchar2 in 'mm/dd/yyyy' format. 
I know it's bad, and bad data can slip in  like '2/29/2003', or '6/30/2037'.
1. We don't want to store them "as is" in DW, it's humilating (when we find somebody is 900 years 
old) and difficult to use (convert to dates and might fail);
2. So we store them as dates and set the field to null if they are bad or unrealistic dates;
3. Users want to know for those null fields, which are originally null (we didn't collect the info) 
and which are set to null (meaning bad value).
4. So we have to put date_key (number) there and put 0 as unavailable, -1 as bad data.

I know we can let users go to a staging table where these data are stored and before they are used 
to update the real DW table, but sometimes users just don't want to and they set the rules.

Your insights?

Thanks,
Wayne

 


Followup   July 1, 2004 - 10am Central time zone:

i would have stored a date field AND the source data when bad

that is, convert the date and if the conversion fails, store the string.  so a null date with a 
null bad field is truly missing, a null date with a non-null bad field is bad data and they can see 
what the bad data was exactly. 

4 stars import of char values into varchar2 fields   July 5, 2004 - 9am Central time zone
Reviewer: Dorothea from Germany
Dear Tom,

I have the following problem: I have to make a full import from a database with char fields into a 
database with only varchar2 fields. The problem is of course, that all the trailing blanks are as 
well imported. How is it possible to get rid of the trailing blanks (for quite a big database)? 

thanks a lot in advance!
 


Followup   July 5, 2004 - 10am Central time zone:

you could put triggers on the tables prior to the import to trim them, else you have to update 
them.

we put in exactly what we are told to put in, nothing more and in this case, nothing less. 

5 stars thanx   July 6, 2004 - 5am Central time zone
Reviewer: Dorothea from Germany
This was a really exact answer! Adn it helped. Thanx.
Dorothea 


4 stars Varchar vs Varchar2   October 19, 2004 - 1pm Central time zone
Reviewer: John from USA
Tom,

What is the difference between Varchar and Varchar2 datatypes? I read somewhere that, we should 
always use Varchar2 in preference to Varchar.

Pl let me know.
 


Followup   October 19, 2004 - 1pm Central time zone:

they are and have been the same -- but varchar2 will remain the same, varchar is subject to change 
(if ansi changes how it should behave) 

4 stars Help on Datatype!!!!!!!   October 28, 2004 - 10am Central time zone
Reviewer: A reader 
Hi Tom,
The other day one of the developers asked me to help him create a table for reports. This table 
will be used for recording SQL Queries. So I created a table and one of the columns is Repsql 
varchar2(2000 char).
My question is from the performance point of view and space which datatype should I use from the 
following:
varchar2(2000 char);
char(2000 varchar2);
This datatype I was not allowed to create:
varchar2(2000 varchar2);
It generates the error: ORA-00905: missing keyword

Also a point to note is that the queries can be huge. I could not explain to my developer neither 
could I remember the link on your site which explains the difference between
varchar2(2000) and varchar2(2000 char).
Another thing I did a desc v$sql and saw the field SQL_TEXT to be just varchar2(1000). So is this 
size sufficient for the above table.
Oracle 91R2 on Win 2k Advances Server.

Thanks in advance as always.
 


Followup   October 28, 2004 - 1pm Central time zone:

varchar2 is the only type i use for strings

it would be varchar2( 2000 char )  (for 2000 CHARACTERS)

2000 char lets you hold 2000 characters -- in a mutli-byte character set, that might be 4000 bytes 
(2 bytes/character) for example

2000 byte lets you hold 2000 bytes of data, which might be as little as 2000/6 characters. 

5 stars The max size of varchar2 in table and proceudre.   November 12, 2004 - 11am Central time zone
Reviewer: Sean from NJ, USA
Hi Tom,

The max size of varchar2 data type is 4000 bytes, but in the stored procedure I can declare 
varchar2 variable to 32767.  But in chapter 3 of Application developer's guide, it only states that 
max size varchar2 is 4000.  Why are the max sizes different?

Thanks so much for your help.

Oracle 9204

Sean 



 


Followup   November 12, 2004 - 1pm Central time zone:

the max size of a varchar2 column ON DISK is 4000 bytes.

the max size of a varchar2 column IN MEMORY is 32k.

they just are different, not sure there is a "reason", other than in plsql you need more than 4000 
characters for variables sometimes.  On disk you have clobs. 

5 stars clob in plsql   November 12, 2004 - 1pm Central time zone
Reviewer: Sean from NJ, USA
Hi Tom,

We can use clobs in plsql too if it is larger than 32K.

Thanks so much for your help.
 


Followup   November 12, 2004 - 3pm Central time zone:

in 9i yes, you can.   

3 stars Shouldn't this..   November 12, 2004 - 3pm Central time zone
Reviewer: A reader 
"the varchar is physically 1999 bytes but logically can be any size in between 0 and 1999."

be..

"the varchar is logically 1999 bytes but physically can be any size in between 0 and 1999." 


Followup   November 12, 2004 - 3pm Central time zone:

nope, meant it the first way when talking about plsql (only plsql).


a varchar2 less than 2000 characters in plsql is pre-allocated physically to the maximum length.

varchar2(50) -- is 50 bytes

varchar2(50) := 'x' 
is logically 1 byte in my character set, physically however takes 50 bytes of RAM.


 

5 stars veriifcation   November 12, 2004 - 6pm Central time zone
Reviewer: Menon 
"varchar2(50) := 'x' 
is logically 1 byte in my character set, physically however takes 50 bytes of 
RAM.
"

Is there a way to verify this?
Thanx. 


Followup   November 12, 2004 - 8pm Central time zone:

did you read the original answer?  that was the crux of the question in the first place. 

5 stars sorry   November 13, 2004 - 10am Central time zone
Reviewer: A reader 
did not read the original answer..
however, I was looking for a more "direct" proof. I guess
there is none - you can only see the effect indirectly 
by measuring elapsed times in the degenerate case.

Anyway this was more of an "academic" question. 


Followup   November 13, 2004 - 11am Central time zone:

fair enough, here is the "direct" proof:



ops$tkyte@ORA9IR2> create or replace
  2  procedure create_procedure( p_name in varchar2,
  3                              p_num_vars in number,
  4                              p_size_vars in number )
  5  as
  6      l_stmt long := 'create or replace procedure ' || p_name || ' as ';
  7  begin
  8
  9      for i in 1 .. p_num_vars
 10      loop
 11          l_stmt := l_stmt || chr(10) || ' v' || i || ' varchar2('||p_size_vars||') := ''x'';';
 12      end loop;
 13
 14      l_stmt := l_stmt || chr(10) || 'begin null; end;';
 15
 16      execute immediate l_stmt;
 17  end;
 18  /
 
Procedure created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec create_procedure( 'p1', 1000, 1999 );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> exec create_procedure( 'p2', 1000, 2001 );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec p1
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> exec p2
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> @connect /
ops$tkyte@ORA9IR2> set termout off
ops$tkyte@ORA9IR2> set termout on
ops$tkyte@ORA9IR2> @mystat "session pga memory max"
ops$tkyte@ORA9IR2> set echo off
 
NAME                                VALUE
------------------------------ ----------
session pga memory max             286044
 
ops$tkyte@ORA9IR2> exec p1
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> @mystat2
ops$tkyte@ORA9IR2> set echo off
 
NAME                                    V       DIFF
------------------------------ ---------- ----------
session pga memory max            2314240    2028196
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> @connect /
ops$tkyte@ORA9IR2> set termout off
ops$tkyte@ORA9IR2> set termout on
ops$tkyte@ORA9IR2> @mystat "session pga memory max"
ops$tkyte@ORA9IR2> set echo off
 
NAME                                VALUE
------------------------------ ----------
session pga memory max             286044
 
ops$tkyte@ORA9IR2> exec p2
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> @mystat2
ops$tkyte@ORA9IR2> set echo off
 
NAME                                    V       DIFF
------------------------------ ---------- ----------
session pga memory max             364120      78076
 
ops$tkyte@ORA9IR2>
 

5 stars excellent!   November 13, 2004 - 11am Central time zone
Reviewer: Menon 
completely forgot about the pga memory stats:) Thanx Tom. 


5 stars   January 11, 2005 - 7am Central time zone
Reviewer: A reader 


4 stars Difference   January 12, 2005 - 5am Central time zone
Reviewer: Reader from India
what is difference between varchar,and varchar2 datatype in oracle 


Followup   January 12, 2005 - 9am Central time zone:

nothing right now. 

5 stars PL/SQL Variable Sizes   March 29, 2005 - 6pm Central time zone
Reviewer: Andrew from Wellington, NZ
My team is tired of seeing value errors caused by PL/SQL developers not allocating a big enough 
string for a message (ORA-6502 numeric or value error: character string buffer too small), so 
someone has suggested enforcing all VARCHAR2s in PL/SQL to be 2000+.  This way there is never a 
downside to memory usage, only a small downside to performance due to re-allocating modified 
strings.

Assume for a second that none of these varchar2 variables are used to store/retrieve information 
from the database (assume we use plsql records for that).  

Is there any downside to enforcing all VARCHAR2 variables to be 2000+?  The only one could be: what 
about PL-SQL collections?  We use index-by tables and associative arrays (but not varrays) to pass 
around information.  But because these collections cannot be pre-allocated sizes, I am guessing 
that PL/SQL does the right thing and doesn't waste any space (in memory) when extending a 
collection, i.e. 

  TYPE typet IS TABLE OF VARCHAR2(2000);
  t typet := typet('x','y','z');

Does this take up 3 bytes or 6000 bytes?
 


Followup   March 29, 2005 - 8pm Central time zone:

there is a downside in that you no longer verify the string is the right size either.  That doesn't 
seem like the correct fix to me.

Use of %TYPE, %ROWTYPE maybe.

Looks "safe" enough memory wise, but doesn't sound like a sound programming concept (well, it isn't 
frankly).


ops$tkyte@ORA9IR2> create or replace package demo_pkg
  2  as
  3          type array is table of varchar2(2000);
  4          g_data array := array();
  5  end;
  6  /
 
Package created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> disconnect
Disconnected from Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.4.0 - Production
ops$tkyte@ORA9IR2> @connect /
ops$tkyte@ORA9IR2> set termout off
ops$tkyte@ORA9IR2> set termout on
ops$tkyte@ORA9IR2> @mystat "session pga memory max"
ops$tkyte@ORA9IR2> set echo off
 
NAME                                VALUE
------------------------------ ----------
session pga memory max             286044
 
ops$tkyte@ORA9IR2> begin
  2          dbms_output.put_line( demo_pkg.g_data.count );
  3  end;
  4  /
0
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> @mystat2 "session pga memory max"
ops$tkyte@ORA9IR2> set echo off
 
NAME                                    V DIFF
------------------------------ ---------- ----------------
session pga memory max             360092           74,048
 
ops$tkyte@ORA9IR2> begin
  2          demo_pkg.g_data.extend(500);
  3  end;
  4  /
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> @mystat2 "session pga memory max"
ops$tkyte@ORA9IR2> set echo off
 
NAME                                    V DIFF
------------------------------ ---------- ----------------
session pga memory max             360092           74,048
 
ops$tkyte@ORA9IR2> begin
  2          for i in 1 .. 50
  3          loop
  4                  demo_pkg.g_data(i) := 'x';
  5          end loop;
  6  end;
  7  /
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> @mystat2 "session pga memory max"
ops$tkyte@ORA9IR2> set echo off
 
NAME                                    V DIFF
------------------------------ ---------- ----------------
session pga memory max             360092           74,048
 
ops$tkyte@ORA9IR2> begin
  2          for i in 1 .. 500
  3          loop
  4                  demo_pkg.g_data(i) := 'x';
  5          end loop;
  6  end;
  7  /
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> @mystat2 "session pga memory max"
ops$tkyte@ORA9IR2> set echo off
 
NAME                                    V DIFF
------------------------------ ---------- ----------------
session pga memory max             360092           74,048
 
ops$tkyte@ORA9IR2> begin
  2          for i in 1 .. 500
  3          loop
  4                  demo_pkg.g_data(i) := rpad('x',200,'x');
  5          end loop;
  6  end;
  7  /
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> @mystat2 "session pga memory max"
ops$tkyte@ORA9IR2> set echo off
 
NAME                                    V DIFF
------------------------------ ---------- ----------------
session pga memory max             491164          205,120
 
ops$tkyte@ORA9IR2> begin
  2          for i in 1 .. 500
  3          loop
  4                  demo_pkg.g_data(i) := rpad('x',2000,'x');
  5          end loop;
  6  end;
  7  /
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> @mystat2 "session pga memory max"
ops$tkyte@ORA9IR2> set echo off
 
NAME                                    V DIFF
------------------------------ ---------- ----------------
session pga memory max            1867420        1,581,376
 
 

4 stars string literal padded with trailing blanks   April 15, 2005 - 3pm Central time zone
Reviewer: A reader 
Hi Tom,
Our problem is related to one of your comments earlier -

character string literals like that are coerced to the proper datatype automagically
'A' is really 'A        '
it is the ansi defined behaviour

We have a select like this in our jsp (java server page):
select '250' as TYPE_ID, ... from ...

At one client site, this is returning '250    ' in a java String variable. According to the 
developers, it always returned '250' (without the trailing blanks). I noted that this particular 
client uses utf8 as their db characterset, whereas all others use us7ascii. Is this behaviour of 
padded trailing blanks because of utf8 characterset? Is there anything we could do on the jdbc side 
that will globally suppress the trailing blanks from strings rather than we having to change our 
java code all over to do a trim? 


Followup   April 15, 2005 - 3pm Central time zone:

other thought first -- what is cursor_sharing set to on this customer site? 

5 stars cursor_sharing?   April 15, 2005 - 5pm Central time zone
Reviewer: A reader 
Sorry for the delay, Tom but the client is out of U.S and I don't have sysdba access to their db. I 
guess cursor_sharing=similar of force could be causing this? I will ask them to set it to exact and 
see. Thanks much. 


Followup   April 15, 2005 - 6pm Central time zone:

ops$tkyte@ORA9IR2> alter session set cursor_sharing=exact;
 
Session altered.
 
ops$tkyte@ORA9IR2> select '123' from dual d1;
 
'12
---
123
 
ops$tkyte@ORA9IR2> alter session set cursor_sharing=force;
 
Session altered.
 
ops$tkyte@ORA9IR2> select '123' from dual d2;
 
'123'
--------------------------------
123
 
ops$tkyte@ORA9IR2>


see the column widths?  with cursor sharing force/simlar the second query becomes select :sys_bv0 
from dual and you know longer know "3" is the answer


and "show parameter" does not need sysdba

in fact, just run select '123' from dual and see what you see in sqlplus! 

4 stars Difference in performance?   May 9, 2005 - 10am Central time zone
Reviewer: A reader 
I just stepped into a shop where the developers have been strictly instructed to use char rather 
then varchar2 because of "performance benefits".  Any truth to this?

PS - No metrics, no test cases...surprised?

Thanks in advance! 


Followup   May 9, 2005 - 10am Central time zone:

yes, char will make you run slower, consume more ram and make the pockets of your hardware vendor 
better padded.

Other than that, they are great.

recommendation -- never use char() 

5 stars Documentation to this fact...   May 18, 2005 - 11am Central time zone
Reviewer: A reader 
Tom,

Does Oracle have any documentation stating that varchar2 is prefered to char and why?  A colleague 
is convinced that using varchar2 instead of char columns will degrade the performance of the 
database. He says that using char columns will write everything in nice, neat groups and prevent 
fragmentation.  Being from Missouri, I need to show him to convice him otherwise.

Thanks!  


Followup   May 18, 2005 - 11am Central time zone:

<quote>
Does Oracle have any documentation stating that varchar2 is prefered to char and 
why?
</quote>

no, it is your choice and my voice on it is "USE VARCHAR2 and FORGET CHAR even exists"

a char is simply a fixed with varchar2 -- with blanks at the end.

a char(80) will take bytes for the length (like a varchar2) and consume 80 more bytes storage on 
disk.

it will make your indexes as big as they can be.
it will make your tables as large as they can.
I will never use them. 

5 stars Performence between data types   July 11, 2005 - 9am Central time zone
Reviewer: Sanjaya Balasuriya from Colombo, Sri Lanka
Hi Tom,

1.
When used for storing text (characters and numbers) data, is there any performance wise difference 
between number, char and varchar2 ?

2.
How can we compare number, char and varchar2 when it comes to efficiency of storage management ?

Thanks in advance. 


Followup   July 11, 2005 - 11am Central time zone:

1) well, the only thing that makes SENSE To store text is "varchar2"

the only thing that makes SENSE to store a number is, well, a number

same with a date.  The question doesn't even make sense to me -- if you have numbers, use a NUMBER 
type.

2) you don't, you won't, you use the right type for your needs, period, plain and simple. 

5 stars Case for CHAR   July 27, 2005 - 7pm Central time zone
Reviewer: Kevin Kirkpatrick from St. Louis, MO
I have a friend (wow, is that cliche or what) who works in a shop where the primary context in 
which data will be extracted will be into fixed-width flat files processed by legacy COBOL 
programs.  Thus, the majority of SQL written against the database will pull fixed-width fields.  
Given that space is not an issue, that single-order-magnitute-or-less performance degradation is 
not an issue, might this be a case where even Tom Kyte would consider using CHAR's in place of 
VARCHAR2's?

Consider:
SELECT FIELD1, FIELD2
FROM TEMP

vs.
SELECT RPAD(FIELD1,10,' '), RPAD(FIELD2,15,' ')
FROM TEMP

Thanks! 


Followup   July 27, 2005 - 8pm Central time zone:

Nope, because eventually someone will ask "what is the length of 'bob'" and getting 20 back is not 
what they expect.  Or they use a varchar in their program OR they just any one of the thousands of 
ad-hoc tools and it uses varchars.

Nope, still would not. 

4 stars CHAR x VARCHAR2 usage   January 5, 2006 - 3pm Central time zone
Reviewer: Leonardo Pinheiro from Brasil
Hi there,
    thinking about CHAR and VARCHAR2 datatypes I figure out one cenario that might indicate CHAR 
datatype instead of VARCHAR. I would like to confirm it with Tom or any other DBA who have already 
tested it using block dumps.

If we have a column that is intended to get updated a log and those updates will change data 
lenght, we might have a lot more I/O using VARCHAR2 due to the block reallocation possibility 
inside the datafile. For instance:

create table test (A varchar2(2000))

insert into test values ('a')   (lets say 1000 of there inserts)

update test set A='aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa' we would have to move all the blocks forward in 
the datafile to fit the new data for each row. Is that performance issue considerable or is it just 
rubish?

thanks! 


Followup   January 5, 2006 - 4pm Central time zone:

you would not move all of the blocks "forward"...

When you update a row - and make it grow in size such that is no longer fits on a block - Oracle 
will MIGRATE the row.  

That is, the original block will just have a "forwarding address"- a rowid pointer to where the row 
is right now.  

So, no, it won't cause the entire "end of the table to move" to make room - it will however incurr 
an extra IO on retrieval via an index (the index will point to where the row started on, we goto 
that block to discover the row has "moved" or migrated and go to that block to get it).

This is why you would consider setting pctfree.  To reserve room on the block for future updates 
and to avoid the row migration. 

5 stars   January 6, 2006 - 4am Central time zone
Reviewer: Raj from IN
Tom,
in case of above reply can I have my tablespace segment space management as auto and not worry 
about setting pctfree, pctused etch 


Followup   January 6, 2006 - 1pm Central time zone:

which above reply, there are lots of them.

ASSM (automatic segment space management) does not help in the setting of pctfree, you still need 
to do that. 

5 stars Blank Padds in Char Data Type   January 26, 2006 - 10pm Central time zone
Reviewer: Sanjai kumaar NM from THAILAND
I have two table with following descriptions.
Table TEST1 (X CHAR(10)) AND Table TEST2 (Y CARCHAR2(10))
Both the tables has some similar values. When I do a select statement like
SELECT X,Y FROM TEST1,TEST2 WHERE TEST1.X=TEST2.Y
This does not return any rows. Whereas if I issue a select statement seperately on individual 
tables like
SELECT X FROM TEST1 WHERE X='ABC12' AND
SELECT X FROM TEST2 WHERE Y='ABC12'
These select statements returns rows.
What makes the difference.
Thanks In Advance.
 


Followup   January 27, 2006 - 8am Central time zone:

select dump(x) from test1 where x = 'ABC12';
select dump(y) from test2 where y = 'ABC12'

what does that produce (cut and PASTE)
 

5 stars Documentation on CHAR deprecation   April 21, 2006 - 8am Central time zone
Reviewer: Nice 
Hello Tom,

Will CHAR data type would be deprecated? Is there any official announcement on this?
The first reference to the de-support of CHAR seems to have surfaced on metalink in December 2001 
CHAR is part of the ANSI standard, so it seems rather unlikely to go.  


Followup   April 21, 2006 - 8am Central time zone:

I would doubt it would be.

Where do you see a reference to the desupport of this type? 

4 stars VARCHAR2(<2000) Vs VARCHAR2(>=2000)   June 22, 2006 - 1pm Central time zone
Reviewer: VKOUL from WA USA
Hi Tom,

"In PLSQL, what they are saying is that for optimization purposes -- a varchar2 
less then 2000 bytes will allocate a fixed amount of memory.  It will 
preallocate the maximum size upon instantiation.  This cuts down on the 
reallocations as different sized strings are assigned to it.  On the other hand, 
a varchar2 greater then 2000 characters will truely be varying sized and as you 
assign different sized strings to it it will grow dynamically as needed.

One takes a little more ram (if you never use the entire string) and performs a 
little better, the other uses only what ram it needs but if you cause the string 
to grow frequently will not perform as well due to the reallocs to resize the ..."

Does this behaviour still hold in Oracle 9i and 10g ?

Thanks. 


4 stars Strange thing happening   October 2, 2006 - 6am Central time zone
Reviewer: abz 
Oracle 9.2.0.5

SQL> DESC CI_BILL;
 Name                                      Null?    Type
 ----------------------------------------- -------- --------
 BILL_ID                                   NOT NULL CHAR(12)
 BILL_CYC_CD                               NOT NULL CHAR(4)
 WIN_START_DT                                       DATE
 ACCT_ID                                   NOT NULL CHAR(10)
 BILL_STAT_FLG                             NOT NULL CHAR(2)

SQL> L
  1  select COUNT(bill_stat_flg), SUM(DECODE(bill_stat_flg, 'C', 1, 0))
  2* from ci_bill where bill_stat_flg='C'
SQL> /

COUNT(BILL_STAT_FLG) SUM(DECODE(BILL_STAT_FLG,'C',1,0))
-------------------- ----------------------------------
             4651756                                  0


I think both of them should be equal, whats going wrong
here? 


Followup   October 2, 2006 - 7am Central time zone:

you got that neato char datatype working for you there.  I hate char, I will not use char, I will 
never recommend char, and this - this is a classic reason WHY.


ops$tkyte%ORA10GR2> create table t ( x char(2) );

Table created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert into t values ( 'C' );

1 row created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select decode( x, 'C', 'I am ''C''!', 'I am certainly NOT ''C''' ),
  2         decode( x, 'C ', 'I am ''C ''!', 'I am certainly NOT ''C ''' )
  3    from t;

DECODE(X,'C','IAM''C'' DECODE(X,'C','IAM''C''!
---------------------- -----------------------
I am certainly NOT 'C' I am 'C '!


You have NO 'C' in your table, you have 'C '.  Decode works with varchar types by default, not char 
types.  When you compare a char to a varchar, the varchar better be padded out OR IT WILL NOT WORK 
as you "expect" (but it will work entirely as ANSI asked for it to work)

ops$tkyte%ORA10GR2> variable x varchar2(2)
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> exec :x := 'C'

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select * from t where x='C';

X
--
C

ops$tkyte%ORA10GR2> select * from t where x=:x;

no rows selected



When you compare a literal to it (oh my gosh, please do NOT think I'm saying to use literals in 
your code - I'm really not, bind stuff that varies from execution to execution!!!!) the rules state 
"the literal shall be promoted to the type we compare to".  Hence the char(1) that is 'C' becomes a 
char(2) with 'C ' in it.

However, when you put 'C' into a varchar2, you are now comparing a varchar2 to a char and the 
results are very different.


My suggestion would be to (I am deadly serious) implement an engineering change request on your 
system to "remove that type, it wastes space, it causes confusion, it provides negative value". 

5 stars very strange   October 2, 2006 - 8am Central time zone
Reviewer: abz 
Oh, its really very confusing.
Problem is that we are using a 3rd party product
called CC&B from SPL www.splwg.com.

That product has used 99 percent of CHAR datatypes.

can you please tell me how can I convince them.

OR
Although you have explained it very well, but is there
a document like ALL ABOUT CHAR DATATYPE, which I can
circulate to all the developers in the organization, so 
that they know that what should they expect when working
with CHAR datatypes.

Thanks
 


Followup   October 2, 2006 - 9am Central time zone:

well, the documentation actually covers this.  I cover it too in my book Expert Oracle Database 
Architecture - chapter on datatypes - but I cover it from the perspective of "the char type is just 
a waste, do not use it" 

4 stars CHAR vs VARCHAR   January 15, 2007 - 12pm Central time zone
Reviewer: Deepak from India
Hi Tom,

First of all please excuse me because am going to ask a very trivial question. But it raises few points in my mind.

Have done this simple exercise...

SQL> conn scott/tiger
Connected.
SQL> create table char_vs_varchar(
2 col1 char(5),
3 col2 varchar2(5));

Table created.

SQL> insert into char_vs_varchar values('TEST','TEST');

1 row created.

SQL> COMMIT;

Commit complete.

SQL> SELECT * FROM char_vs_varchar
2 WHERE COL1=COL2;

no rows selected

SQL> SELECT * FROM char_vs_varchar
2 WHERE TRIM(COL1)=COL2;

COL1 COL2
----- -----
TEST TEST

SQL> SELECT * FROM char_vs_varchar
2 WHERE COL1='TEST';

COL1 COL2
----- -----
TEST TEST

Have the following queries...

I know that Oracle is doing the right thing as it right pads the string for the CHAR column with ' '.

>> But is it always recommended not to use this type of comparisions in queries?

>> Shall we always insert fixed length strings in CHAR columns? In other words, shall we decide on the CHAR data type based on the possible length of the string that will go in, CHAR would be selected for only FIXED length strings?

>> In the last example why does Oracle chose to trim the CHAR column value?

Please help me in understanding the basics?



4 stars Common defined types   February 20, 2007 - 2pm Central time zone
Reviewer: Andrew Markiewicz from Madison, WI
Tom.
My question/response is similar to a previous post on this thread about requiring all varchar2 to be >2000. Your response was
'Looks "safe" enough memory wise, but doesn't sound like a sound programming concept (well, it isn't
frankly).'

We were considering making a package spec to contain predefined types so that programmers could use them instead of defining their own type (most likely differently) in each package. Similar to the standards package Oracle defines but to include some standard plsql tables definitions.

Ex:
create or replace package typ
as
TYPE hashInt is TABLE of integer
index by varchar2(32000);
 
TYPE hashVc is TABLE of varchar2(32000)
index by varchar2(32000);

TYPE hashDt is TABLE of date
index by varchar2(32000);

TYPE vaInt  is VARRAY(10000) of integer;
TYPE vaVc   is VARRAY(10000) of varchar2(32000);
TYPE vaDt   is VARRAY(10000) of date;



end;


And using the types would merely be,
...
xyz  typ.hashDt;
abc  typ.vaInt;


The main benefits are not having to redefine a similar type everywhere it is needed and homogeneous code.
My main concern was the memory consumption but it seems Oracle handles that dynamically, which is good for memory conservation.
I would not advocate the use of these types if there was a requirement to limit the varchar2 size and trap an appropriate error as part of an application, but rather just as a convenient method of using simple arrays.

Are there any drawbacks to you can think of or is this also not a sound programming technique?

Thanks.


Followup   February 20, 2007 - 3pm Central time zone:


If you treat everything as a 32k string, everything will be a 32k string.

You lose the strong typing - which is generally considered "not a good thing".


2 stars   February 21, 2007 - 12pm Central time zone
Reviewer: Andrew Markiewicz 
But I am not suggesting that everything be a varchar2(32000). There is a plsql table defined for integers and dates as well so the data type is preserved. The main goal is to avoid having to define a type every time you just want a simple array of strings (or integers or dates).

An additional consideration could be to make a UDT VARRAY(32767). This adds the benefit of using the type constructor within function calls or even SQL.

create or replace type vcarr as VARRAY(32767) of varchar2(32768)
/

set echo on
col column_value format a20

select * from table(vcarr('x','y','z'));

COLUMN_VALUE
--------------------
x
y
z

declare
    x   vcarr := vcarr();

    procedure proc1(p_arr in vcarr)
    is
    begin
        for i in p_arr.FIRST .. p_arr.LAST
        loop
            dbms_output.put_line('i='|| i || ' ' || p_arr(i));
        end loop;
    end;

begin
    proc1(vcarr('a','b','c','d'));
    proc1(vcarr('hello', 'world'));

    x.EXTEND(5000);
    for i in 1 ..5000
    loop
        x(i) := to_char(i);
    end loop;
    proc1(x);

end;
/
i=1 a
i=2 b
i=3 c
i=4 d
i=1 hello
i=2 world
i=1 1
i=2 2
i=3 3
i=4 4
i=5 5
i=6 6
i=7 7
i=8 8
i=9 9
i=10 10
i=11 11
i=12 12
i=13 13
i=14 14
i=15 15
i=16 16
i=17 17
...
i=4999 4999
i=5000 5000

PL/SQL procedure successfully completed.


For the case of a type (or plsql table) I just abhor the idea of creating types like:

TYPE vc100 of varchar2(100)
TYPE vc500 of varchar2(500)
TYPE vc4000 of varchar2(4000)
...

when plsql handles the memory allocation dynamically (at a minor performance hit). Why not just one type per native data type (varchar2, integer, date,...)?
Again, for simple arrays that do not require constraints on the size of the data.
Unless of course Oracle has some designs on creating real arrays for plsql....

Andrew

Followup   February 21, 2007 - 12pm Central time zone:

if you have a table of names (that should be 30 characters or less)
and a table of addresses (that should be 120 characters or less)

and you use a table of 32k - you'll have names that are 32k and addresses as well.


it is called "strong typing", think of the datatype as a DATA INTEGRITY check (because, after all, that is exactly what it is)

No, you won't convince me to use a few generic types for everything.

I might not convince you not to - but that is another story.


You asked my opinion, I'm of the opinion that creating the correct type for your code is the right approach.


.... for simple arrays that do not require constraints on the size of the data. ...


not sure where that occurs, or why it is onerous to create a type specific to an API.
3 stars   February 21, 2007 - 1pm Central time zone
Reviewer: Andrew Markiewicz from Madison, WI
Fair enough.
That's why I'm debating the issue internally. I'm not convinced it's a great idea myself, just convenient.

As always, thanks for your input.

4 stars is there anything different varchar with varchar2?   March 14, 2007 - 10am Central time zone
Reviewer: Jerhyn from Seoul, South Korea
this place is so amazing, i always appraciate you tom.

while i read this topic, i was wondering what is different varchar with varchar2.

could you tell me? or just let me know where can i find about them.


Followup   March 14, 2007 - 2pm Central time zone:

nothing right now, they are the same right now, they might differ in the future.
3 stars furture   March 14, 2007 - 3pm Central time zone
Reviewer: marc from usa
How far in the future did they think about? Oracle has been saying "for future use" since version 7. Right?

Followup   March 15, 2007 - 7am Central time zone:

at least.
3 stars Char vs Varchar2, like vs =   March 28, 2007 - 2pm Central time zone
Reviewer: Jason Tsai from Santa Clara, CA USA
Tom,

This is the question regarding the reply you made for the second post in this thread from kiran.

I tried your test case in both 9.2.0.3 and 10.2.0.1, but got different results for predicate containing char datatype column and like operator:

In 9.2.0.3:

invmgmt_dev@SCORDEV3(95,1290)> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.3.0 - 64bit Production
PL/SQL Release 9.2.0.3.0 - Production
CORE    9.2.0.3.0       Production
TNS for IBM/AIX RISC System/6000: Version 9.2.0.3.0 - Production
NLSRTL Version 9.2.0.3.0 - Production

Elapsed: 00:00:00.01
invmgmt_dev@SCORDEV3(95,1290)> create table t ( x char(10), y varchar2(10) );

Table created.

Elapsed: 00:00:00.01
invmgmt_dev@SCORDEV3(95,1290)> insert into t values ( 'x', 'x ' );

1 row created.

Elapsed: 00:00:00.00
invmgmt_dev@SCORDEV3(95,1290)> select * from t where x like 'x';

X          Y
---------- ----------
x          x

Elapsed: 00:00:00.00
invmgmt_dev@SCORDEV3(95,1290)> select * from t where y like 'x';

no rows selected

Elapsed: 00:00:00.01


In 10.2.0.1:

jason@ORA10GR2(143,8689)> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
PL/SQL Release 10.2.0.1.0 - Production
CORE    10.2.0.1.0      Production
TNS for 32-bit Windows: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production

5 rows selected.

Elapsed: 00:00:00.09
jason@ORA10GR2(143,8689)> create table t ( x char(10), y varchar2(10) );

Table created.

Elapsed: 00:00:00.01
jason@ORA10GR2(143,8689)> insert into t values ( 'x', 'x ' );

1 row created.

Elapsed: 00:00:00.00
jason@ORA10GR2(143,8689)> select * from t where x like 'x';

no rows selected

Elapsed: 00:00:00.01
jason@ORA10GR2(143,8689)> select * from t where y like 'x';

no rows selected

Elapsed: 00:00:00.01


After searching the metalink, I found a bug no. 4397534, which indicates the behavior in 9.2.0.3 is a bug.

Which behavior is correct?

Thanks,

Jason

Followup   March 30, 2007 - 11am Central time zone:

why would I choose an answer other than what was stated in the bug report - that 9i had a bug that was corrected in 10g?
5 stars What is this behaviour of Oracle 10gR2   November 3, 2007 - 2am Central time zone
Reviewer: Sagar from Mumbai, India
Why varchar column is varchar2 in desc output?.

SQL> ed
Wrote file afiedt.buf

  1* create table tvar ( col1 varchar(100),col2 varchar2(4000 ))
SQL> /

Table created.

SQL> desc tvar
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 COL1                                               VARCHAR2(100)
 COL2                                               VARCHAR2(4000)


Followup   November 5, 2007 - 11am Central time zone:

as stated above... and as documented:

in response to "what is the difference between varchar and varchar2"


... nothing right now, they are the same right now, they might differ in the future. ...
5 stars date as varchar   February 19, 2008 - 3pm Central time zone
Reviewer: dobromil kautsky na stroji husgvarna 
hello tom,

i know you say that dates should be stored in date columns but there is one case where i am not 
sure about how to achieve that.

imagine you need to store a date value whose precision can be anything between a year and a second. 
so the column may contain values like this:

'2008'
'2008-10'
'2008-10-03'
'2008-10-03 12:20'

etc.

we used to store these dates as a date column and as a number column where the number specified the 
precision of the date, but working with such structure turned out to be quite cumbersome.

now we store these dates as a varchar ('yyyymmdd.....') - everything is much simpler now, but i 
still don't feel well about it - i wonder if there is a way to achieve this functionality with date 
datatype.

thank you


Followup   February 20, 2008 - 7am Central time zone:

when you store 2008 in yyyymmdd - what do you store?

01-jan-2008 00:00:00 -> 2008               
01-oct-2008 00:00:00 -> 2008-10
03-oct-2008 00:00:00 -> 2008-10-30
03-oct-2008 12:20:00 -> 2008-10-30 12:20


and how is it better/different then using a date as shown above.

if you store 2008 as 20080000000000 - then you have just stored the logical equivalent of a date column and a number column - just using one field (eg: how would that be any more or less cumbersome than what you already had)
5 stars dates   February 20, 2008 - 8pm Central time zone
Reviewer: dobromil kautsky na stroji husgvarna 
when you store 2008 in yyyymmdd - what do you store?

only the part of the date that we know is stored - so for year it's '2008', for a month it's '200810', etc.

if i only used a date column i would lose an important piece of information - the precision.

sometimes we do not know the exact date, we only know the year, or the year and month, etc. so storing 2005 as 2005-01-01 would be missleading for the users (that's why we used to have the additional number column specifying the precision).

for example - in the table containing statutory body members there are two date columns which tell us from when to when the person was a member of the statutory body of a company. when we want to select all records valid for a given date, we can write:

select * from statutory_body_members where :date_str between start_date and end_date||'z'

for this to work we would have to have different rules for storing the value of '2008' in start_date (2008-01-01) and in end_date (2008-12-31 23:59:59) (which of course would be possible but after several years of working with both ways of storing these dates, i find the 'varchar' way much simpler to use - especially due to the fact that only one column is needed to store the date)




Followup   February 20, 2008 - 8pm Central time zone:

you said you stored yyyymmddhh24miss, so you don't. You use a variable mask.


The varchar approach might be "easier" for you, but throws off the computed cardinalities. Your choice.


Wonder how many of your dates are invalid.... Data types are integrity checks too.

I'm not a fan of it (using the wrong type)


what if you used two columns again, the start date and the interval (duration). 
5 stars   February 20, 2008 - 9pm Central time zone
Reviewer: dobromil kautsky na stroji husgvarna 
Wonder how many of your dates are invalid.... Data types are integrity checks too.

s simple check constraint can take care of that.


I'm not a fan of it (using the wrong type)

i know - that's why i do not feel well about our use of varchars

what if you used two columns again, the start date and the interval (duration).

all operations (that we perform) on these columns would become more complicated. (eg. the one thing that we do the most - store the date, display the date to the user - now all we do is store the string as it comes and then show it again - with dates you need to keep track of where to store the "start date" format of the date and where to store the "end date", the when reading the date you need to format the date with a function because again there is no way to pass the date containing only the year as a date)

it would be best if oracle was capable of storing this information in a date column, unfortunatelly it only distinguishes between a date and a date with time - maybe a new datatype - just 3 more bits and so much work could be saved (at least at our company :-) )

Followup   February 20, 2008 - 10pm Central time zone:

.... s simple check constraint can take care of that. ...

so you've done that right?

with a start_date and a duration, you would not have to store the "format" anywhere.

the duration would cover that already.

start_date, start_date + interval



It does not distinguish between a date and a date with time, ALL DATES have time, all of them.


5 stars   February 21, 2008 - 7am Central time zone
Reviewer: dobromil kautsky na stroji husgvarna 
i am not sure whether i understand how this start_date + interval could help me. could you please 
show me sample data? for example for a person who was a member of the statutory body from '200204' 
to '2007'. if i understand you well, you suggest to use two columns for both these dates?



Followup   February 21, 2008 - 7am Central time zone:

show me how you do that with your current data first.


it'll be very similar.

provide your test data. Let us see examples of people that were there in 2005, another in October 2006 and someone for one day in 2007.
5 stars sample data   February 22, 2008 - 7am Central time zone
Reviewer: dobromil kautsky na stroji husgvarna 
here is a sample of real data (not very representative because 95% of strt_dt's and end_dt's in 
this table are yyyymmdd, but it's those that are not that cause "problems"..)


create table SB_SAMPLE
(
  ROW_ID     NUMBER(20) not null,
  ROW_VER    NUMBER(10),
  ROW_DT     DATE not null,
  COMPANY_ID NUMBER(20) not null,
  TYPE_CD    VARCHAR2(90),
  PERSON_ID  NUMBER(20) not null,
  STRT_DT    VARCHAR2(30),
  END_DT     VARCHAR2(30)
)
;

insert into SB_SAMPLE (ROW_ID, ROW_VER, ROW_DT, COMPANY_ID, TYPE_CD, PERSON_ID, STRT_DT, END_DT)
values (700, null, to_date('22-10-2002 19:05:12', 'dd-mm-yyyy hh24:mi:ss'), 1700, '04', 887200, 
null, '19960318');
insert into SB_SAMPLE (ROW_ID, ROW_VER, ROW_DT, COMPANY_ID, TYPE_CD, PERSON_ID, STRT_DT, END_DT)
values (1100, null, to_date('22-10-2002 19:05:12', 'dd-mm-yyyy hh24:mi:ss'), 2500, '04', 884700, 
null, null);
insert into SB_SAMPLE (ROW_ID, ROW_VER, ROW_DT, COMPANY_ID, TYPE_CD, PERSON_ID, STRT_DT, END_DT)
values (1200, null, to_date('22-10-2002 19:05:12', 'dd-mm-yyyy hh24:mi:ss'), 2800, '04', 167000, 
'19950601', '199807');
insert into SB_SAMPLE (ROW_ID, ROW_VER, ROW_DT, COMPANY_ID, TYPE_CD, PERSON_ID, STRT_DT, END_DT)
values (1600, 2, to_date('22-06-2007', 'dd-mm-yyyy'), 3600, '04', 371300, '1994', '199708');
insert into SB_SAMPLE (ROW_ID, ROW_VER, ROW_DT, COMPANY_ID, TYPE_CD, PERSON_ID, STRT_DT, END_DT)
values (1900, null, to_date('22-10-2002 19:05:12', 'dd-mm-yyyy hh24:mi:ss'), 4000, '04', 1009200, 
'1992', '19971231');
commit;


Followup   February 22, 2008 - 7am Central time zone:

you did not answer my question.

please show how to retrieve any bit of data with your where clause.

whether it be year/yearmonth/yearmonthday/whatever.

I want to find those people I referred to above, how would *you* do it.
5 stars   February 22, 2008 - 8am Central time zone
Reviewer: dobromil kautsky na stroji husgvarna 
ok, so basically we access these dates in 3 ways:

1) just display them (or store), so it's just
"select strt_dt from ..." - our nls_date_format is set
to 'yyyymmddhh24miss' so accessing date columns and varchar columns containing dates is transparent 
to the middle tier.

2) select records that were valid on a given date (this is usualy (almost always) a part of a more 
complex filter - eg: all members for a given set of companies and a given date):

select * from sb_sample where company_id in (1,2,...) and :date_str_param between between 
nvl(strt_dt,'0') and end_dt||'z'

3) select records whose dates fall within a given range (used for monitoring changes of tha data). 
eg:

select * from sb_sample where strt_dt||'z'>=:date_str_oparam_from and 
strt_dt<=:date_str_param_to||'z'

(in 99% of cases the query is run with :date_str_param_to set to current date)



Followup   February 22, 2008 - 12pm Central time zone:

so, it would be IDENTICAL.

You would store a start date, eg:

01-jan-2007

and an interval

1 year

now, start_date is obviously 01-jan-2007
and end_date is that plus 1 year.

So, how much harder is that then the string? (eg: none)


select * from sb_sample where company_id in (1,2,...)
and to_date(:date_str_param,fmt)
between between nvl(strt_dt,some_date_way_in_past)
and nvl(strt_dt+interval,some_date_way_in_future)


5 stars   February 22, 2008 - 12pm Central time zone
Reviewer: dobromil kautsky na stroji husgvarna 
i do not think it's the same.

1) how would you store a records with dates like this:

i) start_date = '2005', end_date = '2006-03-02'
ii) start_date= '2005-03', end_date = '2007-04-02'
iii) start_date = '2005-01-06', end_date = '2009'

2) how woud you rewrite the third select:

select * from sb_sample where strt_dt||'z'>=:date_str_oparam_from and 
strt_dt<=:date_str_param_to||'z'

to achieve the same functionality.

3) bear in mind that the same selects need
to be run on end dates too

i think what you sugest is functionally
equivalent to just having start_date
and end_date as two date columns,


Followup   February 22, 2008 - 12pm Central time zone:

'2005', end_date = '2006-03-02'

start date of 01-jan-2005 and interval of your end date minus start date


Look: you have START DATES and END DATES
I'm saying store the START DATE and a DURATION

now, we both have start dates and end dates

how to rewrite this?

select * from sb_sample where strt_dt||'z'>=:date_str_oparam_from and
strt_dt<=:date_str_param_to||'z'

come one - you can do that given a date that is a start_date and another date that is an end_date equal to start_date plus duration.

instead of strt_dt||'z' you nvl(start_date, to_far_in_future)
instead of strt_dt, well, you just have start_date
instead of :date_str_param_to||'z' you have nvl(to_date(:bind),date_in_future)

actually, you could lose the strt_dt||'z' couldn't you - the null trick doesn't count since you just use strt_dt next.



All you need is

a) start_date
b) end_date

that is all, that is what you have, you are playing games with strings to get them to sort. We need not play games with dates, they just do the right thing in the first place.
5 stars   February 22, 2008 - 12pm Central time zone
Reviewer: dobromil kautsky na stroji husgvarna 
and one thing - when i insert '2008' i need to select '2008', if i was to use what you suggest, how 
would i do that?
i can't display '2008-01-01' to the user, when the value that was inserted (=the most precise value 
we know) was '2008' - this is very important.


Followup   February 22, 2008 - 12pm Central time zone:

that sounds like you want to store the display format when you save the data perhaps.

look - you apparently very much like your tricks with 'z' and everything. So be it. You can do it all with dates, you can do it just as easily with dates, and if you need a specific format for display purposes, by all means - go for it, that is pretty easy.
5 stars   February 22, 2008 - 1pm Central time zone
Reviewer: A reader 
that sounds like you want to store the display format when you save the data perhaps.

exactly - so there we are again. that's what we moved from - using a date + precision (or format whatever) - so instead of one column per date, i need two.

look - you apparently very much like your tricks with 'z' and everything. So be it

it's not about what i like or what i do not like - all i need is a solution that will be as "correct" as possible and at the same time, working with the data should be simple and straight forward.


instead of strt_dt||'z' you nvl(start_date, to_far_in_future)


well those two are not equivallent at all!
see

:date_param = '2006-05-01'
strt_dt = '2006';

while
:date_param<=strt_dt||'z' is TRUE
:date_param<=nvl('2006-01-01','2222-01-01') is FALSE

the thing is, every time i use a date after '<', '<=', or as the second operand of 'between' i need to make it the "latest second" in the given period (be it a year, month, day or whatever - so something<'2008' must be somtehing<'20081231235959', while something>'2008' is the same as somthing>'2008' (or '20080101' - ot's the same in this case) - but the same column can (and does) appear both in "less than" and "greater than" conditions)

if there is a simple way to achieve this with dates, i want to know it! - of course it's simple if you know that all records will hold 'yyyymmdd', or 'yyyymmmddhh24miss', but when every record can contain a different "precision" it's not that easy - so far i have not been able to come up with a way to achieve this with dates while keeping the simplicity of the solution anywhere near the varchar way..

it does not please me, believe me. i would love to have a date datatype remembering the exact value inserted.





Followup   February 22, 2008 - 3pm Central time zone:

.... exactly - so there we are again. that's what we moved from - using a date + precision (or format whatever) - so instead of one column per date, i need two. ...

and so what, you are storing multiple bits of data encoded differently into a single field for your ease, your convenience, this is no different than using a field to store attribute_1 or attribute_2 - depending on how you feel.

You are just encoding data special to save yourself a keystroke. And adding the need for your 'z' tricks and such - pretending dates are strings and having to munge them to make them work.


...
it's not about what i like or what i do not like - all i need is a solution that will be as "correct" as possible and at the same time, working with the data should be simple and straight forward....

Ok, now here you are just plain wrong.

If I have:

a) start date
b) end date

in Oracle dates - which I do if I either store a start and end date (YOU ARE, just in a string, not in a date) or start date and interval - getting all of the data BY DATE is easy, straightforward, not cumbersome.

When you put it in a string, now you have to teach people the zen and art of the letter 'z'.

Your approach is cumbersome and requires training. Using DATES, using simple DATES - anyone should be able to retrieve the data.

And selecting

to_char(start_date, start_date_fmt), to_char(end_date, end_date_fmt)

does not seem cumbersome (heck - think 'view'). And inserting the format is not either (umm, you must have some editing in place right, you don't after all accept 20081 as valid input do you? how do you prevent that - well, just replace all of that code with collect a format and insert it. The DATE will be validated by the datatype, your format is on of YYYY YYYYMM YYYYMMDD YYYYMMDDHH24MISS - easy for a check constraint)


I still am not seeing your perceived complexity in getting data out that has DATES in it.

I don't know why you used start date there, you would use start date and end date to get things that happen in a range.

You had:

select * from sb_sample where strt_dt||'z'>=:date_str_oparam_from and
strt_dt<=:date_str_param_to||'z'


(which frankly, the 'z' trick, confusing - unless you invented it)

if you want overlaps, you only need:

(start_date <= :p_ending) AND (end_date >= :p_beginning)

that is more clear that 'z'

5 stars   February 22, 2008 - 3pm Central time zone
Reviewer: A reader 
<i>if you want overlaps, you only need:

(start_date <= :p_ending) AND (end_date >= :p_beginning)
</i>

did i say that overlaps are a problem? not at all - it's the other query that takes avdantage of 
varchar storage:

<i>
You had: 

select * from sb_sample where 
strt_dt||'z'>=:date_str_oparam_from and
strt_dt<=:date_str_param_to||'z' </i>

so please - show me how to rewrite this simple query using dates instead of varchar. 




Followup   February 22, 2008 - 3pm Central time zone:

tell us what that query does.

put it out in specification form. I'm not following the 'z' logic frankly.

because you know, it looks like "find me things that overlap with my range", apparently - it isn't.

In fact, it is what you said:

3) select records whose dates fall within a given range (used for monitoring changes of tha data). 
eg:

select * from sb_sample where strt_dt||'z'>=:date_str_oparam_from and 
strt_dt<=:date_str_param_to||'z'

(in 99% of cases the query is run with :date_str_param_to set to current date)


taking what you said in that, my predicate is what you use with dates.
5 stars   February 22, 2008 - 3pm Central time zone
Reviewer: dobromil kautsky na stroji husgvarna 
ok, an example:

the user enters a range of dates: "from" and "to"
(again - they can enter just a year, year+month, date, date with time - but that's not important 
now)

now they want all records whose start_date (the same goes for end dates) is in that range, so, if 
they enter 

'2006-07-21' - '2007-02-03'

they want all records whose start_date was between those two dates (including records where the 
start_date is '2006', '2006-07',etc.)

or they can enter '2006-10' - '2007'
and the same applies.

so if the date is '2006' it must be returned for all ranges that overlap with this year at least by 
a second.


Followup   February 22, 2008 - 4pm Central time zone:

and that is what my predicate would return.

Your example has a from/to
Your front end gets a from/to

Your front end adjusts those from/to (just like you add your 'z')

So, if they enter from 2006 to 2007, then yes, your front end would be

2006-01-01-00:00:00  2007-12-31-23:59:59

I see that as being no different than 'z', your front end accepts data, and does the right thing - regardless.

so, if you have dates, this is not hard - certainly *not impossible or impracticable'

but like I said, you seem extremely happy with what you have, I'm not going to change that - you have 'z', you like it.

I look at this and say "the front end would easily be able to accept input dates - validate these inputs - and bind the correct value.

Heck, you could even bind STRINGS if you wanted.... A simple CASE statement in the where clause would convert them.

for the start date and then a different one for end date
case when length(:bind) = 4 /* must be yyyy */ then to_date ....


5 stars   February 22, 2008 - 4pm Central time zone
Reviewer: dobromil kautsky na stroji husgvarna 
it has nothing to do with the front end and how the parameters are sent
look:

alter session set nls_date_format = 'yyyymmddhh24miss';

create table dates (
val_txt varchar2(20) not null,
val_dt date not null
);

insert into dates values ('2006','20060101');
insert into dates values ('200610','20061001');
insert into dates values ('20061021','20061021');
commit;



-- user input:
-- from = '2006-02-01', to '2006-10-02'
-- correct result:
select * 
  from dates 
 where val_txt||'z'>='20060201'
   and val_txt<'20061002'

-- wrong result
select *
  from dates
 where val_dt between '20060201' and '20061002235959'



it's all about the need for '2006' to fullfill "between '20060201' and '20061002'"

i can do that with dates - but a simple >= becomes long chain of "or's"

Followup   February 22, 2008 - 4pm Central time zone:

I said - the database table would always have

a) start_date
b) a way to derive END DATE - either by putting in there or putting an interval, you choose.

so, the table would have start_date, end_date

never 2006

it would have the first day of 2006, last day of 2006 - or first day of 2006 and an interval of one year.


AND IT IS ONE AND - no ORS

I'm not getting it.

start_date
end_date/interval

simple where clause

done.
5 stars   February 23, 2008 - 3am Central time zone
Reviewer: dobromil kautsky na stroji husgvarna 
i'm so stupid!

i think we still misunderstand each other - i have no problem with representing the interval - 
start_date, end_date, but to work with the one column (be it start_date, or end_date).

when i need to filter records based on their start_date (it's not related to end_Date or any other 
columns) - however i store tha start_date representing '2006' i need to make sure, that when i 
compare this column to another date value with ">" or ">=" (like in start_date>=:param), this 
start_date must not act as 2006-01-01 but as 2006-12-32 23:59:59, but at the same time when i 
compare it with "<" or "<=" it must act as 2006-01-01. 
(like in the example where the user wanted all records whose start_date is between '2006-02-01' and 
'2006-10-02' - when they search this interval, they want to see the records with start_date = 
'2006' too, which is '2006-01-01' when stored as a date and does not match this condition)

i was so biased agianst having each date (that needs this variable precision) in two columns, that 
not finding a simple soliution in sql i did not even think of using a function, on which i could 
build an index (which is what i do with strt_dt||'z' anyway) and then use

getlastsecof(start_date,start_date_precision)>=:param

just like i used strt_dt||'z'

maybe i needed to hear that storing the date in two columns is ok (more ok than using the string) 
to be able to see the obvious.

thanks :-)


Followup   February 24, 2008 - 11am Central time zone:

... but to work with the one column (be it
start_date, or end_date). ..

and I find that to be a bad practice. You are encoding multiple attributes into a single string, needing to use the "z" trick. It is all tricks and all not a good practice.


When you start encoding data in an attribute - you are doing something not good in general (in fact, I'm hard pressed to come up with a case whereby encoding data magically into an attribute was a good thing - I cannot).

You have to teach people how to query your data. That in itself is a bad thing. This 'z' trick - is just that, a neat trick - but it indicates a deeper issue - the data is munged in there in a tricky way.

That is why I do not like it, I like straightforward, simple, correct...
3 stars 10g and varchar2 optimization   June 3, 2008 - 8pm Central time zone
Reviewer: Andres from Estonia
about the original answer and especially the "more direct proof" above (
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1542606219593#28397955862861
).
It seems, that the line whether the amount of memory is allocated at once or dynamically has moved in 10g from 1999/2000 to 4000/4001 (despite the documentation still claims it to be 1999/2000). I mean, declaring
x varchar2(4000) := 'x'; takes 4000 bytes of memory
and
x varchar2(4001) := 'x'; takes 1 byte.
Now, when in plsql dealing with varchar2(4000) columns that is 50% emtpty, 49% contain less than 100 bytes and 1% several thousand bytes, it would be more wize to declare as variables varchar2(4001) instead of table.column%type?

Followup   June 4, 2008 - 9am Central time zone:

this is why I like to post these snippets - things change and it is nice to be able to verify that the advice of yesterday still does OR does not apply...

as for the last bit, it'll always be a trade off between performance and memory.

4001 - allocate less overall memory, but do so dynamically - might impose a runtime ping

4000 - allocate more overall memory, but run as fast as you can..
4 stars Char vs. varchar storage for Null   June 19, 2008 - 7pm Central time zone
Reviewer: Nitin from Boston, MA USA
Tom, This is extremely informative - but I couldn't definitively derive the answer to something 
that's key to some database size issues we're facing. I get that character data pads up to the 
defined size - so characters are bad. But then there are third party apps that you can't change 
over to varchar :-)
So the question is - if I predominantly only have NULL values in a character column, will it still 
take up the entire padded space storage wise? Or are NULLs a special condition under which char and 
varchar take up the same amount of storage space on disk?


Followup   June 19, 2008 - 8pm Central time zone:

if null, they will consume just the space needed to indicate "null"

ops$tkyte%ORA10GR2> create table t ( x char(1000) );

Table created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert into t select null from all_objects;

49738 rows created.

ops$tkyte%ORA10GR2> commit;

Commit complete.

ops$tkyte%ORA10GR2> exec show_space( 'T' );
Unformatted Blocks .....................               0
FS1 Blocks (0-25)  .....................               0
FS2 Blocks (25-50) .....................               0
FS3 Blocks (50-75) .....................               0
FS4 Blocks (75-100).....................               5
Full Blocks        .....................              75
Total Blocks............................              88
Total Bytes.............................         720,896
Total MBytes............................               0
Unused Blocks...........................               0
Unused Bytes............................               0
Last Used Ext FileId....................               4
Last Used Ext BlockId...................          22,753
Last Used Block.........................               8

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> truncate table t;

Table truncated.

ops$tkyte%ORA10GR2> insert into t select 'x' from all_objects;

49738 rows created.

ops$tkyte%ORA10GR2> exec show_space( 'T' );
Unformatted Blocks .....................               0
FS1 Blocks (0-25)  .....................               2
FS2 Blocks (25-50) .....................               0
FS3 Blocks (50-75) .....................               1
FS4 Blocks (75-100).....................               4
Full Blocks        .....................           7,103
Total Blocks............................           7,296
Total Bytes.............................      59,768,832
Total MBytes............................              57
Unused Blocks...........................              64
Unused Bytes............................         524,288
Last Used Ext FileId....................               4
Last Used Ext BlockId...................          56,457
Last Used Block.........................              64

PL/SQL procedure successfully completed.


4 stars Char Vs Varchar2   January 8, 2009 - 4am Central time zone
Reviewer: A reader 
Hi Tom,

Could you please let me know, if there is any benefit in having CHAR(1) instead of VARCHAR2(1) in 
the database? i.e., in relation to performance, memory structure, execution plan etc.

Regards,


Followup   January 8, 2009 - 9am Central time zone:

nope, they would be identical.


so my advice is "do not use char(), it would just cause confusion, pretend char() does not exist"
4 stars testcase for timestamp datatype   May 25, 2009 - 6am Central time zone
Reviewer: jvs from India
Dear Tom,
good day to you as always, if you can please comment on the below test case to proof that timestamp datatype should be used to store timestamp values and not varchar2 datatype.

create table testts (ts_str varchar2(24),ts_ts timestamp);

insert into testts select to_char(to_timestamp('31121995135022833459','ddmmyyyyhh24missff') + 
numtodsinterval(rownum,'DAY'),'ddmmyyyyhh24missff6') ts_str,
to_timestamp('31121995135022833459','ddmmyyyyhh24missff') + numtodsinterval(rownum,'DAY') from 
all_objects;

commit;

create index idx_ts_ts on testts(ts_ts);

create index idx_ts_str on testts(ts_str);


execute dbms_stats.gather_table_stats(user,'TESTTS',estimate_percent => null,cascade => true);

select * from testts where ts_str between '01011996135022833459' and '02021996135022833459';
-- Card and consitent gets will be high

select * from testts where to_timestamp(ts_str,'ddmmyyyyhh24missff9') between 
to_timestamp('01011996135022833459','ddmmyyyyhh24missff9') and 
to_timestamp('02021996135022833459','ddmmyyyyhh24missff9');
-- Card may be fine but consitent gets will be high

select * from testts where ts_ts between to_timestamp('01011996135022833459','ddmmyyyyhh24missff') 
and to_timestamp('02021996135022833459','ddmmyyyyhh24missff');
-- Card and consistent gets will be fine.


I have not pasted the output of auto trace, but overall is the test case fine.

Thanking you for your time on this.

Regards,
jvs

Followup   May 26, 2009 - 9am Central time zone:

... good day to you as always, if you can please comment on the below test case to proof that timestamp datatype should be used to store timestamp values and not varchar2 datatype. ...

you need no testcase for that.

it is called common sense, logic, doing it correctly. 


if you use a string/number to store a date you will get corrupt data some day (you will, not "you might", but rather YOU WILL. Someone will insert feb-29, 2009 - some day. IT WILL happen)

if you use a string/number to store a date, you will consume MORE storage. It takes more space.

If you use a string/number to store a date, you will be plagued with conversion errors.

if you use a string/number to store a date, you WILL confuse the optimizer - causing it to incorrectly estimate cardinalities. This will result in incorrect plans and poor performance (funny, you know what type of systems typically do this extremely poor practice? data warehouses - precisely the systems whereby good query plans are VITAL)

ops$tkyte%ORA10GR2> create table testts (ts_str varchar2(24),ts_ts timestamp);

Table created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert into testts select
  2  to_char(to_timestamp('31121995135022833459','ddmmyyyyhh24missff') +
  3  numtodsinterval(rownum,'DAY'),'ddmmyyyyhh24missff6') ts_str,
  4  to_timestamp('31121995135022833459','ddmmyyyyhh24missff') +
  5  numtodsinterval(rownum,'DAY')
  6  from dual connect by level <= 65000;

65000 rows created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> commit;

Commit complete.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create index idx_ts_ts on testts(ts_ts);

Index created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create index idx_ts_str on testts(ts_str);

Index created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> execute dbms_stats.gather_table_stats(user,'TESTTS');

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> @at
ops$tkyte%ORA10GR2> column PLAN_TABLE_OUTPUT format a72 truncate
ops$tkyte%ORA10GR2> set autotrace traceonly explain
ops$tkyte%ORA10GR2> select * from testts where ts_str between '01011996135022833459' and 
'02021996135022833459';

Execution Plan
----------------------------------------------------------
Plan hash value: 3830092671

------------------------------------------------------------------------
| Id  | Operation         | Name   | Rows  | Bytes | Cost (%CPU)| Time
------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |        |  2181 | 69792 |    28   (4)| 00:00:
|*  1 |  TABLE ACCESS FULL| TESTTS |  2181 | 69792 |    28   (4)| 00:00:
------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("TS_STR"<='02021996135022833459' AND
              "TS_STR">='01011996135022833459')

full scan, prompted by the optimizer guessing 2,181 rows... there are lots of possible strings 
between those two strings...


ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select * from testts where to_timestamp(ts_str,'ddmmyyyyhh24missff9') between
  2  to_timestamp('01011996135022833459','ddmmyyyyhh24missff9') and
  3  to_timestamp('02021996135022833459','ddmmyyyyhh24missff9');

Execution Plan
----------------------------------------------------------
Plan hash value: 3131276641

------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time
------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |   163 |  5216 |    34  (21)| 00:00
|*  1 |  FILTER            |        |       |       |            |
|*  2 |   TABLE ACCESS FULL| TESTTS |   163 |  5216 |    34  (21)| 00:00
------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(TO_TIMESTAMP('01011996135022833459','ddmmyyyyhh24missff9')
              <=TO_TIMESTAMP('02021996135022833459','ddmmyyyyhh24missff9
   2 - filter(TO_TIMESTAMP("TS_STR",'ddmmyyyyhh24missff9')>=TO_TIMESTAMP
              ('01011996135022833459','ddmmyyyyhh24missff9') AND
              TO_TIMESTAMP("TS_STR",'ddmmyyyyhh24missff9')<=TO_TIMESTAMP
              22833459','ddmmyyyyhh24missff9'))

full scan prompted by lack of index - and if you say "function based index" then I say "so, you 
store a string and then to_timestamp it in an index - why?  why???? why do it wrong??

and the estimated card= value is still off, better, but off


ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select * from testts where ts_ts between
  2  to_timestamp('01011996135022833459','ddmmyyyyhh24missff')
  3  and to_timestamp('02021996135022833459','ddmmyyyyhh24missff');

Execution Plan
----------------------------------------------------------
Plan hash value: 2730030766

------------------------------------------------------------------------
| Id  | Operation                    | Name      | Rows  | Bytes | Cost
------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |    10 |   320 |     3
|*  1 |  FILTER                      |           |       |       |
|   2 |   TABLE ACCESS BY INDEX ROWID| TESTTS    |    10 |   320 |     3
|*  3 |    INDEX RANGE SCAN          | IDX_TS_TS |    10 |       |     2
------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(TO_TIMESTAMP('01011996135022833459','ddmmyyyyhh24missff')<
              AMP('02021996135022833459','ddmmyyyyhh24missff'))
   3 - access("TS_TS">=TO_TIMESTAMP('01011996135022833459','ddmmyyyyhh24
              AND "TS_TS"<=TO_TIMESTAMP('02021996135022833459','ddmmyyyy

 use the right type, get the right cardinality, get the right plan


ops$tkyte%ORA10GR2> set autotrace off
ops$tkyte%ORA10GR2> select count(*) from testts where ts_ts between
  2  to_timestamp('01011996135022833459','ddmmyyyyhh24missff')
  3  and to_timestamp('02021996135022833459','ddmmyyyyhh24missff');

  COUNT(*)
----------
        33


5 stars thanks for your inputs   May 26, 2009 - 12pm Central time zone
Reviewer: jvs from India
Dear Tom,
good day to you, thanks a lot for your time and help on this, I can now prove with the test case to 
people who don't have the common sense to understand proper data type usage.

Thanks and Regards,
jvs


3 stars char > varchar2 Referencial integrity   May 26, 2009 - 5pm Central time zone
Reviewer: FMedina from Spain
Hello Tom,

I have the following problem:

-- 1/
CREATE TABLE FIELD_CHAR_MASTER(FIELD_PARENT CHAR(12)); 

-- 2/
CREATE TABLE FIELD_VARCHAR2_DETAIL(FIELD_CHILD VARCHAR(12)); 

-- 3/
ALTER TABLE FIELD_VARCHAR2_DETAIL ADD (
  CONSTRAINT MASTER_DETAIL FOREIGN KEY (FIELD_CHILD) 
    REFERENCES FIELD_CHAR_MASTER (FIELD_PARENT));

-- 4/
ORA-02267: column type incompatible with referenced column type


The FIELD_CHAR_MASTER table is big and it´s used for a lot of objects and querys. It has other 
child tables with CHAR(12) datatype. It not "easy" alter these table. 

Is there any way to implement that referencial integrity? 

Should I use de CHAR datatype in the future if it´s necesary another child table?

[ Database Version: 9.2.0.8.0 ]

Thanks in advanced


Followup   May 27, 2009 - 8am Central time zone:

you should never have used the char() type at all - but since you did and since it is the parent, you'll need to modify your CHILD table to have an appropriate type.

Of course you should use a char type in the future if referential integrity is to be supported, you have no other choice (other than to change everything else to be a varchar2)
4 stars CHAR to VARCHAR in Stored Procedure param?   June 23, 2009 - 2pm Central time zone
Reviewer: Nigel Olding from California
When I changed my stored procedure parameter to use the datatype of the column in the table from 
which the data was being retrieved, my CHAR(1) column was returned as a maximum-sized VARCHAR2 to 
my calling PL/SQL.

Using this table...
 
create table NTEST_CHAR1
(
  COLUMN1 CHAR(1) not null,
  COLUMN2 CHAR(1) not null,
  COLUMN3 CHAR(1) not null
);

...with this data...
 
insert into ntest_char1 n
values ('1','A','a');
 
...this procedure...
 
CREATE OR REPLACE PROCEDURE test_char1(p_col2 OUT ntest_char1.column2%TYPE) IS
BEGIN
    SELECT n.column2 INTO p_col2 FROM ntest_char1 n WHERE rownum = 1;
 
END test_char1;

...returns a max-sized VARCHR2...

begin
    declare l_col2 varchar(32000);
    
    begin
    
    test_char1(l_col2);
    
    dbms_output.put_line(length(l_col2));
    
   end;
end;

Output is:
32000

Is this the expected behaviour? Shouldn't it be 1 character?
If I declare the stored proc param as OUT VARCHAR2, it is returned as 1 character...

Oracle 9.2.0.1.0

Thanks,


Followup   June 26, 2009 - 9am Central time zone:

plsql parameters are always unconstrained types - this is the way is was designed to work, yes.

Reason #3213 to never never never use the CHAR type
4 stars char(1) vs. varchar2(1)   June 25, 2009 - 5pm Central time zone
Reviewer: David from Austin, TX
Tom,
if char(1) and varchar2(1) are really the same thing in terms of storage and performance, what's 
the reason to keep the CHAR data type in Oracle?

Thanks,
David


Followup   June 26, 2009 - 10am Central time zone:

none, except the standards.
4 stars char(1) vs. varchar2(1)   June 26, 2009 - 1pm Central time zone
Reviewer: David from Austin, TX
when responding to my question about the reasons to keep CHAR data type in Oracle, you said 
"none, except the standards.". What Standards are you referring to? Also, do you think it's 
appropriate to eliminate the use of CHAR in table design in my own organization?  Thanks, David


Followup   June 26, 2009 - 2pm Central time zone:

ANSI-ISO SQL Standards


I have written many times "pretend char does not exist, use only varchar2"
5 stars   June 30, 2009 - 2pm Central time zone
Reviewer: Chuck Jolley from OKC, OK USA
When we upgraded from 8i to 10g we converted all of our char columns and pl/sql variables to 
varchar2 because some of our code broke related to chars and it seemed too much trouble to go 
through tens of thousands of lines of code and hundreds of tables testing for obscure data 
conversion and comparison errors just to upgrade.
Our tables and code base started in Oracle 7.
The process was mostly scriptable and relatively painless.
The only place we had to do hand modifications were a few lines of code in our front end that 
expected padded data as a workaround for the behavior of chars in the first place.

Not a high cost to never have to worry about something again.
 


5 stars varchar2 (char) to varchar2 (byte)   July 21, 2009 - 11pm Central time zone
Reviewer: Stuart from Auckland, New Zealand
Hi Tom,

I have a 10.2.0.4 database (character set=UTF8, nls_length_semantics=byte), where the developer 
created a table as follows:

create table xxxx
(  column_a_id        number,
   column_b_text      varchar2(255 char),
   column_c_text      varchar2(255 char),
...
...

Unfortunately, the user querying a view on this base table shows the data like it is of CHAR 
datatype (with lots of trailing spaces).   Researching this, I found the tool has a bug which reads 
as follows:  (http://support.sas.com/kb/20/088.html)

"When you access an Oracle database from SAS® in single-byte encoding and set the database 
characterset to UTF8, the SAS/ACCESS Interface to Oracle might return incorrect byte lengths for 
character columns that have lengths defined in CHAR semantics (lengths are usually much larger than 
needed).

This problem occurs when using SAS/ACCESS Interface to Oracle for CHAR and VARCHAR2 columns and 
reflects byte lengths based on the database characterset and not the client-side (SAS) encoding. 
For example, for a VARCHAR2(3 CHAR) column in an Oracle UTF8 database, the byte length surfaced in 
SAS is tripled (9). So all the CHAR and VARCHAR2 column lengths are tripled in SAS even though the 
client-side encoding is single byte. "

So Oracle is fine.  Doing a length() function on the data shows the correct length, which confirms 
I'm hitting this bug.

Rather than adding the suggested hotfix to all the PC's, or changing the table structure to 
varchar2(byte), I'd prefer to update the existing view on this table.  

Question 1:
Is it possible to do a conversion in the SQL for the view creation where it converts varchar2 char 
to byte?  

Question 2:
Although our developer is adamant we don't store any multibyte data, I'm loathe to go about 
changing the varchar2 datatype semantics across our schema.   Just out of interest, if I did go 
down this track (I'd install the hotfix rather than this option), would there be any ramifications 
apart from the possibility of truncating any data that might be multi-byte?

If I have to install the hotfix, then so be it.


Followup   July 26, 2009 - 6am Central time zone:

Question 1:
Is it possible to do a conversion in the SQL for the view creation where it
converts varchar2 char to byte?


when you tested, what did you discover? In any case - you would need to substr AND cast in the view. varchar2(255 char) in a UTF8 database can be a lot larger than varchar2(255 byte)


Although our developer is adamant we don't store any multibyte data, so, they have run a scan of the entire database right? Else, they do not know.

You would/could truncate data - yes.
5 stars varchar2 (char) to varchar2 (byte)... continued   July 22, 2009 - 12am Central time zone
Reviewer: Stuart from Auckland, New Zealand
Further to the previous post,...

I've seen a few base tables which have a mix of:
    column_name_a    varchar2(99 byte)
    column_name_b    varchar2(99 char)

So it looks like the hotfix won't work, as it specifically states in the caveats that the hotfix 
will fail with tables created with mixed semantics.

So it looks like I either 
a. update the tables to be varchar2 byte semantics
b. somehow update the views to convert varchar2 char to byte semantics

If you can advise on my previous question, it would be much appreciated.


Followup   July 26, 2009 - 6am Central time zone:

it'll be up to you as to the approach you want to take on this one.
4 stars Modifying varchar2 column to increase size   September 11, 2009 - 2pm Central time zone
Reviewer: KK from india
Tom,

We have a requirement to modify a varchar2 column size increased. The table is essentially a 
partitioned table and billions of rows are there. As varchar2 allocates space according to inserted 
value, does this operation will affect only data dictionary. I mean will this change will be quick 
enough to modify data dictionary alone and not modify any tablespaces which holds the data right 
now. 


Followup   September 14, 2009 - 1pm Central time zone:

it won't touch any existing data, it just modifies the data dictionary.
5 stars strange behaviour or varchar and char   October 7, 2009 - 7am Central time zone
Reviewer: MAYANK GUPTA from Delhi,INDIA
Hi Tom, 
I used these following steps:--

drop table a cascade;
create table a (a varchar(5),b char(5));

insert into a values('abc','abc');
insert into a values('xyz ','xyz '); --trailing space after z in both cases.

select * from a where a=b;

POSTGRES:

  a | b
------+-------
 abc | abc
 xyz | xyz
(2 rows)


ORACLE 10g:

No data Found

and if I am using same datatype then oracle give results

Regards:
Mayank Gupta
cont: 09711065568


Followup   October 8, 2009 - 7am Central time zone:

<quote src=Expert Oracle Database Architecture>

It is for this reason – the fact that a CHAR/NCHAR is really nothing more than a VARCHAR2/NVARCHAR2 in disguise – that makes me of the opinion there are really only two character string types to ever consider – VARCHAR2 and NVARCHAR2. I have never found a use for the CHAR type in any application. Since a CHAR type always blank pads the resulting string out to a fixed width, we discover rapidly that it consumes maximum storage both in the table segment as well as any index segments. That would be bad enough, but there is another important reason. It creates confusion in applications that need to retrieve this information (many cannot “find” their data after storing it). The reason has to do with the rules of character string comparison, and the strictness with which they are performed. Let’s use ‘Hello World’ string in a simple table to demonstrate:

ops$tkyte@ORA10G> create table t
2  ( char_column      char(20),
  3    varchar2_column  varchar2(20)
  4  )
  5  /
Table created.
ops$tkyte@ORA10G> insert into t values ( ‘Hello World’, ‘Hello World’ );
1 row created.
ops$tkyte@ORA10G> select * from t;
CHAR_COLUMN          VARCHAR2_COLUMN
-------------------- --------------------
Hello World          Hello World
ops$tkyte@ORA10G> select * from t where char_column = ‘Hello World’;
CHAR_COLUMN          VARCHAR2_COLUMN
-------------------- --------------------
Hello World          Hello World
ops$tkyte@ORA10G> select * from t where varchar2_column = ‘Hello World’;
CHAR_COLUMN          VARCHAR2_COLUMN
-------------------- --------------------
Hello World          Hello World


So far they look identical, they appear to be the same – but in fact, there has been some implicit conversions happening – the CHAR(11) literal was promoted to a CHAR(20) and blank padded when compared to the CHAR column. We see for a fact that these two strings are materially different:
ops$tkyte@ORA10G> select * from t where char_column = varchar2_column;
no rows selected

They are not equal to each other. We would have to either blank pad out the VARCHAR2_COLUMN to be 20 bytes in length or we would have to trim the trailing blanks from the CHAR_COLUMN:
ops$tkyte@ORA10G> select * from t where trim(char_column) = varchar2_column;
CHAR_COLUMN          VARCHAR2_COLUMN
-------------------- --------------------
Hello World          Hello World
ops$tkyte@ORA10G> select * from t where char_column = rpad( varchar2_column, 20 );
CHAR_COLUMN          VARCHAR2_COLUMN
-------------------- --------------------
Hello World          Hello World


NOTE: there are many ways to blank pad the varchar2_column, such as using the CAST() function.


The problem comes in with applications. Many of them will in fact utilize varying length strings when they bind inputs – with the resulting “no data found” that is sure to follow. For example:
ops$tkyte@ORA10G> variable varchar2_bv varchar2(20)
ops$tkyte@ORA10G> exec :varchar2_bv := ‘Hello World’;
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA10G> select * from t where char_column = :varchar2_bv;
no rows selected
ops$tkyte@ORA10G> select * from t where varchar2_column = :varchar2_bv;
CHAR_COLUMN          VARCHAR2_COLUMN
-------------------- --------------------
Hello World          Hello World

So here, the search for the VARCHAR2 string worked but the CHAR column did not. The varchar2 bind variable will not be promoted to a CHAR(20) as the character string literal is. At this point, many programmer gets of the opinion that “bind variables don’t work, we have to use literals”. That would be a very bad decision indeed. So, the solution is to bind using a CHAR type:
ops$tkyte@ORA10G> variable char_bv char(20)
ops$tkyte@ORA10G> exec :char_bv := ‘Hello World’;
 
PL/SQL procedure successfully completed.
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> select * from t where char_column = :char_bv;
CHAR_COLUMN          VARCHAR2_COLUMN
-------------------- --------------------
Hello World          Hello World
ops$tkyte@ORA10G> select * from t where varchar2_column = :char_bv;
no rows selected

Only partially – if you mix and match VARCHAR2 and CHAR, you’ll be running into this issue constantly. Not only that, but the developer is now having to consider the field width into their applications. If they opt for the rpad() trick to convert the bind variable into something that will be comparable to the CHAR field (it is preferable of course to pad out the bind variable, rather than TRIM the database column!). Now if the size of the field changes – the application is impacted, it must change it’s field width.

It is for these reasons – the fixed width storage, which tends to make the tables and related indexes much larger than normal, coupled with the bind variable issue that makes me avoid the CHAR type in all circumstances. I cannot even make an argument for it in the case of the 1 character field – for in that case, it is really of no material difference. The VARCHAR2(1) and CHAR(1) are identical in all aspects. There is no compelling reason to use the CHAR type in that case and in order to avoid any confusion – I just say “no” for even the CHAR(1) field.


</quote>
5 stars pga_aggregate_target   October 12, 2009 - 4am Central time zone
Reviewer: Gurudarshan G P from india
Hi Tom,
      I would like to ask you that where the pga will be allocated when the userprocess 
started,either in the serverside  or in the client side. And if we dont set  pga_aggregate_target 
value then how to check that the allocated pga memory.

Awaiting for ealiest relpy.




4 stars What about MD5 sums that are always 32 characters?   November 18, 2009 - 12pm Central time zone
Reviewer: Owen from Denver, USA
I'm one of those ignorant developers who has the misguided notion of char being better for fixed 
lengths.  Please set me straight:

We're storing MD5 checksums which are always, no matter what, exactly 32 characters.  Is it still 
better to use varchar2(x) instead of char(32)?  The column is currently defined as a varchar2(300) 
because the dba says 'it doesn't matter just as long as it is long enough' - is this true?  Even if 
varchar2 wins over char, it seems to me it needs to be varchar2(32) because 32 needs only 5 bits to 
store/handle the length but 300 needs 9.  We use WHERE clauses against this column and there are 
over 100M rows so we need so squeeze the performance as much as possible.  Right now it's slower 
than we need it to be.

PS - There's no index because the values are unique and a lot of other columns need to be returned 
so such an index would have 1:1 with the table and take practically just as much space.  Is this a 
correct analysis?

Thanks for any tips!


Followup   November 23, 2009 - 3pm Central time zone:

a char(n) is simply a varchar2(n) blank padded to be always N characters.

So, if you store something that is always 32 characters - a char(32) and a varchar2(32) are *identical*. Identical on disk, identical in memory.

Which leads me to turn the question around - is it better to use a char(32)? What do you think you would gain by it?

I believe you should not use char(n) ever - for the simple reason that is makes you forget they exist and you are not ever tempted to use them incorrectly.


forget about bits, nothing happens at the bit level here. The leading length indicator will be 1 byte for 255 or less and 2 bytes for anything else.


I believe your DBA is mistaken however in their statement - you want a datatype to be as RESTRICTIVE as possible, the only thing you'll get by using varchar2(300) to hold a field that can only be varchar2(32) is - someday someone will put 300 characters in there - and then what? data integrity - goodbye.


The stupid thing is - we could alter a varchar2(32) to be a varchar2(300) if the need to do so arises. However, to alter a varchar2(300) to the correct length of 32 would take a reorganization of the entire table. 300 was wrong, 32 is correct and a char(32) would provide no benefit over a varchar2(32) and since you never really want to remember char exists - do not use it.

... PS - There's no index because the values are unique and a lot of other columns
need to be returned so such an index would have 1:1 with the table and take
practically just as much space. Is this a correct analysis?
...

if this is an md5 checksum and the values must be unique - I don't really CARE how much space it takes. What will happen to your application when (not if, but when) a duplicate value occurs. You know that the space of 32 character md5 checksums is FINITE right (it is big, but finite) and your inputs - can and will generate duplicate hashes - it will happen someday - and then what?

If you have a unique constraint - unique it. Space be darned.

If you access this data by the checksum - index it for performance reasons.
If you need this to be unique - put the constraint on.

Write a Review
 


All information and materials provided here are provided "as-is"; Oracle disclaims all express and implied warranties, including, the implied warranties of merchantability or fitness for a particular use. Oracle shall not be liable for any damages, including, direct, indirect, incidental, special or consequential damages for loss of profits, revenue, data or data use, incurred by you or any third party in connection with the use of this information or these materials.

About Oracle | Legal Notices and Terms of Use | Privacy Statement