Char VS Varchar
April 3, 2002 - 4pm Central time zone
Reviewer: kiran from cincinnati, OH USA
In all your replys to Char and varchar2 questions you mentioned that Varchar2 supports all the char
properties.
I noticed a weired problem with Varchar2.
I have a table test1(tname char(20),my_name varchar2(20));
I inserted two records into this table
('kiran','kiran')
('kiran ','kiran ')---Notice there is a space after kiran.
Now I am selecting from the table using the below sql
select * from test1 where tname = 'kiran';
---this returned the above two records
but when I am select the same table using the below sql
select * from test1 where my_name = 'kiran'
---it returned only one record(1st one). I tested this on 7.3 and also on 8I could you explain
why it is happening?
another interesting thing is select * from test1 where tname like 'kiran' is not returning any
recors.When the "=" is working fine, why the like is not working?
Followup April 3, 2002 - 6pm Central time zone:
varchar2's can store trailing blanks -- IF You put them there.
the answer is simple.
'a ' <> 'a'
tried this in 8i (r2 and r3)
ops$tkyte@ORA817DEV.US.ORACLE.COM> create table t ( x char(10), y varchar2(10) );
Table created.
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> insert into t values ( 'x', 'x ' );
1 row created.
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> select * from t where x like 'x';
X Y
---------- ----------
x x
ops$tkyte@ORA817DEV.US.ORACLE.COM> select * from t where y like 'x';
no rows selected
works as expected for me.
Is there a Leading Length field in RAW Datatype ?
April 4, 2002 - 2am Central time zone
Reviewer: Pascal from Hamburg , Germany
Hi Tom,
Thanks for a Great answer!
If i could just post a followup on that expecting a short YES/NO answer-
Is there a Leading Length field in RAW Datatype ?
and
Is it True that when i use RAW type in my tables as an automatically generated Primary Key - by
selecting sys_guid() from dual , from within a Trigger - then i would not be able to Create
Materialized Views with columns of RAW type ?
Thanks in advance
Followup April 4, 2002 - 11am Central time zone:
you can either
select length( raw_field )/2 from t
or
select utl_raw.length(raw_field) from t
to get the length.
ops$tkyte@ORA817DEV.US.ORACLE.COM> create table t ( pk raw(16) primary key, data int );
Table created.
ops$tkyte@ORA817DEV.US.ORACLE.COM> create materialized view t_mv
2 as
3 select * from t;
Materialized view created.
seems to work.
Are PL/SQL LONG variables similar?
April 4, 2002 - 4pm Central time zone
Reviewer: Brent from Hartford, CT, USA
I was searching for documentation on the memory behavior with respect to LONG variables, and found
the same snippet at the top of this question for VARCHARs, but nothing about LONGs.
Do the memory allocation rules operate the same way for LONG variables, with the 2000-byte
threshold? Or if it is different for LONGs, what are the differences?
Thanks.
(I know Tom knows what I mean but just to clarify for the readers -- I am referring to the PL/SQL
LONG datatype which has a 32K maximum size, and not the database LONG column type which has a 2GB
limit).
Followup April 4, 2002 - 4pm Central time zone:
From standard.sql:
...
type VARCHAR2 is NEW CHAR_BASE;
pragma PACK(VARCHAR2);
subtype VARCHAR is VARCHAR2;
subtype STRING is VARCHAR2;
subtype LONG is VARCHAR2(32760);
subtype RAW is VARCHAR2;
subtype "LONG RAW" is RAW(32760);
subtype ROWID is VARCHAR2(256);
....
a plsql long is just a varchar2 in disguise. So, yes, they are just a varchar2 that is bigger then
2000.
Can preallocate varchar2(>2000)?
April 4, 2002 - 5pm Central time zone
Reviewer: benedict from australia
Thanks for the informative answer Tom.
Supposing I was building up a large varchar2 variable (declared >2000) characters by appending text
to itself... e.g.
declare
big varchar(30000);
begin
for i := 1..100 loop
big := big || ' something';
end loop;
end;
I understand from your answer that since it's been declared larger than 2000, pl/sql will
reallocate memory each time it gets longer. My question is: is it possible to force pl/sql to
allocate tons of room to begin with? And does it make a big enough difference for me to care?
Will pl/sql reallocate if the varchar2 gets smaller?
e.g. will
...
begin
big := rpad(null, 30000);
big := null;
...
do the trick?
Cheers
Followup April 4, 2002 - 7pm Central time zone:
Based on this test, I would say 'yes, it'll have an effect'.
But -- only if you do this really degenerate case like I did. I mean -- how many times do you make
that string grow a byte at a time??
Also -- setting to NULL sees to be "free" -- it shrinks it right back.
I ran this a couple of times:
ops$tkyte@ORA817DEV.US.ORACLE.COM> create or replace procedure p_big_no_init
2 as
3 x varchar2(30000);
4 begin
5 for j in 1..30000
6 loop
7 x := rpad('x',j,'x');
8 end loop;
9 end ;
10 /
Procedure created.
ops$tkyte@ORA817DEV.US.ORACLE.COM> create or replace procedure p_big_init
2 as
3 x varchar2(30000) := rpad( 'x', 30000, 'x' );
4 begin
5 for j in 1..30000
6 loop
7 x := rpad('x',j,'x');
8 end loop;
9 end ;
10 /
Procedure created.
ops$tkyte@ORA817DEV.US.ORACLE.COM> create or replace procedure p_big_init_with_null
2 as
3 x varchar2(30000) := rpad( 'x', 30000, 'x' );
4 begin
5 x := null;
6 for j in 1..30000
7 loop
8 x := rpad('x',j,'x');
9 end loop;
10 end ;
11 /
Procedure created.
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> declare
2 l_start number default dbms_utility.get_time;
3 begin
4 for i in 1 .. 5 loop p_big_no_init; end loop;
5 dbms_output.put_line
6 ( 'Elapsed: ' ||
7 round( (dbms_utility.get_time-l_start)/100, 2 ) );
8
9 l_start := dbms_utility.get_time;
10 for i in 1 .. 5 loop p_big_init; end loop;
11 dbms_output.put_line
12 ( 'Elapsed: ' ||
13 round( (dbms_utility.get_time-l_start)/100, 2 ) );
14
15 l_start := dbms_utility.get_time;
16 for i in 1 .. 5 loop p_big_init_with_null; end loop;
17 dbms_output.put_line
18 ( 'Elapsed: ' ||
19 round( (dbms_utility.get_time-l_start)/100, 2 ) );
20 end;
21 /
Elapsed: 22.85
Elapsed: 22.06
Elapsed: 24.69
PL/SQL procedure successfully completed.
What about when constants are declared in PL/SQL?
April 10, 2002 - 11am Central time zone
Reviewer: jon from Hartford, CT USA
If you declare a constant in PL/SQL, would it be better to size it as VARCHAR2(2001), so that you
would get the actual size, instead of 2000 bytes allocated? Or is it smart enough to recognize
that constants can't be changed and allocate the correct amount of memory.
In other words, is
usps_code_connecticut constant varchar2(2001) := 'CT';
any different than
usps_code_connecticut constant varchar2(2) := 'CT';
?
Followup April 10, 2002 - 11am Central time zone:
regardless of whether it was constant or not -- you should use varchar2(2).
The varchar2(2001) will allocate storage for 2 characters when it is initialized.
The varchar2(2) will allocate storage for 2 characters when it is initialized.
Now, if you had coded:
usps_code_connecticut containt varchar2(1000) := 'CT';
then 2001 and might be better. But the real fix would be to declare it properly as a varchar2(2).
Constants.
January 21, 2003 - 3pm Central time zone
Reviewer: Kashif from Houston, TX
Hi Tom,
Is using the CONSTANT keyword a performance benefit? Any examples would be helpful. Also, can you
explain whether a declaration such as:
l_filename constant varchar2 (7) default to_char (l_start_date, 'MONYYYY');
is advisable/logical. The constant and default keywords are used in the same declaration. Thanks in
advance.
Kashif
Followup January 21, 2003 - 5pm Central time zone:
it is a compiler directive saying "don't let them modify it".
I've never seen any sort of performance benefit. It is still a plsql variable.
Assiging CHAR type to VARCHAR2
March 4, 2003 - 12pm Central time zone
Reviewer: CSR from India
Hi Tom,
Is there any problem if we assign CHAR type variable directly to a VARCHAR2 type variable.
I have table and SP like below.
chandu@chandra> desc t_clips_audit_info
Name Null? Type
--------------------------- -------- ---------------
AUDIT_SEQUENCE NOT NULL NUMBER(10)
CALL_TYPE CHAR(3)
CALL_RETURN_ERR CHAR(3)
ERROR_DATE_TIME DATE
REQUESTOR_ID VARCHAR2(20)
CALL_STRING VARCHAR2(250)
CALL_RETURN VARCHAR2(2000)
PROCEDURE sp(
ic_or_number IN VARCHAR2,
in_on_number IN NUMBER,
in_amt IN NUMBER,
oc_is_valid OUT CHAR,
oc_is_or_valid OUT CHAR)
As
vc_call_string t_clips_audit_info.call_string%TYPE := NULL;
vc_call_return t_clips_audit_info.call_return%TYPE := NULL;
begin
--------- many lines. here -------
-- #.1
vc_call_return := vc_call_return || ' ' ||'oc_is_obgr_obgt_valid: '||
oc_is_obgr_obgt_valid;
--
end ;
/
When I execute the above SP from SQL*> with a sql client(exec sp) it is giving me proper results.
But java developers are getting the error showing below.
error is :ORA-06502: PL/SQL: numeric or value errorCode:-6502
After debugging I found the exception at occurred the #.1.
If I use SUBSTR(oc_is_obgr_obgt_valid,0,1) in the code, I don't get any errors.
Could you please suggest us.
Followup March 4, 2003 - 6pm Central time zone:
The java developers are binding a string that is too small to hold the result. chars are blank
padded to the maximum length and the java guys bind variable isn't big enough I guess.
one would actually need a small, yet complete testcase to say exactly.
almost forgot to say - i hate char, don't use them, just use varchar2.
CHAR(1) vs VARCHAR2(1)
March 4, 2003 - 8pm Central time zone
Reviewer: Trevor
Tom...
I was told once that if you only need a single char
use CHAR(1) since using VARCHAR2(1) has overhead
due to the Oracle having to maintain how long
the value is in a VARCHAR2 field
Is this a load of rubbish or is there some truth
to it?
Followup March 5, 2003 - 7am Central time zone:
that is rubbish.
A char(n) is a varchar2(n) that is blank padded to it's maximum length. The implementation of a
char physically on disk is identical to a varchar2 -- there is a length byte for a char(1) just as
there is for a varchar2(1).
Binned
March 5, 2003 - 6pm Central time zone
Reviewer: Trevor
Thanks Tom
VARCHAR2(10) vs. VARCHAR2(2000) for a 10 char string
June 3, 2003 - 10am Central time zone
Reviewer: Tajana from Austria, Vienna
Tom,
clear about the difference between VARCHAR2 and CHAR.
But what about defining VARCHAR2(10) and, let's say VARCHAR2(2000) for a column that is expected
not to be "wider" than 10 characters ?
One advantage of defining such a column actually as what it is (VARCHAR2(10)) for me is, that I can
easily list a table containing many of such columns in SQL*Plus (Worksheet) without having to
scroll right, till my fingers get wounded.
Disadvantage: If my text gets wider, I'll have to redefine the column; means: I have to maintain
that.
Defining such a column as - let's say: 2000, would mean: No or less maintenance.
But much more interesting is:
Are there any
space
index
optimizer issues , or
any other important issues with this ?
Any other arguments for defining a 10 char column as VARCHAR2(10) and not VARCHAR2(2000) ?
Or: Any arguments why it is better to generally define VARCHAR2 columns to its maximum ?
Thanks in advance,
Tajana
Followup June 3, 2003 - 11am Central time zone:
well, the real issue is -- what is the business requirement. What the heck is in that field and
what is its length.
If you say "use a varchar2(2000)" then you know what will happen? someone will put 2000 characters
in there someday. Is that OK?
varchar2(10) says "this field accepts 10 bytes of data"
Since you can make a field larger easily with ALTER, maintenance isn't an excuse.
You define a field to be correct and proper. Period. This field is a varchar2(10).
think about the poor client tools that say "oh look - a varchar2(2000) and an array size of 100.
We need to allocate 200,000 bytes of storage (instead of 1000).
VARCHAR2(10) vs. VARCHAR2(2000) for a 10 char string
June 4, 2003 - 7am Central time zone
Reviewer: Tajana from Vienna, Austria
Tom,
in german you would say: You a running against my open doors.
Means: We are of the same opinion.
My intention was to prevent development colleagues to define hundreds of VARCHAR2(255) columns
(which they did, before I "jumped" in the project, and still trying to do).
So I needed arguments.
You gave me one - that with client tools, arraysize and allocation.
But before I discuss that with colelagues, I would like to know this a little bit more in detail:
Do you mean memory allocation by the client tool at the client ?
Can you explain that matter a little bit further, please ?
Any other considerations (optimizer, indexes ?)
Thanks a lot,
Tajana
Followup June 4, 2003 - 8am Central time zone:
there is only one reason necessary:
o it is the only sensible and professional thing to do.
but the technical reasons are:
o it is called data integrity. if you make it 255, you are saying that field legitimately contains
upto 255 bytes of data. If you make it 10, you are saying otherwise. If you make it 255, someone
WILL -- not might -- WILL put 255 bytes in there some day. Then what?
o say you have 20 255 varchar2 columns instead of 20 right sized ones. Pull that up in an ad hoc
tool. Have you done the end user a serious dis-service here? they'll spend the next 5 minutes
formatting data - data that should have been, could have been formatted already.
o say you move onto the next project -- leaving all of these 255 byte columns. Who is going to
tell the next guys "c1 is a 10 character field, c2 is 12, c3 is 44" and so on.
o when the client application, which wants to run fast, decides "I'll array fetch 100 rows at a
time", it'll have to allocate 100 * 255 * 20 bytes of data for the buffering of such data
(remember, no one KNOWS c1 is just 10 characters, it is 255 characters). That is as opposed to
allocating just what you really need.
I've got an idea for these guys -- don't use numbers or dates either. What if later on you want to
store "hello world" in that date column. (only kidding)

June 4, 2003 - 10am Central time zone
Reviewer: Dave from Colorado
In the same kidding spirit as "don't use numbers or dates either", you could use VarChar2(2000)
for all your columns, and then put a check constraint on the column "length(my_column) < 21". Ta
Da! Same as VarChar2(20)!!!
I like that one..
June 4, 2003 - 11am Central time zone
Reviewer: Jose Cleto from Santo Domingo, Dominican Republic
It reads like "American funnies Scripts"
Funnies scripts
June 5, 2003 - 8am Central time zone
Reviewer: Tajana from Vienna, Austria
Thanks, Tom.
============
This was what I needed.
Jose,
It reads like:
What funny ideas developers sometimes have, and what a DBA / data designer has not only to clean
after them, but also to convince them of on the first place ...
(Tenor: "Don't change this, my procedure won't work at all anymore !!! And, by the way, it is yet
implemented at the cust site !")
In this case: European funnies scripts, but I am sure, you have your own ones too ;-)
Dave,
I will evaluate your suggestion and present it to my dev colleagues ;-)
We will see, what they are gonna to say...
Cheers, Tajana
Sorry if I miss the point...
July 30, 2003 - 10am Central time zone
Reviewer: Jose Cleto from Dominican Republic
We all have fallen in this situation:
- "Let's do poor req. and anal. and then go with the code"
- "I know that is better, but this is faster"
- "We don't have time for that now"
An the list goes on...
The true, all the true and nothing but the true is that developers, DBA, and the like all we, are
payed to do our job the better we can.
Why pay like a pro if some one will work like a novice. We most work all the time to get the next
level. Computers is a "science", like math, chemestry, we have discipline CMM, OOP, Database
Design, Structure programming, UML, etc., all this exists is for guide us in doing our work, but
only our self could make our work.
Making the king of VARCHAR2(2000) to hold just 20 is silly. So mr, Tajana, Your job "if You decide
to accept it", is to convince your developers to be as near as possble to the reallity and make use
of all the things they are suppose to know.
BTW.: In my case is "Dominican Funniest scripts"
Char is returing vaules that are not padded?
September 11, 2003 - 10am Central time zone
Reviewer: Vivek
Tom, I created a table with a column of char(10). When I select a row with a where clause of the
"col" not padded with spaces it still returns the row. I would think the row should not be found
if I do not pad. Am I wrong?
SQL> create table x(col char(10));
Table created.
SQL> insert into x values('0');
1 row created.
SQL> insert into x values('1 ');
1 row created.
SQL> insert into x values('A');
1 row created.
SQL> select length(col),'x'||col||'x' from x;
LENGTH(COL) 'X'||COL||'X
----------- ------------
10 x0 x
10 x1 x
10 xA x
SQL>
SQL> DECLARE
2 x CHAR (14);
3 BEGIN
4 DBMS_OUTPUT.put_line ('0');
5
6 SELECT col
7 INTO x
8 FROM x
9 WHERE col = '0';
10
11 DBMS_OUTPUT.put_line ('1');
12
13 SELECT col
14 INTO x
15 FROM x
16 WHERE col = '1';
17
18 DBMS_OUTPUT.put_line ('A');
19
20 SELECT col
21 INTO x
22 FROM x
23 WHERE col = 'A';
24 DBMS_OUTPUT.put_line ('NO Errors');
25 EXCEPTION
26 WHEN NO_DATA_FOUND THEN
27 DBMS_OUTPUT.put_line ('no_data_found');
28 RAISE;
29 WHEN OTHERS THEN
30 DBMS_OUTPUT.put_line ('others');
31 RAISE;
32 END;
33 /
0
1
A
NO Errors
PL/SQL procedure successfully completed.
SQL> spool off
thanks
Followup September 11, 2003 - 6pm Central time zone:
character string literals like that are coerced to the proper datatype automagically
'A' is really 'A '
it is the ansi defined behaviour.
number vs date in datawarehouse
November 3, 2003 - 4am Central time zone
Reviewer: A reader
Hi
I have seen many Datamarts/Datawarehouses using number for date columns. For example instead of
having
2003-AUG-01 23:00:00
they have
20030801230000 in number
Is this good practice? I find it so hassle because when I have to calculate aggreates with time
functions I always have to convert that to date first!
I have never seen this in OLTP but many times in DWH, is there some directive to do so or what?
Followup November 3, 2003 - 7am Central time zone:
In fact, I write about this very topic in my book "Effective Oracle By Design" where I state very
loudly
USE NUMBERS FOR NUMBERS
USE DATES FOR DATES
don't use anything else, use the correct proper datatypes.
Not only do you spend 1/2 your time converting that number to a date -- but I'll assure you that
one day, you'll get a date in there that IS NOT A VALID DATE and then what?
Not only that but tell me, how many values are there between
20031231115959
and
20040101000000
well, the optimizer thinks there could be 8869884041 values -- but we all know there are no values
between those two dates!!! You can totally mess up the optimizer with things like "between" and
"<", ">" doing this silly thing of putting dates into numbers or strings.....
can you tell ? i'm dead set against this "common practice"

November 3, 2003 - 11am Central time zone
Reviewer: Dave from Colorado
"Date folding" is related good reason, as if another one were needed, for using date types for
dates, especially in a DW environment ...
http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96520/qr.htm#50228

November 3, 2003 - 12pm Central time zone
Reviewer: Dave from Colorado
By the way, there seems to be some very poor advice in that documentation, in the "Note:" section.
example from your new book
November 14, 2003 - 4am Central time zone
Reviewer: A reader
Hi
you have an example showing using proper datatypes in chapter about effective schema design
I did a small test using my own script and saw something not predicted, opposite from your example
create table test_date
(
varchar_fecha varchar2(9),
number_fecha number,
date_fecha date
)
pctfree 70
pctused 10;
insert /*+ APPEND */ into test_date
select to_char( to_date( '19980101', 'YYYYMMDD' ) + rownum, 'YYYYMMDD' ),
to_number( to_char( to_date( '19980101', 'YYYYMMDD' ) + rownum, 'YYYYMMDD' ) ),
to_date( '19980101', 'YYYYMMDD' ) + rownum
from dba_source
where rownum <= trunc(sysdate) + 36500 - to_date( '19940101', 'YYYYMMDD' );
create index test_date_idx1 on test_date(varchar_fecha);
create index test_date_idx2 on test_date(number_fecha);
create index test_date_idx3 on test_date(date_fecha);
analyze table test_date compute statistics for all indexes for table for all columns;
set autotrace trace exp stat
select * from test_date
2 where number_fecha between 20011231 and 20050101;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=6 Card=1318 Bytes=26360)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST_DATE' (Cost=6 Card=1318 Bytes=26360)
2 1 INDEX (RANGE SCAN) OF 'TEST_DATE_IDX2' (NON-UNIQUE) (Cost=5 Card=1318)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
166 consistent gets
0 physical reads
0 redo size
50919 bytes sent via SQL*Net to client
8528 bytes received via SQL*Net from client
75 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1098 rows processed
select * from test_date
2 where date_fecha between to_date('20011231', 'YYYYMMDD') and to_date('20050101', 'YYYYMMDD');
1098 filas seleccionadas.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=46 Card=1097 Bytes=21940)
1 0 TABLE ACCESS (FULL) OF 'TEST_DATE' (Cost=46 Card=1097 Bytes=21940)
Statistics
----------------------------------------------------------
0 recursive calls
15 db block gets
545 consistent gets
7 physical reads
0 redo size
50919 bytes sent via SQL*Net to client
8528 bytes received via SQL*Net from client
75 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1098 rows processed
as you can see the proper datatype in this case date is working worse than number datatype...
interesting no?
Followup November 14, 2003 - 8am Central time zone:
My point was the information the optimizer sees
look Card=1097 when using dates -- and reality says 1098 rows.
Card=1318 when using numbers -- and reality says 1098 rows.
using the number gave the optimizer bad information here. That said (i was demonstrating something
wholly different there - that the optimizer is getting bad bad data) -- I do not observe the same
thing as you:
ops$tkyte@ORA920PC> create table test_date
2 (
3 varchar_fecha varchar2(9),
4 number_fecha number,
5 date_fecha date
6 )
7 pctfree 70
8 pctused 10;
Table created.
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> insert /*+ APPEND */ into test_date
2 select to_char( to_date( '19980101', 'YYYYMMDD' ) + rownum, 'YYYYMMDD' ),
3 to_number( to_char( to_date( '19980101', 'YYYYMMDD' ) + rownum, 'YYYYMMDD' ) ),
4 to_date( '19980101', 'YYYYMMDD' ) + rownum
5 from dba_source
6 where rownum <= trunc(sysdate) + 36500 - to_date( '19940101', 'YYYYMMDD' );
40104 rows created.
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> create index test_date_idx1 on test_date(varchar_fecha);
Index created.
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> create index test_date_idx2 on test_date(number_fecha);
Index created.
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> create index test_date_idx3 on test_date(date_fecha);
Index created.
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> analyze table test_date compute statistics for all indexes for table for all
2 columns;
Table analyzed.
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> set autotrace traceonly
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> select * from test_date where number_fecha between 20011231 and 20050101;
1098 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=21 Card=1318 Bytes=26360)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST_DATE' (Cost=21 Card=1318 Bytes=26360)
2 1 INDEX (RANGE SCAN) OF 'TEST_DATE_IDX2' (NON-UNIQUE) (Cost=5 Card=1318)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
163 consistent gets
14 physical reads
0 redo size
36016 bytes sent via SQL*Net to client
1302 bytes received via SQL*Net from client
75 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1098 rows processed
ops$tkyte@ORA920PC> select * from test_date where date_fecha between to_date('20011231',
'YYYYMMDD') and to_date('20050101', 'YYYYMMDD');
1098 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=17 Card=1096 Bytes=21920)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST_DATE' (Cost=17 Card=1096 Bytes=21920)
2 1 INDEX (RANGE SCAN) OF 'TEST_DATE_IDX3' (NON-UNIQUE) (Cost=4 Card=1096)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
163 consistent gets
0 physical reads
0 redo size
36016 bytes sent via SQL*Net to client
1302 bytes received via SQL*Net from client
75 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1098 rows processed
ops$tkyte@ORA920PC> set autotrace off
ops$tkyte@ORA920PC>
It seems unusual to me that the query that selects less rows (is estimated to select less rows)
would use a FULL SCAN (given that both indexes by definition would have the same cluster factors).
You must have lots of non-standard optimizer init.ora settings going on there.
so, use the 10053 trace event (in chapter 6) and show us your optimizer settings. also, a version
is ALWAYS relevant.
For Dave from Colorado...
November 14, 2003 - 8am Central time zone
Reviewer: Jan M. van Mourik from Houston, TX
You thought you were kidding, Dave?
How about this:
CREATE TABLE BFX_EMPGROUP (
EMPLOYEE_GROUPID VARCHAR2(150) NOT NULL,
NAME VARCHAR2(150) NOT NULL,
START_DATE DATE,
END_DATE DATE);
ALTER TABLE BFX_EMPGROUP ADD (
CHECK (LENGTH(EMPLOYEE_GROUPID)<=50),
CHECK (LENGTH(NAME)<=50));
From PeopleSoft HR...

November 25, 2003 - 10am Central time zone
Reviewer: A reader
is there any reason to use CHAR
December 1, 2003 - 11am Central time zone
Reviewer: A reader
Tom, is there any situation in which it is better to use a CHAR datatype instead of VARCHAR2? I am
trying to convince developers to stop using it but they seem to think that if the number of
characters in a string column will always be the same, like in state abbreviations, then it is
better to use CHAR. Is this a valid argument?
Followup December 1, 2003 - 5pm Central time zone:
nope.
ask them for the proof or at least the reason why they think "better"
please disregard previous post
December 1, 2003 - 4pm Central time zone
Reviewer: A reader
I found the answer in the following posting-
http://asktom.oracle.com/pls/ask/f?p=4950:8:856236758633628034:::::
correction to link above
December 1, 2003 - 4pm Central time zone
Reviewer: A reader
sorry for this, the link above points back to this page. The link where I found the answer was
here:
http://asktom.oracle.com/pls/ask/f?p=4950:8:856236758633628034::NO::F4950_P8_DISPLAYID,F4950_P8_CRIT
ERIA:123212348063,
GR8
December 26, 2003 - 4am Central time zone
Reviewer: Bipin Ganar from INDIA
Hi Tom,
Thanx for the Difference.
I got the difference for char and varchar2 But Still not clear about Char and varchar.
So, can you highlight more on difference between Char and varchar
Followup December 26, 2003 - 10am Central time zone:
there are none. varchar2 and varchar are the same right now.
Converting many CHAR columns to VARCHAR columns
February 17, 2004 - 10am Central time zone
Reviewer: Sha from USA
Hi Tom,
I need to convert many char columns to varchar columns.
We have about 150 tables with appox 2 columns of type char and 500,000 rows each. I need to convert
all columns of datatype char to varchar2.
Please suggest the best way to do this.
Regards,
Sha
Followup February 17, 2004 - 10am Central time zone:
you'll have to
a) alter the tables to change the types
b) update each of the tables setting column = rtrim(column) to remove the trailing blanks.
or, rebuild them all.
Table update slow
February 18, 2004 - 11pm Central time zone
Reviewer: Sha from USA
Hi Tom,
In our schema we have constraints,Indexes and Master Details relationships. I need to update the
CHAR columns to VARCHAR datatype and later RTrim the new VARCHAR.
e.g
Lets say TMaster has 3 columns A, B and X, say X is of datatype CHAR(10) and TMASTER has indexs
multiple indexs on columns including index on X.
TMaster( A,B,X char(10) )
TDETAIL has 3 columns P,Q and Y, say Y is on datatype CHAR(10) and TDETAIL has indexs multiple
indexs on columns including index on Y.
X is primary key in TMASTER and Y is foreign key in TDETAIL.
TDETAIL (P,Q, Y char(10))
TMASTER has 50,000 records while TDETAIL has 1 Million records.
I did the follwing
1) Drop constraints on both tables
2) Removed Indexs on X and Y columns only all other Indexes for TMASTER and TDETAIL are inplace.
3)Alter tables TMaster and than TDetail.
4) Update X = Rtrim(X) on TMASTER
5) Update Y = Rtrim(Y) on TDETAIL
To update column Y it took 8 hours which has 1 Millions rows.
1) Is this an efficient way to do this job because I have around 150 tables with similar change.
2) How do I reduce the time to update the column atleast in couple of minutes and not 8 hours.
3) I guess the oracle should be fast ( couple of minutes ) to update 1 million columns with Rtrim,
Is there something holding my table to get updated, how should I figure that out.
4) Did I miss out anything or should I do things in a particular order, how do I know what is
making my table update slow.
I am new to oracle.
Thanks in advance,
Sha
Followup February 19, 2004 - 10am Central time zone:
you did something wrong -- not sure what (i would guess you didn't really get the indexes off).
to update 1,000,000 rows (unindexed) is a couple of minutes.
big_table@ORA920PC> create table t
2 as
3 select cast( object_name as char(30) ) x, big_table.*
4 from big_table
5 where rownum <= 1000000
6 /
Table created.
big_table@ORA920PC> set timing on
big_table@ORA920PC> alter table t modify x varchar2(30);
Table altered.
Elapsed: 00:00:00.32
big_table@ORA920PC> select min(length(x)) from t;
MIN(LENGTH(X))
--------------
30
Elapsed: 00:00:02.00
big_table@ORA920PC> update t set x = rtrim(x);
1000000 rows updated.
Elapsed: 00:01:26.51
big_table@ORA920PC> select min(length(x)) from t;
MIN(LENGTH(X))
--------------
1
Elapsed: 00:00:03.64
big_table@ORA920PC>
So, don't know what you did wrong (logs big enough? any checkpoint not complete messages, are you
sure the indexes were "gone").
Why char as parameter not restricting?
March 19, 2004 - 4pm Central time zone
Reviewer: dharma from CA USA
I have been debugging some procedures from a Oracle 7.3 system which is in 9i rel 2 now where I
found one of your BIG NO NO. it had a "when others then null" and the procedure has been there for
4 years.
What I found to be causing the error looks strange.
The defn of CHAR without any length is defaulted CHAR(1)
But If I have a parameter in the procedure as CHAR how is it able to get the values larger than 1
byte?
DECLARE
PROCEDURE p(p_a IN CHAR) IS
b CHAR;
BEGIN
Dbms_output.put_line('Before assign'||p_a);
b:=p_a; --raises error if assigned;
Dbms_output.put_line('After assign'||b);
END p;
BEGIN
p('12');
END;
if the line b:=a is commented it works fine. should'nt it be throwing an error for assigning
parameter p_a a value of length greater than 1.
Am I missing something.
Thanks,
-dharma
Followup March 20, 2004 - 9am Central time zone:
the parameters size is defined at runtime as specified by the INPUT parameter.
if you called
declare
x char(10);
y char(20);
begin
p( x ) ;
p( y ) ;
end;
the size of "p_a" would be 10 and then 20. sooo, look back in the code to the invokers of this
routine
(and stop using the dreaded CHAR type !!! hate it)
Char(1) vs Varchar(1) Revisited
May 12, 2004 - 5pm Central time zone
Reviewer: Jeff Westman from Walnut Creek, CA, USA
So, Tom ... Am I understanding this correctly?
Situation #1
CHAR(1) = null, VARCHAR(1) = null
The char takes up 3 bytes (length id + space), and the varchar takes up 2 bytes (length id only)
Situation #2
CHAR(1) = 'X', VARCHAR(1) = 'X'
The char takes up 3 bytes still (length id + 'X'), and the varchar takes up 3 bytes (length id +
'X')
So [if I understand this right] it seems best to always use a VARCHAR2 since you will use a minimum
of 2 bytes either way.
When would you want to use a CHAR then???!
Thanks in advance,
Jeff
Followup May 13, 2004 - 9am Central time zone:
situation #1: both take zero or one byte -- the NULL flag byte. (zero bytes if they are a trailing
null column in the row)
situation #2: both take three bytes -- the NULL flag byte, one byte for the length, one byte for
the data.
I would never use char, in fact, I never do.
What if CHAR(1) is a NOT NULL column
June 30, 2004 - 3am Central time zone
Reviewer: Sea from Australia
I have a not null column which is a flag. The value is either 'Y' or 'N'. There will never be any
null values. In this case, does it mean it does not really matter whether to use CHAR or VARCHAR?
Followup June 30, 2004 - 10am Central time zone:
no, they would be 100% the same -- physically on disk, and query wise.
soooo, no good reason to change the rule, i would still use varchar2.
number vs date in data warehouse
June 30, 2004 - 1pm Central time zone
Reviewer: CH from WA
Tom,
Use sequantial numbers for date_key instead of keys like 20040630 is better? That means you have to
always join date dimention table whenever there is a constraint on date.
Thank you very much for your always eye opening replys.
Followup June 30, 2004 - 2pm Central time zone:
(or just use a date?)
Number vs Date
June 30, 2004 - 6pm Central time zone
Reviewer: Wayne
Tom,
Using DATE is much better than using number keys if the field is not null and always have good
values (e.g. trans_date, populated with sysdate at insert time). But if otherwise, dates can have
bad values (and not desirable to be displayed as is in DW), we can turn it into null, but then in
many queries, we have to ask the users to apply the outerjoins (to a TIME_DIM dimension table) even
if the we don't have original values.
I saw some people put a default date there, such as '4-JUL-1776' or '31-DEC-9999', but calculation
could be a problem if people run a avg(sysdate -birthday), we might get wrong results.
Any suggestions?
Thanks,
Followup June 30, 2004 - 8pm Central time zone:
if the field is a date and you put in a number -- what good is that? you cannot do any analysis?
Please DO NOT post irrelevant question
June 30, 2004 - 8pm Central time zone
Reviewer: A Reader
The topic is "Char Vs Varchar".
Please DO NOT post questions that are irrelevant to the topic.
Thanks!
Date Key
June 30, 2004 - 9pm Central time zone
Reviewer: Wayne
Tom,
Some "date" source fields are not stored as date, but rather as a varchar2 in 'mm/dd/yyyy' format.
I know it's bad, and bad data can slip in like '2/29/2003', or '6/30/2037'.
1. We don't want to store them "as is" in DW, it's humilating (when we find somebody is 900 years
old) and difficult to use (convert to dates and might fail);
2. So we store them as dates and set the field to null if they are bad or unrealistic dates;
3. Users want to know for those null fields, which are originally null (we didn't collect the info)
and which are set to null (meaning bad value).
4. So we have to put date_key (number) there and put 0 as unavailable, -1 as bad data.
I know we can let users go to a staging table where these data are stored and before they are used
to update the real DW table, but sometimes users just don't want to and they set the rules.
Your insights?
Thanks,
Wayne
Followup July 1, 2004 - 10am Central time zone:
i would have stored a date field AND the source data when bad
that is, convert the date and if the conversion fails, store the string. so a null date with a
null bad field is truly missing, a null date with a non-null bad field is bad data and they can see
what the bad data was exactly.
import of char values into varchar2 fields
July 5, 2004 - 9am Central time zone
Reviewer: Dorothea from Germany
Dear Tom,
I have the following problem: I have to make a full import from a database with char fields into a
database with only varchar2 fields. The problem is of course, that all the trailing blanks are as
well imported. How is it possible to get rid of the trailing blanks (for quite a big database)?
thanks a lot in advance!
Followup July 5, 2004 - 10am Central time zone:
you could put triggers on the tables prior to the import to trim them, else you have to update
them.
we put in exactly what we are told to put in, nothing more and in this case, nothing less.
thanx
July 6, 2004 - 5am Central time zone
Reviewer: Dorothea from Germany
This was a really exact answer! Adn it helped. Thanx.
Dorothea
Varchar vs Varchar2
October 19, 2004 - 1pm Central time zone
Reviewer: John from USA
Tom,
What is the difference between Varchar and Varchar2 datatypes? I read somewhere that, we should
always use Varchar2 in preference to Varchar.
Pl let me know.
Followup October 19, 2004 - 1pm Central time zone:
they are and have been the same -- but varchar2 will remain the same, varchar is subject to change
(if ansi changes how it should behave)
Help on Datatype!!!!!!!
October 28, 2004 - 10am Central time zone
Reviewer: A reader
Hi Tom,
The other day one of the developers asked me to help him create a table for reports. This table
will be used for recording SQL Queries. So I created a table and one of the columns is Repsql
varchar2(2000 char).
My question is from the performance point of view and space which datatype should I use from the
following:
varchar2(2000 char);
char(2000 varchar2);
This datatype I was not allowed to create:
varchar2(2000 varchar2);
It generates the error: ORA-00905: missing keyword
Also a point to note is that the queries can be huge. I could not explain to my developer neither
could I remember the link on your site which explains the difference between
varchar2(2000) and varchar2(2000 char).
Another thing I did a desc v$sql and saw the field SQL_TEXT to be just varchar2(1000). So is this
size sufficient for the above table.
Oracle 91R2 on Win 2k Advances Server.
Thanks in advance as always.
Followup October 28, 2004 - 1pm Central time zone:
varchar2 is the only type i use for strings
it would be varchar2( 2000 char ) (for 2000 CHARACTERS)
2000 char lets you hold 2000 characters -- in a mutli-byte character set, that might be 4000 bytes
(2 bytes/character) for example
2000 byte lets you hold 2000 bytes of data, which might be as little as 2000/6 characters.
The max size of varchar2 in table and proceudre.
November 12, 2004 - 11am Central time zone
Reviewer: Sean from NJ, USA
Hi Tom,
The max size of varchar2 data type is 4000 bytes, but in the stored procedure I can declare
varchar2 variable to 32767. But in chapter 3 of Application developer's guide, it only states that
max size varchar2 is 4000. Why are the max sizes different?
Thanks so much for your help.
Oracle 9204
Sean
Followup November 12, 2004 - 1pm Central time zone:
the max size of a varchar2 column ON DISK is 4000 bytes.
the max size of a varchar2 column IN MEMORY is 32k.
they just are different, not sure there is a "reason", other than in plsql you need more than 4000
characters for variables sometimes. On disk you have clobs.
clob in plsql
November 12, 2004 - 1pm Central time zone
Reviewer: Sean from NJ, USA
Hi Tom,
We can use clobs in plsql too if it is larger than 32K.
Thanks so much for your help.
Followup November 12, 2004 - 3pm Central time zone:
in 9i yes, you can.
Shouldn't this..
November 12, 2004 - 3pm Central time zone
Reviewer: A reader
"the varchar is physically 1999 bytes but logically can be any size in between 0 and 1999."
be..
"the varchar is logically 1999 bytes but physically can be any size in between 0 and 1999."
Followup November 12, 2004 - 3pm Central time zone:
nope, meant it the first way when talking about plsql (only plsql).
a varchar2 less than 2000 characters in plsql is pre-allocated physically to the maximum length.
varchar2(50) -- is 50 bytes
varchar2(50) := 'x'
is logically 1 byte in my character set, physically however takes 50 bytes of RAM.
veriifcation
November 12, 2004 - 6pm Central time zone
Reviewer: Menon
"varchar2(50) := 'x'
is logically 1 byte in my character set, physically however takes 50 bytes of
RAM.
"
Is there a way to verify this?
Thanx.
Followup November 12, 2004 - 8pm Central time zone:
did you read the original answer? that was the crux of the question in the first place.
sorry
November 13, 2004 - 10am Central time zone
Reviewer: A reader
did not read the original answer..
however, I was looking for a more "direct" proof. I guess
there is none - you can only see the effect indirectly
by measuring elapsed times in the degenerate case.
Anyway this was more of an "academic" question.
Followup November 13, 2004 - 11am Central time zone:
fair enough, here is the "direct" proof:
ops$tkyte@ORA9IR2> create or replace
2 procedure create_procedure( p_name in varchar2,
3 p_num_vars in number,
4 p_size_vars in number )
5 as
6 l_stmt long := 'create or replace procedure ' || p_name || ' as ';
7 begin
8
9 for i in 1 .. p_num_vars
10 loop
11 l_stmt := l_stmt || chr(10) || ' v' || i || ' varchar2('||p_size_vars||') := ''x'';';
12 end loop;
13
14 l_stmt := l_stmt || chr(10) || 'begin null; end;';
15
16 execute immediate l_stmt;
17 end;
18 /
Procedure created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec create_procedure( 'p1', 1000, 1999 );
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> exec create_procedure( 'p2', 1000, 2001 );
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec p1
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> exec p2
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> @connect /
ops$tkyte@ORA9IR2> set termout off
ops$tkyte@ORA9IR2> set termout on
ops$tkyte@ORA9IR2> @mystat "session pga memory max"
ops$tkyte@ORA9IR2> set echo off
NAME VALUE
------------------------------ ----------
session pga memory max 286044
ops$tkyte@ORA9IR2> exec p1
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> @mystat2
ops$tkyte@ORA9IR2> set echo off
NAME V DIFF
------------------------------ ---------- ----------
session pga memory max 2314240 2028196
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> @connect /
ops$tkyte@ORA9IR2> set termout off
ops$tkyte@ORA9IR2> set termout on
ops$tkyte@ORA9IR2> @mystat "session pga memory max"
ops$tkyte@ORA9IR2> set echo off
NAME VALUE
------------------------------ ----------
session pga memory max 286044
ops$tkyte@ORA9IR2> exec p2
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> @mystat2
ops$tkyte@ORA9IR2> set echo off
NAME V DIFF
------------------------------ ---------- ----------
session pga memory max 364120 78076
ops$tkyte@ORA9IR2>
excellent!
November 13, 2004 - 11am Central time zone
Reviewer: Menon
completely forgot about the pga memory stats:) Thanx Tom.

January 11, 2005 - 7am Central time zone
Reviewer: A reader
Difference
January 12, 2005 - 5am Central time zone
Reviewer: Reader from India
what is difference between varchar,and varchar2 datatype in oracle
Followup January 12, 2005 - 9am Central time zone:
nothing right now.
PL/SQL Variable Sizes
March 29, 2005 - 6pm Central time zone
Reviewer: Andrew from Wellington, NZ
My team is tired of seeing value errors caused by PL/SQL developers not allocating a big enough
string for a message (ORA-6502 numeric or value error: character string buffer too small), so
someone has suggested enforcing all VARCHAR2s in PL/SQL to be 2000+. This way there is never a
downside to memory usage, only a small downside to performance due to re-allocating modified
strings.
Assume for a second that none of these varchar2 variables are used to store/retrieve information
from the database (assume we use plsql records for that).
Is there any downside to enforcing all VARCHAR2 variables to be 2000+? The only one could be: what
about PL-SQL collections? We use index-by tables and associative arrays (but not varrays) to pass
around information. But because these collections cannot be pre-allocated sizes, I am guessing
that PL/SQL does the right thing and doesn't waste any space (in memory) when extending a
collection, i.e.
TYPE typet IS TABLE OF VARCHAR2(2000);
t typet := typet('x','y','z');
Does this take up 3 bytes or 6000 bytes?
Followup March 29, 2005 - 8pm Central time zone:
there is a downside in that you no longer verify the string is the right size either. That doesn't
seem like the correct fix to me.
Use of %TYPE, %ROWTYPE maybe.
Looks "safe" enough memory wise, but doesn't sound like a sound programming concept (well, it isn't
frankly).
ops$tkyte@ORA9IR2> create or replace package demo_pkg
2 as
3 type array is table of varchar2(2000);
4 g_data array := array();
5 end;
6 /
Package created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> disconnect
Disconnected from Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.4.0 - Production
ops$tkyte@ORA9IR2> @connect /
ops$tkyte@ORA9IR2> set termout off
ops$tkyte@ORA9IR2> set termout on
ops$tkyte@ORA9IR2> @mystat "session pga memory max"
ops$tkyte@ORA9IR2> set echo off
NAME VALUE
------------------------------ ----------
session pga memory max 286044
ops$tkyte@ORA9IR2> begin
2 dbms_output.put_line( demo_pkg.g_data.count );
3 end;
4 /
0
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> @mystat2 "session pga memory max"
ops$tkyte@ORA9IR2> set echo off
NAME V DIFF
------------------------------ ---------- ----------------
session pga memory max 360092 74,048
ops$tkyte@ORA9IR2> begin
2 demo_pkg.g_data.extend(500);
3 end;
4 /
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> @mystat2 "session pga memory max"
ops$tkyte@ORA9IR2> set echo off
NAME V DIFF
------------------------------ ---------- ----------------
session pga memory max 360092 74,048
ops$tkyte@ORA9IR2> begin
2 for i in 1 .. 50
3 loop
4 demo_pkg.g_data(i) := 'x';
5 end loop;
6 end;
7 /
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> @mystat2 "session pga memory max"
ops$tkyte@ORA9IR2> set echo off
NAME V DIFF
------------------------------ ---------- ----------------
session pga memory max 360092 74,048
ops$tkyte@ORA9IR2> begin
2 for i in 1 .. 500
3 loop
4 demo_pkg.g_data(i) := 'x';
5 end loop;
6 end;
7 /
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> @mystat2 "session pga memory max"
ops$tkyte@ORA9IR2> set echo off
NAME V DIFF
------------------------------ ---------- ----------------
session pga memory max 360092 74,048
ops$tkyte@ORA9IR2> begin
2 for i in 1 .. 500
3 loop
4 demo_pkg.g_data(i) := rpad('x',200,'x');
5 end loop;
6 end;
7 /
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> @mystat2 "session pga memory max"
ops$tkyte@ORA9IR2> set echo off
NAME V DIFF
------------------------------ ---------- ----------------
session pga memory max 491164 205,120
ops$tkyte@ORA9IR2> begin
2 for i in 1 .. 500
3 loop
4 demo_pkg.g_data(i) := rpad('x',2000,'x');
5 end loop;
6 end;
7 /
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> @mystat2 "session pga memory max"
ops$tkyte@ORA9IR2> set echo off
NAME V DIFF
------------------------------ ---------- ----------------
session pga memory max 1867420 1,581,376
string literal padded with trailing blanks
April 15, 2005 - 3pm Central time zone
Reviewer: A reader
Hi Tom,
Our problem is related to one of your comments earlier -
character string literals like that are coerced to the proper datatype automagically
'A' is really 'A '
it is the ansi defined behaviour
We have a select like this in our jsp (java server page):
select '250' as TYPE_ID, ... from ...
At one client site, this is returning '250 ' in a java String variable. According to the
developers, it always returned '250' (without the trailing blanks). I noted that this particular
client uses utf8 as their db characterset, whereas all others use us7ascii. Is this behaviour of
padded trailing blanks because of utf8 characterset? Is there anything we could do on the jdbc side
that will globally suppress the trailing blanks from strings rather than we having to change our
java code all over to do a trim?
Followup April 15, 2005 - 3pm Central time zone:
other thought first -- what is cursor_sharing set to on this customer site?
cursor_sharing?
April 15, 2005 - 5pm Central time zone
Reviewer: A reader
Sorry for the delay, Tom but the client is out of U.S and I don't have sysdba access to their db. I
guess cursor_sharing=similar of force could be causing this? I will ask them to set it to exact and
see. Thanks much.
Followup April 15, 2005 - 6pm Central time zone:
ops$tkyte@ORA9IR2> alter session set cursor_sharing=exact;
Session altered.
ops$tkyte@ORA9IR2> select '123' from dual d1;
'12
---
123
ops$tkyte@ORA9IR2> alter session set cursor_sharing=force;
Session altered.
ops$tkyte@ORA9IR2> select '123' from dual d2;
'123'
--------------------------------
123
ops$tkyte@ORA9IR2>
see the column widths? with cursor sharing force/simlar the second query becomes select :sys_bv0
from dual and you know longer know "3" is the answer
and "show parameter" does not need sysdba
in fact, just run select '123' from dual and see what you see in sqlplus!
Difference in performance?
May 9, 2005 - 10am Central time zone
Reviewer: A reader
I just stepped into a shop where the developers have been strictly instructed to use char rather
then varchar2 because of "performance benefits". Any truth to this?
PS - No metrics, no test cases...surprised?
Thanks in advance!
Followup May 9, 2005 - 10am Central time zone:
yes, char will make you run slower, consume more ram and make the pockets of your hardware vendor
better padded.
Other than that, they are great.
recommendation -- never use char()
Documentation to this fact...
May 18, 2005 - 11am Central time zone
Reviewer: A reader
Tom,
Does Oracle have any documentation stating that varchar2 is prefered to char and why? A colleague
is convinced that using varchar2 instead of char columns will degrade the performance of the
database. He says that using char columns will write everything in nice, neat groups and prevent
fragmentation. Being from Missouri, I need to show him to convice him otherwise.
Thanks!
Followup May 18, 2005 - 11am Central time zone:
<quote>
Does Oracle have any documentation stating that varchar2 is prefered to char and
why?
</quote>
no, it is your choice and my voice on it is "USE VARCHAR2 and FORGET CHAR even exists"
a char is simply a fixed with varchar2 -- with blanks at the end.
a char(80) will take bytes for the length (like a varchar2) and consume 80 more bytes storage on
disk.
it will make your indexes as big as they can be.
it will make your tables as large as they can.
I will never use them.
Performence between data types
July 11, 2005 - 9am Central time zone
Reviewer: Sanjaya Balasuriya from Colombo, Sri Lanka
Hi Tom,
1.
When used for storing text (characters and numbers) data, is there any performance wise difference
between number, char and varchar2 ?
2.
How can we compare number, char and varchar2 when it comes to efficiency of storage management ?
Thanks in advance.
Followup July 11, 2005 - 11am Central time zone:
1) well, the only thing that makes SENSE To store text is "varchar2"
the only thing that makes SENSE to store a number is, well, a number
same with a date. The question doesn't even make sense to me -- if you have numbers, use a NUMBER
type.
2) you don't, you won't, you use the right type for your needs, period, plain and simple.
Case for CHAR
July 27, 2005 - 7pm Central time zone
Reviewer: Kevin Kirkpatrick from St. Louis, MO
I have a friend (wow, is that cliche or what) who works in a shop where the primary context in
which data will be extracted will be into fixed-width flat files processed by legacy COBOL
programs. Thus, the majority of SQL written against the database will pull fixed-width fields.
Given that space is not an issue, that single-order-magnitute-or-less performance degradation is
not an issue, might this be a case where even Tom Kyte would consider using CHAR's in place of
VARCHAR2's?
Consider:
SELECT FIELD1, FIELD2
FROM TEMP
vs.
SELECT RPAD(FIELD1,10,' '), RPAD(FIELD2,15,' ')
FROM TEMP
Thanks!
Followup July 27, 2005 - 8pm Central time zone:
Nope, because eventually someone will ask "what is the length of 'bob'" and getting 20 back is not
what they expect. Or they use a varchar in their program OR they just any one of the thousands of
ad-hoc tools and it uses varchars.
Nope, still would not.
CHAR x VARCHAR2 usage
January 5, 2006 - 3pm Central time zone
Reviewer: Leonardo Pinheiro from Brasil
Hi there,
thinking about CHAR and VARCHAR2 datatypes I figure out one cenario that might indicate CHAR
datatype instead of VARCHAR. I would like to confirm it with Tom or any other DBA who have already
tested it using block dumps.
If we have a column that is intended to get updated a log and those updates will change data
lenght, we might have a lot more I/O using VARCHAR2 due to the block reallocation possibility
inside the datafile. For instance:
create table test (A varchar2(2000))
insert into test values ('a') (lets say 1000 of there inserts)
update test set A='aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa' we would have to move all the blocks forward in
the datafile to fit the new data for each row. Is that performance issue considerable or is it just
rubish?
thanks!
Followup January 5, 2006 - 4pm Central time zone:
you would not move all of the blocks "forward"...
When you update a row - and make it grow in size such that is no longer fits on a block - Oracle
will MIGRATE the row.
That is, the original block will just have a "forwarding address"- a rowid pointer to where the row
is right now.
So, no, it won't cause the entire "end of the table to move" to make room - it will however incurr
an extra IO on retrieval via an index (the index will point to where the row started on, we goto
that block to discover the row has "moved" or migrated and go to that block to get it).
This is why you would consider setting pctfree. To reserve room on the block for future updates
and to avoid the row migration.

January 6, 2006 - 4am Central time zone
Reviewer: Raj from IN
Tom,
in case of above reply can I have my tablespace segment space management as auto and not worry
about setting pctfree, pctused etch
Followup January 6, 2006 - 1pm Central time zone:
which above reply, there are lots of them.
ASSM (automatic segment space management) does not help in the setting of pctfree, you still need
to do that.
Blank Padds in Char Data Type
January 26, 2006 - 10pm Central time zone
Reviewer: Sanjai kumaar NM from THAILAND
I have two table with following descriptions.
Table TEST1 (X CHAR(10)) AND Table TEST2 (Y CARCHAR2(10))
Both the tables has some similar values. When I do a select statement like
SELECT X,Y FROM TEST1,TEST2 WHERE TEST1.X=TEST2.Y
This does not return any rows. Whereas if I issue a select statement seperately on individual
tables like
SELECT X FROM TEST1 WHERE X='ABC12' AND
SELECT X FROM TEST2 WHERE Y='ABC12'
These select statements returns rows.
What makes the difference.
Thanks In Advance.
Followup January 27, 2006 - 8am Central time zone:
select dump(x) from test1 where x = 'ABC12';
select dump(y) from test2 where y = 'ABC12'
what does that produce (cut and PASTE)
Documentation on CHAR deprecation
April 21, 2006 - 8am Central time zone
Reviewer: Nice
Hello Tom,
Will CHAR data type would be deprecated? Is there any official announcement on this?
The first reference to the de-support of CHAR seems to have surfaced on metalink in December 2001
CHAR is part of the ANSI standard, so it seems rather unlikely to go.
Followup April 21, 2006 - 8am Central time zone:
I would doubt it would be.
Where do you see a reference to the desupport of this type?
VARCHAR2(<2000) Vs VARCHAR2(>=2000)
June 22, 2006 - 1pm Central time zone
Reviewer: VKOUL from WA USA
Hi Tom,
"In PLSQL, what they are saying is that for optimization purposes -- a varchar2
less then 2000 bytes will allocate a fixed amount of memory. It will
preallocate the maximum size upon instantiation. This cuts down on the
reallocations as different sized strings are assigned to it. On the other hand,
a varchar2 greater then 2000 characters will truely be varying sized and as you
assign different sized strings to it it will grow dynamically as needed.
One takes a little more ram (if you never use the entire string) and performs a
little better, the other uses only what ram it needs but if you cause the string
to grow frequently will not perform as well due to the reallocs to resize the ..."
Does this behaviour still hold in Oracle 9i and 10g ?
Thanks.
Strange thing happening
October 2, 2006 - 6am Central time zone
Reviewer: abz
Oracle 9.2.0.5
SQL> DESC CI_BILL;
Name Null? Type
----------------------------------------- -------- --------
BILL_ID NOT NULL CHAR(12)
BILL_CYC_CD NOT NULL CHAR(4)
WIN_START_DT DATE
ACCT_ID NOT NULL CHAR(10)
BILL_STAT_FLG NOT NULL CHAR(2)
SQL> L
1 select COUNT(bill_stat_flg), SUM(DECODE(bill_stat_flg, 'C', 1, 0))
2* from ci_bill where bill_stat_flg='C'
SQL> /
COUNT(BILL_STAT_FLG) SUM(DECODE(BILL_STAT_FLG,'C',1,0))
-------------------- ----------------------------------
4651756 0
I think both of them should be equal, whats going wrong
here?
Followup October 2, 2006 - 7am Central time zone:
you got that neato char datatype working for you there. I hate char, I will not use char, I will
never recommend char, and this - this is a classic reason WHY.
ops$tkyte%ORA10GR2> create table t ( x char(2) );
Table created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert into t values ( 'C' );
1 row created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select decode( x, 'C', 'I am ''C''!', 'I am certainly NOT ''C''' ),
2 decode( x, 'C ', 'I am ''C ''!', 'I am certainly NOT ''C ''' )
3 from t;
DECODE(X,'C','IAM''C'' DECODE(X,'C','IAM''C''!
---------------------- -----------------------
I am certainly NOT 'C' I am 'C '!
You have NO 'C' in your table, you have 'C '. Decode works with varchar types by default, not char
types. When you compare a char to a varchar, the varchar better be padded out OR IT WILL NOT WORK
as you "expect" (but it will work entirely as ANSI asked for it to work)
ops$tkyte%ORA10GR2> variable x varchar2(2)
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> exec :x := 'C'
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select * from t where x='C';
X
--
C
ops$tkyte%ORA10GR2> select * from t where x=:x;
no rows selected
When you compare a literal to it (oh my gosh, please do NOT think I'm saying to use literals in
your code - I'm really not, bind stuff that varies from execution to execution!!!!) the rules state
"the literal shall be promoted to the type we compare to". Hence the char(1) that is 'C' becomes a
char(2) with 'C ' in it.
However, when you put 'C' into a varchar2, you are now comparing a varchar2 to a char and the
results are very different.
My suggestion would be to (I am deadly serious) implement an engineering change request on your
system to "remove that type, it wastes space, it causes confusion, it provides negative value".
very strange
October 2, 2006 - 8am Central time zone
Reviewer: abz
Oh, its really very confusing.
Problem is that we are using a 3rd party product
called CC&B from SPL www.splwg.com.
That product has used 99 percent of CHAR datatypes.
can you please tell me how can I convince them.
OR
Although you have explained it very well, but is there
a document like ALL ABOUT CHAR DATATYPE, which I can
circulate to all the developers in the organization, so
that they know that what should they expect when working
with CHAR datatypes.
Thanks
Followup October 2, 2006 - 9am Central time zone:
well, the documentation actually covers this. I cover it too in my book Expert Oracle Database
Architecture - chapter on datatypes - but I cover it from the perspective of "the char type is just
a waste, do not use it"
CHAR vs VARCHAR
January 15, 2007 - 12pm Central time zone
Reviewer: Deepak from India
Hi Tom,
First of all please excuse me because am going to ask a very trivial question. But it raises few points in my mind.
Have done this simple exercise...
SQL> conn scott/tiger
Connected.
SQL> create table char_vs_varchar(
2 col1 char(5),
3 col2 varchar2(5));
Table created.
SQL> insert into char_vs_varchar values('TEST','TEST');
1 row created.
SQL> COMMIT;
Commit complete.
SQL> SELECT * FROM char_vs_varchar
2 WHERE COL1=COL2;
no rows selected
SQL> SELECT * FROM char_vs_varchar
2 WHERE TRIM(COL1)=COL2;
COL1 COL2
----- -----
TEST TEST
SQL> SELECT * FROM char_vs_varchar
2 WHERE COL1='TEST';
COL1 COL2
----- -----
TEST TEST
Have the following queries...
I know that Oracle is doing the right thing as it right pads the string for the CHAR column with ' '.
>> But is it always recommended not to use this type of comparisions in queries?
>> Shall we always insert fixed length strings in CHAR columns? In other words, shall we decide on the CHAR data type based on the possible length of the string that will go in, CHAR would be selected for only FIXED length strings?
>> In the last example why does Oracle chose to trim the CHAR column value?
Please help me in understanding the basics?
Common defined types
February 20, 2007 - 2pm Central time zone
Reviewer: Andrew Markiewicz from Madison, WI
Tom.
My question/response is similar to a previous post on this thread about requiring all varchar2 to be >2000. Your response was
'Looks "safe" enough memory wise, but doesn't sound like a sound programming concept (well, it isn't
frankly).'
We were considering making a package spec to contain predefined types so that programmers could use them instead of defining their own type (most likely differently) in each package. Similar to the standards package Oracle defines but to include some standard plsql tables definitions.
Ex:
create or replace package typ
as
TYPE hashInt is TABLE of integer
index by varchar2(32000);
TYPE hashVc is TABLE of varchar2(32000)
index by varchar2(32000);
TYPE hashDt is TABLE of date
index by varchar2(32000);
TYPE vaInt is VARRAY(10000) of integer;
TYPE vaVc is VARRAY(10000) of varchar2(32000);
TYPE vaDt is VARRAY(10000) of date;
end;
And using the types would merely be,
...
xyz typ.hashDt;
abc typ.vaInt;
The main benefits are not having to redefine a similar type everywhere it is needed and homogeneous code.
My main concern was the memory consumption but it seems Oracle handles that dynamically, which is good for memory conservation.
I would not advocate the use of these types if there was a requirement to limit the varchar2 size and trap an appropriate error as part of an application, but rather just as a convenient method of using simple arrays.
Are there any drawbacks to you can think of or is this also not a sound programming technique?
Thanks.
Followup February 20, 2007 - 3pm Central time zone:
If you treat everything as a 32k string, everything will be a 32k string.
You lose the strong typing - which is generally considered "not a good thing".

February 21, 2007 - 12pm Central time zone
Reviewer: Andrew Markiewicz
But I am not suggesting that everything be a varchar2(32000). There is a plsql table defined for integers and dates as well so the data type is preserved. The main goal is to avoid having to define a type every time you just want a simple array of strings (or integers or dates).
An additional consideration could be to make a UDT VARRAY(32767). This adds the benefit of using the type constructor within function calls or even SQL.
create or replace type vcarr as VARRAY(32767) of varchar2(32768)
/
set echo on
col column_value format a20
select * from table(vcarr('x','y','z'));
COLUMN_VALUE
--------------------
x
y
z
declare
x vcarr := vcarr();
procedure proc1(p_arr in vcarr)
is
begin
for i in p_arr.FIRST .. p_arr.LAST
loop
dbms_output.put_line('i='|| i || ' ' || p_arr(i));
end loop;
end;
begin
proc1(vcarr('a','b','c','d'));
proc1(vcarr('hello', 'world'));
x.EXTEND(5000);
for i in 1 ..5000
loop
x(i) := to_char(i);
end loop;
proc1(x);
end;
/
i=1 a
i=2 b
i=3 c
i=4 d
i=1 hello
i=2 world
i=1 1
i=2 2
i=3 3
i=4 4
i=5 5
i=6 6
i=7 7
i=8 8
i=9 9
i=10 10
i=11 11
i=12 12
i=13 13
i=14 14
i=15 15
i=16 16
i=17 17
...
i=4999 4999
i=5000 5000
PL/SQL procedure successfully completed.
For the case of a type (or plsql table) I just abhor the idea of creating types like:
TYPE vc100 of varchar2(100)
TYPE vc500 of varchar2(500)
TYPE vc4000 of varchar2(4000)
...
when plsql handles the memory allocation dynamically (at a minor performance hit). Why not just one type per native data type (varchar2, integer, date,...)?
Again, for simple arrays that do not require constraints on the size of the data.
Unless of course Oracle has some designs on creating real arrays for plsql....
Andrew
Followup February 21, 2007 - 12pm Central time zone:
if you have a table of names (that should be 30 characters or less)
and a table of addresses (that should be 120 characters or less)
and you use a table of 32k - you'll have names that are 32k and addresses as well.
it is called "strong typing", think of the datatype as a DATA INTEGRITY check (because, after all, that is exactly what it is)
No, you won't convince me to use a few generic types for everything.
I might not convince you not to - but that is another story.
You asked my opinion, I'm of the opinion that creating the correct type for your code is the right approach.
.... for simple arrays that do not require constraints on the size of the data. ...
not sure where that occurs, or why it is onerous to create a type specific to an API.

February 21, 2007 - 1pm Central time zone
Reviewer: Andrew Markiewicz from Madison, WI
Fair enough.
That's why I'm debating the issue internally. I'm not convinced it's a great idea myself, just convenient.
As always, thanks for your input.
is there anything different varchar with varchar2?
March 14, 2007 - 10am Central time zone
Reviewer: Jerhyn from Seoul, South Korea
this place is so amazing, i always appraciate you tom.
while i read this topic, i was wondering what is different varchar with varchar2.
could you tell me? or just let me know where can i find about them.
Followup March 14, 2007 - 2pm Central time zone:
nothing right now, they are the same right now, they might differ in the future.
furture
March 14, 2007 - 3pm Central time zone
Reviewer: marc from usa
How far in the future did they think about? Oracle has been saying "for future use" since version 7. Right?
Followup March 15, 2007 - 7am Central time zone:
at least.
Char vs Varchar2, like vs =
March 28, 2007 - 2pm Central time zone
Reviewer: Jason Tsai from Santa Clara, CA USA
Tom,
This is the question regarding the reply you made for the second post in this thread from kiran.
I tried your test case in both 9.2.0.3 and 10.2.0.1, but got different results for predicate containing char datatype column and like operator:
In 9.2.0.3:
invmgmt_dev@SCORDEV3(95,1290)> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.3.0 - 64bit Production
PL/SQL Release 9.2.0.3.0 - Production
CORE 9.2.0.3.0 Production
TNS for IBM/AIX RISC System/6000: Version 9.2.0.3.0 - Production
NLSRTL Version 9.2.0.3.0 - Production
Elapsed: 00:00:00.01
invmgmt_dev@SCORDEV3(95,1290)> create table t ( x char(10), y varchar2(10) );
Table created.
Elapsed: 00:00:00.01
invmgmt_dev@SCORDEV3(95,1290)> insert into t values ( 'x', 'x ' );
1 row created.
Elapsed: 00:00:00.00
invmgmt_dev@SCORDEV3(95,1290)> select * from t where x like 'x';
X Y
---------- ----------
x x
Elapsed: 00:00:00.00
invmgmt_dev@SCORDEV3(95,1290)> select * from t where y like 'x';
no rows selected
Elapsed: 00:00:00.01
In 10.2.0.1:
jason@ORA10GR2(143,8689)> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
PL/SQL Release 10.2.0.1.0 - Production
CORE 10.2.0.1.0 Production
TNS for 32-bit Windows: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production
5 rows selected.
Elapsed: 00:00:00.09
jason@ORA10GR2(143,8689)> create table t ( x char(10), y varchar2(10) );
Table created.
Elapsed: 00:00:00.01
jason@ORA10GR2(143,8689)> insert into t values ( 'x', 'x ' );
1 row created.
Elapsed: 00:00:00.00
jason@ORA10GR2(143,8689)> select * from t where x like 'x';
no rows selected
Elapsed: 00:00:00.01
jason@ORA10GR2(143,8689)> select * from t where y like 'x';
no rows selected
Elapsed: 00:00:00.01
After searching the metalink, I found a bug no. 4397534, which indicates the behavior in 9.2.0.3 is a bug.
Which behavior is correct?
Thanks,
Jason
Followup March 30, 2007 - 11am Central time zone:
why would I choose an answer other than what was stated in the bug report - that 9i had a bug that was corrected in 10g?
What is this behaviour of Oracle 10gR2
November 3, 2007 - 2am Central time zone
Reviewer: Sagar from Mumbai, India
Why varchar column is varchar2 in desc output?.
SQL> ed
Wrote file afiedt.buf
1* create table tvar ( col1 varchar(100),col2 varchar2(4000 ))
SQL> /
Table created.
SQL> desc tvar
Name Null? Type
----------------------------------------- -------- ----------------------------
COL1 VARCHAR2(100)
COL2 VARCHAR2(4000)
Followup November 5, 2007 - 11am Central time zone:
as stated above... and as documented:
in response to "what is the difference between varchar and varchar2"
... nothing right now, they are the same right now, they might differ in the future. ...
date as varchar
February 19, 2008 - 3pm Central time zone
Reviewer: dobromil kautsky na stroji husgvarna
hello tom,
i know you say that dates should be stored in date columns but there is one case where i am not
sure about how to achieve that.
imagine you need to store a date value whose precision can be anything between a year and a second.
so the column may contain values like this:
'2008'
'2008-10'
'2008-10-03'
'2008-10-03 12:20'
etc.
we used to store these dates as a date column and as a number column where the number specified the
precision of the date, but working with such structure turned out to be quite cumbersome.
now we store these dates as a varchar ('yyyymmdd.....') - everything is much simpler now, but i
still don't feel well about it - i wonder if there is a way to achieve this functionality with date
datatype.
thank you
Followup February 20, 2008 - 7am Central time zone:
when you store 2008 in yyyymmdd - what do you store?
01-jan-2008 00:00:00 -> 2008
01-oct-2008 00:00:00 -> 2008-10
03-oct-2008 00:00:00 -> 2008-10-30
03-oct-2008 12:20:00 -> 2008-10-30 12:20
and how is it better/different then using a date as shown above.
if you store 2008 as 20080000000000 - then you have just stored the logical equivalent of a date column and a number column - just using one field (eg: how would that be any more or less cumbersome than what you already had)
dates
February 20, 2008 - 8pm Central time zone
Reviewer: dobromil kautsky na stroji husgvarna
when you store 2008 in yyyymmdd - what do you store?
only the part of the date that we know is stored - so for year it's '2008', for a month it's '200810', etc.
if i only used a date column i would lose an important piece of information - the precision.
sometimes we do not know the exact date, we only know the year, or the year and month, etc. so storing 2005 as 2005-01-01 would be missleading for the users (that's why we used to have the additional number column specifying the precision).
for example - in the table containing statutory body members there are two date columns which tell us from when to when the person was a member of the statutory body of a company. when we want to select all records valid for a given date, we can write:
select * from statutory_body_members where :date_str between start_date and end_date||'z'
for this to work we would have to have different rules for storing the value of '2008' in start_date (2008-01-01) and in end_date (2008-12-31 23:59:59) (which of course would be possible but after several years of working with both ways of storing these dates, i find the 'varchar' way much simpler to use - especially due to the fact that only one column is needed to store the date)
Followup February 20, 2008 - 8pm Central time zone:
you said you stored yyyymmddhh24miss, so you don't. You use a variable mask.
The varchar approach might be "easier" for you, but throws off the computed cardinalities. Your choice.
Wonder how many of your dates are invalid.... Data types are integrity checks too.
I'm not a fan of it (using the wrong type)
what if you used two columns again, the start date and the interval (duration).

February 20, 2008 - 9pm Central time zone
Reviewer: dobromil kautsky na stroji husgvarna
Wonder how many of your dates are invalid.... Data types are integrity checks too.
s simple check constraint can take care of that.
I'm not a fan of it (using the wrong type)
i know - that's why i do not feel well about our use of varchars
what if you used two columns again, the start date and the interval (duration).
all operations (that we perform) on these columns would become more complicated. (eg. the one thing that we do the most - store the date, display the date to the user - now all we do is store the string as it comes and then show it again - with dates you need to keep track of where to store the "start date" format of the date and where to store the "end date", the when reading the date you need to format the date with a function because again there is no way to pass the date containing only the year as a date)
it would be best if oracle was capable of storing this information in a date column, unfortunatelly it only distinguishes between a date and a date with time - maybe a new datatype - just 3 more bits and so much work could be saved (at least at our company :-) )
Followup February 20, 2008 - 10pm Central time zone:
.... s simple check constraint can take care of that. ...
so you've done that right?
with a start_date and a duration, you would not have to store the "format" anywhere.
the duration would cover that already.
start_date, start_date + interval
It does not distinguish between a date and a date with time, ALL DATES have time, all of them.

February 21, 2008 - 7am Central time zone
Reviewer: dobromil kautsky na stroji husgvarna
i am not sure whether i understand how this start_date + interval could help me. could you please
show me sample data? for example for a person who was a member of the statutory body from '200204'
to '2007'. if i understand you well, you suggest to use two columns for both these dates?
Followup February 21, 2008 - 7am Central time zone:
show me how you do that with your current data first.
it'll be very similar.
provide your test data. Let us see examples of people that were there in 2005, another in October 2006 and someone for one day in 2007.
sample data
February 22, 2008 - 7am Central time zone
Reviewer: dobromil kautsky na stroji husgvarna
here is a sample of real data (not very representative because 95% of strt_dt's and end_dt's in
this table are yyyymmdd, but it's those that are not that cause "problems"..)
create table SB_SAMPLE
(
ROW_ID NUMBER(20) not null,
ROW_VER NUMBER(10),
ROW_DT DATE not null,
COMPANY_ID NUMBER(20) not null,
TYPE_CD VARCHAR2(90),
PERSON_ID NUMBER(20) not null,
STRT_DT VARCHAR2(30),
END_DT VARCHAR2(30)
)
;
insert into SB_SAMPLE (ROW_ID, ROW_VER, ROW_DT, COMPANY_ID, TYPE_CD, PERSON_ID, STRT_DT, END_DT)
values (700, null, to_date('22-10-2002 19:05:12', 'dd-mm-yyyy hh24:mi:ss'), 1700, '04', 887200,
null, '19960318');
insert into SB_SAMPLE (ROW_ID, ROW_VER, ROW_DT, COMPANY_ID, TYPE_CD, PERSON_ID, STRT_DT, END_DT)
values (1100, null, to_date('22-10-2002 19:05:12', 'dd-mm-yyyy hh24:mi:ss'), 2500, '04', 884700,
null, null);
insert into SB_SAMPLE (ROW_ID, ROW_VER, ROW_DT, COMPANY_ID, TYPE_CD, PERSON_ID, STRT_DT, END_DT)
values (1200, null, to_date('22-10-2002 19:05:12', 'dd-mm-yyyy hh24:mi:ss'), 2800, '04', 167000,
'19950601', '199807');
insert into SB_SAMPLE (ROW_ID, ROW_VER, ROW_DT, COMPANY_ID, TYPE_CD, PERSON_ID, STRT_DT, END_DT)
values (1600, 2, to_date('22-06-2007', 'dd-mm-yyyy'), 3600, '04', 371300, '1994', '199708');
insert into SB_SAMPLE (ROW_ID, ROW_VER, ROW_DT, COMPANY_ID, TYPE_CD, PERSON_ID, STRT_DT, END_DT)
values (1900, null, to_date('22-10-2002 19:05:12', 'dd-mm-yyyy hh24:mi:ss'), 4000, '04', 1009200,
'1992', '19971231');
commit;
Followup February 22, 2008 - 7am Central time zone:
you did not answer my question.
please show how to retrieve any bit of data with your where clause.
whether it be year/yearmonth/yearmonthday/whatever.
I want to find those people I referred to above, how would *you* do it.

February 22, 2008 - 8am Central time zone
Reviewer: dobromil kautsky na stroji husgvarna
ok, so basically we access these dates in 3 ways:
1) just display them (or store), so it's just
"select strt_dt from ..." - our nls_date_format is set
to 'yyyymmddhh24miss' so accessing date columns and varchar columns containing dates is transparent
to the middle tier.
2) select records that were valid on a given date (this is usualy (almost always) a part of a more
complex filter - eg: all members for a given set of companies and a given date):
select * from sb_sample where company_id in (1,2,...) and :date_str_param between between
nvl(strt_dt,'0') and end_dt||'z'
3) select records whose dates fall within a given range (used for monitoring changes of tha data).
eg:
select * from sb_sample where strt_dt||'z'>=:date_str_oparam_from and
strt_dt<=:date_str_param_to||'z'
(in 99% of cases the query is run with :date_str_param_to set to current date)
Followup February 22, 2008 - 12pm Central time zone:
so, it would be IDENTICAL.
You would store a start date, eg:
01-jan-2007
and an interval
1 year
now, start_date is obviously 01-jan-2007
and end_date is that plus 1 year.
So, how much harder is that then the string? (eg: none)
select * from sb_sample where company_id in (1,2,...)
and to_date(:date_str_param,fmt)
between between nvl(strt_dt,some_date_way_in_past)
and nvl(strt_dt+interval,some_date_way_in_future)

February 22, 2008 - 12pm Central time zone
Reviewer: dobromil kautsky na stroji husgvarna
i do not think it's the same.
1) how would you store a records with dates like this:
i) start_date = '2005', end_date = '2006-03-02'
ii) start_date= '2005-03', end_date = '2007-04-02'
iii) start_date = '2005-01-06', end_date = '2009'
2) how woud you rewrite the third select:
select * from sb_sample where strt_dt||'z'>=:date_str_oparam_from and
strt_dt<=:date_str_param_to||'z'
to achieve the same functionality.
3) bear in mind that the same selects need
to be run on end dates too
i think what you sugest is functionally
equivalent to just having start_date
and end_date as two date columns,
Followup February 22, 2008 - 12pm Central time zone:
'2005', end_date = '2006-03-02'
start date of 01-jan-2005 and interval of your end date minus start date
Look: you have START DATES and END DATES
I'm saying store the START DATE and a DURATION
now, we both have start dates and end dates
how to rewrite this?
select * from sb_sample where strt_dt||'z'>=:date_str_oparam_from and
strt_dt<=:date_str_param_to||'z'
come one - you can do that given a date that is a start_date and another date that is an end_date equal to start_date plus duration.
instead of strt_dt||'z' you nvl(start_date, to_far_in_future)
instead of strt_dt, well, you just have start_date
instead of :date_str_param_to||'z' you have nvl(to_date(:bind),date_in_future)
actually, you could lose the strt_dt||'z' couldn't you - the null trick doesn't count since you just use strt_dt next.
All you need is
a) start_date
b) end_date
that is all, that is what you have, you are playing games with strings to get them to sort. We need not play games with dates, they just do the right thing in the first place.

February 22, 2008 - 12pm Central time zone
Reviewer: dobromil kautsky na stroji husgvarna
and one thing - when i insert '2008' i need to select '2008', if i was to use what you suggest, how
would i do that?
i can't display '2008-01-01' to the user, when the value that was inserted (=the most precise value
we know) was '2008' - this is very important.
Followup February 22, 2008 - 12pm Central time zone:
that sounds like you want to store the display format when you save the data perhaps.
look - you apparently very much like your tricks with 'z' and everything. So be it. You can do it all with dates, you can do it just as easily with dates, and if you need a specific format for display purposes, by all means - go for it, that is pretty easy.

February 22, 2008 - 1pm Central time zone
Reviewer: A reader
that sounds like you want to store the display format when you save the data perhaps.
exactly - so there we are again. that's what we moved from - using a date + precision (or format whatever) - so instead of one column per date, i need two.
look - you apparently very much like your tricks with 'z' and everything. So be it
it's not about what i like or what i do not like - all i need is a solution that will be as "correct" as possible and at the same time, working with the data should be simple and straight forward.
instead of strt_dt||'z' you nvl(start_date, to_far_in_future)
well those two are not equivallent at all!
see
:date_param = '2006-05-01'
strt_dt = '2006';
while
:date_param<=strt_dt||'z' is TRUE
:date_param<=nvl('2006-01-01','2222-01-01') is FALSE
the thing is, every time i use a date after '<', '<=', or as the second operand of 'between' i need to make it the "latest second" in the given period (be it a year, month, day or whatever - so something<'2008' must be somtehing<'20081231235959', while something>'2008' is the same as somthing>'2008' (or '20080101' - ot's the same in this case) - but the same column can (and does) appear both in "less than" and "greater than" conditions)
if there is a simple way to achieve this with dates, i want to know it! - of course it's simple if you know that all records will hold 'yyyymmdd', or 'yyyymmmddhh24miss', but when every record can contain a different "precision" it's not that easy - so far i have not been able to come up with a way to achieve this with dates while keeping the simplicity of the solution anywhere near the varchar way..
it does not please me, believe me. i would love to have a date datatype remembering the exact value inserted.
Followup February 22, 2008 - 3pm Central time zone:
.... exactly - so there we are again. that's what we moved from - using a date + precision (or format whatever) - so instead of one column per date, i need two. ...
and so what, you are storing multiple bits of data encoded differently into a single field for your ease, your convenience, this is no different than using a field to store attribute_1 or attribute_2 - depending on how you feel.
You are just encoding data special to save yourself a keystroke. And adding the need for your 'z' tricks and such - pretending dates are strings and having to munge them to make them work.
...
it's not about what i like or what i do not like - all i need is a solution that will be as "correct" as possible and at the same time, working with the data should be simple and straight forward....
Ok, now here you are just plain wrong.
If I have:
a) start date
b) end date
in Oracle dates - which I do if I either store a start and end date (YOU ARE, just in a string, not in a date) or start date and interval - getting all of the data BY DATE is easy, straightforward, not cumbersome.
When you put it in a string, now you have to teach people the zen and art of the letter 'z'.
Your approach is cumbersome and requires training. Using DATES, using simple DATES - anyone should be able to retrieve the data.
And selecting
to_char(start_date, start_date_fmt), to_char(end_date, end_date_fmt)
does not seem cumbersome (heck - think 'view'). And inserting the format is not either (umm, you must have some editing in place right, you don't after all accept 20081 as valid input do you? how do you prevent that - well, just replace all of that code with collect a format and insert it. The DATE will be validated by the datatype, your format is on of YYYY YYYYMM YYYYMMDD YYYYMMDDHH24MISS - easy for a check constraint)
I still am not seeing your perceived complexity in getting data out that has DATES in it.
I don't know why you used start date there, you would use start date and end date to get things that happen in a range.
You had:
select * from sb_sample where strt_dt||'z'>=:date_str_oparam_from and
strt_dt<=:date_str_param_to||'z'
(which frankly, the 'z' trick, confusing - unless you invented it)
if you want overlaps, you only need:
(start_date <= :p_ending) AND (end_date >= :p_beginning)
that is more clear that 'z'

February 22, 2008 - 3pm Central time zone
Reviewer: A reader
<i>if you want overlaps, you only need:
(start_date <= :p_ending) AND (end_date >= :p_beginning)
</i>
did i say that overlaps are a problem? not at all - it's the other query that takes avdantage of
varchar storage:
<i>
You had:
select * from sb_sample where
strt_dt||'z'>=:date_str_oparam_from and
strt_dt<=:date_str_param_to||'z' </i>
so please - show me how to rewrite this simple query using dates instead of varchar.
Followup February 22, 2008 - 3pm Central time zone:
tell us what that query does.
put it out in specification form. I'm not following the 'z' logic frankly.
because you know, it looks like "find me things that overlap with my range", apparently - it isn't.
In fact, it is what you said:
3) select records whose dates fall within a given range (used for monitoring changes of tha data).
eg:
select * from sb_sample where strt_dt||'z'>=:date_str_oparam_from and
strt_dt<=:date_str_param_to||'z'
(in 99% of cases the query is run with :date_str_param_to set to current date)
taking what you said in that, my predicate is what you use with dates.

February 22, 2008 - 3pm Central time zone
Reviewer: dobromil kautsky na stroji husgvarna
ok, an example:
the user enters a range of dates: "from" and "to"
(again - they can enter just a year, year+month, date, date with time - but that's not important
now)
now they want all records whose start_date (the same goes for end dates) is in that range, so, if
they enter
'2006-07-21' - '2007-02-03'
they want all records whose start_date was between those two dates (including records where the
start_date is '2006', '2006-07',etc.)
or they can enter '2006-10' - '2007'
and the same applies.
so if the date is '2006' it must be returned for all ranges that overlap with this year at least by
a second.
Followup February 22, 2008 - 4pm Central time zone:
and that is what my predicate would return.
Your example has a from/to
Your front end gets a from/to
Your front end adjusts those from/to (just like you add your 'z')
So, if they enter from 2006 to 2007, then yes, your front end would be
2006-01-01-00:00:00 2007-12-31-23:59:59
I see that as being no different than 'z', your front end accepts data, and does the right thing - regardless.
so, if you have dates, this is not hard - certainly *not impossible or impracticable'
but like I said, you seem extremely happy with what you have, I'm not going to change that - you have 'z', you like it.
I look at this and say "the front end would easily be able to accept input dates - validate these inputs - and bind the correct value.
Heck, you could even bind STRINGS if you wanted.... A simple CASE statement in the where clause would convert them.
for the start date and then a different one for end date
case when length(:bind) = 4 /* must be yyyy */ then to_date ....

February 22, 2008 - 4pm Central time zone
Reviewer: dobromil kautsky na stroji husgvarna
it has nothing to do with the front end and how the parameters are sent
look:
alter session set nls_date_format = 'yyyymmddhh24miss';
create table dates (
val_txt varchar2(20) not null,
val_dt date not null
);
insert into dates values ('2006','20060101');
insert into dates values ('200610','20061001');
insert into dates values ('20061021','20061021');
commit;
-- user input:
-- from = '2006-02-01', to '2006-10-02'
-- correct result:
select *
from dates
where val_txt||'z'>='20060201'
and val_txt<'20061002'
-- wrong result
select *
from dates
where val_dt between '20060201' and '20061002235959'
it's all about the need for '2006' to fullfill "between '20060201' and '20061002'"
i can do that with dates - but a simple >= becomes long chain of "or's"
Followup February 22, 2008 - 4pm Central time zone:
I said - the database table would always have
a) start_date
b) a way to derive END DATE - either by putting in there or putting an interval, you choose.
so, the table would have start_date, end_date
never 2006
it would have the first day of 2006, last day of 2006 - or first day of 2006 and an interval of one year.
AND IT IS ONE AND - no ORS
I'm not getting it.
start_date
end_date/interval
simple where clause
done.

February 23, 2008 - 3am Central time zone
Reviewer: dobromil kautsky na stroji husgvarna
i'm so stupid!
i think we still misunderstand each other - i have no problem with representing the interval -
start_date, end_date, but to work with the one column (be it start_date, or end_date).
when i need to filter records based on their start_date (it's not related to end_Date or any other
columns) - however i store tha start_date representing '2006' i need to make sure, that when i
compare this column to another date value with ">" or ">=" (like in start_date>=:param), this
start_date must not act as 2006-01-01 but as 2006-12-32 23:59:59, but at the same time when i
compare it with "<" or "<=" it must act as 2006-01-01.
(like in the example where the user wanted all records whose start_date is between '2006-02-01' and
'2006-10-02' - when they search this interval, they want to see the records with start_date =
'2006' too, which is '2006-01-01' when stored as a date and does not match this condition)
i was so biased agianst having each date (that needs this variable precision) in two columns, that
not finding a simple soliution in sql i did not even think of using a function, on which i could
build an index (which is what i do with strt_dt||'z' anyway) and then use
getlastsecof(start_date,start_date_precision)>=:param
just like i used strt_dt||'z'
maybe i needed to hear that storing the date in two columns is ok (more ok than using the string)
to be able to see the obvious.
thanks :-)
Followup February 24, 2008 - 11am Central time zone:
... but to work with the one column (be it
start_date, or end_date). ..
and I find that to be a bad practice. You are encoding multiple attributes into a single string, needing to use the "z" trick. It is all tricks and all not a good practice.
When you start encoding data in an attribute - you are doing something not good in general (in fact, I'm hard pressed to come up with a case whereby encoding data magically into an attribute was a good thing - I cannot).
You have to teach people how to query your data. That in itself is a bad thing. This 'z' trick - is just that, a neat trick - but it indicates a deeper issue - the data is munged in there in a tricky way.
That is why I do not like it, I like straightforward, simple, correct...
10g and varchar2 optimization
June 3, 2008 - 8pm Central time zone
Reviewer: Andres from Estonia
about the original answer and especially the "more direct proof" above (http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1542606219593#28397955862861 ).
It seems, that the line whether the amount of memory is allocated at once or dynamically has moved in 10g from 1999/2000 to 4000/4001 (despite the documentation still claims it to be 1999/2000). I mean, declaring
x varchar2(4000) := 'x'; takes 4000 bytes of memory
and
x varchar2(4001) := 'x'; takes 1 byte.
Now, when in plsql dealing with varchar2(4000) columns that is 50% emtpty, 49% contain less than 100 bytes and 1% several thousand bytes, it would be more wize to declare as variables varchar2(4001) instead of table.column%type?
Followup June 4, 2008 - 9am Central time zone:
this is why I like to post these snippets - things change and it is nice to be able to verify that the advice of yesterday still does OR does not apply...
as for the last bit, it'll always be a trade off between performance and memory.
4001 - allocate less overall memory, but do so dynamically - might impose a runtime ping
4000 - allocate more overall memory, but run as fast as you can..
Char vs. varchar storage for Null
June 19, 2008 - 7pm Central time zone
Reviewer: Nitin from Boston, MA USA
Tom, This is extremely informative - but I couldn't definitively derive the answer to something
that's key to some database size issues we're facing. I get that character data pads up to the
defined size - so characters are bad. But then there are third party apps that you can't change
over to varchar :-)
So the question is - if I predominantly only have NULL values in a character column, will it still
take up the entire padded space storage wise? Or are NULLs a special condition under which char and
varchar take up the same amount of storage space on disk?
Followup June 19, 2008 - 8pm Central time zone:
if null, they will consume just the space needed to indicate "null"
ops$tkyte%ORA10GR2> create table t ( x char(1000) );
Table created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert into t select null from all_objects;
49738 rows created.
ops$tkyte%ORA10GR2> commit;
Commit complete.
ops$tkyte%ORA10GR2> exec show_space( 'T' );
Unformatted Blocks ..................... 0
FS1 Blocks (0-25) ..................... 0
FS2 Blocks (25-50) ..................... 0
FS3 Blocks (50-75) ..................... 0
FS4 Blocks (75-100)..................... 5
Full Blocks ..................... 75
Total Blocks............................ 88
Total Bytes............................. 720,896
Total MBytes............................ 0
Unused Blocks........................... 0
Unused Bytes............................ 0
Last Used Ext FileId.................... 4
Last Used Ext BlockId................... 22,753
Last Used Block......................... 8
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> truncate table t;
Table truncated.
ops$tkyte%ORA10GR2> insert into t select 'x' from all_objects;
49738 rows created.
ops$tkyte%ORA10GR2> exec show_space( 'T' );
Unformatted Blocks ..................... 0
FS1 Blocks (0-25) ..................... 2
FS2 Blocks (25-50) ..................... 0
FS3 Blocks (50-75) ..................... 1
FS4 Blocks (75-100)..................... 4
Full Blocks ..................... 7,103
Total Blocks............................ 7,296
Total Bytes............................. 59,768,832
Total MBytes............................ 57
Unused Blocks........................... 64
Unused Bytes............................ 524,288
Last Used Ext FileId.................... 4
Last Used Ext BlockId................... 56,457
Last Used Block......................... 64
PL/SQL procedure successfully completed.
Char Vs Varchar2
January 8, 2009 - 4am Central time zone
Reviewer: A reader
Hi Tom,
Could you please let me know, if there is any benefit in having CHAR(1) instead of VARCHAR2(1) in
the database? i.e., in relation to performance, memory structure, execution plan etc.
Regards,
Followup January 8, 2009 - 9am Central time zone:
nope, they would be identical.
so my advice is "do not use char(), it would just cause confusion, pretend char() does not exist"
testcase for timestamp datatype
May 25, 2009 - 6am Central time zone
Reviewer: jvs from India
Dear Tom,
good day to you as always, if you can please comment on the below test case to proof that timestamp datatype should be used to store timestamp values and not varchar2 datatype.
create table testts (ts_str varchar2(24),ts_ts timestamp);
insert into testts select to_char(to_timestamp('31121995135022833459','ddmmyyyyhh24missff') +
numtodsinterval(rownum,'DAY'),'ddmmyyyyhh24missff6') ts_str,
to_timestamp('31121995135022833459','ddmmyyyyhh24missff') + numtodsinterval(rownum,'DAY') from
all_objects;
commit;
create index idx_ts_ts on testts(ts_ts);
create index idx_ts_str on testts(ts_str);
execute dbms_stats.gather_table_stats(user,'TESTTS',estimate_percent => null,cascade => true);
select * from testts where ts_str between '01011996135022833459' and '02021996135022833459';
-- Card and consitent gets will be high
select * from testts where to_timestamp(ts_str,'ddmmyyyyhh24missff9') between
to_timestamp('01011996135022833459','ddmmyyyyhh24missff9') and
to_timestamp('02021996135022833459','ddmmyyyyhh24missff9');
-- Card may be fine but consitent gets will be high
select * from testts where ts_ts between to_timestamp('01011996135022833459','ddmmyyyyhh24missff')
and to_timestamp('02021996135022833459','ddmmyyyyhh24missff');
-- Card and consistent gets will be fine.
I have not pasted the output of auto trace, but overall is the test case fine.
Thanking you for your time on this.
Regards,
jvs
Followup May 26, 2009 - 9am Central time zone:
... good day to you as always, if you can please comment on the below test case to proof that timestamp datatype should be used to store timestamp values and not varchar2 datatype. ...
you need no testcase for that.
it is called common sense, logic, doing it correctly.
if you use a string/number to store a date you will get corrupt data some day (you will, not "you might", but rather YOU WILL. Someone will insert feb-29, 2009 - some day. IT WILL happen)
if you use a string/number to store a date, you will consume MORE storage. It takes more space.
If you use a string/number to store a date, you will be plagued with conversion errors.
if you use a string/number to store a date, you WILL confuse the optimizer - causing it to incorrectly estimate cardinalities. This will result in incorrect plans and poor performance (funny, you know what type of systems typically do this extremely poor practice? data warehouses - precisely the systems whereby good query plans are VITAL)
ops$tkyte%ORA10GR2> create table testts (ts_str varchar2(24),ts_ts timestamp);
Table created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert into testts select
2 to_char(to_timestamp('31121995135022833459','ddmmyyyyhh24missff') +
3 numtodsinterval(rownum,'DAY'),'ddmmyyyyhh24missff6') ts_str,
4 to_timestamp('31121995135022833459','ddmmyyyyhh24missff') +
5 numtodsinterval(rownum,'DAY')
6 from dual connect by level <= 65000;
65000 rows created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> commit;
Commit complete.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create index idx_ts_ts on testts(ts_ts);
Index created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create index idx_ts_str on testts(ts_str);
Index created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> execute dbms_stats.gather_table_stats(user,'TESTTS');
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> @at
ops$tkyte%ORA10GR2> column PLAN_TABLE_OUTPUT format a72 truncate
ops$tkyte%ORA10GR2> set autotrace traceonly explain
ops$tkyte%ORA10GR2> select * from testts where ts_str between '01011996135022833459' and
'02021996135022833459';
Execution Plan
----------------------------------------------------------
Plan hash value: 3830092671
------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2181 | 69792 | 28 (4)| 00:00:
|* 1 | TABLE ACCESS FULL| TESTTS | 2181 | 69792 | 28 (4)| 00:00:
------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("TS_STR"<='02021996135022833459' AND
"TS_STR">='01011996135022833459')
full scan, prompted by the optimizer guessing 2,181 rows... there are lots of possible strings
between those two strings...
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select * from testts where to_timestamp(ts_str,'ddmmyyyyhh24missff9') between
2 to_timestamp('01011996135022833459','ddmmyyyyhh24missff9') and
3 to_timestamp('02021996135022833459','ddmmyyyyhh24missff9');
Execution Plan
----------------------------------------------------------
Plan hash value: 3131276641
------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 163 | 5216 | 34 (21)| 00:00
|* 1 | FILTER | | | | |
|* 2 | TABLE ACCESS FULL| TESTTS | 163 | 5216 | 34 (21)| 00:00
------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(TO_TIMESTAMP('01011996135022833459','ddmmyyyyhh24missff9')
<=TO_TIMESTAMP('02021996135022833459','ddmmyyyyhh24missff9
2 - filter(TO_TIMESTAMP("TS_STR",'ddmmyyyyhh24missff9')>=TO_TIMESTAMP
('01011996135022833459','ddmmyyyyhh24missff9') AND
TO_TIMESTAMP("TS_STR",'ddmmyyyyhh24missff9')<=TO_TIMESTAMP
22833459','ddmmyyyyhh24missff9'))
full scan prompted by lack of index - and if you say "function based index" then I say "so, you
store a string and then to_timestamp it in an index - why? why???? why do it wrong??
and the estimated card= value is still off, better, but off
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select * from testts where ts_ts between
2 to_timestamp('01011996135022833459','ddmmyyyyhh24missff')
3 and to_timestamp('02021996135022833459','ddmmyyyyhh24missff');
Execution Plan
----------------------------------------------------------
Plan hash value: 2730030766
------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost
------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 320 | 3
|* 1 | FILTER | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| TESTTS | 10 | 320 | 3
|* 3 | INDEX RANGE SCAN | IDX_TS_TS | 10 | | 2
------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(TO_TIMESTAMP('01011996135022833459','ddmmyyyyhh24missff')<
AMP('02021996135022833459','ddmmyyyyhh24missff'))
3 - access("TS_TS">=TO_TIMESTAMP('01011996135022833459','ddmmyyyyhh24
AND "TS_TS"<=TO_TIMESTAMP('02021996135022833459','ddmmyyyy
use the right type, get the right cardinality, get the right plan
ops$tkyte%ORA10GR2> set autotrace off
ops$tkyte%ORA10GR2> select count(*) from testts where ts_ts between
2 to_timestamp('01011996135022833459','ddmmyyyyhh24missff')
3 and to_timestamp('02021996135022833459','ddmmyyyyhh24missff');
COUNT(*)
----------
33
thanks for your inputs
May 26, 2009 - 12pm Central time zone
Reviewer: jvs from India
Dear Tom,
good day to you, thanks a lot for your time and help on this, I can now prove with the test case to
people who don't have the common sense to understand proper data type usage.
Thanks and Regards,
jvs
char > varchar2 Referencial integrity
May 26, 2009 - 5pm Central time zone
Reviewer: FMedina from Spain
Hello Tom,
I have the following problem:
-- 1/
CREATE TABLE FIELD_CHAR_MASTER(FIELD_PARENT CHAR(12));
-- 2/
CREATE TABLE FIELD_VARCHAR2_DETAIL(FIELD_CHILD VARCHAR(12));
-- 3/
ALTER TABLE FIELD_VARCHAR2_DETAIL ADD (
CONSTRAINT MASTER_DETAIL FOREIGN KEY (FIELD_CHILD)
REFERENCES FIELD_CHAR_MASTER (FIELD_PARENT));
-- 4/
ORA-02267: column type incompatible with referenced column type
The FIELD_CHAR_MASTER table is big and it´s used for a lot of objects and querys. It has other
child tables with CHAR(12) datatype. It not "easy" alter these table.
Is there any way to implement that referencial integrity?
Should I use de CHAR datatype in the future if it´s necesary another child table?
[ Database Version: 9.2.0.8.0 ]
Thanks in advanced
Followup May 27, 2009 - 8am Central time zone:
you should never have used the char() type at all - but since you did and since it is the parent, you'll need to modify your CHILD table to have an appropriate type.
Of course you should use a char type in the future if referential integrity is to be supported, you have no other choice (other than to change everything else to be a varchar2)
CHAR to VARCHAR in Stored Procedure param?
June 23, 2009 - 2pm Central time zone
Reviewer: Nigel Olding from California
When I changed my stored procedure parameter to use the datatype of the column in the table from
which the data was being retrieved, my CHAR(1) column was returned as a maximum-sized VARCHAR2 to
my calling PL/SQL.
Using this table...
create table NTEST_CHAR1
(
COLUMN1 CHAR(1) not null,
COLUMN2 CHAR(1) not null,
COLUMN3 CHAR(1) not null
);
...with this data...
insert into ntest_char1 n
values ('1','A','a');
...this procedure...
CREATE OR REPLACE PROCEDURE test_char1(p_col2 OUT ntest_char1.column2%TYPE) IS
BEGIN
SELECT n.column2 INTO p_col2 FROM ntest_char1 n WHERE rownum = 1;
END test_char1;
...returns a max-sized VARCHR2...
begin
declare l_col2 varchar(32000);
begin
test_char1(l_col2);
dbms_output.put_line(length(l_col2));
end;
end;
Output is:
32000
Is this the expected behaviour? Shouldn't it be 1 character?
If I declare the stored proc param as OUT VARCHAR2, it is returned as 1 character...
Oracle 9.2.0.1.0
Thanks,
Followup June 26, 2009 - 9am Central time zone:
plsql parameters are always unconstrained types - this is the way is was designed to work, yes.
Reason #3213 to never never never use the CHAR type
char(1) vs. varchar2(1)
June 25, 2009 - 5pm Central time zone
Reviewer: David from Austin, TX
Tom,
if char(1) and varchar2(1) are really the same thing in terms of storage and performance, what's
the reason to keep the CHAR data type in Oracle?
Thanks,
David
Followup June 26, 2009 - 10am Central time zone:
none, except the standards.
char(1) vs. varchar2(1)
June 26, 2009 - 1pm Central time zone
Reviewer: David from Austin, TX
when responding to my question about the reasons to keep CHAR data type in Oracle, you said
"none, except the standards.". What Standards are you referring to? Also, do you think it's
appropriate to eliminate the use of CHAR in table design in my own organization? Thanks, David
Followup June 26, 2009 - 2pm Central time zone:
ANSI-ISO SQL Standards
I have written many times "pretend char does not exist, use only varchar2"

June 30, 2009 - 2pm Central time zone
Reviewer: Chuck Jolley from OKC, OK USA
When we upgraded from 8i to 10g we converted all of our char columns and pl/sql variables to
varchar2 because some of our code broke related to chars and it seemed too much trouble to go
through tens of thousands of lines of code and hundreds of tables testing for obscure data
conversion and comparison errors just to upgrade.
Our tables and code base started in Oracle 7.
The process was mostly scriptable and relatively painless.
The only place we had to do hand modifications were a few lines of code in our front end that
expected padded data as a workaround for the behavior of chars in the first place.
Not a high cost to never have to worry about something again.
varchar2 (char) to varchar2 (byte)
July 21, 2009 - 11pm Central time zone
Reviewer: Stuart from Auckland, New Zealand
Hi Tom,
I have a 10.2.0.4 database (character set=UTF8, nls_length_semantics=byte), where the developer
created a table as follows:
create table xxxx
( column_a_id number,
column_b_text varchar2(255 char),
column_c_text varchar2(255 char),
...
...
Unfortunately, the user querying a view on this base table shows the data like it is of CHAR
datatype (with lots of trailing spaces). Researching this, I found the tool has a bug which reads
as follows: (http://support.sas.com/kb/20/088.html)
"When you access an Oracle database from SAS® in single-byte encoding and set the database
characterset to UTF8, the SAS/ACCESS Interface to Oracle might return incorrect byte lengths for
character columns that have lengths defined in CHAR semantics (lengths are usually much larger than
needed).
This problem occurs when using SAS/ACCESS Interface to Oracle for CHAR and VARCHAR2 columns and
reflects byte lengths based on the database characterset and not the client-side (SAS) encoding.
For example, for a VARCHAR2(3 CHAR) column in an Oracle UTF8 database, the byte length surfaced in
SAS is tripled (9). So all the CHAR and VARCHAR2 column lengths are tripled in SAS even though the
client-side encoding is single byte. "
So Oracle is fine. Doing a length() function on the data shows the correct length, which confirms
I'm hitting this bug.
Rather than adding the suggested hotfix to all the PC's, or changing the table structure to
varchar2(byte), I'd prefer to update the existing view on this table.
Question 1:
Is it possible to do a conversion in the SQL for the view creation where it converts varchar2 char
to byte?
Question 2:
Although our developer is adamant we don't store any multibyte data, I'm loathe to go about
changing the varchar2 datatype semantics across our schema. Just out of interest, if I did go
down this track (I'd install the hotfix rather than this option), would there be any ramifications
apart from the possibility of truncating any data that might be multi-byte?
If I have to install the hotfix, then so be it.
Followup July 26, 2009 - 6am Central time zone:
Question 1:
Is it possible to do a conversion in the SQL for the view creation where it
converts varchar2 char to byte?
when you tested, what did you discover? In any case - you would need to substr AND cast in the view. varchar2(255 char) in a UTF8 database can be a lot larger than varchar2(255 byte)
Although our developer is adamant we don't store any multibyte data, so, they have run a scan of the entire database right? Else, they do not know.
You would/could truncate data - yes.
varchar2 (char) to varchar2 (byte)... continued
July 22, 2009 - 12am Central time zone
Reviewer: Stuart from Auckland, New Zealand
Further to the previous post,...
I've seen a few base tables which have a mix of:
column_name_a varchar2(99 byte)
column_name_b varchar2(99 char)
So it looks like the hotfix won't work, as it specifically states in the caveats that the hotfix
will fail with tables created with mixed semantics.
So it looks like I either
a. update the tables to be varchar2 byte semantics
b. somehow update the views to convert varchar2 char to byte semantics
If you can advise on my previous question, it would be much appreciated.
Followup July 26, 2009 - 6am Central time zone:
it'll be up to you as to the approach you want to take on this one.
Modifying varchar2 column to increase size
September 11, 2009 - 2pm Central time zone
Reviewer: KK from india
Tom,
We have a requirement to modify a varchar2 column size increased. The table is essentially a
partitioned table and billions of rows are there. As varchar2 allocates space according to inserted
value, does this operation will affect only data dictionary. I mean will this change will be quick
enough to modify data dictionary alone and not modify any tablespaces which holds the data right
now.
Followup September 14, 2009 - 1pm Central time zone:
it won't touch any existing data, it just modifies the data dictionary.
strange behaviour or varchar and char
October 7, 2009 - 7am Central time zone
Reviewer: MAYANK GUPTA from Delhi,INDIA
Hi Tom,
I used these following steps:--
drop table a cascade;
create table a (a varchar(5),b char(5));
insert into a values('abc','abc');
insert into a values('xyz ','xyz '); --trailing space after z in both cases.
select * from a where a=b;
POSTGRES:
a | b
------+-------
abc | abc
xyz | xyz
(2 rows)
ORACLE 10g:
No data Found
and if I am using same datatype then oracle give results
Regards:
Mayank Gupta
cont: 09711065568
Followup October 8, 2009 - 7am Central time zone:
<quote src=Expert Oracle Database Architecture>
It is for this reason – the fact that a CHAR/NCHAR is really nothing more than a VARCHAR2/NVARCHAR2 in disguise – that makes me of the opinion there are really only two character string types to ever consider – VARCHAR2 and NVARCHAR2. I have never found a use for the CHAR type in any application. Since a CHAR type always blank pads the resulting string out to a fixed width, we discover rapidly that it consumes maximum storage both in the table segment as well as any index segments. That would be bad enough, but there is another important reason. It creates confusion in applications that need to retrieve this information (many cannot “find” their data after storing it). The reason has to do with the rules of character string comparison, and the strictness with which they are performed. Let’s use ‘Hello World’ string in a simple table to demonstrate:
ops$tkyte@ORA10G> create table t
2 ( char_column char(20),
3 varchar2_column varchar2(20)
4 )
5 /
Table created.
ops$tkyte@ORA10G> insert into t values ( ‘Hello World’, ‘Hello World’ );
1 row created.
ops$tkyte@ORA10G> select * from t;
CHAR_COLUMN VARCHAR2_COLUMN
-------------------- --------------------
Hello World Hello World
ops$tkyte@ORA10G> select * from t where char_column = ‘Hello World’;
CHAR_COLUMN VARCHAR2_COLUMN
-------------------- --------------------
Hello World Hello World
ops$tkyte@ORA10G> select * from t where varchar2_column = ‘Hello World’;
CHAR_COLUMN VARCHAR2_COLUMN
-------------------- --------------------
Hello World Hello World
So far they look identical, they appear to be the same – but in fact, there has been some implicit conversions happening – the CHAR(11) literal was promoted to a CHAR(20) and blank padded when compared to the CHAR column. We see for a fact that these two strings are materially different:
ops$tkyte@ORA10G> select * from t where char_column = varchar2_column;
no rows selected
They are not equal to each other. We would have to either blank pad out the VARCHAR2_COLUMN to be 20 bytes in length or we would have to trim the trailing blanks from the CHAR_COLUMN:
ops$tkyte@ORA10G> select * from t where trim(char_column) = varchar2_column;
CHAR_COLUMN VARCHAR2_COLUMN
-------------------- --------------------
Hello World Hello World
ops$tkyte@ORA10G> select * from t where char_column = rpad( varchar2_column, 20 );
CHAR_COLUMN VARCHAR2_COLUMN
-------------------- --------------------
Hello World Hello World
NOTE: there are many ways to blank pad the varchar2_column, such as using the CAST() function.
The problem comes in with applications. Many of them will in fact utilize varying length strings when they bind inputs – with the resulting “no data found” that is sure to follow. For example:
ops$tkyte@ORA10G> variable varchar2_bv varchar2(20)
ops$tkyte@ORA10G> exec :varchar2_bv := ‘Hello World’;
PL/SQL procedure successfully completed.
ops$tkyte@ORA10G> select * from t where char_column = :varchar2_bv;
no rows selected
ops$tkyte@ORA10G> select * from t where varchar2_column = :varchar2_bv;
CHAR_COLUMN VARCHAR2_COLUMN
-------------------- --------------------
Hello World Hello World
So here, the search for the VARCHAR2 string worked but the CHAR column did not. The varchar2 bind variable will not be promoted to a CHAR(20) as the character string literal is. At this point, many programmer gets of the opinion that “bind variables don’t work, we have to use literals”. That would be a very bad decision indeed. So, the solution is to bind using a CHAR type:
ops$tkyte@ORA10G> variable char_bv char(20)
ops$tkyte@ORA10G> exec :char_bv := ‘Hello World’;
PL/SQL procedure successfully completed.
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> select * from t where char_column = :char_bv;
CHAR_COLUMN VARCHAR2_COLUMN
-------------------- --------------------
Hello World Hello World
ops$tkyte@ORA10G> select * from t where varchar2_column = :char_bv;
no rows selected
Only partially – if you mix and match VARCHAR2 and CHAR, you’ll be running into this issue constantly. Not only that, but the developer is now having to consider the field width into their applications. If they opt for the rpad() trick to convert the bind variable into something that will be comparable to the CHAR field (it is preferable of course to pad out the bind variable, rather than TRIM the database column!). Now if the size of the field changes – the application is impacted, it must change it’s field width.
It is for these reasons – the fixed width storage, which tends to make the tables and related indexes much larger than normal, coupled with the bind variable issue that makes me avoid the CHAR type in all circumstances. I cannot even make an argument for it in the case of the 1 character field – for in that case, it is really of no material difference. The VARCHAR2(1) and CHAR(1) are identical in all aspects. There is no compelling reason to use the CHAR type in that case and in order to avoid any confusion – I just say “no” for even the CHAR(1) field.
</quote>
pga_aggregate_target
October 12, 2009 - 4am Central time zone
Reviewer: Gurudarshan G P from india
Hi Tom,
I would like to ask you that where the pga will be allocated when the userprocess
started,either in the serverside or in the client side. And if we dont set pga_aggregate_target
value then how to check that the allocated pga memory.
Awaiting for ealiest relpy.
What about MD5 sums that are always 32 characters?
November 18, 2009 - 12pm Central time zone
Reviewer: Owen from Denver, USA
I'm one of those ignorant developers who has the misguided notion of char being better for fixed
lengths. Please set me straight:
We're storing MD5 checksums which are always, no matter what, exactly 32 characters. Is it still
better to use varchar2(x) instead of char(32)? The column is currently defined as a varchar2(300)
because the dba says 'it doesn't matter just as long as it is long enough' - is this true? Even if
varchar2 wins over char, it seems to me it needs to be varchar2(32) because 32 needs only 5 bits to
store/handle the length but 300 needs 9. We use WHERE clauses against this column and there are
over 100M rows so we need so squeeze the performance as much as possible. Right now it's slower
than we need it to be.
PS - There's no index because the values are unique and a lot of other columns need to be returned
so such an index would have 1:1 with the table and take practically just as much space. Is this a
correct analysis?
Thanks for any tips!
Followup November 23, 2009 - 3pm Central time zone:
a char(n) is simply a varchar2(n) blank padded to be always N characters.
So, if you store something that is always 32 characters - a char(32) and a varchar2(32) are *identical*. Identical on disk, identical in memory.
Which leads me to turn the question around - is it better to use a char(32)? What do you think you would gain by it?
I believe you should not use char(n) ever - for the simple reason that is makes you forget they exist and you are not ever tempted to use them incorrectly.
forget about bits, nothing happens at the bit level here. The leading length indicator will be 1 byte for 255 or less and 2 bytes for anything else.
I believe your DBA is mistaken however in their statement - you want a datatype to be as RESTRICTIVE as possible, the only thing you'll get by using varchar2(300) to hold a field that can only be varchar2(32) is - someday someone will put 300 characters in there - and then what? data integrity - goodbye.
The stupid thing is - we could alter a varchar2(32) to be a varchar2(300) if the need to do so arises. However, to alter a varchar2(300) to the correct length of 32 would take a reorganization of the entire table. 300 was wrong, 32 is correct and a char(32) would provide no benefit over a varchar2(32) and since you never really want to remember char exists - do not use it.
... PS - There's no index because the values are unique and a lot of other columns
need to be returned so such an index would have 1:1 with the table and take
practically just as much space. Is this a correct analysis?
...
if this is an md5 checksum and the values must be unique - I don't really CARE how much space it takes. What will happen to your application when (not if, but when) a duplicate value occurs. You know that the space of 32 character md5 checksums is FINITE right (it is big, but finite) and your inputs - can and will generate duplicate hashes - it will happen someday - and then what?
If you have a unique constraint - unique it. Space be darned.
If you access this data by the checksum - index it for performance reasons.
If you need this to be unique - put the constraint on.
|