Space used by RAW
Mohan K, July 22, 2003 - 4:34 am UTC
I have a very latge table of two fields ech field containing positive integers upto 50000. If I conver the number datatype to RAW(2) then each field will ocuupy only 2 bytes. So each block can store more number of records and the number of records fetched into SGA by a query will be more.
I read in Oracle document that RAW datatype is same as varchar2 since it is of variable length. What will be additional overhead of variable length in storage.
Is it really of any use to convert into RAW datatype or shall I convert into hexdecimal( uses 4 bytes) or the number datatype itself is the same as the above methods.
Mohan
July 22, 2003 - 8:29 am UTC
why not use a number?
how will use manipulate this thing otherwise? sum it? avg it? min/max it? ANYTHING to it.
raws in keys
max, January 29, 2005 - 8:09 am UTC
hi tom,
i 've two questions regarding datatypes within keys/indices:
1st: we want to use SYS_GUID's return values (16 bytes raw) as primary keys. would you recommend to use a number instead and introduce an alternate key?
2nd: would you recommend to *NOT* generate such an unique universal identifier by a java routine inside the database (pro: the exact same code could be run outside that certain database in case it is not responsible for generating such key)?
January 29, 2005 - 8:59 am UTC
1) if you want to use sys_guid, you won't be using a number. I'm confused?
do you need a sys_guid() type of identifier?
2) i would recommend doing as little as possible outside the database, since when java becomes "old fashioned" like C++, C, Cobol, RPG and so on before it, you want to still be able to use the data without having to rip logic out of code no one really understands anymore.
max, January 29, 2005 - 10:33 am UTC
as for 1:
yes - we have such requirements. but we usually use surrogate keys of type number. is it - in general - a bad idea to "switch" to the uid value (or some 'large' varchar) instead for RI purposes?
as for 2:
but we'll be fine at least as long as ORACLE supports JAVA ;o) our goal is to have a uniform way to generate those identifiers across all platforms and applications. but could such a java routine for key value generation become a bottleneck (compared to sequences)?
January 29, 2005 - 11:33 am UTC
1) no, it is not.
2) it goes the other way, the language has to support the database....
The java routine would become a bottleneck when someone wants to use something other than java. I would use either a sequence or sys_guid() in the database -- do not reinvent very well designed wheels.
Not "if" but "when". That, along with death and taxes, are assured. The language, it'll change. I don't know how many "this is the last language/programming paradigm/methodology/technique/whatever ever" I've used. :)
max, January 29, 2005 - 1:08 pm UTC
1: our only/main concern is that 16 (or even more bytes) of raw or varchar content would consume much more space than a number type would and index maintenance would be slowed down ...
2: so java calls wouldn't introduce problem with say parallel inserts and such?
btw: is the generation algorithm of SYS_GUID published?
January 30, 2005 - 9:02 am UTC
1) will it be different? of course. The indexes won't be "right hand" indexes anymoer (the guid will tend to be randomly inserted.
So hey, on the bright side, less contention for that right hand side!
16 bytes is so so so very small in the year 2005 when you think about it. Numbers are not stored in 4 bytes, they are stored in the equivalent of a varying length character string (2 digits/byte plus sign, exponent and such). Meaning numbers are not small.
But at the end of the day -- you gotta do what your requirments demand. It goes back to my question "do you need them"? answer: "Yes". After that, it is just talk -- you have a stated requirement to do this.
2) I'm not discussion java, it would be wrong. sys_guid(), written -- there, done. Just use it.
max, January 30, 2005 - 2:03 pm UTC
thank you for your advice.
please let me ask two last questions:
1st: what do you think about storing the 32 byte hex representation of the SYS_GUID instead of the just 16 bytes raw? one difference i 've read is that raws are not subject to character conversions (e.g. with import/export). do you see other pros & cons (apart from space consumption)?
2nd: why isn't 'GUIDHex' not the same as both the others in the following sample?
create table Test as
select GUIDRaw, rawtohex( GUIDRaw ) GUIDHex
from ( select SYS_GUID() GUIDRaw from dual ) ;
alter table Test add GUIDChar varchar( 2000 ) ;
update Test set GUIDChar = GUIDRaw ;
select * from Test ;
GUIDRAW
--------------------------------
960169C170F44947A348CE9A88115D43
GUIDHEX
--------------------------------
77BF30DC57214E9B931D1212E0747C7E
GUIDCHAR
--------------------------------
960169C170F44947A348CE9A88115D43
January 30, 2005 - 2:33 pm UTC
well, the thing is a raw, raw is not "scary", no more than "number" or "date" -- they are not stored as strings either -- we use to_char on that binary raw data to format it into something human beings enjoy seeing on the screen (use dump() on a number or date -- it is just binary stuff that we to_char)
you have hextoraw and rawtohex to do the same.
in answer to your question, what do you think the chances that:
select GUIDRaw, rawtohex( GUIDRaw ) GUIDHex
from ( select SYS_GUID() GUIDRaw from dual ) ;
could be executed using view merging as:
select sys_guid(), rawtohex( sys_guid() ) from dual;
:)
A function can be called as often as sql enjoys calling it.
A sequence on the other hand has very certain, documented attributes ascribed to it.
On my 10g system, all three were the same. In fact, in 9i it was too. but, it is easy to see how they could be different since sys_guid() the function is purposely "non-deterministic".
yes - sure - view merging took place
max, January 31, 2005 - 2:33 am UTC
thank you very much for pointing that out. applying a NO_MERGE hint to my query solved the "issue".
just to be curious: what could have forced my 9iR2 (9.2.0.6.0 on Win XP) to merge whereas your's did not?
January 31, 2005 - 8:13 am UTC
different optimizer settings would be one thing that could.
but expect it, and remember a hint should be considered a hint, not a processing directive.
me again ...
max, January 31, 2005 - 2:24 pm UTC
i guess i have now to convince my teammates that raw is *the* option -- but i can't come up with a sample showing any bad impacts of their favorite varchar (e.g. how character conversion could become an issue).
could you please help?
January 31, 2005 - 2:27 pm UTC
it won't be an issue?
but if you are worried about 16 bytes of raw, why are you not really afraid of 32 bytes of string data?
legacy systems ...
max, February 03, 2005 - 8:55 am UTC
... that (according to reports) can't handle such raw values very well forced a character representation to be chosen there ...
but one could use ORACLE's built-ins hextoraw and rawtohex in order to get RAW's stored in ORACLE but strings exchanged.
wouldn't that be OK?
February 03, 2005 - 2:20 pm UTC
data, once extracted from the database, can be represented however you please.
so for these systems, rawtohex() it on the way out (just like you use to_char on a date, or to_char on a number -- be it implicitly or explicitly - they don't take the 7byte oracle date format either do they...) and hextoraw() it on the way in (just like you use to_date or to_number on ascii data...)
What does these character values ?
Parag Jayant Patankar, October 11, 2005 - 8:50 am UTC
Hi Tom,
We are analyzing a problem for junk chracters in a application and showing following data
a9490 RAWTOHEX(A9490)
--------- ----------------
0009965 3030393936352020
0009965 3030393936350000
I know first record "2020" means 2 spaces. What does "0000" mean ? I have checked data with inserting null values but rawtohex does not show as "0000"
I want to know equivalant character of hex value "0000" how can I find out by using what function ? for eg. I want to know character of hexvalue "6060" how can I find out ?
regards & thanks
pjp
October 11, 2005 - 3:29 pm UTC
chr(0) is 00 - a C null terminator is what it looks like. do you use C?
if you are a single byte, you can try:
ops$tkyte@ORA10GR1> select chr( to_number( '60', 'xx' ) ) from dual;
C
-
`