do NOT use dbms_obfuscation_toolkit - it is obsolete and frankly - broken.
The result of encrypting something is a binary, raw piece of data. It is not suitable for storing in a varchar2 - this is something learned over and over again the hard way.
If the client character set is NOT identical the database character set - then character set conversion takes place - which will scramble up your encrypted data - destroying it.
You MUST use a raw type.
This is an excerpt from my last book - Expert Oracle Database Architecture volume II
<quote>
Implementing Manual Application Encryption In this and the next two major sections we’ll look at the three approaches a developer/DBA can use to encrypt data in the database. We’ve had an overview of them already; now it is time to look at each in more depth. First up is Manual Application Encryption.
Manual Application Encryption is achieved programmatically using one of two builtin packages:
o DBMS_OBFUSCATION_TOOLKIT: A package supplied with the database since Oracle 8i. This package employs the Data Encryption Standard (DES) and Triple DES (3DES) encryption algorithms only. It is also capable of generating MD5 checksums of data. It can only work with RAW and VARCHAR datatypes. It should not be used for two reasons: it is deprecated as of Oracle 10g and above, replaced by the next package in this list and the name of it is really hard to type and say out loud.
o DBMS_CRYPTO: In addition to being much easier to type and say – this is the currently supported encryption API. It supports a broader range of encryption algorithms including DES, 3DES, Rivest Cipher 4 (RC4) and Advanced Encryption Standard (AES). It is also capable of generating a variety of hashes and message authentication codes (MAC) such as MD5, MD4, SHA-1. It is capable of working with RAW, CLOB and BLOB datatypes.
I will not be discussing the DBMS_OBFUSCATION_TOOLKIT package directly, but everything I say about the DBMS_CRYPTO package will apply to it as well.
Reasons to Avoid The Manual ApproachI firmly believe that if you have the documented need for encrypting data at rest – your best approach is to not use the manual approach. If you do manual application encryption, you are responsible for key management and as we discussed earlier in this chapter in the section on “The Oracle Wallet” we know how code intensive that would be. Implementing a secure key store is a non-trivial exercise and in order to be somewhat sure it is impervious to attack, you would need to expose it to a level of testing that is typically not feasible in a single organization. On the other hand, if you use off the shelf implementations (such as the Oracle wallet) you have knowledge that it is extensively tested and probed on a daily basis.
In addition to the key management issue (which is a large one – to me more than enough to not want to consider it), there is the fact that you have to code the encrypt/decrypt routines and make sure they are hooked into the application at the correct places. Whenever data comes into the database – you are responsible for encrypting the data and whenever it leaves, you are responsible for decrypting it. That comes at the expense of yet more code to be developed as well as a definite performance impact, which we’ll measure shortly.
A third reason to give serious thought before employing manual application encryption is that the application has to be involved in the process – every application that touches this data will have to be involved. That is because the column that will be encrypted must (I stress the word must here) use the RAW datatype as its underlying datatype. If you are to encrypt a LAST_NAME column that was originally defined as a VARCHAR2(70), you will be redefining that column to be a RAW(80). You must use a RAW datatype because the result of encrypting data is a binary set of bytes – they are not characters. You cannot store them in a character datatype such as VARCHAR2. I’ll refer you to Chapter 12 “Datatypes” and the discussion on NLS character set conversion for the reason why this is so. The VARCHAR2 type is subject to character set conversion – and the bytes that one client sees can and will be different from the bytes another client sees in a VARCHAR2 string. When dealing with true character strings – this is a good thing, when dealing with encrypted data – it is definitely a bad thing, it is called “corrupting the data”. Character set conversion would tend to change the encrypted data – making it so you cannot decrypt it anymore (that is, you just lost it forever). I’ve seen that happen more than once in real life – the developers used a VARCHAR2 to store encrypted data – and it was not detected for a while (that the conversion had taken place) effectively destroying that bit of data. If it had been detected early, near the point of the corruption being introduced – we could have resurrected the data from backups (before the implicit conversion corrupted it) but it would have been a non-trivial task.
Also – note that I said that a VARCHAR2(70) would require a RAW(80). This is because encrypted data is stored in multiples of 16 bytes. It takes five 16 byte pieces of raw data to store 70 single byte characters (or 71, 72 and so on). So unless your existing data attribute byte length is divisible by 16 – you’ll be computing a new field length. Again, a mistake I’ve seen time and time again is for a developer to just try to store the encrypted data in the existing VARCHAR2(70) field – it won’t work because of the implicit character set conversion that will corrupt the data some day and it won’t work because it simply won’t fit in that field. Sure, some of the data will – any of the data that was 64 bytes or less before encrypting would fit – but a value that was 65 bytes or more would need 80 bytes after encrypting. This gets even worse when you start considering multi-byte data!!
Performance Implications of The Manual Approach
The last reason to avoid the manual approach is one of performance. There is a measurable overhead in application-implemented encryption. Any layer of code we as developers add between the client and the database will add some level of overhead – and encryption can be a very expensive one to implement. So, let’s create a package to easily encrypt/decrypt with and compare the performance of a VARCHAR2(30) string stored non-encrypted and again encrypted. We’ll just compare single row INSERT performance, bulk INSERT performance and data retrieval (SELECT) performance. To begin with, we’ll need a package to perform the encryption operations for us, a sample package specification that would just contain two entry points – one to encrypt and the other to decrypt:
ops$tkyte%ORA11GR2> create or replace package encryption_wrapper
2 as
3 function encrypt( p_string in varchar2,
4 p_key in varchar2 )
5 return raw;
6
7 function decrypt( p_raw in raw,
8 p_key in varchar2 )
9 return varchar2;
10 end;
11 /
Package created.
Now for our package body, an implementation might start off like this:
ops$tkyte%ORA11GR2> create or replace package body encryption_wrapper
2 as
3 g_encrypt_typ constant PLS_INTEGER default
4 DBMS_CRYPTO.ENCRYPT_AES256
5 + DBMS_CRYPTO.CHAIN_CBC
6 + DBMS_CRYPTO.PAD_PKCS5;
7
8 function padkey( p_key in varchar2 ) return raw
9 is
10 begin
11 return utl_raw.cast_to_raw(rpad(p_key,32));
12 end;
Note: You may have to have EXECUTE on DBMS_CRYPTO granted to you before using it.
The global variable G_ENCRYPT_TYP simply specifies the type of encryption we want – this would be implementation dependent (that is, something you would figure out). Here I’m going with a standard AES with a 256 bit (32 byte) encryption key. I’ve added a convenience function PADKEY to convert a character string key into a RAW and ensure it is 32 bytes long.
Next is our implementation of the ENCRYPT function. It will begin by converting our string into a RAW type using a ‘safe’ character set. Then it will invoke DBMS_CRYPTO to actually encrypt the string (which is now a RAW) and return the value:
14 function encrypt( p_string in varchar2,
15 p_key in varchar2 )
16 return raw
17 is
18 begin
19 return DBMS_CRYPTO.ENCRYPT
20 ( src => UTL_I18N.STRING_TO_RAW (p_string, 'AL32UTF8'),
21 typ => g_encrypt_typ,
22 key => padkey( p_key ) );
23 end;
Lastly, we implement the DECRYPT routine. This simply decrypts our RAW encrypted data using the same key and then returns the VARCHAR2 representation of that output:
24
25 function decrypt( p_raw in raw,
26 p_key in varchar2 )
27 return varchar2
28 is
29 begin
30 return utl_i18n.raw_to_char(
31 dbms_crypto.decrypt
32 ( src => p_raw,
33 typ => g_encrypt_typ,
34 key => padkey(p_key) ),
35 'AL32UTF8' );
36 end;
37
38 end;
39 /
Package body created.
Now we’ll create a table to test with – a column LAST_NAME to hold the data when it is not encrypted and a column ENCRYPTED_NAME – implemented as a RAW with 2*16 bytes of storage, therefore capable of holding at least 30 bytes of character data. Also, we generate a set of data we can use to test with into a simple table – this is so when we time things later, we’ll be timing just the operation we are performing, not the query against ALL_OBJECTS:
ops$tkyte%ORA11GR2> create table stage
2 as
3 select object_name from all_objects;
Table created.
ops$tkyte%ORA11GR2> create table t
2 ( last_name varchar2(30),
3 encrypted_name raw(32)
4 )
5 /
Table created.
And now we can start inserting – first we’ll do a row by row insert into this table, once with non-encrypted data and again with encrypted data. The non-encrypted insert would look like this:
ops$tkyte%ORA11GR2> declare
2 l_start number := dbms_utility.get_cpu_time;
3 begin
4 for x in (select object_name from stage)
5 loop
6 insert into t (last_name) values ( x.object_name );
7 end loop;
8 dbms_output.put_line( (dbms_utility.get_cpu_time-l_start) || ' hsecs' );
9 end;
10 /
431 hsecs
PL/SQL procedure successfully completed.
As you can see – our code consumed 4.31 CPU seconds of computer time to generate and insert the data. If we perform the same operation using the encryption package:
ops$tkyte%ORA11GR2> truncate table t;
Table truncated.
ops$tkyte%ORA11GR2> declare
2 l_start number := dbms_utility.get_cpu_time;
3 begin
4 for x in (select object_name from stage)
5 loop
6 insert into t (encrypted_name)
7 values ( encryption_wrapper.encrypt
8 (x.object_name,
9 'Secret Key Secret Key Secret Key'));
10 end loop;
11 dbms_output.put_line( (dbms_utility.get_cpu_time-l_start) || ' hsecs' );
12 end;
13 /
2502 hsecs
PL/SQL procedure successfully completed.
It takes 25.02 CPU seconds to generate and insert the data. Thus, it takes almost six times more CPU to perform the encryption as opposed to leaving the data unencrypted. Even if we “bulk up” the operation, using an INSERT AS SELECT statement, we’ll see a large disparity:
ops$tkyte%ORA11GR2> truncate table t;
Table truncated.
ops$tkyte%ORA11GR2> declare
2 l_start number := dbms_utility.get_cpu_time;
3 begin
4 insert into t (last_name) select object_name from stage;
5 dbms_output.put_line( (dbms_utility.get_cpu_time-l_start) || ' hsecs' );
6 end;
7 /
8 hsecs
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2> truncate table t;
Table truncated.
ops$tkyte%ORA11GR2> declare
2 l_start number := dbms_utility.get_cpu_time;
3 begin
4 insert into t (encrypted_name)
5 select encryption_wrapper.encrypt
6 (object_name,
7 'Secret Key Secret Key Secret Key')
8 from stage;
9 dbms_output.put_line( (dbms_utility.get_cpu_time-l_start) || ' hsecs' );
10 end;
11 /
374 hsecs
PL/SQL procedure successfully completed.
374 hseconds versus 8 seconds is a more striking disparity – an almost 47 times increase in CPU utilization because of encryption when we performed the more efficient bulk operation. Clearly, this increased demand on CPU will have an impact on our modifications to the column. INSERTS, UPDATES, and MERGES will surely be affected, (but probably notDELETEs.
What about data retrieval? To test the effects on data retrieval, we’ll populate both columns at the same time:
ops$tkyte%ORA11GR2> truncate table t;
Table truncated.
ops$tkyte%ORA11GR2> insert into t (last_name, encrypted_name)
2 select object_name,
3 encryption_wrapper.encrypt
4 (object_name,
5 'Secret Key Secret Key Secret Key')
6 from stage;
64588 rows created.
ops$tkyte%ORA11GR2> commit;
Commit complete.
And then retrieve them:
ops$tkyte%ORA11GR2> declare
2 l_start number := dbms_utility.get_cpu_time;
3 begin
4 for x in (select last_name from t)
5 loop
6 null;
7 end loop;
8 dbms_output.put_line( (dbms_utility.get_cpu_time-l_start) || ' hsecs' );
9 end;
10 /
7 hsecs
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2> declare
2 l_start number := dbms_utility.get_cpu_time;
3 begin
4 for x in ( select encryption_wrapper.decrypt
5 (encrypted_name,
6 'Secret Key Secret Key Secret Key')
7 from t )
8 loop
9 null;
10 end loop;
11 dbms_output.put_line( (dbms_utility.get_cpu_time-l_start) || ' hsecs' );
12 end;
13 /
378 hsecs
PL/SQL procedure successfully completed.
As you can see – the decryption overhead added a sizable amount of increased CPU utilization, around 50 times as much CPU was needed in this case (378 seconds versus 7).
When To Use The Manual Approach
So, when should you consider the manual approach of application encryption? If you ask me – the answer is pretty much “almost never”. The only time it would be necessary is if you are using Oracle’s Standard Edition (SE) or lower – which does not have the ability to have the Advanced Security Option (ASO). Many times people will utilize SE as a cost savings approach – but if you find that you have to implement the functional equivalent of ASO yourself – doing the key management, implementing all of the code – and performing the necessary application modifications (this approach is definitely not transparent to the application), you might find the cost of Enterprise Edition justified. In house developed software is not “free” after all.
</quote>