Encryption in Oracle Databases |
Data Obfuscation and Encryption
Jared
Still
Certified
Oracle DBA and Part Time Perl Evangelist
jkstill@cybcon.com
Data
encryption is a hot topic these days.
This is hardly a new subject, but has received and increasing amount of
attention, largely due to ecommerce. Protecting credit card numbers, medical
data and other sensitive information has become more important than ever
before, and on a larger scale.
Before
launching into a discussion on algorithm choices and methods of implementing
encryption, it is important to consider some related decisions that need to be
made first. It may become clear that
encryption is not necessarily what is required.
Please
keep in mind that encryption in this
article refers to some method of modifying data so that it is meaningless and
unreadable in it's encrypted form. It
also must be reasonably secure, that is it must not be easily decrypted without
the proper key.
Anything
less than that will be referred to as obfuscation. This is data that is rendered unusable by
some means, but is not considered as a serious form of encryption. Examples will follow.
Why
would you want to merely obfuscate data, rather than use a strong encryption
algorithm?
A
good example would be an audit report on a medical system. This report may be generated for an external
auditor, and contain sensitive information. The auditor will be examining the
report for information that indicates possible cases of fraud or abuse.
Assume
that management has required that Names, Social Security Numbers and other
personal information should not be available to the auditor except on an as
needed basis.
The
data needs to be presented to the auditor, but in a way that allows the
examination of all data, so that patterns in the data may be detected.
Encryption
would be a poor choice in this case, as the data would be rendered into ASCII
values outside of the range of normal ASCII characters. This would be impossible to read.
A
better choice might be to obfuscate the data with a simple substitution
cipher. While this is not considered
encryption, it may be suitable for this situation.
When
the auditor finds a possible case of abuse, he will need the real name and SSN
of the party involved. He could obtain
this by calling a customer service representative at the insurance company that
supplied the report, and ask for the real information.
The
obfuscated data is read to the customer service rep, who then inputs it into an
application that supplies the real data.
The
importance of using pronounceable characters becomes very clear. Strong encryption would render this
impossible.
Here’s
some simple example code to do the obfuscation:
create or replace package obfs
is
function obfs( varchar2 in ) return
varchar2;
pragma restrict_references( obfs, WNPS, WNDS
);
function unobfs( varchar2 in ) return
varchar2;
pragma
restrict_references( unobfs, WNPS, WNDS );
end;
/
create or
replace package body obfs
is
xlate_from varchar2(62) :=
'0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz';
xlate_to
varchar2(62) :=
'nopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklm';
function obfs ( clear_text_in
varchar2 ) return varchar2
is
begin
return translate( clear_text_in,
xlate_from, xlate_to );
end;
function unobfs ( obfs_text_in
varchar2 ) return varchar2
is
begin
return translate( obfs_text_in,
xlate_to, xlate_from );
end;
end;
/
Here
is some sample output:
SSN OBFS SSN
---------- ----------
540407786 srnrnuuvt
542800170 srpvnnoun
542802063 srpvnpntq
541466830 srorttvqn
As
you can see, it wouldn't be very difficult to decipher this scheme given enough
data. A somewhat more effective method
involves chopping the text into segments and rearranging it as well as
obfuscating it. Below is some sample
output from this algorithm.
OBFS OBFS
---------- ----------
540407786
&24B23B&Z
542800170
-4Bၿ&&&
542802063
-4Z&23-&_
541466830
Ʀ𗒿ZZ&
While
this is still not encryption, this data would be more difficult to decipher
without the key. Source code for this in
PL/SQL is available at the URL provided at the end of this article.
Another way to hide sensitive data is through masking.
This is different from the previous example in that the clear text
cannot be reconstructed from the displayed data.
This
is useful in situations where it is only necessary to display a portion of the
data. A good case for this method is the
receipts printed at gas stations and convenience stores. When a purchase is made with a credit card,
the last 4 digits of the credit are often displayed as clear text, while the
rest of the credit card number has been masked with a series of X's.
|
This
method can also be used for reports where the person reading the report
requires only a portion of the sensitive data.
This method is also commonly used for the account numbers on printed
transactions from ATM's.
Ok,
you've decided that your data is too sensitive to trust to anything other than
a real encryption algorithm. You must
use some form of encryption to give reasonable assurance that sensitive
information will be hidden from anyone other than those entitled to it.
The
choices of algorithms and implementation methods you can make use of are
dependent on the version of Oracle that your database uses. All solutions discussed here will be either
home grown software, or solutions provided with the Oracle RDBMS, and operate
inside of the database.
If
you need to encrypt data in a version 7 database, your choices are somewhat
limited. The only solution outside of
third party software is to use an external program to encrypt and decrypt data.
While
a program written in C could do this very quickly, it precludes the ability to
encrypt and decrypt data in a SQL statement, limiting the usefulness of
encryption.
The
only solution within the database is to write an encryption routine in PL/SQL.
This
works well for limited amounts of data, but may be unwieldy for large amounts
of data. This is because PL/SQL is quite a bit slower than a language such as C
for this type of operation.
Timings
of various encryption methods are shown in Table 1.
TABLE 1
|
Cipher
Method |
UPDATE |
SELECT |
SELECT
COUNT(*) |
|
Baseline
Queries |
0.28 |
1.41 |
0.03 |
|
RC4 in
PL/SQL |
15.83 |
17.87 |
15.59 |
|
RC4 in
C |
0.74 |
1.96 |
0.56 |
|
DBMS_OBFUSCATION_TOOLKIT |
0.82 |
1.88 |
0.53 |
|
OBFS in
PL/SQL |
0.65 |
1.88 |
0.48 |
Timings for various methods of obscuring and encrypting data within a database.
Test platform is a Sun E4500 with 2 Gigabytes of RAM and 4 CPU's.
Operating System is Solaris 2.6
Database is Oracle 8.1.6
Testing is done on a 1000 row
table as shown.
Table CRYPT_TEST
Column Name Type
------------
----------------------------
SOC_SEC_NBR VARCHAR2(9)
CIPHER_SSN VARCHAR2(16)
All timings are in seconds.
The
encryption method used in this case is RC4[1], a very
fast stream cipher. It is fast in
relation to other encryption methods, but as you will see in table 1, it is
still rather slow in PL/SQL.
The
complete PL/SQL source code is too lengthy to include in this article, but I
will show the package header. The
complete source code with examples can be found at the URL at the end of the
article.
create or replace package rc4
is
-- returns text
encrypted with 'return_key'
-- client must
decrypt
function
crypt_secure (
plain_text_in
varchar2,
key_in
varchar2,
return_key_in
varchar2
) return
varchar2;
pragma
restrict_references( crypt_secure, WNDS);
-- does not
encrypt returned text
function
crypt_clear (
plain_text_in
varchar2,
key_in
varchar2
) return
varchar2;
pragma
restrict_references( crypt_clear, WNDS);
-- returns text
encrypted with 'return_key'
-- client must
decrypt
procedure
crypt_secure (
cipher_text_inout in out varchar2,
key_in
varchar2,
return_key_in
varchar2
);
pragma
restrict_references( crypt_secure, WNDS);
-- does not
encrypt returned text
procedure
crypt_clear (
cipher_text_inout in out varchar2,
key_in
varchar2);
pragma
restrict_references( crypt_clear, WNDS);
end rc4;
As
of Oracle 8, it is possible to write external library routines in C, and call
them from within a SQL statement or PL/SQL block. A share library native to the Os must be
created. With NT this would be a
DLL. In many versions of Unix this is a
.so file. This method will provide much
faster results than a pure PL/SQL implementation.
External
libraries are somewhat difficult to setup however. When an external callout is made, Oracle
starts a dedicated process for each session to interface to the shared library.
The listener.ora, sqlnet.ora and tnsnames.ora files must all be configured
properly to launch this process.
The
SQL*Net modifications that worked for this on our Solaris 2.6 with an Oracle 8.1.6 database are included in
a README file that is available at the URL at the end of this article.
For
my tests, I again used the streaming cipher RC4, this time coded in C. The performance of this test was 24 times
faster than RC4 as implemented in PL/SQL.
1000 rows of data were update in .75 seconds, as compared to about 18
seconds for the PL/SQL version of RC4.
While
this is quite a bit faster, I was actually expecting better performance than
that. More on that later.
There
is though one caveat on this RC4 implementation. While writing this article, I discovered that
my C implementation of RC4 as modified for use with Oracle is less than
perfect.
It
has the nasty habit of occasionally truncating the encrypted data at some
point. So I have to make the disclaimer
that this code is FOR DEMONSTRATION PURPOSES ONLY.
Its
purpose is to demonstrate the concept of using an external library to encrypt
data in the database, and for comparing the performance of this method to other
methods.
This
source code is also posted on my web site.
It will include a notification stating if the bug still exists.
Here
is the PL/SQL portion of this RC4 implementation.
create or replace library librc4p as
'/home/jkstill/articles/encryption/rc4/c/rc4/rc4p.so
create or replace package crypt
is
function rc4(
key_in varchar2, data_in varchar2 ) return varchar2;
-- pragma not
required in 8.1 databases
-- pragma
restrict_references( rc4, wnds, rnds, wnps );
end;
create or replace package body crypt
is
function rc4p(
key_in varchar2, data_in in out varchar2 )
return
pls_integer is external
library librc4p
name
"rc4p"
language C;
function rc4(
key_in varchar2, data_in varchar2 ) return varchar2
is
dummy
pls_integer;
v_data
varchar2(500);
clear_len
integer;
crypt_len
integer;
begin
v_data :=
data_in;
clear_len :=
length(v_data);
dummy := rc4p(
key_in, v_data );
return v_data;
end;
end;
Another
encryption choice is available as of version 8.1.6 of the Oracle RDBMS.
That is the DBMS_OBFUSCATION_TOOLKIT. This package is an interface to a 40 bit implementation of the DES[2]
algorithm. While 40 bit encryption is
not exactly state of the art, it is probably sufficient for most
applications. It is subject to other
restrictions as well due to U.S. export laws.
This package cannot be used to encrypt already encrypted data, as that
is one of the export restrictions imposed at the time this package was
released. The Oracle8i Application
Developers Guide contains more information.
Here
is the package header:
CREATE OR REPLACE PACKAGE dbms_obfuscation_toolkit AS
---------------------- FUNCTIONS AND PROCEDURES -------------
---------------------------- DATA ENCRYPTION ----------------
-- The following
routines encrypt and decrypt data.
-- There are two
versions of each procedure: one for raw data
-- and the other
for strings.
-------------------------------------------------------------
PROCEDURE
DESEncrypt(input IN RAW,
key IN RAW,
encrypted_data OUT RAW);
PROCEDURE
DESEncrypt(input_string IN VARCHAR2,
key_string IN VARCHAR2,
encrypted_string OUT VARCHAR2);
PROCEDURE
DESDecrypt(input IN RAW,
key IN RAW,
decrypted_data OUT RAW);
PROCEDURE
DESDecrypt(input_string IN VARCHAR2,
key_string IN VARCHAR2,
decrypted_string OUT VARCHAR2);
END dbms_obfuscation_toolkit;
One
of the chief advantages of using this package is that it does not require any
of the setup that is required for using external libraries. You must be using at least version 8.1.6 of
the Oracle RDBMS to make use of this package.
Creating
it is as simple as logging into the database as SYS and running the package
$ORACLE_HOME/rdbms/admin/catobtk.sql.
It is also a good performer. This implementation of DES was able to update 1000 rows in about 0.8 seconds in testing. Even though it is fast, the performance was still a concern. This can be seen by the timing results in Table 1.
This
performance of DBMS_OBFUSCATION_TOOLKIT may be fine for many applications, but
would be a cause for concern where large amounts of data are processed in a
single transaction, such as in a Data Warehouse.
Doing
an UPDATE of 1000 rows, and encrypting a single column in each row, the DES
encryption took about three times longer to do the update than if no form of
encryption was used.
During
SELECT’s though, the performance difference was much less noticeable, as
decrypting a column added about 40% to the time required for a query without
any encryption.
My
motivation for using RC4 to encrypt data was to increase the performance of
updating encrypted data. RC4 is about 10
times faster than DES, yet the results show that RC4 has a very small
performance advantage over DES, at least as implemented in an Oracle database.
This
difference leads me to believe that most of the extra overhead in encrypting
data is not in the actual encryption routine, but is in the interface between
the Oracle engine and the encryption routines.
I haven't attempted to prove this yet, but the observed performance
during testing suggests it.
One
small advantage that RC4 has over DES is that data can be encrypted in place in
a table. In the text below Table 1 you
will see the encrypted column is larger than the column for the clear text.
DES
is a block cipher, and Oracle's implementation of it requires that the length
of the data be a multiple of 8. The
functions I used to interface to the DES routines will pad the incoming data so
that the length is a multiple of 8.
Therefore, the size of any encrypted column will need to be rounded up
to the next multiple of 8.
[i]Armed with
these few tools, this article can serve as springboard for further exploration
of encryption in the database.
This
article is not exhaustive by any means.
There is a lot of background material available for cryptography if you
are interested in learning more about this.
Web
sites:
http://www.mach5.com/crypto
http://www.counterpane.com/sites.html
The
definitive source for making application of cryptography is the book Applied Cryptography by Bruce Schneier.
[1] RC4 is a very fast stream cipher created by Ronald Rivest, one of the founders of RSA
[2] Data Encryption Standard
[i] The encryption keys used here have been short. For real security, a longer key should be used. A minimum key length should be 56 bit. The DBMS_OBFUSCATION_TOOLKIT currently works with 56 bit keys only.