Database, SQL and PL/SQL

Hide from Prying Eyes

Use Data Redaction in Oracle Database 12c to hide sensitive data automatically.

By Arup Nanda Oracle ACE Director

January/February 2014

John, the lead DBA at Acme Bank, is listening patiently to the visitors in his office today. To comply with several mandates and regulations, the bank must make sure that some types of data—such as Social Security numbers and the date of the last withdrawal—in Acme’s database tables are altered to hide their true values from all visitors. This masking must be done at the database level—not by the user interface tools that are pulling the data. (Masking data to protect it from prying eyes is called redaction.) Jill, the lead developer at Acme Bank, emphasizes an important requirement: the data must remain intact in the database tables; it is only the displayed information that must be redacted. Further, she adds, the application user accounts must show the real account information without redaction when account holders pull it.

Traditionally, meeting this redaction requirement has meant creating views on tables and assigning privileges to users on the views, not on the actual database tables. Although the view approach works for redaction, it is complex, error-prone, and subject to performance issues. Jill asks if there is a simple and fast mechanism for setting up redaction.

Yes, there is, John informs her, with Oracle Advanced Security’s Data Redaction feature in Oracle Database 12c.

The Requirements

Jill explains that all of the bank’s data is stored in the TSBS schema, named after the application (Total Standard Banking System). A table named SAVINGS stores the details of the savings accounts. She shows everyone the structure of the SAVINGS table:

Name          Null?  Type
——————————— ———— —————————————————
ACCNO NUMBER
ACCNAME VARCHAR2(20)
ID_NO VARCHAR2(9)
LAST_DEP_DT DATE
FOLIOID NUMBER
EMAIL VARCHAR2(200)

When the TSBS user selects data from the SAVINGS table, the results display with the values intact—but when any other user selects data from the table, Jill continues, the data must be masked, as shown in Table 1.

Column Description and Redaction Requirement
ID_NO National ID number, such as the Social Security number in the US. Replace each of the first five characters of the column with a * and show only the last four characters. (Example: for 123456789, show *****6789.)
LAST_DEP_DT Date of last deposit. Show only day and month, and replace the year with 1900. (Example: for Jan 14, 2013, show Jan 14, 1900.
FOLIOID ID of the folio where the account is located. Replace the number with any random number.
EMAIL E-mail address of the account holder. Replace the e-mail address before the @ sign with four x’s, and keep the domain name. (Example: for john.smith@proligence.com, show xxxx@proligence.com.)
Table 1: Redaction requirements for Acme Bank’s SAVINGS table

Jill explains that these redaction requirements have traditionally been met in two ways:

  1. The developers put the data masking code in the application. This makes applications complex to develop and difficult to manage, and—even worse—the logic of redactions, being in the application code, may be inconsistent across different applications, making this approach highly undesirable to Jill.

  2. The DBAs create a view—called VW_SAVINGS, for example—on the SAVINGS table. Inside the view, the DBAs change the column values, grant necessary privileges to the appropriate users for this view, and create a public synonym called SAVINGS that points to the VW_SAVINGS view instead of the SAVINGS table. This way, when users select from the SAVINGS object, they are actually selecting from the VW_SAVINGS view instead of the SAVINGS table. Because the view performs the appropriate masking of the data, the users see the redacted values only, regardless of the application they connect from, solving the inconsistent-logic problem of the application coding redaction approach.

    However, developers want to insert, update, and delete table data as well. A statement such as insert into savings actually references the VW_SAVINGS view, so the statement fails, because data can’t be inserted into the derived columns of the view. The only option for changing data while using the view is to use INSTEAD OF triggers, which actually update the table behind the scenes. This makes application development complex and prone to mistakes, and Jill does not want to use that technique either.


Redaction

Have no worries, John assures her. Acme can use Data Redaction in Oracle Database 12c to accomplish her objectives easily. The feature enables Acme to define a set of rules on what data can be redacted in a specific table, plus how and when. This set of rules is called a redaction policy, or just a policy for short. The supplied PL/SQL package, DBMS_REDACT, includes all the functionality for creating and maintaining policies on a table.

To set up a demonstration, John uses the setup.sql script. John then uses the code shown in Listing 1 to set up a redaction policy on the SAVINGS table.

setup.sql script

create user tsbs identified by tsbs
/
grant create session, create table, unlimited tablespace to tsbs
/
create user app identified by app
/
conn tsbs/tsbs
create table savings (
accno number,
accname varchar2(20),
id_no varchar2(9),
last_dep_dt date,
folioid number,
email varchar2(200)
)
/
insert into savings values (101,'John Smith', '123456789', sysdate-1,
1234567, 'john.smith@proligence.com')
/
insert into savings values (102, 'Jane Smith', '234567890', sysdate-2,
2345678, 'jane.smith@proligence.com')
/
insert into savings values (103, 'Jane Doe', '345678901', sysdate-3,
3456789, 'jane.doe@proligence.com')
/
commit
/
grant insert, select, delete, update on savings to app
/


Code Listing 1:
Redaction setup

  1  begin
2 dbms_redact.add_policy (
3 object_schema => 'TSBS',
4 object_name => 'SAVINGS',
5 policy_name => 'Savings_Redaction',
6 expression => 'USER!=''TSBS''',
7 column_name => 'ID_NO',
8 function_type => dbms_redact.partial,
9 function_parameters => 'VVVVVVVVV,VVVVVVVVV,*,1,5'
10 );
11 -- subsequent columns will need to be added
12 dbms_redact.alter_policy (
13 object_schema => 'TSBS',
14 object_name => 'SAVINGS',
15 policy_name => 'Savings_Redaction',
16 action => dbms_redact.add_COLUMN,
17 column_name => 'FOLIOID',
18 function_type => dbms_redact.random
19 );
20 dbms_redact.alter_policy (
21 object_schema => 'TSBS',
22 object_name => 'SAVINGS',
23 policy_name => 'Savings_Redaction',
24 action => dbms_redact.add_COLUMN,
25 column_name => 'LAST_DEP_DT',
26 function_type => dbms_redact.partial,
27 function_parameters => 'MDy1900'
28 );
29 dbms_redact.alter_policy (
30 object_schema => 'TSBS',
31 object_name => 'SAVINGS',
32 policy_name => 'Savings_Redaction',
33 action => dbms_redact.add_COLUMN,
34 column_name => 'EMAIL',
35 function_type => dbms_redact.regexp,
36 regexp_pattern => dbms_redact.re_pattern_email_address,
37 regexp_replace_string => dbms_redact.re_redact_email_name,
38 regexp_position => dbms_redact.re_beginning,
39 regexp_occurrence => dbms_redact.re_all
40 );
41 end;

The DBMS_REDACT.add_policy procedure enables John to add the policy on a table with some basic parameters, including

  • object_schema: the owner of the table—TSBS in this case.

  • object_name: the table name— SAVINGS in this case—on which the policy is defined.

  • policy_name: the name of the policy—Savings_Redaction in this case—used to refer to the policy later.

  • column_name: the name of the column—ID_NO in this case—that is to be redacted.

  • expression: a parameter that enables John to specify a condition that returns either true or false. The redaction is performed only if the condition is true. The expression SYS_CONTEXT(‘USERENV’, ‘CURRENT_USER’!= ‘TSBS’) will return true for all users except TSBS, so the column will be redacted for all users except TSBS—exactly what Jill wants. John can place any expression here, as long as it returns true or false.

John explains the redaction mechanism, by referring to the line numbers in Listing 1.

In line 8, he sets the extent of the redaction on the ID_NO column, which is partial (DBMS_REDACT.PARTIAL)—only a part of the value is to be redacted, not the entire value. (If he had wanted to redact the entire value, he would have used DBMS_REDACT.FULL here.)

Line 9 shows the redaction formats. There are five masking parameters, separated by commas, John explains. The first—VVVVVVVVV—is the input value; it is the actual value stored in the ID_NO column. In this case, each V represents a number. The second value—VVVVVVVVV—shows what to display. The third value, which in this case is *, shows the value to use in place of the redacted value. Because Jill wants to show * for the redacted values, John uses that character here. The fourth value indicates the position in the input value where the redaction should start, which, in this case is 1—meaning that the redaction should start in the first position. The fifth and final value shows the number of characters to be redacted from the starting position. Because Jill wants to redact the first five numbers, John uses 5 here. With this setting, if the input value is 123456789, the redaction will be between the first and fifth positions, inclusive, and * will be used to mask actual values, so the final redacted value will be *****6789.

But ID_NO is not the only column Acme needs to redact. To add the other columns to the redaction policy, John alters the policy by executing another procedure in the DBMS_REDACT package—alter_policy—shown in line 12 of Listing 1. The action parameter in line 16 specifies the type of alteration—add_column—and line 17 specifies the column that needs to be added to the policy—FOLIOID. Because the FOLIOID column should be completely redacted with random values, the next parameter—function_type—in line 18, specifies dbms_redact.random.

John adds the other columns to the policy in the same way. For the LAST_DEP_DT column, where only the year needs to be redacted, he sets the function_type parameter to dbms_redact.partial in line 26 of Listing 1. In line 27, he sets the function_parameters value to MDy1900. The use of uppercase M and D means that the month/day component of the date value should remain unaltered during redaction. The use of the lowercase y means that the year should be redacted and that the number immediately following that y—1900—is the value substituted for the actual value. The net effect of the parameter is that the date and month in the LAST_DEP_DT column value remain the same and the year is replaced everywhere with 1900.

The redaction requirement for the EMAIL column is a bit more complex. The redaction policy must detect a pattern in the value and redact only part of it. In this case, the policy must redact only the portion before the @ sign and leave the rest intact. Jill worries that it might require complex coding. On the contrary, John explains, it is quite easy with the built-in functions for regular expressions, which are designed for partial matching. In Listing 1, line 35, he sets the function_type parameter to a constant named dbms_redact.regexp, which instructs the redaction package to use regular expressions. The redaction package includes some predefined templates for different types of values, including e-mail addresses. In line 36, he specifies that EMAIL column values are in e-mail address format. In line 37, he specifies replacement of the name portion—the part before the @ sign. It’s that simple to include the appropriate parameters, he explains, and the dbms_redact package does the rest.

Testing

After John executes the code in Listing 1, the Savings_Redaction redaction policy on the SAVINGS table is now active on the ID_NO, LAST_DEP_DT, FOLIOID, and EMAIL columns. To test the redaction, Jill logs in to an application that connects as the APP user, and she selects from the SAVINGS table, as shown in Listing 2. Jill compares the results with her original requirements in Table 1 and pronounces them completely compliant. To complete the test, she logs in as the schema owner—TSBS—and selects from the SAVINGS table in the application. The results, shown in Listing 3, display the original values without redaction.

Code Listing 2: Selection by the APP user

SQL> conn app/app
SQL> select * from tsbs.savings;
ACCNO ACCNAME ID_NO LAST_DEP_DT FOLIOID EMAIL
————— —————————— ————————— ——————————— ——————— ————————————————————
101 John Smith *****6789 21-SEP-00 3434562 xxxx@proligence.com
102 Jane Smith *****7890 20-SEP-00 3452092 xxxx@proligence.com
103 Jane Doe *****8901 19-SEP-00 4529012 xxxx@proligence.com


Code Listing 3:
Selection by the TSBS user

SQL> conn tsbs/tsbs
SQL> select * from tsbs.savings;
ACCNO ACCNAME ID_NO LAST_DEP_DT FOLIOID EMAIL
————— —————————— ————————— ——————————— ——————— ————————————————————
101 John Smith 123456789 21-SEP-13 1234567 john.smith@proligence.com
102 Jane Smith 234567890 20-SEP-13 2345678 jane.smith@proligence.com
103 Jane Doe 345678901 19-SEP-13 3456789 jane.doe@proligence.com


Protection Mechanisms
There is a small issue with this redaction policy, Jill opines. The application is using the APP user to connect to the database. It needs to update, insert into, and delete from the table as well as select from it. If the data is masked, the APP user will not be able match exact values. For example, she continues, after a folio shuffling, suppose the APP user wants to set the FOLIOID of the account of the customer with ID_NO 123456789 to 9876 but, due to redaction, the APP user can’t see the exact value of either the ID_NO or FOLIOID columns. The applications would have to be modified to change the predicate or use different user IDs—neither of which is a palatable option for Jill, who wants a better, more workable solution from John.

That’s simple, John answers. Redaction protects data from being visible for examination, but it does not prevent application logic. Data Redaction, he explains, does not change the underlying data in the database; it redacts the data only when it is displayed. To demonstrate, he performs a simple update as the APP user, as shown in Listing 4. He also selects the data before and after the update. The update works perfectly without any issues, but the displayed data is redacted in all cases. What this means, John clarifies, is that the application needs no change whatsoever to enable redaction. Jill is greatly relieved to hear that.

Code Listing 4: Effect of redaction on update

SQL> select id_no, FOLIOID from tsbs.savings where accno=101;
ID_NO FOLIOID
—————————— ——————————————
*****6789 7420987
SQL> update tsbs.savings set FOLIOID = 500 where id_no='123456789';
1 row updated.
SQL> select id_no, FOLIOID from tsbs.savings where accno=101;
ID_NO FOLIOID
—————————— ——————————————
*****6789 7590112

Jack, another developer, is concerned that when a user creates another table by selecting data from the SAVINGS table, whose data is redacted—and, in some cases, replaced by completely random values—the new table may contain wrong data and create confusion among users, which is quite a serious issue. John assures him that that will not be the case and demonstrates:

SQL> create table my_savings as 
select * from tsbs.savings;
create table my_savings as select *
from tsbs.savings
*
ERROR at line 1:
ORA-28081: Insufficient privileges -
the command references a redacted object.

The realm of protection is not just for the table, John explains. If the APP user creates a view on the SAVINGS table, the view will also contain the redacted data.

Conclusion

Using the Data Redaction feature in Oracle Database 12c, Acme can add policies to tables to mask data in any column to a desired format quickly and easily. Acme does not need to change any existing application code to enable the redaction, and UPDATE, INSERT, and DELETE operations continue to work as before for authorized users. And, most importantly, Acme does not need to create, use, and maintain views.

Everyone is satisfied with the solution, and the meeting is adjourned.

Next Steps

 READ more about DBMS_REDACT

 LEARN more about Oracle Database security

 DOWNLOAD Oracle Database 12c




DISCLAIMER: We've captured these popular historical magazine articles for your reference. Links etc contained within these article possibly won't work. These articles are provided as-is with no guarantee that the information within them is the best advice for current versions of the Oracle Database.