Skip to Main Content
  • Questions
  • Howto mask (remap) data during import over database link with Datapump ?

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, Kim.

Asked: August 08, 2017 - 11:17 am UTC

Last updated: August 12, 2017 - 4:53 am UTC

Version: 11.2.0.4

Viewed 1000+ times

You Asked

Hi Tom

I have an issue with Data Masking (remap_data).

It seems, that the package for masking the data has to be present on the importing database, when I use dbms_datapump to import tables from the source database.
This seems to go against all what I have read about the masking of the data, where the masking occurs on the source database, and thus sensitive data does not leave the source database.

Short example:

DB1 (the source database, contains the table SCOTT.EMP)
-----------------------------------------
conn system/manager1@db1

create or replace package system.pck
is
function dummy ( val in varchar2 ) return varchar2;
end pck;
/

create or replace package body system.pck
is
function dummy ( val in varchar2 ) return varchar2
is
begin
return 'X';
end dummy;
end pck;
/

This package must be used for scrambling the data.

DB2 (the importing database)
-----------------------------------------
conn system/manager1@db2

create database link db1 connect to system identified by manager1 using 'db1';

declare
h1 number;
l_status Varchar2(2000);
begin
h1 := dbms_datapump.open
( operation => 'IMPORT'
, job_mode => 'SCHEMA'
, remote_link => 'DB1'
, job_name => 'REMAP_DATA_TEST'
, version => 'COMPATIBLE'
);
dbms_datapump.add_file
( handle => h1
, filename => 'REMAP_DATA_TEST.log'
, directory => 'DATA_PUMP_DIR'
, filetype => DBMS_DATAPUMP.ku$_file_type_log_file
);
dbms_datapump.metadata_filter
( handle => h1
, name => 'SCHEMA_EXPR'
, value => q'| = 'SCOTT'|'
);
dbms_datapump.data_remap
( handle => h1
, name => 'COLUMN_FUNCTION'
, table_name => 'EMP'
, column => 'ENAME'
, function => 'SYSTEM.PCK.DUMMY'
, schema => 'SCOTT'
);
dbms_datapump.start_job (h1,cluster_ok => 0);
--
--
dbms_datapump.wait_for_job
( handle => h1
, job_state => l_status);
--
dbms_output.put_line ( 'Finished with status = '||l_status );
end;
/

ORA-39001: invalid argument value
ORA-06512: at "SYS.DBMS_SYS_ERROR", line 79
ORA-06512: at "SYS.DBMS_DATAPUMP", line 3507
ORA-06512: at "SYS.DBMS_DATAPUMP", line 4309
ORA-06512: at line 23

However, if I create the SYSTEM.PCK package on DB2, the tables are transferred from DB1 to DB2, and the column ENAME in EMP contains only the value "X".

This is a big issue, as we cannot have the sensitive data leave over production database "DB1".

Can You help, have I missed anything ?

Br,
Kim

and Connor said...

Is there any reason you cannot do the export locally and remap the data as part of that ?, ie

SQL> create or replace
  2  package scott.pkg is
  3    function randomizer(c number) return number;
  4  end;
  5  /

Package created.

SQL>
SQL> create or replace
  2  package body scott.pkg is
  3  function randomizer(c number) return number is
  4  begin
  5    return trunc(dbms_random.value(1,1000));
  6  end;
  7  end;
  8  /

Package body created.


C:\>expdp scott/tiger directory=TEMP dumpfile=scott_mask logfile=scott_mask.log remap_data=emp.empno:pkg.randomizer

Export: Release 12.2.0.1.0 - Production on Thu Aug 10 12:31:59 2017

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production
Starting "SCOTT"."SYS_EXPORT_SCHEMA_01":  scott/******** directory=TEMP dumpfile=scott_mask logfile=scott_mask.log remap_data=emp.empno:pkg.randomizer
Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA
Processing object type SCHEMA_EXPORT/PACKAGE/PACKAGE_BODY
Processing object type SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS
Processing object type SCHEMA_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
Processing object type SCHEMA_EXPORT/STATISTICS/MARKER
Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA
Processing object type SCHEMA_EXPORT/SYNONYM/SYNONYM
Processing object type SCHEMA_EXPORT/SEQUENCE/SEQUENCE
Processing object type SCHEMA_EXPORT/SEQUENCE/GRANT/OWNER_GRANT/OBJECT_GRANT
Processing object type SCHEMA_EXPORT/TABLE/TABLE
Processing object type SCHEMA_EXPORT/TABLE/GRANT/OWNER_GRANT/OBJECT_GRANT
Processing object type SCHEMA_EXPORT/TABLE/COMMENT
Processing object type SCHEMA_EXPORT/PACKAGE/PACKAGE_SPEC
Processing object type SCHEMA_EXPORT/FUNCTION/FUNCTION
Processing object type SCHEMA_EXPORT/PROCEDURE/PROCEDURE
Processing object type SCHEMA_EXPORT/PACKAGE/COMPILE_PACKAGE/PACKAGE_SPEC/ALTER_PACKAGE_SPEC
Processing object type SCHEMA_EXPORT/FUNCTION/ALTER_FUNCTION
Processing object type SCHEMA_EXPORT/PROCEDURE/ALTER_PROCEDURE
Processing object type SCHEMA_EXPORT/VIEW/VIEW
Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX
Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/CONSTRAINT
Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/REF_CONSTRAINT
. . exported "SCOTT"."EMP$"                              8.796 KB      15 rows
. . exported "SCOTT"."EMP"                               8.773 KB      14 rows
. . exported "SCOTT"."DEPT"                              6.023 KB       4 rows
. . exported "SCOTT"."SALGRADE"                          5.953 KB       5 rows
. . exported "SCOTT"."BONUS"                                 0 KB       0 rows
Master table "SCOTT"."SYS_EXPORT_SCHEMA_01" successfully loaded/unloaded
******************************************************************************


Rating

  (4 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Yes, but....

Kim, August 10, 2017 - 8:16 am UTC

The reason that we do not use dumpfiles, is that we have a lot of Exadata machines, but there is no common filesystem.

For this reason, we use Datapump over the network.
Connor McDonald
August 10, 2017 - 8:52 am UTC

Have you tried network mode *export*, ie

expdp network_link=my_test_db dumpfile=... directory=...

to dump the data to the target filesystem. And then import from the local file.

Not a bad idea, but not allowed here

Kim, August 10, 2017 - 8:57 am UTC

Hi Connor,

I have already considered that, and while it is not a bad idea, it will not help here.

Exporting from the remote database (the test system) means, that the data masking will take place on the test system.
This is not allowed, the sensitive data must not leave the production database without being masked.
Connor McDonald
August 12, 2017 - 4:53 am UTC

My proposal was not that - it was run the export on the *production* node, and write the dumpfile across the network to the test node.

to requester

A reader, August 11, 2017 - 7:00 am UTC

Your logic is true but it go with the fact that say that the remap SHOULD be done while Exporting not Importing.

followup!

AMA, December 06, 2018 - 11:27 am UTC

well, what about creating the Datapump job as well as system.pck on the source database DB1 and then push the masked data to DB2 instead of pulling the data from DB1 to DB2?

what do you think?





More to Explore

Utilities

All of the database utilities are explained in the Utilities guide.