Transformations > Data Masking transformation > Custom substitution masking

Custom substitution masking

Use custom substitution masking to replace production data with realistic test data from a flat file or relational dictionary that you create.
Substitution masking replaces a column of data with similar but unrelated data. For example, you can create a dictionary that contains male and female first names. Use the dictionary to perform substitution masking on a column that contains both male and female first names.
You can configure custom substitution masking to replace the target column with unique masked values for every unique source column value.
When you configure custom substitution masking, select the dictionary type, the connection, and then select the required dictionary file or table. You can then select the column that you want to use from the dictionary. To support non-English characters, you can use different code pages from a flat file connection.
Create a connection to the flat file dictionary from the Configure | Connections view. Add a relational dictionary connection and storage connection to the transformation on the Masking Rules tab.
The flat file connection code page and the Secure Agent system code page must be compatible for the masking task to work.
You can substitute data with repeatable or nonrepeatable values. When you choose repeatable values, the Data Masking transformation produces deterministic results for the same source data and seed value. You must configure a seed value to substitute data with deterministic results. You can substitute more than one column of data with masked values from the same dictionary row.
Note: Before you run the mapping, verify that the dictionary file is present in the following location: <Secure Agent installation directory>\apps\Data_Integration_Server\data

Custom substitution parameters

The following table describes the parameters that you configure for custom substitution masking:
Flat File or Relational
Choose the type of custom dictionary to use.
If you choose flat file, you must create a flat file connection with the directory that points to the dictionary files.
If you choose relational, the transformation must include the relational dictionary.
Dictionary Connection
The name of the relational or flat file connection where you store the relational or flat file dictionary. To make a flat file dictionary available to all Secure Agents in a runtime environment, verify that the file is in the following location:
<Secure Agent installation directory>\apps\Data_Integration_Server\data
The custom dictionary that you want to select.
Dictionary Column
The output column from the custom dictionary. For flat file dictionaries, you can select a dictionary column if the flat file contains column headers.
Order By
Applicable for relational dictionaries. The dictionary column on which you want to sort entries. Specify a sort column to generate deterministic results even if the order of entries in the dictionary changes. For example, if you move a relational dictionary and the order of entries changes, sort on the serial number column to consistently mask the data.
Note: The column that you choose must contain unique values. Do not use columns that can contain duplicate values to sort the data.
Lookup Input Column
Optional. The source input column on which you perform a lookup operation with the dictionary.
Lookup Dictionary Column
Required if you enter a lookup Input Column value. The dictionary column to compare with the input port. The source is replaced with values from the dictionary rows where the Lookup Input and Lookup Dictionary values match.
Lookup Error Constant
Optional. A constant value that you can configure when there are no matching values for the lookup condition from the dictionary. Default is an empty string.
Returns the same masked value when you run a task multiple times or when you generate masked values for a field that is in multiple tables.
Seed Value
A starting number to create repeatable output. Enter a number from 1 through 999. Default seed value is 190. You can enter the seed value as a parameter.
Optimize Dictionary Usage
Increases the usage of masked values from the dictionary. Available if you choose the Repeatable option. The property is not applicable if you enable unique substitution.
Is Unique
Applicable for repeatable substitution. Replaces the target column with unique dictionary values for every unique source column value. If there are more unique values in the source than in the dictionary file, the data masking operation fails. Default is nonunique substitution.