Five Stars

How to Join data masked

Hi, 

I want to mask sensitive data in my DB with Talend, some data to be masked are key fields, so I need to use them to join, i used tDataMasking, but applying the same function to the same key in two different tables, the output is different. How can I fix it? Is there a particular function that I have to choose in tDataMasking for this use (doing join with masked data)?

  • Data Quality
4 REPLIES
Employee

Re: How to Join data masked

Hi Mark,

That's a very good question. At the beginning, most of the tDataMasking functions were purely random (i.e. we did not care about what is in the input). We added in 6.3 some functions for SSN (called "Generate unique xxx SSN number" where xxx can be Chinese, French, German, Indian, Japanese, UK, US) that are able to do exactly what you want, if you have a SSN as an input. We may do it for other types (like credit cards). In what functions are you interested in ?

Damien

Employee

Re: How to Join data masked

If you don't use SSN, there is still an approximate way to do it:

first, store all your unique Ids in a file
then use the "Replace by consistent items from input list (or file)" function to read from this file.

Five Stars

Re: How to Join data masked

Thank you for your answer,

I want to do the join between the keys of a table, this keys could be a string of integers or letters or both. So I tried to mask these keys with a "replace all" "replace all digits" "replace all letters" and other functions but the join isn't done correctly because if I apply the same function to the same key in two different tables is masked differently.

 

Employee

Re: How to Join data masked

Hi Mark,

 

This cannot work. The replacement done by these functions are purely random.

 

As I said in my previous answer:

first, store all your unique keys in a file (ideally, add more keys to this file but avoid duplicates).
then use the "Replace by consistent items from input list (or file)" function to read from this file.

 

See the attached example.