One Star

data masking / encryption

Hey guys, 
i have a requirement to mask data before i push it into a cloud database so in TOS i want to encrypt or hash ip address data and then after it has been encrypted or hashed compare the values to identify if two people have used the same ip address without decryption in in the destination database how can i do this is it possible in a tmap?   
6 REPLIES
Six Stars

Re: data masking / encryption

Hi 2ndofafew,
Yes you can do that in TOS. Easiest way to encrypt is using the PasswordEncryptUtil.encryptPassword supplied by Talend natively. You can modify the code <main install folder>/plugins/org.talend.librariesmanager/resources/java/routines/system/PasswordEncryptUtil.java in order to get a unique key.
If you want something even simpler you can apply (int) row1.originatedIp.hashCode().
Then from your other process, just re-apply the same encryption/hashing and compare the two values with a lookup in your tMap. Since you database can be huge, I would probably do a Lookup Model = Reload at each row and applied the hash on the globalMapKey
After be aware that you can have in very rare scenarios some false positive / collisions. As well as people in large companies can be behind a proxy and then be originated from the same IP.

 
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco}

Books are the quietest and most constant of friends; they are the most accessible and wisest of counsellors, and the most patient of teachers.
--Charles W. Eliot (1834 - 1926), The Happy Life, 1896
One Star

Re: data masking / encryption

Hey AdrienServian,
thanks for the help I've tried to use row1.IP_ADDRESS.hashcode() straight in a tmap with row1 as a string and the output as an int however this fails with the method hashcode() is undefined for the type string if its not too much trouble could you go into a bit more detail on usage or is there some documentation explaining this? 
Moderator

Re: data masking / encryption

Hi,
Could you please take a look at this KB article about:TalendHelpCenter: How to setup encryption of the passwords in Talend Studio? to see if it is what you are looking for?
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
One Star

Re: data masking / encryption

Hey Xdshi, how could i apply this in a TMap on a data flow? 
Moderator

Re: data masking / encryption

Hi,
You can call your custom routine in tMap component.
Here is a KB article about:TalendHelpCenter:Creating a user routine and call it in a Job.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Ten Stars

Re: data masking / encryption

Hey AdrienServian,
thanks for the help I've tried to use row1.IP_ADDRESS.hashcode() straight in a tmap with row1 as a string and the output as an int however this fails with the method hashcode() is undefined for the type string if its not too much trouble could you go into a bit more detail on usage or is there some documentation explaining this? 

Java is case sensitive.  The method name is hashCode().  I put together a simple job that reads in a single column file and uses a tMap to add a column with the expression "row1.IP_ADDRESS.hashCode();"  The job runs to completion and creates a different integer for each unique input value.