One Star

Check Sha-1

Hi all,
i am receiving in a ftp 2 files: 
  - 1 csv for data
  - 1 csv for a sha-1 hash
I need to check if the file was not altered, so i want to calculate the sha-1 of the csv containing the date versus the hash received
Please how can i perform this task in Talend ?
Thank you in advance
Br
7 REPLIES
Four Stars

Re: Check Sha-1

Hi,
Check the solution given in the thread...
https://talendforge.org/forum/viewtopic.php?id=20015
I would have following approach..
Generate sha-1 for input file and write it to another file
Read both sha-1 and use inner join to check using tMap. In case of reject row, the file is altered, else it is not.
I hope you got an idea.
Vaibhav
Five Stars

Re: Check Sha-1

Here's a "routine" that generates a hash for a String. This might give you the right pointer if you want to adapt it for hashing an entire file.
public static String getSHA256(String plainText) {
   if(plainText == null) return null;
   else {
   try {
   java.security.MessageDigest md = java.security.MessageDigest.getInstance("SHA-256");
   md.update(plainText.getBytes("UTF-8"));
   byte[] bytes = md.digest();
   java.math.BigInteger bI = new java.math.BigInteger(1, bytes);
   return bI.toString(16);
   }
   catch (Exception e) {
   return null;
   }
   }
   }
One Star

Re: Check Sha-1

Thank you ,
it works with a pair of data and checksum file. but my issue is that in my folder i have many files ( lets say 10 data files and 10 checksum files). 
How can i articulate my job in order to:
- read data file
- calculate checksum from this file
- compare the value to the checksum file
file by file ?
Thank you
Four Stars

Re: Check Sha-1

If you have above scenario, then you can
- Create a single file with 10 SHA-1 for input file
- Create a single file with 10 SHA-1 for sha-1 files
- Use tMap to compare both the files to each other.
This way it will compare the code to each other
Reason for not doing this in existing files is that, tFileList component reads file one by one and there is no option to re-load all file once again after completion.
Thanks
Vaibhav
One Star

Re: Check Sha-1

Thank you Vaibhav,
will try this.
SO if i understand it is not possible to compare pair by pair for example (datafile 1 & checksumfile1), then (datafile 2 & checksumfile2) ...........(datafile n & checksumfile n)  ?
Best Regards
Four Stars

Re: Check Sha-1

You can do this, if your files are in specific sort order of name/date else not possible
Vaibhav
One Star

Re: Check Sha-1

hi Vaibhav,
my files are named as follows:
data files: "XXXXXX-YYYYMMDDHHMMSS.csv"          ( Y=year, M=month, D=day, H=hour, M=minute, S=second)
checksum files: "XXXXX_YYYYMMDDHHMMSS.syn"     ( Y=year, M=month, D=day, H=hour, M=minute, S=second)
Each pair of files (data file, associated checksum file) have the same (date, hour) in the filename.
Regards