row hashing and column concatenation

One Star

row hashing and column concatenation

Hello - 
I would like to concatenate all the columns in a schema to for the purposes of a creating an md5 hash. 
Currently I am doing this in a tJavaRow:
output_row.src_md5diff = utils.md5Hash(
                                  utils.replaceNull(input_row.firstName,context.nullReplaceWith)+context.hashDelim
+ utils.replaceNull(input_row.lastName,context.nullReplaceWith)+context.hashDelim
+ utils.replaceNull(input_row.defaultLocale,context.nullReplaceWith)+context.hashDelim
+ utils.replaceNull(input_row.jobTitle,context.nullReplaceWith)+context.hashDelim
+ utils.replaceNull(input_row.phoneNumber,context.nullReplaceWith)+context.hashDelim
);
Which works fine but since I hate all that typing! I have a schema with 50 columns and now i want to clean this up.
I would like to loop over each column in the schema 
pseudo code:
RowString = ""
for each column in schema
    
   RowString = RowString + column.value + delimiter
next column
loop
RowHash = md5Hash(RowString)
in Talend TOS is there an exposed collection of columns that can be looped over as described above?  I suspect tJavaFlex would be the right component?
Thanks
Seven Stars

Re: row hashing and column concatenation

Hello, 
Each of the row structures has a toSring() method generated for it that creates a comma delimited string for each of the rows when invoked.  Theoretically if this is insufficient for you, you can overwrite this at some point (i.e. tJavaRow) with your own implementation.  Or, you can take this generated string and call replaceAll(",",<your delimiter here>) to turn the delimiter into what you want it to be and replaceAll("null",<null replacement here>) to reformat the string.
Something like utils.md5Hash(row4.toString().replaceAll(",","\t").replaceAll("null","THIS IS NULL"))
Hope that answers your question.
Five Stars

Re: row hashing and column concatenation

Hello, 
Each of the row structures has a toSring() method generated for it that creates a comma delimited string for each of the rows when invoked.  Theoretically if this is insufficient for you, you can overwrite this at some point (i.e. tJavaRow) with your own implementation.  Or, you can take this generated string and call replaceAll(",",<your delimiter here>) to turn the delimiter into what you want it to be and replaceAll("null",<null replacement here>) to reformat the string.
Something like utils.md5Hash(row4.toString().replaceAll(",","\t").replaceAll("null","THIS IS NULL"))
Hope that answers your question.

A great tip - row4.toString()
I would say that you do not need to worry about any null conversion etc as you only want a hash of the data and should not be interested in the content other than knowing that identical data will hash to the same value.
One Star

Re: row hashing and column concatenation

utils.md5Hash -- Is this inbuilt function or you created as Routine?