I'm following this post where a Talend user was able to compare 2 files using tMap, but I can't replicate his success. Any direction is much appreciated. http://www.talendforge.org/forum/viewtopic.php?id=12358 File to Compare contents (Main) col1, col2, col3, col4 1,2,3,9 5,6,7,8 Reference File contents (lookup) col1, col2, col3, col4 1,2,3,4 5,6,7,8 In tMap, I dragged each reference column to the like-named column in the main row1.col1 -- row2.col1 row1.col2 -- row2.col2 row1.col3 -- row2.col3 row1.col4 -- row2.col4 This plotted purple colored lines and "key" graphics as shown in the screen shot. I then clicked the tMap Settings button of the row2 and set Join Model to "INNER JOIN" (I did not set any columns to KEY in the schema editor for either row, because it seems like INNER JOIN should take care of that..) I then added 2 tFileOutputDelimited components (match and diffs) I dragged all columns of row1 (lookup) to MATCH and all columns of row2 (main) to DIFFS. Using tMap Setting of the DIFFS output I set Catch lookup inner join reject to "TRUE". My output in the match file has the data I expect: 5,6,7,8 But the DIFFS file only contains a separator: ,,, Whereas I expected this row of data: 1,2,3,9 I've worked thru the tMap example in TalendOpenStudio_Components_RG_41b_EN.pdf and the tutorial here: http://www.talendforge.org/tutorials/tutorial.php?language=english&idTuto=8. Of course, both of these trials worked flawlessly, but I've not been able to extract the necessary portions to build my own file compare. Thanks in advance for any direction/suggestions. Mark
Because when you catch inner join rejects, they are rejects. What it means is that the data that came from row1 couldn't find a match in row2, so they are rejected. In that case, how do you expect to get data from row2?
Thanks for the detailed example on file comparison. I am new to Talend DI and want to do file comparison for my testing requirements. It was very much helpful. In the example it will show the entire row in which there is a difference. But in my case as the file size is huge the requirement is to filter out the specific row and column which is not matching, not the entire row. As per the example here, File to Compare contents (Main) col1, col2, col3, col4 1,2,3,9 5,6,7,8 Reference File contents (lookup) col1, col2, col3, col4 1,2,3,4 5,6,7,8 The output will show the entire row 1,2,3,9 where in I want to see only 9 here. Is that possible. Pls guide.
Hi Pulak, In tMap if you have a join on Col4 only, then you can get the reject. Only condition is that you put single column C4 as reject output and not entire input columns. What is your scenario, can you pl put the screenshot? Vaibhav
Hi Vaibhav, Thanks for your answer. The requirement here is, I dont want see the entire row, rather I want to see the specific column where is the difference. In large files the data may vary in any of the columns. So my question is, can tmap point the exact cell where is the difference rather than showing the entire row. To give an example, File1, 1234 5678 1111 4444 FIle2, 1234 5978 1221 3444 So can I get data in my output file only for the difference, row1: okay row2,col2: 9 row3,col2:2,row3,col3:2, row4,col1:3 Because in large files you never know where is the difference. My file contains million of records. So if I will get the entire row they its difficult to match up. Thanks, Pulak