Here I am giving a .csv or xml file as input with 100 details and collecting the output in .csv format.Now I again modify input file by adding 20 more details and now when I run it should only generate from 101 to 120 details without generating the first 100 details.like My csv file has 100 details now and generated the job and .csv output has 100 details.Now I have added 20 more files to input .csv file and now the same job should run 101 to 120 details but not from 1st detail again.How can I do this job in talend? My Sample Job: Here I have used tInputFiledelimited to pass input and then sorted with tsortrow in desc order and get to find the last row and I am unable to pass this as a parameter to generate the next updated files with this detail as a startindex and obtain the output with this 20 details.
In any case - logic for CSV and XML will be different generally Talend not support "tail" function (and I do not know any software which can do this for remote files) in case of csv, if You do not worry about concurrent writes, You can save last processed row and on next iteration use saved number for skip first XXX lines with XML this could not work - depending from XML format (it could work only if each row is separate XML document) again, If You know this new id's - You can run XQUERY over XML iterate over file for id's from list For this reason most often used way (with files) - processed, than moved or deleted and new iteration start from new file Other way - use combination of different technologies: - do not store data in file at all, but send them to message queue or topic You can do this direct from Your source software, or if You can not do this, using external software like - https://github.com/mguindin/tail-kafka than by Talend parse this queue/topic
@vapukov Thanks for the response.Could you please elaborate on how to process the csv file after saving the last row. My Job: tFileInputDelimited-->tsortrow-->tsamplerow(here I have collected the last row) after this could you please explain the process to generate the data setting the last row as a start index and generating the remaining data.
component tFileInputDelimited have settings - Header, this is really number of skipped rows (include "header" row) Talend do not parse header - it just skip number of rows, and map columns by delimiter in order which You are use in schema for component so, write value to variable and use this variable as value for component