Four Stars

How to setup a startup index to a xml file?

I have a data of 100 details.If the processing is done till 100 and If I add 20 more details.The processing of the file should start from 101 till 120.How can I do this job in talend?
5 REPLIES
Moderator

Re: How to setup a startup index to a xml file?

Hi,
Could you please elaborate your case with an example with input and expected output values?
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Four Stars

Re: How to setup a startup index to a xml file?

Here I am giving a .csv or xml file as input with 100 details and collecting the output in .csv format.Now I again modify input file by adding 20 more details and now when I run it should only generate from 101 to 120 details without generating the first 100 details.like My csv file has 100 details now and generated the job and .csv output has 100 details.Now I have added 20 more files to input .csv file and now the same job should run 101 to 120 details but not from 1st detail again.How can I do this job in talend?
My Sample Job: Here I have used tInputFiledelimited to pass input and then sorted with tsortrow in desc order and get to find the last row and I am unable to pass this as a parameter to generate the next updated files with this detail as a startindex and obtain the output with this 20 details. 
Twelve Stars

Re: How to setup a startup index to a xml file?

In any case - logic for CSV and XML will be different
generally Talend not support "tail" function (and I do not know any software which can do this for remote files)
in case of csv, if You do not worry about concurrent writes, You can save last processed row and on next iteration use saved number for skip first XXX lines
with XML this could not work - depending from XML format (it could work only if each row is separate XML document)
again, If You know this new id's - You can run XQUERY over XML iterate over file for id's from list
For this reason most often used way (with files) - processed, than moved or deleted and new iteration start from new file
Other way - use combination of different technologies:
- do not store data in file at all, but send them to message queue or topic
You can do this direct from Your source software, or if You can not do this, using external software like - https://github.com/mguindin/tail-kafka 
than by Talend parse this queue/topic
-----------
Four Stars

Re: How to setup a startup index to a xml file?

@vapukov
Thanks for the response.Could you please elaborate on how to process the csv file after saving the last row.
My Job:
tFileInputDelimited-->tsortrow-->tsamplerow(here I have collected the last row) after this could you please explain the process to generate the data setting the last row as a start index and generating the remaining data. 
Twelve Stars

Re: How to setup a startup index to a xml file?

component tFileInputDelimited have settings - Header, this is really number of skipped rows (include "header" row)
Talend do not parse header - it just skip number of rows, and map columns by delimiter in order which You are use in schema for component
so, write value to variable and use this variable as value for component
-----------