One Star

How to pick only latest files from the FTP server?

We receive random files on a remote ftp server. yesterday we got files for Mar 2,3,4 and today for Mar 5,6 .Tomorrow we might receive for 7,8,9,10,11,12. All the CSV files have date appended to the filename: ABC_20150307. we have to create a job that picks the latest uploaded files from the FTP server, iterates through all the new files and loads to the MySQL table. This job will execute daily.
My question is how do I pick only the latest arrived files from the FTP server. How should I set the filemask property in tFileList component for this. snapshot or detail description would be helpful
Thanks
Shipra
4 REPLIES
Moderator

Re: How to pick only latest files from the FTP server?

Hi 
Could you please take a look at a related forum: https://www.talendforge.org/forum/viewtopic.php?id=39197 to see if it is working for your case?
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
One Star

Re: How to pick only latest files from the FTP server?

Hi Shipra,
I had faced a similar situation sometime back, & handled it with a slightly different approach.

1. Here, I've configured the tFileList component to list the files in descending order of 'latestModifiedTime'.


This would result in the last file listed first.
2. Then, I've used a FLAG approach to process only the first file listed by tFileList component. 
For this, I've used 2 tJava components, one to initiate a Flag (& then store it to global map) & on successful processing of the first file, this flag would be updated.
3. The link between the tFileList component & tFileInputDelimited component is an 'IF' condition with the FLAG. This condition allows process flow only if the FLAG is true.
4. This FLAG, which would be initialized to 'True', would be updated to 'False' on first successful execution. Thus, only the first (latest) file would get processed.

I would agree that your approach of selectively picking the latest file using the filemask would also work, but still be dependant on generating the filemask for the latest file. On the contrary, the above approach is independent of filemask condition & can be used as-is in other scenarios as well.
Let me know if your concerns persists. Smiley Happy
MathurM
One Star

Re: How to pick only latest files from the FTP server?

Thanks Sabrina and MathurM
your approach seems helpful but I not only want the latest file  , I want 'ALL' the files that arrived today. how do i pick all Smiley Sad 
I need to check for the upload date on ftp server , pick all those files that arrived today ( ex: files of Mar 4,5,6 that arrived today on 13th Mar, I need all the 3. The counter flag will pick only latest , that is 6thMar file)
One Star

Re: How to pick only latest files from the FTP server?

Ohh, in that scenario your approach of working on the file mask would be better.
Here, using the default built-in functions, in the filemask section of the tFileList component, the condition needs to be like:
Assumption: the job is scheduled for execution, at-least, on a daily frequency or more (i.e. one or more executions per day)
This should definitely put an end to your woes. Smiley Happy
MathurM