I downloaded files for last 3 months from ftp server. These are the past files. I need to download the future files every day from the ftp server. The previous files file_2018_07_01 to file_2018_09_09 are downloaded but if I have to download file_2018_09_10 without looking for the file name, how do I do it? There will be multiple files like file_2018_09_10 everyday that have to be downloaded. How to compare the list of files between PC and FTP server, fetch files that are not on PC?
Can anyone suggest the right components for it?
@tychobrahe,you want to download the correspond date file on every day basis? if yes you can do with below approcah
in filemask of tFTPGet.
"*" + TalendDate.getDate("yyyy_MM_dd")+"*"
@manodwhb thank you! the issue could be that there are many data files that get uploaded on ftp server everyday and a lot of them aren't needed to be downloaded. So, is there another kind of filter that we can use? Something other than filename format as the format could be changed every year.
@Dijke, thank you very much. 2nd alternative is ruled out as we can't do that on FTP server.
Regarding the main approach you gave using tFTPFileList, tFileList and then using tFTPGet, would that be possible to download the ones that are not matching between tFileList & tFTPFileList? Would it work even if the file names are changed sometime by the FTP admin? There is a possibility that they may change.
1st alternative with syncing dates, could you be more specific? I'm pretty new to Talend and could use some more details.
@tychobrahe,the pattern that you provided has not satisfy to fetch the file. please check the pattern.
If you are not sure about the filename , date format then the only way would be to compare the file content between the files present in the local and ftp server . you can use tfilecompare which would tell you whether there is a difference between local and ftp server files and then you can download files which only have a difference.
1.You can use 'exclude file mask' option on advanced settings of tfilelist , which does not include the files falling under such mask , where you can iterate over the filenames that are already present in the local and put the entire filename as mask on the the excludeFileMAsk option and then download the ones which come through all the exclude file masks.
2.Use tfileExist after tftpList by which you can check whether the ftp file is present in the local and if not present then only pass it to ftpGet using trigger-RunIf link giving the if clause.
Watch the recorded webinar!
Accelerate your data lake projects with an agile approach
Create systems and workflow to manage clean data ingestion and data transformation.
Introduction to Talend Open Studio for Data Integration.