Read files from FTP and get date created/modified

Four Stars

Read files from FTP and get date created/modified



I am new to Talend, but I have been trying to find a solution to this problem. Currently, I can list all of the files on my FTP server. I would like to be able to list all files on the server that were created on today's date. I have tried looping over all of the files on the server, and then passing the filename (I have tried current_filepath & current_file) to the tFTPFileProperties which would then output the properties. At this current state, the output is just null for every field.


Is there a step that I am missing somewhere?


Also, which component from the palette would I need to use in order to only output the files that were created on today's date?











Four Stars

Re: Read files from FTP and get date created/modified

Hello, Is this possible to do in Talend, or am I going about it the wrong way? Also...where does everyone find tutorials/documentation on Talend? On Talend's website?


Five Stars

Re: Read files from FTP and get date created/modified



Without correct file naming like myfile_25_09_2019, I think you won't be able to retrieve the "creation date" of the file because this data is not retrieved by tFTPFileProperties component. BUT you really can have Job Design which allows you to process files based on their modification date.

If you're interrested in, read the following. If you absolutely need to process files based on creation date, try your own logic with tJava / Routines etc.. and your favorite FTP library.

Even if the topic is old, someday the following could help someone else. This is my solution to your problem :

The Job design :

1. Get the files list with tFTPFileList component.

Then, in the iterate loop :
1. Get File properties with tFTPFileProperties component.

1_get_file_properties.PNGGet Files Properties


2. Filter the rows that doesn't have any value for "mtime" property.

"mtime" is the unix timestamp of the file last modification. Rows without mtime are corresponding to folders, and we are looking for files. To achieve this, just use a tFilter Row component with condition : AND | "mtime" column | Empty action | non equals to operator | null value. This will only keep rows with mtime values.

2_remove_rows_with_null_mtime.PNGFilter Rows That Doesn't have mtime value


3. With a tJava, just add another field : "fileDay", this new field will be used in the next component.
I recommand using the simple Calendar Java 8 API which allows to init a Calendar from unix long ( = mtime field). 
We will compare file's date and today's date by using respective Calendar.DAY_OF_YEAR values. 

Pass in output_row every input_row values.

3_add_dayOfYear_property_to_rows.PNGadd new field to each row : fileDay


4. Compare each rows "fileDay" value with today's DAY_OF_YEAR value in a tMap component
I recommand again Calendar API. Please check Catch Reject input for the second output of the tMap. 
Then, you will have two outputs, one with files updated today, and the others...

If fileDay == today ==> go in the output filesToday , else go in the catch output : filesNotToday
4_compare_rows_date_to_todays_date.PNGcompare fileDay to todayDay in a Tmap


And this is it. Now you will be able to process files updated today or not.

Have a nice day.


Talend named a Leader.

Get your copy


Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables


Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema


Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables