One Star

tFileList and HDFS connection

Hi,
is there a way to do a tFileList utilizing an existing connection to a hadoop cluster?  Would like to return a list and then decide whether to open file or not depending upon name.
thanks,
Bob
3 REPLIES
Moderator

Re: tFileList and HDFS connection

Hi,
is there a way to do a tFileList utilizing an existing connection to a hadoop cluster?  Would like to return a list and then decide whether to open file or not depending upon name.

The component tfilelist can not retrieve a set of files or folders on hadoop cluster server directly.
Here is a TalendHelpCenter:tHDFSList component which can retrieve a list of files or folders based on a filemask pattern and iterates on each unity.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Employee

Re: tFileList and HDFS connection

Hi,
Is there a tHDFSList equivalent for Spark?
Thank you
Highlighted
Seven Stars

Re: tFileList and HDFS connection

You would have to handle that in a Standard DI job then call spark big data job with tRunJob passing any context to the spark job as needed.