Talend Big Data Batch - Reading from S3

Five Stars

Talend Big Data Batch - Reading from S3

Guys, I've started my tests with Talend Big Data.

Now, specifically, I'm trying to read S3 csv file to a dataframe... I Want to try to merge this data with an existent parquet file on S3.

 

1.PNG

2.PNG

3.PNG

The talend is returning the following error:

4.PNG

 

I saw once some article or person saying that on Talend Big Data is necessary download the file from S3 to HDFS firstly and after with the file inside hdfs is possible then use a Big Data Batch job to process the data. Is it correct? Would be possible do the way I'm trying or Should I try the second approach.

 

I found really difficult to find answers to this through the internet...

 

Thanks,

 

André Santos

2019 GARTNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

Put Massive Amounts of Data to Work

Learn how to make your data more available, reduce costs and cut your build time

Watch Now

How OTTO Utilizes Big Data to Deliver Personalized Experiences

Read about OTTO's experiences with Big Data and Personalized Experiences

Blog

Talend Integration with Databricks

Take a look at this video about Talend Integration with Databricks

Watch Now