Trying to connect Talend Open Studio for Big Data with Hadoop on AWS

One Star

Trying to connect Talend Open Studio for Big Data with Hadoop on AWS

Hi everybody,
I have the Talend Open Studio on my desktop and a Hadoop cluster on Amazon Web Services. I am trying to connect those two, so that I'll be running my Hadoop jobs from Talend. As I learned from tutorials, in order to integrate Talend with Hadoop, I need to set up tHDFSConnection. As I double click on the tHDFSConnection icon on the Job Design space, I get the Component inset at the bottom of the screen. Here I choose the Hadoop distribution (Amazon EMR) and set my user name and my group. I am confused by the "NameNode URI" field though: what should it be? Sorry if my question is trivial, but all info I've found online is not quite helpful.
Four Stars

Re: Trying to connect Talend Open Studio for Big Data with Hadoop on AWS

Hi kpopov - take a look at this AWS Guide on connecting to AWS HDFS remotely - http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-web-interfaces.html
Scroll to the bottom of the page.
And as the doc indicates, if you still can't directly access your URI, you'd have to do one of the 3 options listed on that page to connect remotely.

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Put Massive Amounts of Data to Work

Learn how to make your data more available, reduce costs and cut your build time

Watch Now

How OTTO Utilizes Big Data to Deliver Personalized Experiences

Read about OTTO's experiences with Big Data and Personalized Experiences

Blog

Talend Integration with Databricks

Take a look at this video about Talend Integration with Databricks

Watch Now