One Star

Move data from AWS S3 to EMR

hi experts,
Can anyone please help me that how can we move data from AWS S3 to EMR Cluster using talend?
Also if I have some zip files in S3 buckets, how can I unzip it using talend before moving it to EMR cluster (I assume EMR is the hadoop cluster provided by amazon)? 
I am using Talend Open studio for Bigdata and running it on my local PC.
Regards
Mukesh
3 REPLIES
Moderator

Re: Move data from AWS S3 to EMR

Hi,
If we understand your requirement very well, you can use tS3Get component to retrieve a file from Amazon S3.
The work flow should be:tS3Connection-->tS3Get(retrieve files frm s3 to local)-->tfileunarchive(unzip your file)-->EMR cluster(amazon EMR).
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
One Star

Re: Move data from AWS S3 to EMR

Thanks for your reply. 
Which talend component I could use for "AMR Cluster" which you had mentioned at the end ? 
regards
Mukesh
Moderator

Re: Move data from AWS S3 to EMR

Hi,
You can get Amazon EMR distribution from  hadoop component.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.