Error while running a spark job on AWS EMR.

Highlighted
Four Stars

Error while running a spark job on AWS EMR.

I have a job that is running well in local mode. But when I try to launch the same job on AWS EMR, I get the following error:  This is the error as seen from the Spark log files.

Diagnostics Info:
AM Container for appattempt_1525124598990_0007_000002 exited with exitCode: -1000
Failing this attempt.Diagnostics: File file:/Users/Desktop/TALEND/Talend-Studio-20180116_1512-V6.5.1/workspace/.Java/lib/metrics-core-3.1.2.jar does not exist
java.io.FileNotFoundException: File file:/Users/Desktop/TALEND/Talend-Studio-20180116_1512-V6.5.1/workspace/.Java/lib/metrics-core-3.1.2.jar does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:640)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:866)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:630)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:452)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:359)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
For more detailed output, check the application tracking page: http://ip-10-0-0-12.ec2.internal:8088/cluster/app/application_1525124598990_0007 Then click on links to logs of each attempt.

 

Note:
Looks like the talend job is trying to find the file "metrics-core-3.1.2.jar" from my local(developer) machine, but it is unable to find it. However I verified that the file exists and is accessible.

 

Can anyone help ?

Highlighted
Moderator

Re: Error while running a spark job on AWS EMR.

Hello,

The metrics-core-3.1.2.jar file should be shipped within Talend Studio. Could you please create a case on talend support portal so that we can give you a remote assistance(webex) through support portal with priority?

 

Best regards

Sabrina

--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Highlighted
One Star

Re: Error while running a spark job on AWS EMR.

Did you resolve your issue? If so, can you please share your solution? Currently, I am experiencing the same problem.

2019 GARTNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

Put Massive Amounts of Data to Work

Learn how to make your data more available, reduce costs and cut your build time

Watch Now

How OTTO Utilizes Big Data to Deliver Personalized Experiences

Read about OTTO's experiences with Big Data and Personalized Experiences

Blog

Talend Integration with Databricks

Take a look at this video about Talend Integration with Databricks

Watch Now