Custom Hadoop Distribution support to Spark components in Talend

Five Stars

Custom Hadoop Distribution support to Spark components in Talend

I am working with a cluster where we have custom hadoop 2.4. I am trying to use talend with spark components. For the Spark Connection components, I have the set the relevant SparkHost, SparkHome.

For the distribution, the two available options are Cloudera and Custom( unsupported). When the Custom( unsupported ) distribution is selected, there is a provision to choose the custom Hadoop version to include the relavant libraries. The options available here are: Cloudera, HortonWorks, MapR, Apache, Amazon EMR, PivotalHD. However for me, when I choose Cloudera it comes with Hadoop 2.3 and I am assuming that all essential libraries are missing, and hence I get an "NoClassDefFoundError" which leads to not being able to load a file in Spark via this Spark connection. Btw, the spark version I have is 1.0.0

I would like to know how to fix this and a way to get this version of Spark running with Hadoop Certification.

The error is copied and pasted below:


[statistics] connecting to socket on port 3637

[statistics] connected

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/api/java/JavaSparkContext

    at sparktest.sparktest_0_1.sparktest.tSparkConnection_2Process(

    at sparktest.sparktest_0_1.sparktest.runJobInTOS(

    at sparktest.sparktest_0_1.sparktest.main(

Caused by: java.lang.ClassNotFoundException:



    at Method)


    at java.lang.ClassLoader.loadClass(

    at sun.misc.Launcher$AppClassLoader.loadClass(

    at java.lang.ClassLoader.loadClass(

    ... 3 more

[statistics] disconnected

Job sparktest ended at 13:19 21/10/2014. [exit code=1]



Re: Custom Hadoop Distribution support to Spark components in Talend


Could you please indicate on which talend build version you got this issue? Here exists a jira issue: about "spark job can't work with HDP2.3".

This issue has been fixed on 6.1.2, 6.2.1 .

Best regards


Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.