Four Stars

Facing Issues with Spark Context Initialization Using Spark Big Data Batch Job

Good afternoon,

 

I am getting the following error log:

My cluster is well configured. Anyone who can help me?

 

Starting job testAvro at 15:16 02/11/2017.

[statistics] connecting to socket on port 3464
[statistics] connected
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/C:/Users/0martinjr/.Java/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/C:/Users/0martinjr/.Java/lib/talend-spark-assembly-1.6.0-cdh5.8.1-hadoop2.6.0-cdh5.8.1-with-hive.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
[WARN ]: org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[WARN ]: org.apache.spark.SparkConf - In Spark 1.0 and later spark.local.dir will be overridden by the value set by the cluster manager (via SPARK_LOCAL_DIRS in mesos/standalone and LOCAL_DIRS in YARN).
[ERROR]: org.apache.spark.SparkContext - Error initializing SparkContext.
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
 at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:124)
 at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:64)
 at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:144)
 at org.apache.spark.SparkContext.<init>(SparkContext.scala:541)
 at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:59)
 at poc_sebastien.testavro_0_1.testAvro.runJobInTOS(testAvro.java:1291)
 at poc_sebastien.testavro_0_1.testAvro.main(testAvro.java:1172)
[WARN ]: org.apache.spark.metrics.MetricsSystem - Stopping a MetricsSystem that is not running
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
 at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:124)
 at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:64)
 at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:144)
 at org.apache.spark.SparkContext.<init>(SparkContext.scala:541)
 at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:59)
 at poc_sebastien.testavro_0_1.testAvro.runJobInTOS(testAvro.java:1291)
 at poc_sebastien.testavro_0_1.testAvro.main(testAvro.java:1172)
Exception in thread "main" java.lang.RuntimeException: TalendJob: 'testAvro' - Failed with exit code: 1.
 at poc_sebastien.testavro_0_1.testAvro.main(testAvro.java:1182)
[ERROR]: poc_sebastien.testavro_0_1.testAvro - TalendJob: 'testAvro' - Failed with exit code: 1.
Job testAvro ended at 15:38 02/11/2017. [exit code=1]

 

Thanks in advance,

 

sebastien1981

1 REPLY
Four Stars

Re: Facing Issues with Spark Context Initialization Using Spark Big Data Batch Job

We are using Talend 6.3, CDH 5.9, and spark 1.6.

 

cluster version:
Property Type: Repository : HDFS:testSeb
Distribution : Cloudera; Version: Cloudera CDH5.8(YARN mode)
SPark Mode : Yarn Client

Configuration:
Resource manager: "xxxxxxx:8032"
set resourcemanager scheduler address : xxx:8030
set jobhistory address : xxxx: 10020
Set staging directory: "/user"


Authentication
Use kerberos authentication
Resource manager principal "xx"
job history principal: "xxx"

Use a keytab to authenticate
Principal "xxxxx" Keytab: "cheminendur"