When using a MapR 5.1 distribution with a MapR ticket, some Windows users' Spark Jobs get stuck in a loop, with the following message in the execution log of the Job (managed by log4j):
[WARN ]: org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
The true cause of this issue is shown in the log on YARN:
Exception in thread "ContainerLauncher #0" java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/MRJobConfig
This message means the YARN system that is trying to run the Spark Job can't reach some MapR ticket dependencies that are accessible only in the MapReduce directory, so you must specify the related path in the YARN classpath.
Making this change in your cluster could adversely impact the whole cluster. To prevent such issues, add this classpath for only your current Job:
Note: These steps cover only how to add the MapReduce classpath for a Spark Job; they don’t show any other configuration required by the Spark Job.
TBD-3612 Job Spark on MAPR 5.1 Ticket cluster blocked on "Initial job has not accepted any resources"