Talend Spark job getting failed while we process Large xml message with size 1.2 GB.
we have two core nodes with 64GB each and one master node with 64GB.
we are running the job in Yarn client mode, with attached spark configuration.
[ERROR]: org.apache.spark.internal.io.SparkHadoopWriter - Aborting job job_id.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 6 in stage 0.0 failed 4 times, most recent failure: Lost task 6.3 in stage 0.0 (TID 17, ip-XX-XXX-XXX-XXX.XXXXX.com, executor 6): ExecutorLostFailure (executor 6 exited caused by one of the running tasks) Reason: Container marked as failed: container_id on host: ip-XX-XXX-XXX-XXX.XXXXX.com. Exit status: 50. Diagnostics: Exception from container-launch.
Container id: container_id
Exit code: 50
Stack trace: ExitCodeException exitCode=50:
Please let me know your inputs
Can you please clarify in which Talend bigdata version/edition you are?
Here is a jira issue:https://jira.talendforge.org/browse/TBD-4872 and it is fixed in V 6.4.1
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Take a look at the Talend Studio improvements for API Services
Take a look at this video about Talend Integration with Databricks
Learn how<SPAN>to modernize your Cloud Platform for Big Data Analytics with Talend and Microsoft Azure</SPAN>