Context variable value is not passed to Spark Job (Yarn Cluster Mode) invoked by tRunJob component

Problem Description

 

A standard Job is invoking through a tRunJob component in a Big Data Spark Job configured with the Yarn Cluster Mode. The standard Job has two contexts Context_A and Context_B with a context variable, testvar. Similarly, the Big Data Spark Job has the same contexts, Context_A, and Context_B, and context variable, testvar. The Big Data Spark Job is invoked from the standard Job with the tRunJob component configured with the following options selected:

  • Transmit whole context
  • Use an independent process to run subJob

When running the standard Job with the context Context_A, the expected behavior is that the value of the context variable testvar in the Big Data Spark Job (invoked by the standard Job) is set to the value from the Context_A of the standard Job. The actual behavior that the context variable testvar is set to the value of the Context_A of the Big Data Spark Job. Therefore, at runtime, the context of the standard Job is not passed to the Big Data Spark Job.

 

Note: if the Big Data Spark Job is configured with the Yarn Client Mode, this problem does not occur.

 

Root Cause

This is a known bug.

 

Solution

The issue is fixed in Talend 7.1.1 and 7.0.2.

Against Talend 7.0.1, the solution consists of applying Patch_20180710_TPS-2583_v1.

  1. Contact Talend Support to request patch Patch_20180710_TPS-2583_v1.

  2. Use the patch Readme file steps (embedded in the patch zip file) to apply the patch.

Version history
Revision #:
8 of 8
Last update:
‎06-13-2019 02:35 AM
Updated by: