tRunJob Dynamic Jobs Context Passing Not Working For Complex Objects

Highlighted
Five Stars

tRunJob Dynamic Jobs Context Passing Not Working For Complex Objects

Hi all.  I've been working on a problem all day and I think I finally figured out what's going on.  I am using the paid version of Studio 7.1 big data.

 

I have a main job that will call a series of child jobs.  This works fine.  I am trying to use contexts to pass data from the parent to the child.  This always seems to work fine for simple objects (Strings, Integers, etc.).

 

I am trying to pass a ConcurrentHashMap (basically I am trying to do something almost exactly like this:  https://www.talendbyexample.com/talend-returning-values-from-subjobs.html). 

 

What I notice is that if I do NOT check "Use dynamic job", the Object context variable (aka the ConcurrentHashMap) is sent properly.  I can also manipulate it in the child job and the changes are available in the parent.  Exactly what I want!

 

The problem is that I do need to use dynamic jobs.  When I check this box, now all of sudden the Map is a String!!

I ended up putting printlns in tJava components like this:

System.out.println(context.sharedMap.getClass().toString());

 

I figured out that just by changing the dynamic job flag (and the job name of course), this will vary between:

class java.util.concurrent.ConcurrentHashMap [without dynamic flag]

class java.lang.String [with dynamic flag]

 

Please help me understand what is going on here and more importantly how to fix it.

 

If anyone wants to replicate it, simply do the example in the URL above and change back and forth between dyanamic or static child jobs.  I'd be curious if it's just me or if others have the same problem.

 

Thanks,

Tom

 


Accepted Solutions
Community Manager

Re: tRunJob Dynamic Jobs Context Passing Not Working For Complex Objects

Hi Tom,

 

I feel your pain with this. I came across this "feature" a while ago. The cause of this is that when you run dynamic jobs, they actually run as an independent process. Essentially you are running them in a different virtual machine and starting them using command-line arguments. However, there is a workaround. It's not ideal, but you can build routines to make it as reusable as possible. 

 

Essentially what you need to do is serialise your ConcurrentHashMap to a String (or create a new object which will serialise to a String). I have built a quick and (very) dirty example routine for serialising and deserialising a ConcurrentHashMap that will always hold String keys and values. Obviously this can be altered for different types......

 

package routines;


import java.util.Enumeration;
import java.util.concurrent.ConcurrentHashMap;

/*
 * user specification: the function's comment should contain keys as follows: 1. write about the function's comment.but
 * it must be before the "{talendTypes}" key.
 * 
 * 2. {talendTypes} 's value must be talend Type, it is required . its value should be one of: String, char | Character,
 * long | Long, int | Integer, boolean | Boolean, byte | Byte, Date, double | Double, float | Float, Object, short |
 * Short
 * 
 * 3. {Category} define a category for the Function. it is required. its value is user-defined .
 * 
 * 4. {param} 's format is: {param} <type>[(<default value or closed list values>)] <name>[ : <comment>]
 * 
 * <type> 's value should be one of: string, int, list, double, object, boolean, long, char, date. <name>'s value is the
 * Function's parameter name. the {param} is optional. so if you the Function without the parameters. the {param} don't
 * added. you can have many parameters for the Function.
 * 
 * 5. {example} gives a example for the Function. it is optional.
 */
public class ConcurrentHashMapWrapper {

    
	public static String serialiseConcurrentHashMap(ConcurrentHashMap<String,String> chm){
		String returnVal = null;
		
		if(chm!=null){
			returnVal = "";
			Enumeration<String> e = chm.keys();
			
			while(e.hasMoreElements()){
				String key = e.nextElement();
				returnVal = key+"|"+chm.get(key)+"|";
			}
			
		}
		
		return returnVal;
	}
	
	
	public static ConcurrentHashMap deserialiseConcurrentHashMap(String data){
		ConcurrentHashMap<String,String> returnVal = null;
		
		if(data!=null&&data.trim().compareToIgnoreCase("")!=0){
			returnVal = new ConcurrentHashMap<String,String>();
			
			String[] pairs = data.split("\\|");
			
			for(int i=1; i<pairs.length;i=i+2){
				returnVal.put(pairs[i-1], pairs[i]);
			}
			
		}
		
		return returnVal;
	}
    
}

This will allow the values to be passed to your child job when using Dynamic jobs.

 

If you want to send the data back to your parent job, then you will have to go to extra lengths to achieve this. I would recommend using a database (if possible) or a flat file. You will need to write the serialised String back to the third party repository and then read it into your parent job once the child job has finished processing.

 

Granted, this is not ideal. But this will get you around the issue you are experiencing. 

View solution in original post


All Replies
Community Manager

Re: tRunJob Dynamic Jobs Context Passing Not Working For Complex Objects

Hi Tom,

 

I feel your pain with this. I came across this "feature" a while ago. The cause of this is that when you run dynamic jobs, they actually run as an independent process. Essentially you are running them in a different virtual machine and starting them using command-line arguments. However, there is a workaround. It's not ideal, but you can build routines to make it as reusable as possible. 

 

Essentially what you need to do is serialise your ConcurrentHashMap to a String (or create a new object which will serialise to a String). I have built a quick and (very) dirty example routine for serialising and deserialising a ConcurrentHashMap that will always hold String keys and values. Obviously this can be altered for different types......

 

package routines;


import java.util.Enumeration;
import java.util.concurrent.ConcurrentHashMap;

/*
 * user specification: the function's comment should contain keys as follows: 1. write about the function's comment.but
 * it must be before the "{talendTypes}" key.
 * 
 * 2. {talendTypes} 's value must be talend Type, it is required . its value should be one of: String, char | Character,
 * long | Long, int | Integer, boolean | Boolean, byte | Byte, Date, double | Double, float | Float, Object, short |
 * Short
 * 
 * 3. {Category} define a category for the Function. it is required. its value is user-defined .
 * 
 * 4. {param} 's format is: {param} <type>[(<default value or closed list values>)] <name>[ : <comment>]
 * 
 * <type> 's value should be one of: string, int, list, double, object, boolean, long, char, date. <name>'s value is the
 * Function's parameter name. the {param} is optional. so if you the Function without the parameters. the {param} don't
 * added. you can have many parameters for the Function.
 * 
 * 5. {example} gives a example for the Function. it is optional.
 */
public class ConcurrentHashMapWrapper {

    
	public static String serialiseConcurrentHashMap(ConcurrentHashMap<String,String> chm){
		String returnVal = null;
		
		if(chm!=null){
			returnVal = "";
			Enumeration<String> e = chm.keys();
			
			while(e.hasMoreElements()){
				String key = e.nextElement();
				returnVal = key+"|"+chm.get(key)+"|";
			}
			
		}
		
		return returnVal;
	}
	
	
	public static ConcurrentHashMap deserialiseConcurrentHashMap(String data){
		ConcurrentHashMap<String,String> returnVal = null;
		
		if(data!=null&&data.trim().compareToIgnoreCase("")!=0){
			returnVal = new ConcurrentHashMap<String,String>();
			
			String[] pairs = data.split("\\|");
			
			for(int i=1; i<pairs.length;i=i+2){
				returnVal.put(pairs[i-1], pairs[i]);
			}
			
		}
		
		return returnVal;
	}
    
}

This will allow the values to be passed to your child job when using Dynamic jobs.

 

If you want to send the data back to your parent job, then you will have to go to extra lengths to achieve this. I would recommend using a database (if possible) or a flat file. You will need to write the serialised String back to the third party repository and then read it into your parent job once the child job has finished processing.

 

Granted, this is not ideal. But this will get you around the issue you are experiencing. 

View solution in original post

Five Stars

Re: tRunJob Dynamic Jobs Context Passing Not Working For Complex Objects

Wow.  I wasn't expecting such a detailed well thought out response.  Thanks.  I get it now.  Ok, I'll have to rethink the design and incorporate what you have suggested.

 

Thanks again.

Community Manager

Re: tRunJob Dynamic Jobs Context Passing Not Working For Complex Objects

No problem. I've been in exactly the same position, so knew what you'd need to point you in the right direction :-)

2019 GARTNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables

Blog

Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema

Blog

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog