I have a mainjob -> subjob(s)
1 - mainjob triggers multiple subjobs but calls NodeJS and receives (Document) XML and placed in (Object) context.configXML
2 - context.configXML is assigned to the subjob.
3 - subjob -> read (Document)context.configXML -> tXMLMap -> correct output.
BUT If I run the subJob as a use and Independent process to run Job , I get an error:
java.lang.String cannot be cast to routines.system.Document
Could somebody please explain what is happening here?
Looks, to me, the context variables are dumped in a file as string, and not bytes/serializing , so casting back to Document doesnt work?
How to solve this properly?
Talend "Documents" are a bit of a pain. Talend has its own Document class and that can cause issues when converting from or to a org.w3c.dom.Document, for example. So you need to be aware of this. An easy way of sorting this (althought not ideal) is to convert your Documents from another source into a String, removing the XML header and using a tConvertType to convert it into an instance of the Talend Document class. It's a pain, but it works.
However, in this situation I believe that Talend ARE forcing contexts to a String. The reason I believe this is because contexts passed into a job using the implicit context load or tContextLoad need to be in String format. This has been done for a number of reasons, but an obvious one being that they need to load the value in a object that can handle all values. String meets that requirement. When you pass contexts between jobs in the same process, they can make use of the object information known about them. However the independent process clearly switches to the presumption that the data is of unknown type (possibly stored in a flat file) and therefore the implicit String conversion takes place. To get round this try passing your XML as a String without the XML header and see what that does. I believe it should work for you.
You can pass real objects from one job to another.... I do it all the time. What is stopping you from doing this is setting your job to run in a different process. In this case the assumption is that the job is receiving context variables from a database, a file or a command line. As such, the default format for receiving those values is a String. In the majority of cases, this is sufficient. However, for passing objects it is not. But in this situation you can still actually use serialisation, but it doesn't do the work for you and you will need to manually control it.
It is not impossible to do the majority of things that people think it is impossible to do with Talend. But they build the product to make it easy to do as much as possible for people without computer science degrees, while leaving the door open to do much more complicated stuff for those who want to go a little beyond that. I kind of like the fact that my computer science degree is still useful in a world where sales guys want to sell development software with the tagline "even my granny can do it" :-)