i have a job with 4 different sources and store each input table in a hash output component and used it later with hash input.
Now, i built subjobs and want to try it the same way but the hash components seems not to be the way it work.
I know, i can give data from parent to child job via the tbuffer component. But i have 4 datasets i want to use in my subjob. That don't work with tbuffer, right?
Is there a way besides creating temp files?
Thanks for answer :-)
If the data set is same, you can add all the data set to a single buffer output with a code value to determine each dataset. If it is totally different dataset, I would think about the possibility to read directly from DB at subjob rather than reading from the parent job. Or I would keep the data in temp files as it will not overload the memory in this case.
thanks for your answer.
I will try the temp file thing.
The reason why i want the input in the parent job is because i use the same input again in every subjob.
There are 6 subjobs (stage 1 to 6) and via the metaservlet in TAC we set a context variable and start the parent job which will run the stage named in context.
At the end, we trigger the ESB, he does stuff and come back call the DI TAC for the next stage.
I report my solution.
We can use tHash components, but this occupies the memory, as you have 4 different sources , which means you will use 4 tHashoutput and 4 tHashinput, which will be 8 hash components , there is huge memory occupation if the data is more, it is okay to use if we have less amount of data
Can you please discribe littele bit detail about 4 sources, based on the incoming source we will have some options to reduce memory issues.
best way is we can create a folder in the server in all environments, then by using tmap before insertion to target in the basic setting tMap you can find temp data path, But this works with same schema
That is the reason I had mentioned both disk options in the form of files or memory options in the form of hash. Based on the processing requirements and available memory, he can choose the preferred method.
here are my solution, i create temp files before reading input data.
For each input-source the job creates a temp file:
and store each Path in globalMap (tJava):
To run the different subjobs i used a tRun component with "Use dynamic job". To run a specific stage i used the context.stage variable in "Context Job".
Because the subjobs have to know where the temp data is stored i use the globalMap variables with the tempfile path as context parameter.
If the subjob is successful, the stage-file will be uploaded to a FTP Server and the ESB get a JMS Message with the information where the file to be found.
thanks for your help :-)
The first 100 community members completing the Open Studio survey win a $10 gift voucher.
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Watch the recorded webinar!
Pick up some tips and tricks with Context Variables
Learn how media organizations have achieved success with Data Integration
Accelerate your data lake projects with an agile approach