Reuse loaded data in multiple subjobs

Four Stars

Reuse loaded data in multiple subjobs


my question is if there is a way (or what is the best way) to reuse data in subjobs without loading them again.

I have multiple jobs each of them performing a specific task. For example importX, importY, importZ, exportA, exportB,... (~30-40 of them)
They should be able to run on they're own but i also have a parent job that calls all of them in a row.

Most of the subjobs need data from a pretty big MySql-table. So there is a tMysqlInput to load this table in almost every job. The query for this needs about 5-10 seconds each time.
The data in this table will not change through the process so it would be ok if i load it just once.

Is there a way to load this data once into memory and then reuse it in any subjob but still maintain the possibility to run each subjob seperat?

Im searching for something like: "Was the data already loaded in another job before? use it. If not, load it now".

So if i start each of these jobs seperat they will load the data from DB. But if i start the parent job this data will only be loaded once and then all the other subjobs reuse it.

Thanks for you're help Smiley Happy

Six Stars dgm
Six Stars

Re: Reuse loaded data in multiple subjobs

I don't know such component.


I think it should be possible with big data. Spark provide a solution to keep data in memory and manipulate it.



Six Stars

Re: Reuse loaded data in multiple subjobs

Try using the tbufferinput and tbufferoutput components so that you can read once and store in bufferinput. Wherever needed use tbufferoutput in the subjobs.


Introduction to Talend Open Studio for Data Integration.

Definitive Guide to Data Integration

Practical steps to developing your data integration strategy.

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.