Four Stars

Reuse loaded data in multiple subjobs

Hello,

my question is if there is a way (or what is the best way) to reuse data in subjobs without loading them again.

I have multiple jobs each of them performing a specific task. For example importX, importY, importZ, exportA, exportB,... (~30-40 of them)
They should be able to run on they're own but i also have a parent job that calls all of them in a row.

Most of the subjobs need data from a pretty big MySql-table. So there is a tMysqlInput to load this table in almost every job. The query for this needs about 5-10 seconds each time.
The data in this table will not change through the process so it would be ok if i load it just once.

Is there a way to load this data once into memory and then reuse it in any subjob but still maintain the possibility to run each subjob seperat?

Im searching for something like: "Was the data already loaded in another job before? use it. If not, load it now".

So if i start each of these jobs seperat they will load the data from DB. But if i start the parent job this data will only be loaded once and then all the other subjobs reuse it.

Thanks for you're help Smiley Happy

2 REPLIES
Six Stars dgm
Six Stars

Re: Reuse loaded data in multiple subjobs

I don't know such component.

 

I think it should be possible with big data. Spark provide a solution to keep data in memory and manipulate it.

 

dgm

Five Stars

Re: Reuse loaded data in multiple subjobs

Hi,
Try using the tbufferinput and tbufferoutput components so that you can read once and store in bufferinput. Wherever needed use tbufferoutput in the subjobs.
Thanks,
Ram