One Star

[resolved] multithreading in single job

Hi,
1.I created 4 jobs and enabled multi-threading . so jobs working paralley but i don't have any idea about how many thread can execute paralley . can i specify such things . ???? where can i get these details.
2. as you can see in image i am using tFileList in which i have 5000 files(each).
when i create single job for that particuler folder then can i use multithreading for it.
(say, 4 files reading and inserting in parallel in one job .
does talend runs these files paralley
)
is it possible ??? if yes then How???
1 ACCEPTED SOLUTION

Accepted Solutions
One Star

Re: [resolved] multithreading in single job

Hi rhall,
Thanks for your replay .
and I found out everything I need in   http://community.talend.com:80/t5/Installing-and-Upgrading/uDIG-Viewer/m-p/1609#M7432 .

Thanks,
Pankaj
4 REPLIES
Employee

Re: [resolved] multithreading in single job

With multi-threading a good rule of thumb for the number of threads you can run concurrently is ...
n= number of processor cores
n-1
However, it very much depends on what else is going on on your system at the time.
When it comes to dealing with reading and writing to files, multi-threading can be dangerous as files don't like being written to by more than one process at a time. You may want to test this out.
One Star

Re: [resolved] multithreading in single job

Hi,
for multiple sub jobs multi-threading is working.
(I get that it will depend on my system how many threads it can run)
But i was wondering if i have 5000 file in folder and created a job to read and insert data into DB.
How talend does it.
1.talend reads file one by one or talend can read multiple files parallely in single job.
there is no connection between file.It just read and insert
is there any difference between community and enterprise versions

Thanks,
Pankaj 
I was just trying to get maximum efficiency .
Employee

Re: [resolved] multithreading in single job

I am assuming you are using the tParallelize component to synchronise the jobs. This works by running the jobs that you can see are hooked up to it in parallel. So if you have 3 jobs connected, you will have 3 jobs running concurrently. You mentioned that you will be loading the data to a DB. In which case, you may be able to make use of parallel inserts depending on the DB (the Oracle components supports this). This works slightly differently to the tParallelize component as you simply set the number of parallel executions you want and the job will sort the rest out for you when it runs.
Talend does not support parallel processing in the Open Source product in the same way in which it supports it in the Enterprise Edition. You may be able to find some multi threading stuff in the Exchange, but apart from that you will not get that functionality. 
One Star

Re: [resolved] multithreading in single job

Hi rhall,
Thanks for your replay .
and I found out everything I need in   http://community.talend.com:80/t5/Installing-and-Upgrading/uDIG-Viewer/m-p/1609#M7432 .

Thanks,
Pankaj