Enable parallel execution disabled

Seven Stars

Enable parallel execution disabled

Hello,

I am using 6.2 Enterprise version. I am using parallel execution. but in tFilterRow's advance setting, parallel execution is disabled. How to enable this parallel execution for tFilterRow?

 

Six Stars

Re: Enable parallel execution disabled

Hi mailforsaggy,

 

Could you join a screenshot of your job so we can try to replicate everything?

Does your stream is already parallelised?

 

Cheers,

Six Stars

Re: Enable parallel execution disabled

Hi mailsorsaggy,

we don`t have enable parallel option for the components like tfilterrow,tsortRow and taggergateRow components because as per my understanding these operation should executed on the whole records. please let me know if you already found the way to enable the option.

 

Thanks

Jilani Syed

Six Stars

Re: Enable parallel execution disabled

You can sort and parallel but you have enable it on the link instead of the component (screen below). 

 

@jilanisye, on Aggregation or Sorting, you can still do parallelisation but it would make sense to use a hash key partition (same key that you want to do the aggregation or sorting on) to make sure that the operation are done correctly. This exact topic with example is well cover on the Talend DI Advanced training (ODT). 

 

Overview.PNGLink.PNG

 

 

Alternatively, you could use tPartition, tCollector etc to achieve the same things but I personally prefer to enable it at the link level as it makes the job less clumsy.

 

Partition.PNGPartitionner

Cheers,

Seven Stars

Re: Enable parallel execution disabled

hello @AdrienAussie@jilanisyed, I am trying parallelism for some small section from my job. With out parallel, it is giving different output and with parallel, it is not proceeding ahead after certain records.

 

I am not understanding what's wrong in setting up the parallelism.

Six Stars

Re: Enable parallel execution disabled

Hi Mailforsaggy,

 

Have you tried to at least validate the tMap (just export to a file like you got with your tReplicate) and parallelism?

 

The general idea of parallelism when you have aggregation/sorting process is to group of the record by the processing key (aggregate by) otherwise it will give you some unexpected result. Regarding the fact that it is crashing, what are you doing in the tJavaRow? Like, can you paste the code or partial code?

Parallelism is creating multi-thread so maybe it is crashing because it is simply trying to write to the same file at the same time.

If it is the case (file pointer issue), you could export everything to a tHashOutput and then read from a tHashInput to your file.