Process all rows in custom component

Four Stars

Process all rows in custom component

Hi,

It seems the components process one row at a time.

I have a need to micro batch it in some cases and in some cases ensure that I am looking at the whole of data set.

 

How do I enforce batch size and/or enforce all rows to be processed in one pass.

e.g will be I have 2 millions rows to push to a webservice, I would rather do 10-50 K rows in one call, rather than making 2 Million unique web calls

 In this case I need to be able to provide a batch size

or I need to calculate some aggregate value (for simplicity sake) In this case I need to know all rows before I can do the math

Employee

Re: Process all rows in custom component

Hi @bhupendra_patil,

 

This page gives some pointers about that kind of implementation.

 

Hi level, the studio will define "groups" of a particualr size (assumed "big" from the component developper point of view). To implement chunking/bulking you need to define in your configuration a "maxSize" option and define the following callback (method in your processor):

 

  1. BeforeGroup: reset a record buffer (list)
  2. ElementListener: test if the buffer size is >= maxSize and if so flush, if not bufferize current record
  3. AfterGroup: if the buffer is not empty then flush

 

This works for output kind of components but for transform components (understand a component with an output like the use case you mentionned) you would need a patched version of the studio since we added it after last available release.

 

Which version of the studio do you rely on?

 

Thanks,

Romain
Talend Component Kit Documentation: https://talend.github.io/component-runtime/
Employee

Re: Process all rows in custom component

Hello,

We have prepared a patch that include the bulk processing feature to the latest Talend Open Studio 7.1.1M1 milestone release.

 

You will find a readme file with instructions to install the attached patch.
Download the latest milestone release: Talend Open Studio 7.1.1M1

--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.

What’s New for Talend Spring ’19

Watch the recorded webinar!

Watch Now

Tutorial

Introduction to Talend Open Studio for Data Integration.

Watch

Downloads and Trials

Test drive Talend's enterprise products.

Downloads

Definitive Guide to Data Integration

Practical steps to developing your data integration strategy.

Download