result-set splitter

Seven Stars

result-set splitter

 

 

 

Hi,

 

I have a large bunch of rows (read from a file) and I'm managing it with some elaboration. 

The elaboration takes a time not proportional to the number of row, but exponential somehow. The first 10000 rows are elaborated in 2 minutes (let's say), the second 10000 in 5-6 minutes ... and so on. 

My job is designed to write on DB the result, on a SubJobOk link, that is followed when all rows parsed (from file) and elaborated.

 

How could I split a result set (let say 40k rows) in blocks of 10k rows, like a loop?

Ideally, I would need a result-set splitter, a component that takes in input a row link and supply in output 4 times (consecutively) 10k rows.

Any idea how to achieve this task?

 

Thank you,

Lorenzo

 


Accepted Solutions
Ten Stars

Re: result-set splitter

You can leverage the Header field of file input components to skip numbers of rows. A tLoop component can execute a subjob a number of times, and exposes a CURRENT_VALUE variable that you can reference in other components.

All Replies
Ten Stars

Re: result-set splitter

You can leverage the Header field of file input components to skip numbers of rows. A tLoop component can execute a subjob a number of times, and exposes a CURRENT_VALUE variable that you can reference in other components.
Fifteen Stars TRF
Fifteen Stars

Re: result-set splitter

Hi,
An other option is to read the whole input file a 1st time and write the content to a tFileOutputDelimited with the split option with a number of records per setted to 10k for example.
Then using a tFileList, iterate over the generated files to process the content by 10k rows at a time.

TRF

Calling Talend Open Studio Users

The first 100 community members completing the Open Studio survey win a $10 gift voucher.

Start the survey

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog

How Media Organizations Achieved Success with Data Integration

Learn how media organizations have achieved success with Data Integration

Read

6 Ways to Start Utilizing Machine Learning with Amazon We Services and Talend

Look at6 ways to start utilizing Machine Learning with Amazon We Services and Talend

Blog