Process based on source count data

Highlighted
Four Stars

Process based on source count data

Hi,

 

We are using Talend Data Integration 7.x. I have an issue with one of my requirement.

 

My pipeline consists of components tDBInput -> tMap -> tDBOutput.  I have a huge data coming in from source tDBInput(around 10 million) , so I had to filter the data for example (1 to 1,000,000 & 1,000,001 to 2,000,000). 

Can I process each filter sequentially such that it will not fail.

I tried to use row-->Iterate: unable to use row(iterate) from tDBInput to tMap. 

 

Please suggest me on the better approach, I cant run in parallel as the data is huge and it might affect the server capacity.

Highlighted
Four Stars

Re: Process based on source count data

@xdshi  any suggestions ??

Highlighted
Sixteen Stars
Sixteen Stars

Re: Process based on source count data

You may have tDBInput -> tFileOutputDelimited (with the number of lines for each files definied on tFileOutputDelimited Advanced settings).
This will create from 1 to n files, then you can iterate over these files and do what you want.

TRF
Highlighted
Sixteen Stars
Sixteen Stars

Re: Process based on source count data

@srkalakonda, does this help?

If so, thank's to mark your case as solved.


TRF
Highlighted
Four Stars

Re: Process based on source count data

@TRF, We cannot use files as a intermediate area. Can we iterate based on parameter values ?

2019 GARTNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

Best Practices for Using Context Variables with Talend – Part 2

Part 2 of a series on Context Variables

Blog

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables

Blog

Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema

Blog