Record Count at ID level

Six Stars

Record Count at ID level

Hi

I'm trying to achieve an equivalent of a COUNT(*) GROUP BY 2 fields in SQL within a Talend flow but at the moment we are reading a table and writing out a sort/uniq'd output to a MySQL table, then having a second subjob that reads the output from the first step and counts the number of rows per ID.

 

Any idea whether it would be possible to run this in a single subjob flow or whether we should keep these jobs separate and run the count in raw SQL?

Thanks

Dave

Nine Stars

Re: Record Count at ID level

Hi,

 

Split out the data you want to get the GROUP BY on with a tReplicate (so it gets its own feed to perform the aggregation on) and then either tSortRow/tAggregateSortedRow or tAggregateRow, depending on data volumes.

Regards David
Dont forget to give Kudos when an answer is helpful or mark the answer as the solution.

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog

How Media Organizations Achieved Success with Data Integration

Learn how media organizations have achieved success with Data Integration

Read

Why Companies Move to the Cloud: 7 Success Stories

Learn how and why companies are moving to the Cloud

Read Now