I want to count the number of rows, but in the case that there are no rows, I want to get a result saying zero, not nothing at all.
I tried tFixedFlowInput -> tUnite -> tAggregateRow, and selecting "Ignore null values", but my row structure just contains a string (that I am grouping by) and an int, and you can't set an int to null.
I could add a dummy column, but that's a bit messy, I don't want to add any more components than I need.
I could just subtract 1 from the count, but that's an extra tJavaRow or tMap, and will lead to confusion.
If I want to do something more complex as well, like calculate a standard deviation, then the dummy row with a zero integer value will throw the statistics out.
I could tUnite a dummy row after the tAggregateRow and aggregate that in, but that's extra processing and an extra component and confusion.
Is there a "good" way to do this? There are plenty of messy ways.
Many components have built in global variables. One of them that is common to a lot if them is row count or # of lines processed. You would typically retrieve this after that subjob is completed.
For example, tFixedFlowInput's is
for something like tMSSqlInput you would use
This will also report if there are 0 rows.
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Part 2 of a series on Context Variables
Learn how to do cool things with Context Variables
Find out how to migrate from one database to another using the Dynamic schema