I want to count the number of rows, but in the case that there are no rows, I want to get a result saying zero, not nothing at all.
I tried tFixedFlowInput -> tUnite -> tAggregateRow, and selecting "Ignore null values", but my row structure just contains a string (that I am grouping by) and an int, and you can't set an int to null.
I could add a dummy column, but that's a bit messy, I don't want to add any more components than I need.
I could just subtract 1 from the count, but that's an extra tJavaRow or tMap, and will lead to confusion.
If I want to do something more complex as well, like calculate a standard deviation, then the dummy row with a zero integer value will throw the statistics out.
I could tUnite a dummy row after the tAggregateRow and aggregate that in, but that's extra processing and an extra component and confusion.
Is there a "good" way to do this? There are plenty of messy ways.
Many components have built in global variables. One of them that is common to a lot if them is row count or # of lines processed. You would typically retrieve this after that subjob is completed.
For example, tFixedFlowInput's is
for something like tMSSqlInput you would use
This will also report if there are 0 rows.
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Watch the recorded webinar!
Learn how to do cool things with Context Variables
Find out how to migrate from one database to another using the Dynamic schema
Pick up some tips and tricks with Context Variables