[resolved] Filter by max value

One Star

[resolved] Filter by max value

Hello,
I'm novice with Talend and I have a special operation to do, but I don't know how to do it and what component to use.
I have some rows grouped by one column (Column1). For each value of Column1, I want to get the row wich value of another column (Column2) is the most important : Max(Column2)
By example, for the following rows :
Column1 Column2 Column3 Column4 Column5
AAAA 14 EEEEEE FFFFFFF GGGGGG
AAAA 52 HHHHH IIIIIII JJJJJ
AAAA 8 KKKKK LLLLLL MMMMM
BBBB 34 NNNNNN OOOO PPPPP
BBBB 12 QQQQQ RRRRR SSSSS
CCCC 43 TTTTTT UUUUUU VVVVVV
CCCC 16 WWWWW XXXXXX YYYYYY
CCCC 23 ZZZZZZ ZZZZZZ ZZZZZZ
I want to get the following rows :
AAAA 52 HHHHH IIIIIII JJJJJ
BBBB 34 NNNNNN OOOO PPPPP
CCCC 43 TTTTTT UUUUUU VVVVVV
How can I do it ?
I tried to use the tAggregateRow but I don't understand how to use it, the documentation is very poor (or I'm not enough intelligent to understand it).
Someone can help me please ? Thank you to be precise because Talend is completely new for me
Thanks
Lionel

Accepted Solutions
Seven Stars

Re: [resolved] Filter by max value

You should tSort on Column1 and Column2 descending and then use tUniqueRow to pass through only the first row.
(tAggregateRow can only give you max(Column2) grouped by Column1. You would then have to join the result to the whole data flow to get the rest of the column values for the chosen row.)

All Replies
Seven Stars

Re: [resolved] Filter by max value

You should tSort on Column1 and Column2 descending and then use tUniqueRow to pass through only the first row.
(tAggregateRow can only give you max(Column2) grouped by Column1. You would then have to join the result to the whole data flow to get the rest of the column values for the chosen row.)
One Star

Re: [resolved] Filter by max value

You should tSort on Column1 and Column2 descending and then use tUniqueRow to pass through only the first row.
(tAggregateRow can only give you max(Column2) grouped by Column1. You would then have to join the result to the whole data flow to get the rest of the column values for the chosen row.)

Tank you very much, it works !!!