Seven Stars

different results for tUniqueRow and tAggregateRow

Hi guys,

 

I have dataset with only one column and number of rows. Why I'm getting different result for tUniqueRow and tAggregateRow ?

 

Screen Shot 2017-06-08 at 4.26.43 PM.pngScreen Shot 2017-06-08 at 4.29.24 PM.png

 

Thanks !

  • Data Integration
1 ACCEPTED SOLUTION

Accepted Solutions
Ten Stars

Re: different results for tUniqueRow and tAggregateRow

I was hoping you'd say it was BigDecimal or something and this was a precision issue.

tUniqRow by default does a case insensitive comparison. If your data has different casing for the same vendor name, this would result in fewer values output by tUniqRow. If you want tUniqRow to be case sensitive, you can check the box in its settings (next to Key attribute).
12 REPLIES
Nine Stars

Re: different results for tUniqueRow and tAggregateRow

may be because - one is count Unique rows and other any rows? :-)

 

in other words - You have duplicates in this filtered column

-----------
Seven Stars

Re: different results for tUniqueRow and tAggregateRow

But the input to both the components is same and mainly has only one column, and yes this one column is having duplicate values in it. How can the result be different ?

 

I think tUniqueRow is using fuzzy match for uniqueness, and so the no. of rows are lesser by 33 than that of tAggregateRow.

 

Thanks for the reply @vapukov.

Nine Stars

Re: different results for tUniqueRow and tAggregateRow

Unique - it is Unique :-)

 

1,1,2,3,3,3 = unique 1,2,3 without variants

 

Screen Shot 2017-06-08 at 11.31.05 PM.png

-----------
Seven Stars

Re: different results for tUniqueRow and tAggregateRow

Okay 

You mean to say for 1,2,3,3,4,4,5,5,5,6,6,6,6

 

Unique = 1,2

Aggregate = 1,2,3,4,5,6 

 

Right ? @vapukov

 

Nine Stars

Re: different results for tUniqueRow and tAggregateRow

no, 

I guess, what are You mean

 

if You have only single column and group by this column, it must be same result
because not - need redirect flows to files and compare by diff

 

for make it more easy compare - You can sort column before store to file

what settings You use in both components?

-----------
Ten Stars

Re: different results for tUniqueRow and tAggregateRow

What is the data type of the single column in your schema?  How do you have your tAggregateRow component configured?

Seven Stars

Re: different results for tUniqueRow and tAggregateRow

@vapukov, below are the screenshot of basic setting of both the componnets

 

tUniqueRow setting:

 

Screen Shot 2017-06-08 at 5.00.56 PM.png

 

tAggregateRow setting:

 

Screen Shot 2017-06-08 at 5.00.32 PM.png

 

@cterenzi, data dype of single column 'VENDOR_NAME' is String.

Ten Stars

Re: different results for tUniqueRow and tAggregateRow

I was hoping you'd say it was BigDecimal or something and this was a precision issue.

tUniqRow by default does a case insensitive comparison. If your data has different casing for the same vendor name, this would result in fewer values output by tUniqRow. If you want tUniqRow to be case sensitive, you can check the box in its settings (next to Key attribute).
Seven Stars

Re: different results for tUniqueRow and tAggregateRow

Oh, I haven't observed that. Thanks for pointing me out @cterenzi.

Now I'm getting same row count. Which one is good to go for - tUniqueRow or tAggregateRow ?

Ten Stars

Re: different results for tUniqueRow and tAggregateRow

tUniqRow is a bit more explicit about its purpose, and offers a Duplicates output which can be handy for noting what values were stripped from the flow.
Nine Stars

Re: different results for tUniqueRow and tAggregateRow

sorry, was sleep :-)

so, good to know - the problem is resolved!

-----------
Seven Stars

Re: different results for tUniqueRow and tAggregateRow

Its okay @vapukov. Thanks for your cooperation Smiley Happy