tMD5Sum component

Five Stars

tMD5Sum component

Hi,

 

I had one scenario in which I need to compare the data with in the file. 

for example: file1 has data:

This is book.

This is car.

This is book.

 

In this I need to know how many lines are duplicate.

Thanks in advance..!!

Nine Stars

Re: tMD5Sum component

Please specify expected output.

 

Regards,

Veeru Boppudi
Seven Stars

Re: tMD5Sum component

Hi 

 

you can do so by tAggregateRow component by follow below flow:

 

tfileinput*--> tAggregaterow-->tfileoutput*

 

In tAggregaterow:

*group by data/or whatever be a name of column

*operation count(data/or whatever be a name of column)

Regards
Aashish
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Five Stars

Re: tMD5Sum component

Hi,

 

The expected o/p will be:

 

This is book. was come 2 times in a file

 

Regards,

Akash Rastogi

 

Eight Stars

Re: tMD5Sum component

I think the component you're after is tUniqRow

Regards

David

Don't forget to give Kudos when an answer is helpful or the solution.
Five Stars

Re: tMD5Sum component

Hi David,

 

I think tUniqRow component is used to filtering the unique and duplicate records.

But in my scenario I want the exact count how many times that particular sentence had come.

 

-Akash 

Five Stars

Re: tMD5Sum component

Hi Aashish,

Can you please more elaborate. I am unable to understand the solution.

My data is in file how can I use tAggregateRow on that.

 

-Akash

Seven Stars

Re: tMD5Sum component

@rastogi_akashfor reading a file use tfileinputdelimited and let say your file having schema like sentence string type then use tmap and generate output with same sentence column two times,

So now we have two output column first sentence and second column (named as cnt) then add tggregaterow edit schema accordingly change datatype of cnt from string to integer and add sentence to group by and cnt to operation with function type count Smiley Happy

 

 

your output will be :

sentence;cnt

a;1

b;2

c;1

 

you can use this result as per your requirement.

 

 

Regards
Aashish
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.