Five Stars

How does the unique matching work in tFuzzymatch component?

Hi all,

I am using the double metaphone method in tFuzzymatch component to compare a column from the main table with reference column from the lookup table. If the unique matching option is checked, the best matched result would be given in the final output. But   may I know how the "unique matching" actually work? Why does it pick the one as the best matched? What is the algorithm/criteria for comparison behind it? 

 

For  example:

(unique matching is unchecked)

Words_Main | VALUE | MATCHING

Oakland County Sheriff | AKLN | Oakland Academy,Oakland International Academy

 

(unique matching is checked)

Words_Main | VALUE | MATCHING

Oakland County Sheriff | AKLN | Oakland Academy

 

Why "Oakland Academy" is better than "Oakland International Academy"? 

 

Thank you very much!!

 

  • Data Integration
1 ACCEPTED SOLUTION

Accepted Solutions
Ten Stars

Re: How does the unique matching work in tFuzzymatch component?

It looks like the code iterates over the list of values until it finds a match, and then stops.  So it returns the first match it finds and does not make any comparison between all possible matches to return the "best" match.

4 REPLIES
Ten Stars

Re: How does the unique matching work in tFuzzymatch component?

It looks like the code iterates over the list of values until it finds a match, and then stops.  So it returns the first match it finds and does not make any comparison between all possible matches to return the "best" match.

Five Stars

Re: How does the unique matching work in tFuzzymatch component?

oh, so talend define the first match as the "best" one in tfuzzymatch? Is there anyway I can see the source code for this component? thanks
Ten Stars

Re: How does the unique matching work in tFuzzymatch component?

Drop the component in a job, configure it, then click on the code tab under the job area.
Moderator

Re: How does the unique matching work in tFuzzymatch component?

Hello,

Next to the Designer tab there is a Code tab that lets you view the source code for the entire Job. Or else there is Code view next to the Outline that shows the generated code for the selected component.

Best regards

Sabrina

 

--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.