Four Stars

How to convert single width (hankaku) to double width (zenkaku) in talend mdm

Hi,

I have data in UTF-8 format (Hankaku and Zenkaku) mean data may consists single byte or double byte means Single byte

"サンプル "And Double byte "サンプル" both have same meaning as "Sample"

 

I want to apply tFuzzymatch in talend but before i want to make all data into uniform means convert all data into double byte.

Anybody know How to convert single width (hankaku) to double width (zenkaku) in talend mdm

 

 

  • Data Integration
3 REPLIES
Four Stars

Re: how to search same japanese data which is in double byte and single byte ? in Talend MDM

ICU4j
mbhushan wrote:

Hi,

I am facing this issue i have data which is in Japanese Language i don't know this data is in double byte or single byte

example Double byte data "サンプル" And Single Byte data "サンプル" both data have same mean as "Sample "if i search data then it will show both result how ??


 

Six Stars

Re: how to search same japanese data which is in double byte and single byte ? in Talend MDM

what data type you are using for this field?

Have to you defined your both input and output components ENCODING to "UTF-8", the you will result in in same data.

Hope this answers your question.


Thanks,
Sid
Please like the post if it is useful
Please put to resolved if it resolves your issue.
Four Stars

`

`