Reading Special Characters like Trade Marks and Register Marks

Four Stars

Reading Special Characters like Trade Marks and Register Marks

Hi,

 

My source has a text data with all spacial characters like trade marks and Register Marks, While moving to Target as text, I am getting some spcial characters before registration marks and ? marks before single quotes. I have used utf-8 encoding etc, still no use,  I am not seeing trademark sign like TM, i see ¢. 

 Cash+â„¢ Signature®

Is there any alternative solution to this?

 Any thoughts ???

 

 

 

Six Stars

Re: Reading Special Characters like Trade Marks and Register Marks

what is the encoding of source file ?

UTF-8 is not able to understand the source encoding.
Regards
Abhishek KUMAR
Four Stars

Re: Reading Special Characters like Trade Marks and Register Marks

Yep it is very tricky, actually reading from hive and writing into hive, but within talend server all looks good like same linux box, but when writing to hdfs different box, I see all kind of junks. Even tlogrow also writes junk characters. Both boxes have same characters set.
Six Stars

Re: Reading Special Characters like Trade Marks and Register Marks

Please check file encoding in hdfs by command file ? is it UTF-8 ? otherwise change the encoding to utf-8 by iconv command and try to see if all looks good.
Regards
Abhishek KUMAR
Six Stars

Re: Reading Special Characters like Trade Marks and Register Marks

Strange, that "Cash+â„¢ Signature®" does look like UTF-8. However, it might be going through a double conversion - UTF-8 being re-encoded into UTF-8 as if it were not already. Do you have any intermediate steps that might be being written as one format e.g. UTF-8 but read as another e.g. ISO-8859?

Tutorial

Introduction to Talend Open Studio for Data Integration.

Definitive Guide to Data Integration

Practical steps to developing your data integration strategy.

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.