Japanese Characters: ensuring proper data handling

Highlighted
Three Stars JPG
Three Stars

Japanese Characters: ensuring proper data handling

All, 

 

Talend newbie here; I was told to use Talend Open Studio for Data Integration as part of a current project. 

 

I've been searching the documentation and the forums for authoritative, all-encompassing guidance on this topic, but I've not been able to find it, so I'm hoping someone here can point me in the right direction. 

 

Situation

There are hundreds of Excel spreadsheets with data in Japanese (e.g., "レター対応"). I need to transform this data and output to CSV files for loading into other systems. What must I do to ensure the Japanese data is maintained and not garbled into something unintelligible, such as a bunch of question marks?

 

 

I'm sure that Talend can do this; the mere existence of these Japanese Discussion forums within the community is highly encouraging. But the suggestions I've seen are spread out throughout many forum postings, which leads me to think it will be easy for me to miss something. Any ideas?

 

Thanks,

JPG.

 

PS. Here are the most promising posts that I've already read: 

 

 

Community Manager

Re: Japanese Characters: ensuring proper data handling

If you simply want to take the characters and output to CSV (without translation) the big thing to focus on is ensuring the character encoding you are using is UTF-8. You can set this in a number of components (it may be in the advanced settings in some).

If you want to translate to another language, I'd recommend using Google's translation API (https://cloud.google.com/translate/docs/).

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables

Blog

Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema

Blog

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog