From Thursday, July, 9, 3:00 PM Pacific,
our Community site will be in
read-only mode
through Sunday, July 12th.
Thank you for your patience.

[resolved] Encoding to UTF-8 miss all latin special characters

Highlighted
Six Stars

[resolved] Encoding to UTF-8 miss all latin special characters

Hello.
I am trying to get data from Firebird DB, then store it to csv files.
If i create the files with UTF-8, I loose all latin special characters, like "é" or "ã" they will be replaced with "?"
Is there any way to put in UTF-8 without loosing those latin special characters?
Thank you

Accepted Solutions
Highlighted
Six Stars

Re: [resolved] Encoding to UTF-8 miss all latin special characters

Hello.
I workaround this by doing this steps:
Changed the windows encoding to 65001 for current user session command and administration session with command line: chcp 65001
Then: All the generated files in the job with "UTF-8" and added JVM argument: Dfile.encoding="cp1252".
All the files now are in UTF-8 with all latin special characters.
I hope this help someone in the future
Thank you!

View solution in original post


All Replies
Highlighted
Moderator

Re: [resolved] Encoding to UTF-8 miss all latin special characters

Hi,
Could you please try to add the "Dfile.encoding=utf-8" to the JVM parameters of job review to see if it works?
Best regards
Sabrina

--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Highlighted
Six Stars

Re: [resolved] Encoding to UTF-8 miss all latin special characters

Hi Sabrina.
That's what I have.
I also tried with the tencoding component and the result is the same.
File is converted/saved with UTF-8 encoding but all latin characters are replaced with "?"
I am using last TOS, but I have this problem since I Know TOS, from version 4.2.
Any more ideas?
Highlighted
Six Stars

Re: [resolved] Encoding to UTF-8 miss all latin special characters

Hello.
I think it's related with O.S. I am using it in English, but locale, Keyboard, etc are in Portuguese.
Default windows enconding is in cp1252, changed to 65001 (UTF-8) but result it's the same.
I am running out of ideas...Any suggestions?
Thank you
Highlighted
Six Stars

Re: [resolved] Encoding to UTF-8 miss all latin special characters

Hello.
I workaround this by doing this steps:
Changed the windows encoding to 65001 for current user session command and administration session with command line: chcp 65001
Then: All the generated files in the job with "UTF-8" and added JVM argument: Dfile.encoding="cp1252".
All the files now are in UTF-8 with all latin special characters.
I hope this help someone in the future
Thank you!

View solution in original post

Highlighted
Moderator

Re: [resolved] Encoding to UTF-8 miss all latin special characters

Hi,
Great the solution works. Thanks for your feedback and sharing your solution with us.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.

2019 GARTNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

Best Practices for Using Context Variables with Talend – Part 2

Part 2 of a series on Context Variables

Blog

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables

Blog

Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema

Blog