Encoding issue with tFileOutputXML

Highlighted
One Star

Encoding issue with tFileOutputXML

Hello,
I'm using a tFileOutputXML to write a simple XML file. I must use ISO-8859-1 as encoding, this works well if I set this as a Custom encoding in Advanced Options. But if there is a character outside ISO-8859-1 (for instance "?"), talend just outputs "?".
I expect talend to encode it to "& #8364;" (without space) : this is correctly decoded back to "?" when I use a tFileInputXML, why is this behavior not consistent ?
A workaround is to set UTF-8 encoding on the tFileoutputXML and then use a transformation to get the XML in the mandatory encoding.
Did anyone had the same issue ? Do you think a bug report/request for enhancement for this has any chance of getting some attention ?
Regards,
Eric
edit : I'm using talend 5.0.1
Community Manager

Re: Encoding issue with tFileOutputXML

Hi
You have to use UTF-8 to read or write the special character "?", I don't think you can read it correctly from file without utf-8 encoding.
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: Encoding issue with tFileOutputXML

? is not part of the 8859-1 character set and since can never be written to a file encoded in 8859-1.
I expect talend to encode it to "?"

What? How exactly do you want Talend to change that? As said, it is not part of the character set.
If you want ? either use 8859-15 change to UTF-8.
One Star

Re: Encoding issue with tFileOutputXML

Sorry, the forum broke everything. I'll edit my post : of course an ? in a file encoded in 8859-1 wouldn't be possible, what I meant is "& #8364;" without space.
I can't control the encoding, I'm writing this file for a legacy app. Even ISO-8859-15 would be enough but I simply can't.
One Star

Re: Encoding issue with tFileOutputXML

Ok, yes, that makes more sense. Talend has no built in way to do this. Best thing would be to set up a new routine and use this (not tested by me):
http://stackoverflow.com/questions/1273986/converting-utf-8-to-iso-8859-1-in-java
One Star

Re: Encoding issue with tFileOutputXML

Ok, yes, that makes more sense. Talend has no built in way to do this. Best thing would be to set up a new routine and use this (not tested by me):
http://stackoverflow.com/questions/1273986/converting-utf-8-to-iso-8859-1-in-java

Won't the & be encoded when I use the tFileOutputXML, ruining this improvised encoding ?

What’s New for Talend Spring ’19

Watch the recorded webinar!

Watch Now

Agile Data lakes & Analytics

Accelerate your data lake projects with an agile approach

Watch

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.

Download

Tutorial

Introduction to Talend Open Studio for Data Integration.

Watch