Encoding issue with tFileOutputXML

One Star

Encoding issue with tFileOutputXML

Hello,
I'm using a tFileOutputXML to write a simple XML file. I must use ISO-8859-1 as encoding, this works well if I set this as a Custom encoding in Advanced Options. But if there is a character outside ISO-8859-1 (for instance "?"), talend just outputs "?".
I expect talend to encode it to "& #8364;" (without space) : this is correctly decoded back to "?" when I use a tFileInputXML, why is this behavior not consistent ?
A workaround is to set UTF-8 encoding on the tFileoutputXML and then use a transformation to get the XML in the mandatory encoding.
Did anyone had the same issue ? Do you think a bug report/request for enhancement for this has any chance of getting some attention ?
Regards,
Eric
edit : I'm using talend 5.0.1
Community Manager

Re: Encoding issue with tFileOutputXML

Hi
You have to use UTF-8 to read or write the special character "?", I don't think you can read it correctly from file without utf-8 encoding.
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: Encoding issue with tFileOutputXML

? is not part of the 8859-1 character set and since can never be written to a file encoded in 8859-1.
I expect talend to encode it to "?"

What? How exactly do you want Talend to change that? As said, it is not part of the character set.
If you want ? either use 8859-15 change to UTF-8.
One Star

Re: Encoding issue with tFileOutputXML

Sorry, the forum broke everything. I'll edit my post : of course an ? in a file encoded in 8859-1 wouldn't be possible, what I meant is "& #8364;" without space.
I can't control the encoding, I'm writing this file for a legacy app. Even ISO-8859-15 would be enough but I simply can't.
One Star

Re: Encoding issue with tFileOutputXML

Ok, yes, that makes more sense. Talend has no built in way to do this. Best thing would be to set up a new routine and use this (not tested by me):
http://stackoverflow.com/questions/1273986/converting-utf-8-to-iso-8859-1-in-java
Highlighted
One Star

Re: Encoding issue with tFileOutputXML

Ok, yes, that makes more sense. Talend has no built in way to do this. Best thing would be to set up a new routine and use this (not tested by me):
http://stackoverflow.com/questions/1273986/converting-utf-8-to-iso-8859-1-in-java

Won't the & be encoded when I use the tFileOutputXML, ruining this improvised encoding ?

Calling Talend Open Studio Users

The first 100 community members completing the Open Studio survey win a $10 gift voucher.

Start the survey

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog

How Media Organizations Achieved Success with Data Integration

Learn how media organizations have achieved success with Data Integration

Read

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.

Download