One Star

[resolved] Encoding Problem with Special Characters

Hi all,
we're using TOS DI 5.2.0M3 and facing a problem mapping data from mssql to oracle (and even within the same mssql database) when the target field has a smaller size than the source field (varchar 132 to varchar 40) AND it contains special chars like Ü,Ä etc..
The String is cut to a length of 40 chars in a tmap component but a data truncation error occurs when saving this String using a tXXXOutput Component.
Does anyone have a solution where an encoding can be set in the tXXXOutput Components or any information how this problem can be solved?
Thanks in advance,
M.Meier
1 ACCEPTED SOLUTION

Accepted Solutions
One Star

Re: [resolved] Encoding Problem with Special Characters

Hi xdshi,
i did some further investigation, it pointed out i made a mistake on the mssql -> mssql mapping, the string had too much characters so the error message was correct.
Concerning mssql -> oracle it was a bit trickier. The destination oracle database (i had no influence on the schema) has varchar fields with exactly the same size as the source (i thought Smiley Wink) but the was a small difference is saw when i inspected the oracle schema:
the definition of the fields was varchar (254 BYTE) which means the field has space for 254 BYTE. Since we are working with multibyte characters (ÄÖÜ etc.) we need 4 bytes space for saving it in the database.
So my solution is to count the occurences of these multibyte chars in the source field and cut the string (len - 4*occurences). I'm sure it's not the bust way, but it works for now Smiley Wink
Regards,
M
7 REPLIES
Moderator

Re: [resolved] Encoding Problem with Special Characters

Hi,
Perhaps this KB article about Exception: Data truncation: Data too long for column will be helpful for you.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
One Star

Re: [resolved] Encoding Problem with Special Characters

Hi xdshi,
thanks for your hint. I tried this, but got the same error. What i did to solve the problem was just to remove the selection of "die on error" on my output component. I now get the message "data truncation" in the log, but it commits the data completely anyway....
Regards,
M.Meier
Moderator

Re: [resolved] Encoding Problem with Special Characters

Hi,
The special characters shows well in target DB? There is no error in console but the error info "data truncation" in log?
Could you give us your job screenshot in forum which may shed light to us.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
One Star

Re: [resolved] Encoding Problem with Special Characters

Hi xdshi,
i did some further investigation, it pointed out i made a mistake on the mssql -> mssql mapping, the string had too much characters so the error message was correct.
Concerning mssql -> oracle it was a bit trickier. The destination oracle database (i had no influence on the schema) has varchar fields with exactly the same size as the source (i thought Smiley Wink) but the was a small difference is saw when i inspected the oracle schema:
the definition of the fields was varchar (254 BYTE) which means the field has space for 254 BYTE. Since we are working with multibyte characters (ÄÖÜ etc.) we need 4 bytes space for saving it in the database.
So my solution is to count the occurences of these multibyte chars in the source field and cut the string (len - 4*occurences). I'm sure it's not the bust way, but it works for now Smiley Wink
Regards,
M
Moderator

Re: [resolved] Encoding Problem with Special Characters

Hi,
Thanks for your feedback and sharing your experience with us, I will make some investigation to find a better way for your issue.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
One Star

Re: [resolved] Encoding Problem with Special Characters

Hi xdshi,
i think the problem in my case is a bad schema design, which i sadly can't change. But it might be a cool feature if the encoding can be set in the output elements, although i know encoding handling using java Strings is a bit tricky Smiley Wink
Regards,
M
Moderator

Re: [resolved] Encoding Problem with Special Characters

Hi,
Thanks your response.
In addition, For talend enterprise subscription version, there is "Dynamic schema" feature which allows you to design schema with an unknown column structure (unknown name and number of columns).I think it is a good option for "Schema" Issue.
You can refer to this article How to process changing data structure.
Best regards
Sabrina

--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.