One Star

Delimited file wizard uses incorrect tab character for Perl

I tried to submit this as a bug, but after carefully filling in the form bugtracker announced an error itself, which I could not avoid after several retries.
So here it is, though I can't believe I'm the first to report it.
The metadata file delimited wizard, when I select "tabulation" as field delimiter, uses a corresponding character expression '\t'. In Perl, that is a literal 2-character constant, not a tab. The correct expression is "\t". If I don't manually change it each time it will not separate tab-delimited fields.
This is true for older versions as well as TOS 4.2.0M2r53829. I'm running under various versions of Windows and Linux.
2 REPLIES
Community Manager

Re: Delimited file wizard uses incorrect tab character for Perl

Hello
I use a corresponding character expression '\t' and it works for tab-delimited field, I am working on v420M1. What's the error message when you use a corresponding character expression '\t'?
Best regards
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: Delimited file wizard uses incorrect tab character for Perl

Hi Shong, thanks for your reply.
This bug is actually a little more interesting than I had realized.
First, make sure you are testing in a Perl project.
Use the metadata wizard to create a tab-delimited file item, accepting its '\t' expression for tabulation.
If you are using an actual tab-delimited file for the wizard to interpret, it will correctly separate columns in the wizard.
What is more, if you drag your new metadata item to a job and create a tFileInputDelimited component, it will work correctly!
If you create a tFileOutputDelimited component, it will not do what you probably had in mind.
Instead of separating values with a tab character, it will separate values with two characters, a literal backslash character and a lower case t.
In tFileInputDelimited, the '\t' expression appears as an argument to the Perl split function, which interprets it as a regular expression. In a regular expression, \t does indeed stand for a tab character.
In tFileOutputDelimited, the '\t' expression appears as an argument to the Perl join function, which interprets it as a string. In this case \t within single quotes is a two-character literal, not a tab.