Delimited file wizard uses incorrect tab character for Perl

Highlighted
One Star

Delimited file wizard uses incorrect tab character for Perl

I tried to submit this as a bug, but after carefully filling in the form bugtracker announced an error itself, which I could not avoid after several retries.
So here it is, though I can't believe I'm the first to report it.
The metadata file delimited wizard, when I select "tabulation" as field delimiter, uses a corresponding character expression '\t'. In Perl, that is a literal 2-character constant, not a tab. The correct expression is "\t". If I don't manually change it each time it will not separate tab-delimited fields.
This is true for older versions as well as TOS 4.2.0M2r53829. I'm running under various versions of Windows and Linux.
Community Manager

Re: Delimited file wizard uses incorrect tab character for Perl

Hello
I use a corresponding character expression '\t' and it works for tab-delimited field, I am working on v420M1. What's the error message when you use a corresponding character expression '\t'?
Best regards
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: Delimited file wizard uses incorrect tab character for Perl

Hi Shong, thanks for your reply.
This bug is actually a little more interesting than I had realized.
First, make sure you are testing in a Perl project.
Use the metadata wizard to create a tab-delimited file item, accepting its '\t' expression for tabulation.
If you are using an actual tab-delimited file for the wizard to interpret, it will correctly separate columns in the wizard.
What is more, if you drag your new metadata item to a job and create a tFileInputDelimited component, it will work correctly!
If you create a tFileOutputDelimited component, it will not do what you probably had in mind.
Instead of separating values with a tab character, it will separate values with two characters, a literal backslash character and a lower case t.
In tFileInputDelimited, the '\t' expression appears as an argument to the Perl split function, which interprets it as a regular expression. In a regular expression, \t does indeed stand for a tab character.
In tFileOutputDelimited, the '\t' expression appears as an argument to the Perl join function, which interprets it as a string. In this case \t within single quotes is a two-character literal, not a tab.

What’s New for Talend Spring ’19

Watch the recorded webinar!

Watch Now

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.

Download

Tutorial

Introduction to Talend Open Studio for Data Integration.

Watch

Downloads and Trials

Test drive Talend's enterprise products.

Downloads