Bulk exporting - field separator takes only one character when I use the csv options for text enclosure in advanced settings

Four Stars

Bulk exporting - field separator takes only one character when I use the csv options for text enclosure in advanced settings

I am experiencing some strange behavior, trying to create a bulk export to csv.

 

I had previously set it up als follows:

- row separator: "\n"

- field separator: "#\t#"

- no csv options (Escape Char and Text Enclosure)

 

The csv creation worked perfectly with that setup.

 

Afterwards I had the requirement to add a free text field to that export and unfortunately, the free text field contains carriage returns. For that, I checked the csv options and tried to add the character string "\"" for text enclosure. But when I run the job with that setup, the field separator in the corresponding csv file is # (while it used to be #\t# before). It seems that the field separator can only consist of one character if the csv options are used. Has anyone of you experienced this before?

 

I am using:  

talend version 7.0.1

java version "1.8.0_101"

Java(TM) SE Runtime Environment (build 1.8.0_101-b13)

Java HotSpot(TM) 64-Bit Server VM (build 25.101-b13, mixed mode)

 

BR Kristian


Accepted Solutions
Seven Stars JGM
Seven Stars

Re: Bulk exporting - field separator takes only one character when I use the csv options for text enclosure in advanced settings

I've been able to reproduce this behavior using TIS 6.4.1. 

 

Checking into the component code, this part is the culprit:

 

                    //support passing value (property: Field Separator) by 'context.fs' or 'globalMap.get("fs")'.
                    if (fieldSep.length() > 0 ){
                        field_Delim_tFileOutputDelimited_1 = fieldSep.toCharArray();
                    }else {
                        throw new IllegalArgumentException("Field Separator must be assigned a char.");
                    }
                    this.field_Delim = field_Delim_tFileOutputDelimited_1[0];

I suspect the reason this is not a huge problem is that when you are using properly escaped csv files (csv options), the field separator should no longer be sensitive to the content of the data -- meaning you can safely use a single character. 

 

if you wanted to fix it, you'd need to edit the component. From a quick look at the component code, this will not be a quick fix as the current code expects a character and this is tied into how data is escaped when it is written. 

 

What I would suggest is updating any jobs reading these files to use csv options as well -- at a high level, it is always better to enable csv options unless you are writing very very large files and need better performance. 


All Replies
Seven Stars JGM
Seven Stars

Re: Bulk exporting - field separator takes only one character when I use the csv options for text enclosure in advanced settings

I've been able to reproduce this behavior using TIS 6.4.1. 

 

Checking into the component code, this part is the culprit:

 

                    //support passing value (property: Field Separator) by 'context.fs' or 'globalMap.get("fs")'.
                    if (fieldSep.length() > 0 ){
                        field_Delim_tFileOutputDelimited_1 = fieldSep.toCharArray();
                    }else {
                        throw new IllegalArgumentException("Field Separator must be assigned a char.");
                    }
                    this.field_Delim = field_Delim_tFileOutputDelimited_1[0];

I suspect the reason this is not a huge problem is that when you are using properly escaped csv files (csv options), the field separator should no longer be sensitive to the content of the data -- meaning you can safely use a single character. 

 

if you wanted to fix it, you'd need to edit the component. From a quick look at the component code, this will not be a quick fix as the current code expects a character and this is tied into how data is escaped when it is written. 

 

What I would suggest is updating any jobs reading these files to use csv options as well -- at a high level, it is always better to enable csv options unless you are writing very very large files and need better performance. 

Four Stars

Re: Bulk exporting - field separator takes only one character when I use the csv options for text enclosure in advanced settings

Thanks for looking at it and thank you for your advice on how to construct the job. Any idea on if this error is going to be fixed in one of the next versions?

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 2

Part 2 of a series on Context Variables

Blog

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables

Blog

Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema

Blog