Replace tabs in txt file

One Star

Replace tabs in txt file

Hi all,
I would like your help on a specific part of my design.
I am receiving tab delimited txt file with rows of data, which I process on my design in next steps. 
However, some files have two tabs (\t\t) between columns instead of one (\t) , so I would like to replace all double tabs with single tabs in the file and automate this step before feeding it to the process. (something I perform manually with a text editor like Notepad++)
Is there any suggestion to achieve this?
Note: I've tried to do this with tReplace but the component recognizes the schema as already separated by tabs, so I can only replace someting within the specific column of data.
One Star

Re: Replace tabs in txt file

Hi harris
You should first read your file with tFileInputRaw ---> tmap (transform to a string) --> tReplace ("\t\t" to "\t) --> tFileOutputRaw
then use your new txt file 
Highlighted
One Star

Re: Replace tabs in txt file

Hi Jcs19,
I tried your suggestion, but did not manage to run the task.
I used a tFileInputRaw component with the option 'Read the file as a string' and then used the content of the file in the following tMap component with the function row1.content.replaceAll("\t\t", "\t") and exporting the output to a tFileOutputRaw, but I get out of memory exception.
Seems like it canlt handle the amount of data stored in memory.
Is there any alternative suggestion?
One Star

Re: Replace tabs in txt file

Hi Jcs19,
I tried your suggestion, but did not manage to run the task.
I used a tFileInputRaw component with the option 'Read the file as a string' and then used the content of the file in the following tMap component with the function row1.content.replaceAll("\t\t", "\t") and exporting the output to a tFileOutputRaw, but I get out of memory exception.
Seems like it canlt handle the amount of data stored in memory.
Is there any alternative suggestion?

How big is your file ? 
You may just need more memory
One Star

Re: Replace tabs in txt file

These files are in avg about 700k rows of data. 
As for memory allocation below are the values of .ini file
-vmargs
-Xms512m
-Xmx6056m
-XX:MaxPermSize=1024m
-Dfile.encoding=UTF-8

What’s New for Talend Spring ’19

Watch the recorded webinar!

Watch Now

Tutorial

Introduction to Talend Open Studio for Data Integration.

Watch

Downloads and Trials

Test drive Talend's enterprise products.

Downloads

Definitive Guide to Data Integration

Practical steps to developing your data integration strategy.

Download