Replace tabs in txt file

Highlighted
One Star

Replace tabs in txt file

Hi all,
I would like your help on a specific part of my design.
I am receiving tab delimited txt file with rows of data, which I process on my design in next steps. 
However, some files have two tabs (\t\t) between columns instead of one (\t) , so I would like to replace all double tabs with single tabs in the file and automate this step before feeding it to the process. (something I perform manually with a text editor like Notepad++)
Is there any suggestion to achieve this?
Note: I've tried to do this with tReplace but the component recognizes the schema as already separated by tabs, so I can only replace someting within the specific column of data.
One Star

Re: Replace tabs in txt file

Hi harris
You should first read your file with tFileInputRaw ---> tmap (transform to a string) --> tReplace ("\t\t" to "\t) --> tFileOutputRaw
then use your new txt file 
One Star

Re: Replace tabs in txt file

Hi Jcs19,
I tried your suggestion, but did not manage to run the task.
I used a tFileInputRaw component with the option 'Read the file as a string' and then used the content of the file in the following tMap component with the function row1.content.replaceAll("\t\t", "\t") and exporting the output to a tFileOutputRaw, but I get out of memory exception.
Seems like it canlt handle the amount of data stored in memory.
Is there any alternative suggestion?
One Star

Re: Replace tabs in txt file

Hi Jcs19,
I tried your suggestion, but did not manage to run the task.
I used a tFileInputRaw component with the option 'Read the file as a string' and then used the content of the file in the following tMap component with the function row1.content.replaceAll("\t\t", "\t") and exporting the output to a tFileOutputRaw, but I get out of memory exception.
Seems like it canlt handle the amount of data stored in memory.
Is there any alternative suggestion?

How big is your file ? 
You may just need more memory
One Star

Re: Replace tabs in txt file

These files are in avg about 700k rows of data. 
As for memory allocation below are the values of .ini file
-vmargs
-Xms512m
-Xmx6056m
-XX:MaxPermSize=1024m
-Dfile.encoding=UTF-8

What’s New for Talend Spring ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog

How Media Organizations Achieved Success with Data Integration

Learn how media organizations have achieved success with Data Integration

Read

Downloads and Trials

Test drive Talend's enterprise products.

Downloads