Replace tabs in txt file

One Star

Replace tabs in txt file

Hi all,
I would like your help on a specific part of my design.
I am receiving tab delimited txt file with rows of data, which I process on my design in next steps. 
However, some files have two tabs (\t\t) between columns instead of one (\t) , so I would like to replace all double tabs with single tabs in the file and automate this step before feeding it to the process. (something I perform manually with a text editor like Notepad++)
Is there any suggestion to achieve this?
Note: I've tried to do this with tReplace but the component recognizes the schema as already separated by tabs, so I can only replace someting within the specific column of data.
Four Stars

Re: Replace tabs in txt file

Hi harris
You should first read your file with tFileInputRaw ---> tmap (transform to a string) --> tReplace ("\t\t" to "\t) --> tFileOutputRaw
then use your new txt file 
Highlighted
One Star

Re: Replace tabs in txt file

Hi Jcs19,
I tried your suggestion, but did not manage to run the task.
I used a tFileInputRaw component with the option 'Read the file as a string' and then used the content of the file in the following tMap component with the function row1.content.replaceAll("\t\t", "\t") and exporting the output to a tFileOutputRaw, but I get out of memory exception.
Seems like it canlt handle the amount of data stored in memory.
Is there any alternative suggestion?
Four Stars

Re: Replace tabs in txt file

Hi Jcs19,
I tried your suggestion, but did not manage to run the task.
I used a tFileInputRaw component with the option 'Read the file as a string' and then used the content of the file in the following tMap component with the function row1.content.replaceAll("\t\t", "\t") and exporting the output to a tFileOutputRaw, but I get out of memory exception.
Seems like it canlt handle the amount of data stored in memory.
Is there any alternative suggestion?

How big is your file ? 
You may just need more memory
One Star

Re: Replace tabs in txt file

These files are in avg about 700k rows of data. 
As for memory allocation below are the values of .ini file
-vmargs
-Xms512m
-Xmx6056m
-XX:MaxPermSize=1024m
-Dfile.encoding=UTF-8

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog

How Media Organizations Achieved Success with Data Integration

Learn how media organizations have achieved success with Data Integration

Read

Why Companies Move to the Cloud: 7 Success Stories

Learn how and why companies are moving to the Cloud

Read Now