Replace tabs in txt file

One Star

Replace tabs in txt file

Hi all,
I would like your help on a specific part of my design.
I am receiving tab delimited txt file with rows of data, which I process on my design in next steps. 
However, some files have two tabs (\t\t) between columns instead of one (\t) , so I would like to replace all double tabs with single tabs in the file and automate this step before feeding it to the process. (something I perform manually with a text editor like Notepad++)
Is there any suggestion to achieve this?
Note: I've tried to do this with tReplace but the component recognizes the schema as already separated by tabs, so I can only replace someting within the specific column of data.
One Star

Re: Replace tabs in txt file

Hi harris
You should first read your file with tFileInputRaw ---> tmap (transform to a string) --> tReplace ("\t\t" to "\t) --> tFileOutputRaw
then use your new txt file 
One Star

Re: Replace tabs in txt file

Hi Jcs19,
I tried your suggestion, but did not manage to run the task.
I used a tFileInputRaw component with the option 'Read the file as a string' and then used the content of the file in the following tMap component with the function row1.content.replaceAll("\t\t", "\t") and exporting the output to a tFileOutputRaw, but I get out of memory exception.
Seems like it canlt handle the amount of data stored in memory.
Is there any alternative suggestion?
One Star

Re: Replace tabs in txt file

Hi Jcs19,
I tried your suggestion, but did not manage to run the task.
I used a tFileInputRaw component with the option 'Read the file as a string' and then used the content of the file in the following tMap component with the function row1.content.replaceAll("\t\t", "\t") and exporting the output to a tFileOutputRaw, but I get out of memory exception.
Seems like it canlt handle the amount of data stored in memory.
Is there any alternative suggestion?

How big is your file ? 
You may just need more memory
One Star

Re: Replace tabs in txt file

These files are in avg about 700k rows of data. 
As for memory allocation below are the values of .ini file
-vmargs
-Xms512m
-Xmx6056m
-XX:MaxPermSize=1024m
-Dfile.encoding=UTF-8