tSystem - sed command - I am close, just need a little help

Nine Stars

tSystem - sed command - I am close, just need a little help

@jlolling

 

When I am in the command line, this command works on Windows for using the GNU sed:

sed -i ":a;N;$!ba;s/\"\n\"«/\"\"«/g" myFile.csv

 

I have a 115GB file which has multiple row which are breaking to multiple lines.

I need to find and replace all occurrences of double quote new line character double quote and special character « (which is my delimiter).

 

For example:

line_one_here"\n
the_rest_of_line_one_here

line_two_here

line_three_here

 

I am trying to get this to work in tSystem:

"sed -i \":a;N;$!ba;s/\"\n\"«/\"\"«/g\" "+ context.myFile

 

Error is:

sed: -e expression #1, char 12: unterminated `s' command

 

Similar to this post:

https://community.talend.com/t5/Design-and-Development/Not-able-to-run-Linux-command-using-tSystem-C...

Seventeen Stars

Re: tSystem - sed command - I am close, just need a little help

Whats wrong with the line breaks?

It is not a problem for parsing CSV!

If the values are capsulated in an exclosure like " the line break does not disturb will be treated simple as content.

Nine Stars

Re: tSystem - sed command - I am close, just need a little help

The problem is when the 115GB file was created, the .csv option wasn't turned on Smiley Sad

Seventeen Stars

Re: tSystem - sed command - I am close, just need a little help

I have done a quick check what is the result of your sed java expression:

In a tJava:

System.out.println("sed -i \":a;N;$!ba;s/\"\n\"«/\"\"«/g\" ");

The result is:

sed -i ":a;N;$!ba;s/"
"«/""«/g" 

It does not looks like your original working command line above.

Your problem is the line break misused currently. Try this as java expression in the tSystem component.

"sed -i \":a;N;$!ba;s/\\\"\\n\\\"«/\\\"\\\"«/g\" "

compared to your working command from your post:

sed -i ":a;N;$!ba;s/\"\n\"«/\"\"«/g"

this is the result of my changed java escaped command line:

sed -i ":a;N;$!ba;s/\"\n\"«/\"\"«/g"

I suggest you are trying your expression and print it out with System.out.println and tweak it until it looks like your expected command line.

Nine Stars

Re: tSystem - sed command - I am close, just need a little help

I think this works:

context.myCommand="sed -i \":a;N;$!ba;s/\\\"\\n\\\"«/\\\"\\\"«/g\"
myFile.csv";
System.out.println(context.myCommand);
Seventeen Stars

Re: tSystem - sed command - I am close, just need a little help

Where the hell is my response? I have for about an hour send your exactly this answer!


@talendtester wrote:
I think this works:

context.myCommand="sed -i \":a;N;$!ba;s/\\\"\\n\\\"«/\\\"\\\"«/g\"
myFile.csv";
System.out.println(context.myCommand);

 

Nine Stars

Re: tSystem - sed command - I am close, just need a little help

@jlolling thank you for your fast help! 

 

Not sure, I got the email update notice and I attempted to respond to it (instead of the earlier post) but the forums gave me a message saying I didn't have permissions and then when I refreshed the page your newer post was gone.

 

 

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 2

Part 2 of a series on Context Variables

Blog

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables

Blog

Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema

Blog