[resolved] how to replace in a field when the row separator is also

One Star

[resolved] how to replace in a field when the row separator is also

Hello,
I have a delimited file (field separator "\t", row separator "\n") looking like :
1\tblue\tapple\n
2\tred\n
\tpeer\n
tFileInputDelimited will fail on line 2 ...
So my idea was to replace the row separator so I will be able to replace \n by "" in the field. Bbut I didn't found a component for that Smiley Sad

regards

Accepted Solutions
Six Stars

Re: [resolved] how to replace in a field when the row separator is also

Well this format is really big shit Smiley Very Happy. Well, still there is solution of course :-) If you know the schema, you can go trough the whole file and detect \t char which seems to be used as column delimiter and repair the file first before it can be read by Talend.
Send me the file to archenroot@gmail.com, I can write small code which will standardize the file.
Ladislav

All Replies
One Star

Re: [resolved] how to replace in a field when the row separator is also

You want to replace the \n of the second line, the one in the middle?
One Star

Re: [resolved] how to replace in a field when the row separator is also

You want to replace the \n of the second line, the one in the middle?

I have modified line 2 in my post to reflect what's real
One Star

Re: [resolved] how to replace in a field when the row separator is also

So actually it's not really a tFileDelimited. Do you have only 2 columns?
One Star

Re: [resolved] how to replace in a field when the row separator is also

no the file have 15 columns
One Star

Re: [resolved] how to replace in a field when the row separator is also

But I don't really get it. Could you post a part of what you have and what you want?
One Star

Re: [resolved] how to replace in a field when the row separator is also

in my example the expected result will be :
1\tblue\tapple\n
2\tred\tpeer\n
One Star

Re: [resolved] how to replace in a field when the row separator is also

But why the first line has the right format?
One Star

Re: [resolved] how to replace in a field when the row separator is also

Because it has been entered correctly in the frontend application I guess
In line 2 the operator hit the return key by accident
One Star

Re: [resolved] how to replace in a field when the row separator is also

So sometimes you want to replace the \n, sometimes not?
One Star

Re: [resolved] how to replace in a field when the row separator is also

You neet to replace \n\t with \t anywhere in the file.
One Star

Re: [resolved] how to replace in a field when the row separator is also

there's no \n\t in the file. Only "string\t..." and "...string\n"
and BTW there could be more than one \n (...string\n\n\n) at a time because the field has been designed as multi line in the front end app ...
Regards
Didier
One Star

Re: [resolved] how to replace in a field when the row separator is also

There's a \n\t in your example.
One Star

Re: [resolved] how to replace in a field when the row separator is also

janhess,
I agree with you.
I think I haven't gave enough explanations to understand my problem.
So :
actually the file I'm trying to load is handled by Microsoft bcp utility. bcp doesn't care of misplaced CRLF. What I have understood is bcp seams to ignore the row separator so a row can be on several lines in the file, bcp loads the data using the field delimiter until the en of the file.

What I need is to reproduce this with Talend Smiley Happy
Six Stars

Re: [resolved] how to replace in a field when the row separator is also

Upload somewhere the file and post http link to it. Would be best option look at the real data.
Ladislav
One Star

Re: [resolved] how to replace in a field when the row separator is also

See screenshot.
At line 34543 the field 10 (3 - Tchaïkovski: Lac des cygnes...) contains multiple CRLF. the real row separator is at line 34549. The schema contains 16 columns.

Hope that helps
One Star

Re: [resolved] how to replace in a field when the row separator is also

Can you make your line break the NULCRLF?
Six Stars

Re: [resolved] how to replace in a field when the row separator is also

I see.. there is also the NULL character. So in content of file it seems there is always only CRLF, but end of line is always defined as NULCRLF sequence. You schould first read whole file to a String variable and do something like in tJavaRow:
// Make custom end of lines with char sequence "~$"
context.sWholeFile = context.sWholeFile.replace("\0\r\n","~$" );
// Remove CRLF from content of file
context.sWholeFile = context.sWholeFile.replace("\r\n","" );
// Create Windows standard end of line
context.sWholeFile = context.sWholeFile.replace("~$","\r\n" );

Ladislav
One Star

Re: [resolved] how to replace in a field when the row separator is also

Or you could make the separator \0\r\n and replace the \r\n in a tMap on the field.
One Star

Re: [resolved] how to replace in a field when the row separator is also

sorry but the end of row is not still NULL CR LF
Six Stars

Re: [resolved] how to replace in a field when the row separator is also

Well this format is really big shit Smiley Very Happy. Well, still there is solution of course :-) If you know the schema, you can go trough the whole file and detect \t char which seems to be used as column delimiter and repair the file first before it can be read by Talend.
Send me the file to archenroot@gmail.com, I can write small code which will standardize the file.
Ladislav
One Star

Re: [resolved] how to replace in a field when the row separator is also

your example shows line terminators as \t\n or \t\0\n. First replace all the \t\0\n with \t\n then pass to the tFileInputDelimited with a \t\n row separator.
Six Stars

Re: [resolved] how to replace in a field when the row separator is also

I see...janhes is right, try his approach...
One Star

Re: [resolved] how to replace in a field when the row separator is also

no sorry, in the screenshot there's not all the examples. Some lines do have a value in the last column.
So the end of line can be :
string\n
or
string\t\0\n
or
string\t\n
One Star

Re: [resolved] how to replace in a field when the row separator is also

If you don't give us the correct info we can't give you a correct solution.
Can you get the data fields surrounded by quotes? Then you'd be able to treat is as a csv file.
One Star

Re: [resolved] how to replace in a field when the row separator is also

archenroot solved the problem.
Thanks a lot for your help.