regular expression as Row Separator?

Four Stars

regular expression as Row Separator?

Hello,

 

I am attempting to parse a log file and wish to use a regular expression as the “Row Separator”

 

I started out by trying the tFileIpnut MSDelimited, but the “Row Separator” does not seem to accept regular expressions. 

 

I would like to have my job read sequentially and then "break" each time I encounter <four characters><space><dash><space><four characters>.    I don't need help with the regular expression - just with selecting the correct Talend component (or does this require Java code)?

 

A screenshot of the text is below.  

 

Capture.PNG

 

Any advice will be greatly appreciated.


Accepted Solutions
Sixteen Stars

Re: regular expression as Row Separator?

You can do it with a job looking like this....

Screen Shot 2018-04-06 at 00.28.08.png

 

The tFileInputRaw is used to read the whole file in one big chunk. It is read as an Object. You convert to a String with the tConvertType component. The important code comes in the tMap. In this component we join the input data (one row) to a tJavaFlex input using "Reload at each row". We also set up a variable to pass to the tJavaFlex. The config can be seen here....

Screen Shot 2018-04-06 at 00.32.06.png

 

Now we set up the tJavaFlex. Here we need to create a single column (I've called it "row") and add the following code......

Start Code

String[] rows = ((String)globalMap.get("myKey1")).split(".{4} - .{4}");

for(String row : rows){

Main Code

row4.row = row;

End Code

}

What we are doing is splitting the String chunk (passed in via the globalMap variable) using the regex (a quick and dirty one I knocked up to test this). It generates an array which I then create a for loop for. The Start Code opens the loop, the Main Code acts on the each iteration of the loop and the End Code closes the loop.

 

With your file it generated 12 rows. The only thing to take into consideration is that it will remove the text that is assessed as the row separator. The output looked like this....

 

}{vfwwwwgggwg}{vovgwwwwwggg}{vofwwwwwwwww}{vofgwgwwwwwg}{vfwgwggwwggu{vofgwwwwww
wg}{vfwgwggwggg}{vofwgwwwwwww}{vovgwggwwwww}{vofwwwwwggwg}{vofgggwwwwwwu{vofwwww
wggwg}{vofgggwgggwu{vfggwwwwwgg}{vofwwwwgg
SERT ID: 2356
CPU ID:
DATE:
TIME:
NAME:

.TR

ZP: 0000
VPL: 00
POS: 3FFF
Tw2: 01
L*D: 10
A%P: 00

.DD


00 00 00 00 03 01 02 0A 00 01 00 1E 00 00 00 00
00 00 01 00 AE 42 00 00 00 02 00 01 00 00 00 00
00 00 F0 00 A7 00 00 00 00 00 00 00 01 11 00 02
70 92 03 20 00 00 60 D5 62 FF 01 FC 04 00 52 C0
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 06 0A 1C 00 00 02 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 08 80 8B C7 C2 40 EF DF
00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00
10 10 10 00 00 00 00 00 00 00 00 00 FF FF FF 00
00 00 1E 1E 00 01 00 00 00 20 00 00 00 00 00 00
08 08 40 40 08 00 FF 3F 00 64 08 00 00 00 00 00
00 2B 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 01 02 00 00 10 00 0E 00 03 00 00
00 02 01 01 02 02 02 03 FF FF 00 00 02 00 00 00
00 00 00 00 00 00 2D 08 00 00 00 00 00 02 06 00



00 00 00 00 00 00 00 01 00 00 00 00 17 00 00 00
19 19 C0 17 00 19 00 FF 3F 00 00 00 00 81 12 FF
3F FF 3F 6E 12 00 00 00 00 00 00 00 00 00 00 00
00 00 00 FF 00 17 00 00 00 00 00 00 00 00 00 FB




B2 D1 00 00 00 00 00 00 00 00 00 00 00 00 B2 C0
D5 97 1C 06 B2 D1 00 00 00 00 00 00 00 00 00 00
00 00 FF C3 D1 00 FF C3 D1 00 00 00 FF FF 00 00
00 00 B7 00 00 00 B7 D0 02 D0 48 2F FD 2F 00 00



00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00



17 C8 00 85 CC FF 02 D5 2E 33 01 00 FE C3 01 D5
90 0A 00 FF 00 00 00 07 D5 01 07 00 10 C4 00 00
01 00 00 00 00 11 00 00 00 00 A3 00 00 04 01 D5
11 B9 27 17 C8 00 85 00 00 00 00 00 00 00 14 00
00 D5 00 17 07 00 01 02 00 00 01 03 01 00 31 62
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00



00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
D5 00 17 07 00 28 80 00 00 00 01 06 00 96 01 00
00 00 01 00 00 01 D5 00 00 00 00 00 00 00 01 01
C0 00 67 12 00 00 00 08 01 00 00 00 F9 FF 07 00
FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF
04 00 F0 FF FE FE 7E 12 FF 3F 2B 11 E0 01 40 FE
FF 00 00 00 03 12 00 00 00 00 04 00 00 00 00 00
2E 30 31 2E 30 32 4C 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 1A 10 B2 C0 97 1C 06 B2 D1 00



00 00 00 00 00 00 02 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 2E B7 00 00 00 2D 02 02 00 01 0F 00
00 00 00 01 5E 1A CE 1A 80 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00



AA 55 FA C1 FB C1 FD C1 FE C1 FF C1 00 C2 01 C2
40 C0 41 C0 42 C0 44 C0 46 C0 48 C0 66 C0 06 C0
01 C0 31 36 31 30 33 31 30 C7 90 D0 90 D0 40 CE
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00



B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00



B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00



B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00

All Replies
Sixteen Stars

Re: regular expression as Row Separator?

You can do it with a job looking like this....

Screen Shot 2018-04-06 at 00.28.08.png

 

The tFileInputRaw is used to read the whole file in one big chunk. It is read as an Object. You convert to a String with the tConvertType component. The important code comes in the tMap. In this component we join the input data (one row) to a tJavaFlex input using "Reload at each row". We also set up a variable to pass to the tJavaFlex. The config can be seen here....

Screen Shot 2018-04-06 at 00.32.06.png

 

Now we set up the tJavaFlex. Here we need to create a single column (I've called it "row") and add the following code......

Start Code

String[] rows = ((String)globalMap.get("myKey1")).split(".{4} - .{4}");

for(String row : rows){

Main Code

row4.row = row;

End Code

}

What we are doing is splitting the String chunk (passed in via the globalMap variable) using the regex (a quick and dirty one I knocked up to test this). It generates an array which I then create a for loop for. The Start Code opens the loop, the Main Code acts on the each iteration of the loop and the End Code closes the loop.

 

With your file it generated 12 rows. The only thing to take into consideration is that it will remove the text that is assessed as the row separator. The output looked like this....

 

}{vfwwwwgggwg}{vovgwwwwwggg}{vofwwwwwwwww}{vofgwgwwwwwg}{vfwgwggwwggu{vofgwwwwww
wg}{vfwgwggwggg}{vofwgwwwwwww}{vovgwggwwwww}{vofwwwwwggwg}{vofgggwwwwwwu{vofwwww
wggwg}{vofgggwgggwu{vfggwwwwwgg}{vofwwwwgg
SERT ID: 2356
CPU ID:
DATE:
TIME:
NAME:

.TR

ZP: 0000
VPL: 00
POS: 3FFF
Tw2: 01
L*D: 10
A%P: 00

.DD


00 00 00 00 03 01 02 0A 00 01 00 1E 00 00 00 00
00 00 01 00 AE 42 00 00 00 02 00 01 00 00 00 00
00 00 F0 00 A7 00 00 00 00 00 00 00 01 11 00 02
70 92 03 20 00 00 60 D5 62 FF 01 FC 04 00 52 C0
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 06 0A 1C 00 00 02 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 08 80 8B C7 C2 40 EF DF
00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00
10 10 10 00 00 00 00 00 00 00 00 00 FF FF FF 00
00 00 1E 1E 00 01 00 00 00 20 00 00 00 00 00 00
08 08 40 40 08 00 FF 3F 00 64 08 00 00 00 00 00
00 2B 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 01 02 00 00 10 00 0E 00 03 00 00
00 02 01 01 02 02 02 03 FF FF 00 00 02 00 00 00
00 00 00 00 00 00 2D 08 00 00 00 00 00 02 06 00



00 00 00 00 00 00 00 01 00 00 00 00 17 00 00 00
19 19 C0 17 00 19 00 FF 3F 00 00 00 00 81 12 FF
3F FF 3F 6E 12 00 00 00 00 00 00 00 00 00 00 00
00 00 00 FF 00 17 00 00 00 00 00 00 00 00 00 FB




B2 D1 00 00 00 00 00 00 00 00 00 00 00 00 B2 C0
D5 97 1C 06 B2 D1 00 00 00 00 00 00 00 00 00 00
00 00 FF C3 D1 00 FF C3 D1 00 00 00 FF FF 00 00
00 00 B7 00 00 00 B7 D0 02 D0 48 2F FD 2F 00 00



00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00



17 C8 00 85 CC FF 02 D5 2E 33 01 00 FE C3 01 D5
90 0A 00 FF 00 00 00 07 D5 01 07 00 10 C4 00 00
01 00 00 00 00 11 00 00 00 00 A3 00 00 04 01 D5
11 B9 27 17 C8 00 85 00 00 00 00 00 00 00 14 00
00 D5 00 17 07 00 01 02 00 00 01 03 01 00 31 62
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00



00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
D5 00 17 07 00 28 80 00 00 00 01 06 00 96 01 00
00 00 01 00 00 01 D5 00 00 00 00 00 00 00 01 01
C0 00 67 12 00 00 00 08 01 00 00 00 F9 FF 07 00
FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF
04 00 F0 FF FE FE 7E 12 FF 3F 2B 11 E0 01 40 FE
FF 00 00 00 03 12 00 00 00 00 04 00 00 00 00 00
2E 30 31 2E 30 32 4C 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 1A 10 B2 C0 97 1C 06 B2 D1 00



00 00 00 00 00 00 02 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 2E B7 00 00 00 2D 02 02 00 01 0F 00
00 00 00 01 5E 1A CE 1A 80 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00



AA 55 FA C1 FB C1 FD C1 FE C1 FF C1 00 C2 01 C2
40 C0 41 C0 42 C0 44 C0 46 C0 48 C0 66 C0 06 C0
01 C0 31 36 31 30 33 31 30 C7 90 D0 90 D0 40 CE
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00



B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00



B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00



B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
B2 C0 97 1C 06 B2 D1 00 00 00 00 00 00 00 02 00
Four Stars

Re: regular expression as Row Separator?

Thank you very much for putting me on the right path.

 

Much appreciated!!!

Four Stars

Re: regular expression as Row Separator?

very good