Which regex to us with tExtractRegexFields to split a column?

Highlighted
Six Stars

Which regex to us with tExtractRegexFields to split a column?

I want to split one column into multiple columns (not rows) from the following string of a resultset (type: list):
]
The field seperator should be the comma, however not for the comma between the brackets. The end result should be:
column1="160915010"
column2="20:00"
column3="22:59"
column4="toneel,conventioneel"
It's not possible with the tExtractDelimitedFields, collumn4 will also split with the comma seperator.
Any help appreciated!
Remco
Highlighted
Seventeen Stars

Re: Which regex to us with tExtractRegexFields to split a column?

I would remove the [] and use indeed tExtractDelimitedFields but with the CSV option using " as enclosure char. In this case the delimiter , will not be used as delimiter (the " marks them as content and not as delimiter). 
Highlighted
Six Stars

Re: Which regex to us with tExtractRegexFields to split a column?

I would remove the [] and use indeed tExtractDelimitedFields but with the CSV option using " as enclosure char. In this case the delimiter , will not be used as delimiter (the " marks them as content and not as delimiter). 

Thnx for your reply, I want to test this but I'm running into a issue where CSV option does not exist for tExtractDelimitedFields. I'm using Talend 6.1.1 and according to the documentation there should be a CSV option?? I've also tested this with talend 6.2.0 and there's no CSV option also. Do I've to activate something?

Remco
Highlighted
Seventeen Stars

Re: Which regex to us with tExtractRegexFields to split a column?

I am sorry, my fault, You are right, unfortunately this component does not have the CSV options. I would say, this is a missing feature!
In this case you have only the choice to play with the original approach using regex. I suggest you play with regex with a online regex tester and use your finally tested regex here.
Highlighted
Six Stars

Re: Which regex to us with tExtractRegexFields to split a column?

I am sorry, my fault, You are right, unfortunately this component does not have the CSV options. I would say, this is a missing feature!
In this case you have only the choice to play with the original approach using regex. I suggest you play with regex with a online regex tester and use your finally tested regex here.

I've got a working regex (http://www.regextester.com/3269) tested online: ("(|"")*")
This returns all values between quotes, however when I use this in tExtractRegexFields it does not work. A syntax error is genereated..??
Highlighted
Seventeen Stars

Re: Which regex to us with tExtractRegexFields to split a column?

You have to convert this into a Java String: "(\"(|\"\")*\")"
Escape the backslashes and put it into the double quotas for the Java String literal.
The tExtractRegexField put the result of every regex group into a new field at the output.

2019 GARTNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

Best Practices for Using Context Variables with Talend – Part 2

Part 2 of a series on Context Variables

Blog

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables

Blog

Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema

Blog