Which regex to us with tExtractRegexFields to split a column?

Six Stars

Which regex to us with tExtractRegexFields to split a column?

I want to split one column into multiple columns (not rows) from the following string of a resultset (type: list):
]
The field seperator should be the comma, however not for the comma between the brackets. The end result should be:
column1="160915010"
column2="20:00"
column3="22:59"
column4="toneel,conventioneel"
It's not possible with the tExtractDelimitedFields, collumn4 will also split with the comma seperator.
Any help appreciated!
Remco
Seventeen Stars

Re: Which regex to us with tExtractRegexFields to split a column?

I would remove the [] and use indeed tExtractDelimitedFields but with the CSV option using " as enclosure char. In this case the delimiter , will not be used as delimiter (the " marks them as content and not as delimiter). 
Six Stars

Re: Which regex to us with tExtractRegexFields to split a column?

I would remove the [] and use indeed tExtractDelimitedFields but with the CSV option using " as enclosure char. In this case the delimiter , will not be used as delimiter (the " marks them as content and not as delimiter). 

Thnx for your reply, I want to test this but I'm running into a issue where CSV option does not exist for tExtractDelimitedFields. I'm using Talend 6.1.1 and according to the documentation there should be a CSV option?? I've also tested this with talend 6.2.0 and there's no CSV option also. Do I've to activate something?

Remco
Seventeen Stars

Re: Which regex to us with tExtractRegexFields to split a column?

I am sorry, my fault, You are right, unfortunately this component does not have the CSV options. I would say, this is a missing feature!
In this case you have only the choice to play with the original approach using regex. I suggest you play with regex with a online regex tester and use your finally tested regex here.
Six Stars

Re: Which regex to us with tExtractRegexFields to split a column?

I am sorry, my fault, You are right, unfortunately this component does not have the CSV options. I would say, this is a missing feature!
In this case you have only the choice to play with the original approach using regex. I suggest you play with regex with a online regex tester and use your finally tested regex here.

I've got a working regex (http://www.regextester.com/3269) tested online: ("(|"")*")
This returns all values between quotes, however when I use this in tExtractRegexFields it does not work. A syntax error is genereated..??
Seventeen Stars

Re: Which regex to us with tExtractRegexFields to split a column?

You have to convert this into a Java String: "(\"(|\"\")*\")"
Escape the backslashes and put it into the double quotas for the Java String literal.
The tExtractRegexField put the result of every regex group into a new field at the output.