Which regex to us with tExtractRegexFields to split a column?

Six Stars

Which regex to us with tExtractRegexFields to split a column?

I want to split one column into multiple columns (not rows) from the following string of a resultset (type: list):
]
The field seperator should be the comma, however not for the comma between the brackets. The end result should be:
column1="160915010"
column2="20:00"
column3="22:59"
column4="toneel,conventioneel"
It's not possible with the tExtractDelimitedFields, collumn4 will also split with the comma seperator.
Any help appreciated!
Remco
Seventeen Stars

Re: Which regex to us with tExtractRegexFields to split a column?

I would remove the [] and use indeed tExtractDelimitedFields but with the CSV option using " as enclosure char. In this case the delimiter , will not be used as delimiter (the " marks them as content and not as delimiter). 
Six Stars

Re: Which regex to us with tExtractRegexFields to split a column?

I would remove the [] and use indeed tExtractDelimitedFields but with the CSV option using " as enclosure char. In this case the delimiter , will not be used as delimiter (the " marks them as content and not as delimiter). 

Thnx for your reply, I want to test this but I'm running into a issue where CSV option does not exist for tExtractDelimitedFields. I'm using Talend 6.1.1 and according to the documentation there should be a CSV option?? I've also tested this with talend 6.2.0 and there's no CSV option also. Do I've to activate something?

Remco
Seventeen Stars

Re: Which regex to us with tExtractRegexFields to split a column?

I am sorry, my fault, You are right, unfortunately this component does not have the CSV options. I would say, this is a missing feature!
In this case you have only the choice to play with the original approach using regex. I suggest you play with regex with a online regex tester and use your finally tested regex here.
Six Stars

Re: Which regex to us with tExtractRegexFields to split a column?

I am sorry, my fault, You are right, unfortunately this component does not have the CSV options. I would say, this is a missing feature!
In this case you have only the choice to play with the original approach using regex. I suggest you play with regex with a online regex tester and use your finally tested regex here.

I've got a working regex (http://www.regextester.com/3269) tested online: ("(|"")*")
This returns all values between quotes, however when I use this in tExtractRegexFields it does not work. A syntax error is genereated..??
Seventeen Stars

Re: Which regex to us with tExtractRegexFields to split a column?

You have to convert this into a Java String: "(\"(|\"\")*\")"
Escape the backslashes and put it into the double quotas for the Java String literal.
The tExtractRegexField put the result of every regex group into a new field at the output.

15TH OCTOBER, COUNTY HALL, LONDON

Join us at the Community Lounge.

Register Now

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog

How Media Organizations Achieved Success with Data Integration

Learn how media organizations have achieved success with Data Integration

Read

APIs for Dummies

View this on-demand webinar about APIs....

Watch Now