tExtractRegexField to split a string in two

Two Stars

tExtractRegexField to split a string in two

Hello,

 

I'm currently working on a job but i'm facing a problem. I got a field that contains strings that i would like to separate in two columns but I can't find the good regex to split it in the way I want, here is an exemple.

All my fields are structured that way : " hello, my name is steven, I live in Paris (215484845)"

I want to separe it in : 1/ hello, my name is steven, I live in Paris     and    2/ 215484845       <----- ( it's the ID)

 

But it may happens that some parenthesis can be found before the ones that handle the ID, but this one is always at the end of the string.

So if a string like this is find: "Hello, my name is jeff, i'm (27) and i like ice cream (58646846)" i want to split it that way:

1/ Hello, my name is jeff, i'm (27) and i like ice cream and 2/58646846

Thanks in advance for your help !

Eight Stars

Re: tExtractRegexField to split a string in two

I think you're going to need to exploit the structure of the phone number to do this and split when the parser finds a parenthesized string with 9 digits (ignoring all other parenthesized tokens). Is that something you can do, or will that lead to data errors?

 

David

Ten Stars

Re: tExtractRegexField to split a string in two

There is a Java string method lastIndexOf() which may be helpful to you. It returns the index of the last appearance of a character or string in another string.

rowX.columnName.lastIndexOf("(") and rowX.columnName.lastIndexOf(")") should get you the indexes of the start and end of your ID. Using those in a substring can extract the ID without having to apply a regex.
Nine Stars

Re: tExtractRegexField to split a string in two

Please check the attached job. 

 

in tMap:

two output ports:

Column1: row1.Desc.substring(0,row1.Desc.lastIndexOf("(")-1) 

Column2: row1.Desc.substring(row1.Desc.lastIndexOf("(")+1).replace(")","") 

 

row1.Desc is nothing but your input string.

 

Regards,

 

Veeru Boppudi

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog

How Media Organizations Achieved Success with Data Integration

Learn how media organizations have achieved success with Data Integration

Read

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.

Download