tExtractRegexFields not working

Highlighted
Six Stars

tExtractRegexFields not working

Hello!

I want to split one column. For example, 21.A-01-BTA or 21.A03-01-BTA. The split I want to do is like 

BDel: 21

Group: A

UnderGroup: (if there is any number after Group then it should go to UnderGroup). So for 1st example it will be Null and for 2nd it will be 03

Remaining string1: 01

Remaining string2: BTA

 

I tried to use tExtractRegexFields with the following expression but i get no values

"([0-9][0-9]).([A-Z])([0-9][0-9])?-([0-9])-([A-Z])" 

-- Used '?' since undergroup might or might not be present for the group.

What is the correct syntax for this?

 

Regards

Priyadarshini

 

 

 


Accepted Solutions
Nine Stars

Re: tExtractRegexFields not working

Another alternative is to use this regex :

 

"^"+
"([0-9]{2}).([A-Z])([0-9][0-9])?-([0-9][0-9])-([A-Z]{3})" +
".*"

make sure you create at least 5 columns in your shema

 

 

Regards
DGM
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.

View solution in original post


All Replies
Employee

Re: tExtractRegexFields not working

@priyadarshiniv 

 

Please refer the below details for parsing of data. Please note that I have considered only happy path of data. So you will have to do testing with various conditions and make necessary amendments for null checking and string length. The solutions for these two are already available in stackoverflow. So I am not touching on that aspect and give it as a hands-on exercise to you.

image.png

 

image.png

 

image.png

 

Coming to the java functions, please refer below.

 

var1 ->             row1.input.substring(row1.input.indexOf(".")+1,row1.input.indexOf("-")).replaceAll("\\D+","") 

BDel ->             row1.input.substring(0 ,row1.input.indexOf(".")) 
Group ->            row1.input.substring(row1.input.indexOf(".")+1,row1.input.indexOf("-")).replaceAll("[^A-Za-z]+", "") 
UnderGroup ->      Var.var1.equals("")?null:Var.var1 
R_string1->        row1.input.substring(row1.input.indexOf("-") +1,row1.input.indexOf("-", row1.input.indexOf("-") +1)) 
R_string2->        row1.input.substring(row1.input.indexOf("-", row1.input.indexOf("-") +1)+1) 

Hope you are happy with the resolution. Please spare a minute to give kudos and mark the topic as resolved :-)

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved :-)

Six Stars

Re: tExtractRegexFields not working

Thank you so much @nikhilthampi !!! I will check it today! 

 

Regards

Priya

Nine Stars

Re: tExtractRegexFields not working

Another alternative is to use this regex :

 

"^"+
"([0-9]{2}).([A-Z])([0-9][0-9])?-([0-9][0-9])-([A-Z]{3})" +
".*"

make sure you create at least 5 columns in your shema

 

 

Regards
DGM
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.

View solution in original post

Six Stars

Re: tExtractRegexFields not working

Hello @nikhilthampi 

When I use this I get this error!

String index out of range: -4

 

 

Six Stars

Re: tExtractRegexFields not working

Thank you @dgm01 for your reply! I need one more help! Instead of having Remaining_1 and Remaining_2 I want to have everything after UnderGroup to go in as one part. Tried this but gives wrong result:

"^"+

"([0-9]{2})?(\\.[A-Z])?([0-9][0-9])?(-[0-9][0-9])?(-[.]*)?" +

".*"  

for 21.A03-01-BTA  it gives me 

BDel: 21

Group: A

UnderGroup:03

Rest1: 01

It doesnt take the last part -BTA. Might be "-" is not trated as a character. How can then the expression be?

 

Regards

Priya

 

Nine Stars

Re: tExtractRegexFields not working

Hello @priyadarshiniv

Please, try this :

 

"^"+
"([0-9]{2})?(\\.[A-Z])?([0-9][0-9])?(-[0-9][0-9])?" +
"(.*)" 

Don't forget to create at least 5 columns in the schema



inputString:
expected Result:

Then I will help you write the regex

Regards
DGM
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables

Blog

Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema

Blog

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog