Grabbing the First Value From an Input String based off a space or "and" value to use in multiple output columns in tMap

Seven Stars

Grabbing the First Value From an Input String based off a space or "and" value to use in multiple output columns in tMap

I have an input column that has Primary Salutation of a mailing address for example "Steve and Rachel Smith" is one value. If I needed to manipulate that value to pull just "Steve" for a First Name Output Column. What String Method could I use to Output just the first word in that value? This will be used across the board. Also I would need to pull the Last Name for a Last Name Field "Smith" would be the value I grab.

 

Now there are different input values in different formats in this file. For example there are input values in this format listed below as well.

"Steve"

"Steve and Rachel"

"Steve Smith"

 

So the String Method isn't cut and dry there are a few different outputs that would cause issues with how the other strings are manipulated.

I know for the first name it would be the First Word before the first space.

Second First Name would be the first word after the word "and"

The Last Name would be the second word if no "and" exists, or the fourth word if the word "and" exists.

 

If anyone can help with this, my Java String Method coding is not there yet

 

-Andrew

 

Seven Stars

Re: Grabbing the First Value From an Input String based off a space or "and" value to use in multiple output columns in tMap

Try this

 

String test = "Steve and Rachel";
System.out.println("Firstname: " + test.substring(0, test.indexOf(" ")));
System.out.println("LastName: " + test.substring(test.lastIndexOf(" ")+1));

DataTeam.pl

Eleven Stars

Re: Grabbing the First Value From an Input String based off a space or "and" value to use in multiple output columns in tMap

PFA a sample job 

 

Count Number of ( And/AND/and/&) in the string 

If CountAnd is Zero 

     Count Number of Word in String 

     If CountNumberofWord = 1

     FirstName is 1st Word , 2nd Name is ""

     If CountNumberofWord > 1

     FirstName is 1st Word , 2nd Name is last word

 

If CountAnd > 0

     Count Number of Word in  (last part of String separated by And)

     If CountNumberofWord = 1

     FirstName is 1st Word , 2nd Name is ""

     If CountNumberofWord > 1

     FirstName is 1st Word , 2nd Name is last word

 

 

Regards
Abhishek KUMAR
Seven Stars

Re: Grabbing the First Value From an Input String based off a space or "and" value to use in multiple output columns in tMap

Hey @akumar2301 

Thanks for the solution! one more scenario that I am dealing with within this column

 

If the Name Line is in this format "Smith, Steve". So Last Name is listed first followed by a comma, then First Name.

This can be a seperate String Method you don't have to include it within your last job.

 

Last Name = "Smith"

First Name = "Steve"

 

Thanks,

 

Andrew

Eleven Stars

Re: Grabbing the First Value From an Input String based off a space or "and" value to use in multiple output columns in tMap

@AndrewSmith1182 

 

Best solution is to ask your provider to clean file himself. Smiley Happy Smiley HappyThese input cannot be system generated.

 

String input= "Smith, Steve"

 

FirstName = (input == null)? "" : (input.split(",").length > 1 ? input.split(",")[1] : "")

Last Name = (input == null)? "" : (input.split(",").length > 1 ? input.split(",")[0] : "")

 

 

 

Regards
Abhishek KUMAR
Seven Stars

Re: Grabbing the First Value From an Input String based off a space or "and" value to use in multiple output columns in tMap

@akumar2301 

 

We're in the process for asking are clients to standardize their data to one specific format. It's been a headache especially coming from a SQL background and not knowing much Java. Do you have any good Java script resources, to learn what each character means within the code?

 

Thanks,

 

Andrew

Employee

Re: Grabbing the First Value From an Input String based off a space or "and" value to use in multiple output columns in tMap

@AndrewSmith1182 

 

Hi Andrew,

 

   I am also from SQL background and not a Java developer. But Talend has helped me to solve lot of issues with basic Java skill set. So you can easily become a Talend professional without deep java knowledge :-)

 

   Coming to the current issue, I agree with @akumar2301  200%. This is an issue created by source. So they have to first cleanse the data at source and then they have to transport the data to other systems.

 

"Steve"
"Steve and Rachel"
"Steve Smith"
"Smith, Steve"

It seems they have opened just a free text box in the front end system or direct load to a excel sheet and asking us to clear all the junk data. If they are writing all these combinations, I am sure the customer can write the data in below format also in that system :-)

 

Let me test Andrew again and I am sure our source system team can beat Mr.Smith this time

And they might be expecting to pick Andrew Smith from this data :-)

 

There are Data Quality elements in tMap as shown below but even those will get beaten due to all different combinations in your input data

image.png

 

So we need to push it back and ask the source data owner to either cleanse and give it or ask them to sign off as an agreed data risk from source.

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved :-)

What’s New for Talend Spring ’19

Watch the recorded webinar!

Watch Now

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.

Download

Tutorial

Introduction to Talend Open Studio for Data Integration.

Watch

Downloads and Trials

Test drive Talend's enterprise products.

Downloads