Search and Replace a List of Substrings

Five Stars

Search and Replace a List of Substrings

I am doing data clean up and would like to search for and replace with using a list of values.

 

 

For example, given the following entry:

"J Q Goudin Et Al Usa Trl"

 

And values from a lookup table:

Find         |       Replace

"Et Al"              ""

"Usa Trl"          "USA Trail"

 

I would expect this output:

J Q Goudin USA Trail

 

I have tried using tReplace, tReplaceList but have been unsuccessful.

 

Thanks in advance!

 

E


Accepted Solutions
Nine Stars

Re: Search and Replace a List of Substrings

@EMeany - not sure why your tReplaceList was not successful, check below solution which does work out for my use case.

Job Layout - 

tReplaceList-1.JPG

I have setup the main data in one tFixedFlowInput and the data to be searched/replaced with in the other one. In your tReplaceList you need to select the MAIN as your main string and LOOKUP would be your lookup-table.

The tMap is used just to remove all unwanted spaces from the string, used regex -

row35.newColumn.replaceAll("\\s+"," ")

tReplaceList Configuration - 

tReplaceList-2.JPG


All Replies
Ten Stars

Re: Search and Replace a List of Substrings

dig into the regex replace stuff and use groups which you use in your replace by 

myStrValueColumn1.replaceAll("\\s+)", "_");

Replaces all repeating white spaces with a single underscore

"Hi     wow     what ya want?" -> "Hi_wow_what_ya_want?"

 

Regex is like magic! go for it.

Six Stars

Re: Search and Replace a List of Substrings

Hi EMeany,

You can pass the input from tfixedflowinput to tmap and do as per the screenshot in tmap editor

Like

Var.var1.replaceAll("Et Al","") 

(Var.var2.replaceAll("Usa Trl","USA Trail")).replaceAll("  "," ") 

and display in tlogrow.

 

Regards

Aaryan

Nine Stars

Re: Search and Replace a List of Substrings

@EMeany - not sure why your tReplaceList was not successful, check below solution which does work out for my use case.

Job Layout - 

tReplaceList-1.JPG

I have setup the main data in one tFixedFlowInput and the data to be searched/replaced with in the other one. In your tReplaceList you need to select the MAIN as your main string and LOOKUP would be your lookup-table.

The tMap is used just to remove all unwanted spaces from the string, used regex -

row35.newColumn.replaceAll("\\s+"," ")

tReplaceList Configuration - 

tReplaceList-2.JPG

Five Stars

Re: Search and Replace a List of Substrings

Thank you iamabhishek.

 

After looking closer at the job requirements, I noticed that the replacement list was being sent to tReplaceList without being sorted first.  In my particular use case this was required as some of the replacements needed to be made sequentially.  

Thanks again for your help.