One Star

How to Join a String with substring values if substring scan matches?

I have a data source (CSV schema DS1) and another data source(DS2: reference values).
I need to iterate throw the reference data rows (DS2) to chek if I find a match with a scan of DS1.
The scan operation is done by a custom code routine that does an implementation of the SCANNER java class. the routine getSubstringInString(WholeString, Mysubstring) return null or Mysubstring if scan is OK.
I thought I could use a tMap component and do an inner join with specific conditions on both string.
<Util.getSubstringInString(row5.WholeString, row6.Mysubstring).matches(row6.Mysubstring)>
and set a default value <UNKNOWN> in case of reject (no match in any rows)
But either the tmap doesnt' allow these kind of setting or I got a failure at execution.
Any help would be appreciated.
4 REPLIES
One Star

Re: How to Join a String with substring values if substring scan matches?

Hi,
can you give us a screen shot of the mapping in tMap and more information about your routine and the error you get.
Without this information I could only guess:
If your routine getSubstringInString returns null if if does not match you would get a NullPointerException with ".matches()". You should first check the result and only if it is not null check with match.
By the way: If this function returns your substring, the match make no sense. Or?
Bye
Volker
One Star

Re: How to Join a String with substring values if substring scan matches?

Hi,
the routine code is
public static boolean findInString(String strToLookIn, String strToScan) {
boolean result = false;
String strScan=null;
Scanner sc = new Scanner(strToLookIn);
strScan= sc.findInLine(strToScan);
if (strScan!=null)
result=true;
sc.close();

return result;
}
public static String getSubstringInString(String strToLookIn, String strToScan) {
String strScan=null;
Scanner sc = new Scanner(strToLookIn);
strScan= sc.findInLine(strToScan);

sc.close();

return strScan;
}
The images enclosed show the job purpose and a tMap view of what I wish to do.
If I set filter in the output, that works with
Util.findInString("ABC", "B") || Util.findInString("ABCD", "BC") || ... that could be a very long string loaded in context
but this doenst'give a very clear view as a tMap schema could do. And I shoud set the filter twice to get rejected data (NotInFirstFilterList) to look for in the next filterList, ...
When I used these kind of filters, it takes 30 minutes to parse 1 000 000 000 rows (tFileList).
As the error is concerned it was an error access to "row6.Libelle" in the join sequence Util.getSubstringInString(row5.Libelle, row6.Libelle)==row6.Libelle.
I would rather do a join between both datasources. I'm looking for ideas.
Thanks for your help.
Jnb
One Star

Re: How to Join a String with substring values if substring scan matches?

I had trouble with image posting.
After set of new dimension, I could only post one image at a time.
So here's the second image.
One Star

Re: How to Join a String with substring values if substring scan matches?

Hi,
the text in the pictures is not readable. Can you post your filter and join code as text?
Which error do you get?
I'm not sure but do you really use in the upper filter the findInString function with two literals? This would make no sense. But, how I said, it is bad readable so it could be a different.
Bye
Volker