One Star

Filtering Data from MySqlInput > tJavaRow > tFilterRow to MySqlOutput

Hello @all:
I´m using TOS 5.1.0 and i want to filter data after some if statements in tJavaRow to MySqlOutput.
The Input Data consists of near to 200 data sets with near to 200 names.
For example: The InputData attribute "Name" contains among other characters also "?" or "/" or brackets "( )" in it.
For example: Name1/Name2? or i.e. Name1/Name2(Name3)? or i. e. Name1?/Name2?
My problem ist, that there are so many cases. I don´t know exactly the position of the slash or the brackets in the sttring, so i have to work with a for-loop! and the reason why i couldn´t specify the cases "contains" or "equals" in tFilterRow.
So i wrote a short program in tJavaRow with some if statements with "indexOf". One of these statements looks like:
"for (int i = 0; i < input_row.Name.length(); i++)
{
if ((input_row.Name.length - 1).indexOf("?")
{
System.out.println(input_row.Name)
}
}
output_row.Name = input_row.Name"

In the component there are shown i. e. 17 datasets and the tJavaRow runs really well. But i´ve to store (if it´s possible!?) these 17 outputs to another output database and not the 200 datasets (this is the case at the moment ;()
pls help!! thx in advance!
7 REPLIES
Community Manager

Re: Filtering Data from MySqlInput > tJavaRow > tFilterRow to MySqlOutput

Hi
Can you give an example with some real data to explain your request? What are your expected result?
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: Filtering Data from MySqlInput > tJavaRow > tFilterRow to MySqlOutput

At the moment i´ve to work with a coin-database, where one table is about the coin-issuer and you sometimes don´t know exactly the issuer of these coins. it´s a very big mix up in it and i want to give a new structure, so it could be some tables look like:
id Name
1 Name1/Name2
2 Name3
3 Name4
4 Name5/Name1?
5 Name6?
8 Name3?/Name7?
so.. i seperated all those cases in exactly 4 cases:
id1 means that it could be name1 OR name2. i call this case "alternative".
id2 and id 3 mean, that it is exactly name3 for id 2 and name4 for id3. no one else! this case i call "certain".
id4 is another case that means it was name5 OR name1 but i also could be someone else. it´s the "uncertain&alternative case".
id5 is name6 or someone else. it´s "uncertain".
so... everyone can see that there´s a big chaos in it, i´ve 4 different cases and if you´ve look to id1 and id4 you see two times name1.. so the database isn´t really normalized and it´s values are not atomic.
first of all i defined all cases "uncertain, certain, alternative und uncertain&alternative.
in the end i would like to have a database like this:
id Name
1 Name1
2 Name2
3 Name3
...
so i´ve to split the names and would give them id´s for uncertainty. certain = 1, uncertain = 2, and so on to make sure, that those, who use the database, can see, if someone is certain or not.
behind this table there´s a main table, called "coinfind" where you can find all the informations about that coins.
i. e. coin id 126 has now issuer 1 (Name1/Name2) but coin 177 has the issuer id 4 (Name5/Name1?). finally i would like to make a lookup and replace the id´s, where you can see the Names two or more times. Name1 i. e. has two different id´s and should only have one in the end.
to cut a long story short:
now i´m doing the filter of those 4 cases. in one table all the certain id´s. in the second all the uncertain ones. in the third all the alternatives (which i have to split afterwards). then i will give them all their uncertain id´s and then i would like make the loopup and replace the id´s.
now i did 4 jobs: from mysql input to tjavarow to mysql output. i use tjavarow in 4 different jobs. one job for certain, one for uncertain and so on. in the certain job i only would like to have names without "?" and without "/"..
so i did some if statements. now the issuer table has near to 200 data sets. i. e. i´ve after tjavarow 22 certain id´s.. i can see them well in the component, but it doesn´t write them in the ouput database. in the output database there will be the 200 datasets again from the beginning.
because that i´ve more cases and because that in the cases i have to define exactly the index of a "?" or a "/", i can´t work with "contains" or the the operator in the tFilterRow.
i´ve no more ideas to resolve the problem ;( and i´m very thankful for your response.
Community Manager

Re: Filtering Data from MySqlInput > tJavaRow > tFilterRow to MySqlOutput

Hi
You don't need to create 4 jobs, only one job is OK to filter thoese 4 cases, use tReplicate to replicate the data flow and then link to 4 tFilterRow components, each tFilterRow filter one case. Note that, tJavaRow don't filter rows, if the incoming data flow has N rows, it also output N rows. I created an example to explain to get all the certain rows. see my screeshots
If you really want to use tJavaRow in your case, the code should be like:
"for (int i = 0; i < input_row.Name.length(); i++)
{
if ((input_row.Name.length - 1).indexOf("?")
{
System.out.println(input_row.Name)
output_row.name=input_row.name;
}else{
output_row.name="";
}
}

and then, add a tFilterRow after tJavaRow to remove all the rows which name is "".
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: Filtering Data from MySqlInput > tJavaRow > tFilterRow to MySqlOutput

thx for your quick reply.
i solved the problem with using tjavarow and the filter, but i had a problem with the for-loop (because of tjava"row", solution was "output_row.Uncertainty = "false"; " in the beginning of the programm) and another one within the filter. in the filter now i´m using >> Uncertainty (input row), Operator == and uncertain or certain to filter. so.. thx for helping me!!
String fragezeichen = "?";
String slash = "/";
String klammerAuf = "(";
String klammerZu = ")";
String fragezeichenMitte = "? (";
output_row.Uncertainty = "uncertain";

int k = input_row.Name.indexOf(fragezeichen); //k ist Position von "?"
int l = input_row.Name.indexOf(slash); //l ist Position von "/"
int m = input_row.Name.indexOf(klammerAuf); //m ist Position von "("
int n = input_row.Name.indexOf(klammerZu); //n ist Position von ")"
int o = input_row.Name.indexOf(fragezeichenMitte); //o ist Position von "? ("

if ((k < 0 && l < 0) || (m < l && l < n))
{
System.out.println("k" + k);
System.out.println("l" + l);
System.out.println("m" + m);
System.out.println("n" + n);
System.out.println(input_row.id + input_row.Name);
output_row.Uncertainty = "certain";
}

output_row.Name = input_row.Name;
output_row.id = input_row.id;
output_row.Description = input_row.Description;
output_row.Uncertainty = output_row.Uncertainty;

nadine
One Star

Re: Filtering Data from MySqlInput > tJavaRow > tFilterRow to MySqlOutput

Hi,
Didn't mean to break the flow of the discussion but believe the functionality can be achieved with tExtractRegexFields also you can try with following expression "(^*$)"
And schema would be like
Name -- FilteredName
Location -- Location
Where FilteredName represents all the name after filtering ()*&^%$#@! and other special character, it follows rule that string should start with a character(either in lower case or upper case), contains characters(either in lower case or upper case) or spaces in between and should end with either a character(either in lower case or upper case) or with a space.
Hope that helps.
--
Regards,
Vinod
One Star

Re: Filtering Data from MySqlInput > tJavaRow > tFilterRow to MySqlOutput

Dear, again in the www.mmolive.com valentine's day. I don't know how many of you have me to your chocolate, but I see clearly... www.mmohome.com
!This year I can send--to reduce weight tea?
One Star

Re: Filtering Data from MySqlInput > tJavaRow > tFilterRow to MySqlOutput

ok, thx, next time i will try to use it.
now i´ve another problem with the .split method, but i will open another thread.