Filtering Data from MySqlInput > tJavaRow > tFilterRow to MySqlOutput

One Star

Filtering Data from MySqlInput > tJavaRow > tFilterRow to MySqlOutput

Hello @all:
I´m using TOS 5.1.0 and i want to filter data after some if statements in tJavaRow to MySqlOutput.
The Input Data consists of near to 200 data sets with near to 200 names.
For example: The InputData attribute "Name" contains among other characters also "?" or "/" or brackets "( )" in it.
For example: Name1/Name2? or i.e. Name1/Name2(Name3)? or i. e. Name1?/Name2?
My problem ist, that there are so many cases. I don´t know exactly the position of the slash or the brackets in the sttring, so i have to work with a for-loop! and the reason why i couldn´t specify the cases "contains" or "equals" in tFilterRow.
So i wrote a short program in tJavaRow with some if statements with "indexOf". One of these statements looks like:
"for (int i = 0; i < input_row.Name.length(); i++)
{
if ((input_row.Name.length - 1).indexOf("?")
{
System.out.println(input_row.Name)
}
}
output_row.Name = input_row.Name"

In the component there are shown i. e. 17 datasets and the tJavaRow runs really well. But i´ve to store (if it´s possible!?) these 17 outputs to another output database and not the 200 datasets (this is the case at the moment ;()
pls help!! thx in advance!
Community Manager

Re: Filtering Data from MySqlInput > tJavaRow > tFilterRow to MySqlOutput

Hi
Can you give an example with some real data to explain your request? What are your expected result?
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: Filtering Data from MySqlInput > tJavaRow > tFilterRow to MySqlOutput

At the moment i´ve to work with a coin-database, where one table is about the coin-issuer and you sometimes don´t know exactly the issuer of these coins. it´s a very big mix up in it and i want to give a new structure, so it could be some tables look like:
id Name
1 Name1/Name2
2 Name3
3 Name4
4 Name5/Name1?
5 Name6?
8 Name3?/Name7?
so.. i seperated all those cases in exactly 4 cases:
id1 means that it could be name1 OR name2. i call this case "alternative".
id2 and id 3 mean, that it is exactly name3 for id 2 and name4 for id3. no one else! this case i call "certain".
id4 is another case that means it was name5 OR name1 but i also could be someone else. it´s the "uncertain&alternative case".
id5 is name6 or someone else. it´s "uncertain".
so... everyone can see that there´s a big chaos in it, i´ve 4 different cases and if you´ve look to id1 and id4 you see two times name1.. so the database isn´t really normalized and it´s values are not atomic.
first of all i defined all cases "uncertain, certain, alternative und uncertain&alternative.
in the end i would like to have a database like this:
id Name
1 Name1
2 Name2
3 Name3
...
so i´ve to split the names and would give them id´s for uncertainty. certain = 1, uncertain = 2, and so on to make sure, that those, who use the database, can see, if someone is certain or not.
behind this table there´s a main table, called "coinfind" where you can find all the informations about that coins.
i. e. coin id 126 has now issuer 1 (Name1/Name2) but coin 177 has the issuer id 4 (Name5/Name1?). finally i would like to make a lookup and replace the id´s, where you can see the Names two or more times. Name1 i. e. has two different id´s and should only have one in the end.
to cut a long story short:
now i´m doing the filter of those 4 cases. in one table all the certain id´s. in the second all the uncertain ones. in the third all the alternatives (which i have to split afterwards). then i will give them all their uncertain id´s and then i would like make the loopup and replace the id´s.
now i did 4 jobs: from mysql input to tjavarow to mysql output. i use tjavarow in 4 different jobs. one job for certain, one for uncertain and so on. in the certain job i only would like to have names without "?" and without "/"..
so i did some if statements. now the issuer table has near to 200 data sets. i. e. i´ve after tjavarow 22 certain id´s.. i can see them well in the component, but it doesn´t write them in the ouput database. in the output database there will be the 200 datasets again from the beginning.
because that i´ve more cases and because that in the cases i have to define exactly the index of a "?" or a "/", i can´t work with "contains" or the the operator in the tFilterRow.
i´ve no more ideas to resolve the problem ;( and i´m very thankful for your response.
Community Manager

Re: Filtering Data from MySqlInput > tJavaRow > tFilterRow to MySqlOutput

Hi
You don't need to create 4 jobs, only one job is OK to filter thoese 4 cases, use tReplicate to replicate the data flow and then link to 4 tFilterRow components, each tFilterRow filter one case. Note that, tJavaRow don't filter rows, if the incoming data flow has N rows, it also output N rows. I created an example to explain to get all the certain rows. see my screeshots
If you really want to use tJavaRow in your case, the code should be like:
"for (int i = 0; i < input_row.Name.length(); i++)
{
if ((input_row.Name.length - 1).indexOf("?")
{
System.out.println(input_row.Name)
output_row.name=input_row.name;
}else{
output_row.name="";
}
}

and then, add a tFilterRow after tJavaRow to remove all the rows which name is "".
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: Filtering Data from MySqlInput > tJavaRow > tFilterRow to MySqlOutput

thx for your quick reply.
i solved the problem with using tjavarow and the filter, but i had a problem with the for-loop (because of tjava"row", solution was "output_row.Uncertainty = "false"; " in the beginning of the programm) and another one within the filter. in the filter now i´m using >> Uncertainty (input row), Operator == and uncertain or certain to filter. so.. thx for helping me!!
String fragezeichen = "?";
String slash = "/";
String klammerAuf = "(";
String klammerZu = ")";
String fragezeichenMitte = "? (";
output_row.Uncertainty = "uncertain";

int k = input_row.Name.indexOf(fragezeichen); //k ist Position von "?"
int l = input_row.Name.indexOf(slash); //l ist Position von "/"
int m = input_row.Name.indexOf(klammerAuf); //m ist Position von "("
int n = input_row.Name.indexOf(klammerZu); //n ist Position von ")"
int o = input_row.Name.indexOf(fragezeichenMitte); //o ist Position von "? ("

if ((k < 0 && l < 0) || (m < l && l < n))
{
System.out.println("k" + k);
System.out.println("l" + l);
System.out.println("m" + m);
System.out.println("n" + n);
System.out.println(input_row.id + input_row.Name);
output_row.Uncertainty = "certain";
}

output_row.Name = input_row.Name;
output_row.id = input_row.id;
output_row.Description = input_row.Description;
output_row.Uncertainty = output_row.Uncertainty;

nadine
One Star

Re: Filtering Data from MySqlInput > tJavaRow > tFilterRow to MySqlOutput

Hi,
Didn't mean to break the flow of the discussion but believe the functionality can be achieved with tExtractRegexFields also you can try with following expression "(^*$)"
And schema would be like
Name -- FilteredName
Location -- Location
Where FilteredName represents all the name after filtering ()*&^%$#@! and other special character, it follows rule that string should start with a character(either in lower case or upper case), contains characters(either in lower case or upper case) or spaces in between and should end with either a character(either in lower case or upper case) or with a space.
Hope that helps.
--
Regards,
Vinod
One Star

Re: Filtering Data from MySqlInput > tJavaRow > tFilterRow to MySqlOutput

Dear, again in the www.mmolive.com valentine's day. I don't know how many of you have me to your chocolate, but I see clearly... www.mmohome.com
!This year I can send--to reduce weight tea?
One Star

Re: Filtering Data from MySqlInput > tJavaRow > tFilterRow to MySqlOutput

ok, thx, next time i will try to use it.
now i´ve another problem with the .split method, but i will open another thread.

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog

How Media Organizations Achieved Success with Data Integration

Learn how media organizations have achieved success with Data Integration

Read

Downloads and Trials

Test drive Talend's enterprise products.

Downloads