Hi, I have an input file like this: ID;First_Name;Last_Name;Team 835;Lionel|Cristano;Messi|Ronaldo;Barca|Real 836;Andres|Lionel;Iniesta|Messi;Barca|Barca And I'd like to have something like this: ID;First_Name;Last_Name;Team 835;Lionel;Messi;Barca 835;Cristiano;Ronaldo,Real 836;Andres;Iniesta;Barca 836;Lionel;Messi;Barca (Just for information, the final goal is to have two tables: -one with the referential of person: ID_Person;Name;Last_Name;Team 1;Lionel;Messi;Barca 2;Cristiano;Ronaldo,Real 3;Andres;Iniesta;Barca -and one with the link ID-ID_Person: ID;ID_Person 835;1 835;2 836;3 836;1) I use Talend all the time for my job but I don't manage to do this, I try with tExtractDelimitedFields, tDenormalize and tUnpivot... Anyone could help me? Thank you, Romain
For information, with 3 tExtractDelimited and 1 tUnpivotRow, I manage to have something like this, but it's not really what I want... : ID;Column1;Column2 835;First_Name;Lionel 835;First_Name;Cristiano 835;Last_Name;Messi 835;Last_Name;Ronaldo 835;Team;Barca 835;Team;Real 836;First_Name;Andres 836;First_Name;Lionel 836;Last_Name;Iniesta 836;Last_Name;Messi 836;Team;Barca 836;Team;Barca
I would use simple tFileInputDelimited using ";" as a delimiter. so the schema returns 4 columns you can now connect the schema to a tMap with a lookup of fixed values (0 and 1) The lookup must have Load Once and All rows settings with no real join to the main flow.. I am using hardcoded values tFixedFlowInput with schema: Idx, integer csv.First_Name.split("\\|") - gives me first persons fname for the (0) value of Idx csv.First_Name.split("\\|") - gives me second persons fname for the (1) value of Idx apply the same function for Last_Name and Team columns The lookup will act as a multiplier so if the tMap has 2 main rows from csv and 2 rows from your lookup the output will be 2x2 = 4 if 3 rows from csv the output will be 3x2 = 6 and so on ... Now it should be easy if you have output with the normalised values: ID;First_Name;Last_Name;Team 835;Lionel;Messi;Barca 835;Cristiano;Ronaldo,Real 836;Andres;Iniesta;Barca 836;Lionel;Messi;Barca you can do an insert update on a Person table and than your link table... Hope you got the idea.
Hello Lubod, thank you very much for your answer, it works! Very good trick. In fact, sometime I have one person per ID, sometime 2, sometime 3 (or more). The input could be like that: ID;First_Name;Last_Name;Team 835;Lionel|Cristano;Messi|Ronaldo;Barca|Real 836;Andres|Lionel|Franck;Iniesta|Messi|Ribery;Barca|Barca|Bayern 837;Lionel;Messi;Barca So I use one tFixedFlowInput (name=two_id) with (0,1), on another (name=three_id) with (0,1,2) And I choose which one to use with a StringHandling.COUNT. In the tmap, the function is: StringHandling.COUNT(First_Name,"|")==0?First_Name: StringHandling.COUNT(First_Name,"|")==1?First_Name.split("\\|") : StringHandling.COUNT(row10.First_Name,"|")==2?First_Name.split("\\|") :"more than 3 persons" I use a tUniqRow after that because I have two many rows (due to the two lookup inputs), but it works. The job doesn't look so bad, but if I have 10 persons per ID, I have to put 10 lookup outputs, and it will generate a lot of rows before the tUniqRow component! If you have a better idea, let me know! thank you again, Romain