I have a number of patient IDs in an XML file. Since it contains patient notes, there are duplicates when extracting all IDs. For each unique patient I add a new entry in a lookup table that maps the patient's ID to a new meaningless ID. There's probably a few approaches to doing this, and here's mine:
tFileInputXML -> tFlowToIterate -?> tJoin -> (got it from here)
Now I want to do a tJoin for each ID from the XML with the lookup table. Each rejected inner join means the patient ID is not yet in the lookup table so I'll add it downstream. There is probably something I am not yet grasping (I'm new to Talend) but I can't connect the tFlowToIterate to the tJoin.
Your suggestion works but I'm not sure whether I really need to iterate since the result after tFileInputXML seems no different...
My purpose for using tJoin is to filter out duplicate patient IDs which already have an entry in the lookup table. Only for the inner join rejects I add a new entry that maps the ID to a new (meaningless) ID.
Would you agree it is better to remove duplicates using tUniq <i>before</i> doing the tJoins then? Having each tJoin done in a subjob for the sake of incorporating the latest additions to the lookup table would unnecessarily complicate things, no?