tUnite work within the Talend job

One Star

tUnite work within the Talend job

Hi,
As the result of another post (see below) I have been told that the tUnite component cannot unite data if it has been split inside the same job. The recommendation to me was to write the split files out to real files on disk and then bring them back inside my Talend job using the tUnite component. This seems a bit long winded.
I'd like the tUnite component to be able to re-combine data from multiple streams even if those streams come from a single stream in the same Talend job.
Below is my original post which led to the above comment and describes what I'm trying to achieve.
************************** start of original post *****************************************
Hi folks,
I'm fairly new to TOS and am putting together my first 'real' job. I know what I'm trying to achieve but I'm not sure if I'm going about it the right way.
I have an input file which contains multiple record types. Different record type have different numbers of columns.
What I'm trying to do is to write out a single file with the columns of each record type moved such that each column in the output file has a single data type.
For example:
- input file may contain the following 3 records
RT1,20100101,fred
RT2,12345,david
RT3,abc,1234
My output file needs to be in the following format:
- column#1 should be string (is my record type)
- column#2 should be a date
- column#3 should be a string
- column#4 should be numeric.
Using my sample data from above my output file will be as follows:
RT1,20100101,fred
RT2,,david,12345
RT3,,abc,1234
I am using tMap to split the input file based on the record type (RT1, RT2 etc) and for testing have written those to tLogRow so I know I'm doing the correct processing.
How do I merge them back into one file ? I realise that I could write them all out to different files and then iterate through a filelist, but that seems like overkill to me. I'm typically processing less than 10 rows on each execution of the job.
I tried to use tUnite directly from the output of tMap but couldn't get it to accept multiple inputs. Seems strange but I could not get it to work. The first output from tMap dropped onto the tUnite component ok, but subsequent ones would not connect to it. The icon being shown was the "not allowed" one.
I then considered using a tBufferOutput from each row type of tMap, this is ok, but again trying to use tUnite to mwerge them back together I'm hitting the same problem.
What's the best way to achieve this ? All ideas and suggestions gratefully received.
************************** end of original post *****************************************
How about allowing the tUnite to read from multuple tOutputBuffer/tInputBuffer components within the same Talend job ?
Cheers,
Dave
One Star

Re: tUnite work within the Talend job

Hello,
I encounter exactly the same problem. It is not possible to use tUnite when the data to combine have been splitted within the same job.
Is there a reason or is it a bug ?
Does it mean that the only way is to first write down all input data in files and then read them back ?
One Star

Re: tUnite work within the Talend job

I am a new TOS user and now in Dave's same boat. I have input data which has multiple contacts per account all in one row (not a great format, I know, but I inherited it.) My job is to split all the contact columns into separate streams with the same schema (easy enough with a tMap) and then re-merge the split data flows in the same way as a UNION operator would in SQL, so I have one nice data flow of contacts.
I do NOT think it's practical or desirable to have to write these streams out to file/database, then make new inputs and pump them all into the tUnite component.
WHY can't the tUnite component accept multiple inputs which happen to have been split off from a single input earlier upstream? This kind of operation would be really great to be able to accomplish simply. Are we all missing something, or has this still not been addressed?
- pat
Community Manager

Re: tUnite work within the Talend job

I am a new TOS user and now in Dave's same boat. I have input data which has multiple contacts per account all in one row (not a great format, I know, but I inherited it.) My job is to split all the contact columns into separate streams with the same schema (easy enough with a tMap) and then re-merge the split data flows in the same way as a UNION operator would in SQL, so I have one nice data flow of contacts.
I do NOT think it's practical or desirable to have to write these streams out to file/database, then make new inputs and pump them all into the tUnite component.
WHY can't the tUnite component accept multiple inputs which happen to have been split off from a single input earlier upstream? This kind of operation would be really great to be able to accomplish simply. Are we all missing something, or has this still not been addressed?
- pat

Can you upload a screenshot of job? so that we could know what you are trying to do?
Best regards
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
Seven Stars

Re: tUnite work within the Talend job

Inside tMap, click the green plus to add an output table but instead of using the default new output create a join table from your main/first output table. You can do this as many times as you want and the output from each will be united for you by tMap and sent through the first output flow.
Or you can use tSplitRow in v4.2.x.
One Star

Re: tUnite work within the Talend job

I think its a bug in Talend...
Please try this...
Split the input based on 3 conditions and create 3 diff outputs. Try to join all the 3 outputs to a tUnite component. It wont allow us to join the 2nd and 3rd outputs.
Moderator

Re: tUnite work within the Talend job

Hi bkar81,
Split the input based on 3 conditions and create 3 diff outputs. Try to join all the 3 outputs to a tUnite component. It wont allow us to join the 2nd and 3rd outputs.

Actually it is not a bug.
Regarding your job design, there must be a "Circle" in work flow which don't be allowed by talend.
Please have a look at KB article TalendHelpCenter:Can I create a Job with multiple paths from a single source to the same target?.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
One Star

Re: tUnite work within the Talend job

Yep, after posting, i went through the Forums and understood the feature restrictions and found out that this can be achieved using Hash components.
Moderator

Re: tUnite work within the Talend job

Hi,
It's great. Feel free post your issue and difficulty in forum.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
One Star

Re: tUnite work within the Talend job

Can the tMap and tUnite components have a configurable option to enable or disable this cyclic effect and by default it will be disabled. Its upto the developers to take care of this or handle the situation...
Its just my opinion, though donno how far it would be feasible or necessary.
One Star

Re: tUnite work within the Talend job

I have a doubt as well.
Will tHastInput wait for tHashOutput or do we need to explicitly pass on the OnSubjobOk to one of the tHashInputs?
Moderator

Re: tUnite work within the Talend job

Hi,
There are related scenarios in component reference TalendHelpCenter:tHashInput, hope it will be helpful for you.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
One Star

Re: tUnite work within the Talend job

Thanks. But there also the example shows only OnSubJobOK. So I think I need to go only about that.