One Star

[resolved] Delay output from tUnite or tUniqRow

I have a job that has two data flows into a tUnite component; the output from tUnite then loads into a tUniqRow component, and only the unique records are sent on. The problem is that the job reads all of the records from Data Flow 1 into the tUnite, sends those records onto the tUniqRow, and then outputs the unique records for Data Flow 1. After that is complete, it does the same for Data Flow 2.
The problem is, the whole point of the tUnite and tUniqRow components is that Data Flow 1 and Data Flow 2 may have the same records, and I want to make sure I do not process the same record twice. But it seems like the job is checking for unique rows only within the two data flows, not within the united data flows.
So is there a way to delay the output from the tUnite or the tUniqRow components until all records from both data flows have been read in?
Or am I misunderstanding how the tUniqRow component works? Does it maintain all records in memory to check so it doesn't matter if the Data Flow 2 records are read in after Data Flow 1 has been output?
5 REPLIES
Moderator

Re: [resolved] Delay output from tUnite or tUniqRow

Hi,
Could you please elaborate your case with an example with input and expected output values? So that we can see if the components you used is OK with your use case.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
One Star

Re: [resolved] Delay output from tUnite or tUniqRow

Is this what you're asking for?
INPUT DATA:
Data Flow 1
========
ColA
------
100
200
300
400
500
Data Flow 2
========
ColA
------
100
999

EXPECTED OUTPUT DATA:
ColA
------
100
200
300
400
500
999
Moderator

Re: [resolved] Delay output from tUnite or tUniqRow

Hi,
I made a testing and don't get your scenario
But it seems like the job is checking for unique rows only within the two data flows, not within the united data flows.

Feel free to let me know if I miss something.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
One Star

Re: [resolved] Delay output from tUnite or tUniqRow

In your example, the two data flows are read in simultaneously; the issue I was dealing with is one entire data set would be read into and output from the tUniqRow, then then the other would be. However, I did some testing, and it appears that the tUniqRow does indeed keep all the records that pass through it in memory for comparison when it loads in the second data flow.
Moderator

Re: [resolved] Delay output from tUnite or tUniqRow

Hi,
tUniqRow is cache component which consumes memory. Thanks for sharing your experience with us.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.