I have a job that has two data flows into a tUnite component; the output from tUnite then loads into a tUniqRow component, and only the unique records are sent on. The problem is that the job reads all of the records from Data Flow 1 into the tUnite, sends those records onto the tUniqRow, and then outputs the unique records for Data Flow 1. After that is complete, it does the same for Data Flow 2. The problem is, the whole point of the tUnite and tUniqRow components is that Data Flow 1 and Data Flow 2 may have the same records, and I want to make sure I do not process the same record twice. But it seems like the job is checking for unique rows only within the two data flows, not within the united data flows. So is there a way to delay the output from the tUnite or the tUniqRow components until all records from both data flows have been read in? Or am I misunderstanding how the tUniqRow component works? Does it maintain all records in memory to check so it doesn't matter if the Data Flow 2 records are read in after Data Flow 1 has been output?
Re: [resolved] Delay output from tUnite or tUniqRow
In your example, the two data flows are read in simultaneously; the issue I was dealing with is one entire data set would be read into and output from the tUniqRow, then then the other would be. However, I did some testing, and it appears that the tUniqRow does indeed keep all the records that pass through it in memory for comparison when it loads in the second data flow.