Checking row numbers of two files then combining them

One Star

Checking row numbers of two files then combining them

Hi, thanks for looking at my thread.
My use case is that I have two files that need to be combined into a single file with some field mapping. Both of the input files have a header line that contains the number of rows that should be in the file. If the correct number of lines exist in both the files then they should be combined in the tMap. 
*I currently have a tFileInput that grabs the header and saves the number of rows as a global variable.
*Next, the tFileRowCount saves the number of rows as another global variable. 
*A tJava component has an IF connection. If the field that contains the row numbers from the header file and the number of rows counted by the tFileRowCount are equal, the input file loads into the tMap. Ideally I want a component send an email if the row numbers aren't correct, but I'll finish that when I have the basic use case working.

The current problem is that I can't attach the second tJava component to my second input file with a RUN IF connection.
Any suggestions or advice would be greatly appreciated. 
Thanks in advanced.
One Star

Re: Checking row numbers of two files then combining them

The reason you're not able to add the second IF is that a subjob in Talend has one starting point. And having connected via the top IF to the tFileInput, you implicitly defined that tFileInput as the input data flow or row. You'll notice that the row from the tFileInput is Main and the other one in the bottom is Lookup - for the tMap.
To achieve your design, you'd have to do the check on both files first - place the files into respective directories, then use them in the subjob with the tMap. The question I'd have - if you have a lot of main and lookup files, how do you coordinate the lookups between pairs of files? Or do you have only 1 lookup file? In which case you can reference the file in the directory that holds the file that passed your test above...
Hope this helps...
One Star

Re: Checking row numbers of two files then combining them

Taking another stab at your issue - if you have multiple main files that you want to do reconcile against a number of lookup files that have same schemas, how about when you do the row counts, you append the contents to new files - one for the main data and one for the lookup data. Then you connect to the subjob and configure your inputs to the respective new files...?
One Star

Re: Checking row numbers of two files then combining them

The reason you're not able to add the second IF is that a subjob in Talend has one starting point. And having connected via the top IF to the tFileInput, you implicitly defined that tFileInput as the input data flow or row. You'll notice that the row from the tFileInput is Main and the other one in the bottom is Lookup - for the tMap.
To achieve your design, you'd have to do the check on both files first - place the files into respective directories, then use them in the subjob with the tMap. The question I'd have - if you have a lot of main and lookup files, how do you coordinate the lookups between pairs of files? Or do you have only 1 lookup file? In which case you can reference the file in the directory that holds the file that passed your test above...
Hope this helps...

I'm having a hard time understanding your solution without a visualization. I only have two files at a time that come in from different directories. These two files must be combined into a single output file via the tMap. 
Before I combine them, I must ensure that the two input files have the correct number of rows that are shown in their respective headers.
I don't see how I would check them both first, unless you mean I do something like this:

Then create a subjob that combines those two files into a tMap
One Star

Re: Checking row numbers of two files then combining them

Yes, that's what I was describing... Would that work for you?
One Star

Re: Checking row numbers of two files then combining them

Yes, that's what I was describing... Would that work for you?

Absolutely. Thanks for the help!
One Star

Re: Checking row numbers of two files then combining them

I'm actually having a bit of trouble setting up the subjob correctly. I'm not sure how to use the two files from the parent job as the input files in the child job. Here is how I have my subjob set up currently. 


I have not given the input files in this subjob any filestream. How do I pass the files from the parent job to the child?