Iterate through List of files and lookup corresponding file in another directory

Five Stars

Iterate through List of files and lookup corresponding file in another directory

I have two directories each containing set of files for for different countries.

dir1-  Files for countries

dir2- Lookup files for each of the file in the dir1 country wise

 

I have to loop through the all the files in dir1 and do a lookup on the corresponding file in dir2.

E.g- If I am processing "*_US.csv" in dir1 then I should do a lookup on "*_US.csv" in dir2.

 Please help. Thanks In advance.


Accepted Solutions
Community Manager

Re: Iterate through List of files and lookup corresponding file in another directory

Sorry, I presumed you understood that you would need to calculate the lookup file name from what you know about the main file name. This could be done in a tJava just after the tFileList component and before the Main tFileInputDelimited. Since you know the name of the main file, you can figure out the name of the lookup file you will need. The example files you gave made this look really quite simple. You will need to use some basic Java String manipulation (which will take place in the tJava) and then save that value to the globalMap. Then in the lookup tFileInputDelimited you would just use the globalMap value you have set in the tJava.


All Replies
Community Manager

Re: Iterate through List of files and lookup corresponding file in another directory

Are the file names the same or fixed in such a way that you picking up the first would identify the name of the lookup file? If so, you only need to iterate over the first folder. A tFileList works using an iterator. This means that you can use it to select the first file and read it into a tMap. The lookup on the tMap can be selected using what you know about the first file. The lookup will read a new file for each iteration. This should remove the complication you describe in your question.

Five Stars

Re: Iterate through List of files and lookup corresponding file in another directory

Hi @rhall_2_0,

 Part of file names are constants and part of it changes from source file to lookup files.

For instance, if I am reading Customers_US.csv then corresponding lookup file in lookup directory is Accounts_US.csv. For Customers_UK.csv lookup on Accounts_UK.csv in lookup directory.

 

Please correct my understanding of the solution you proposed below:

tfilelist --iteration--tmap

                                    |

                                    |

                              (lookup source metadata)

Please correct if otherwise. If what I understand is correct how can tmap select corresponding lookup file for each of the file being iterated.

Thanks in Advance.

Community Manager

Re: Iterate through List of files and lookup corresponding file in another directory

I think you have understood me. But you need to go from the tFileList to a tFileInputDelimited to a tMap (for your main source). The tMap will have a tFileInputDelimited connected as its lookup as well. Every time a new main input is started (every iteration), the tMap lookup file will be started new as well. 

Five Stars

Re: Iterate through List of files and lookup corresponding file in another directory

@rhall_2_0.

Do you mean tfiledelimited for lookup will switch filename by itself as per incoming source file?


Here file masks for lookup and inputfiles are different.Input files have "Customers_*" and lookup files have "Accounts_*". How will files in lookup metadata will switch?

In the mean time, I will give it a try on what you mentioned.But please reply what do I mention as Filepath in tFileInputDelimited connected as lookup?

Five Stars

Re: Iterate through List of files and lookup corresponding file in another directory

@rhall_2_0,

I was trying to mockup the scenario but that dint help. Refer the actual job snapshot.

 

I got "java.lang.Exception: The data source should be specified as Inputstream or File Path!" with the approach mentioned in the snapshot.

 

I am using same filepath variable from tfilelist2 which is used for source files which is obviously going to fail in lookup source. Please suggest what should be "Filename/Stream" property for lookup source?

 

Please excuse me if I am not making any sense as I am new to Talend.

 

talend_job.png

 

Community Manager

Re: Iterate through List of files and lookup corresponding file in another directory

Sorry, I presumed you understood that you would need to calculate the lookup file name from what you know about the main file name. This could be done in a tJava just after the tFileList component and before the Main tFileInputDelimited. Since you know the name of the main file, you can figure out the name of the lookup file you will need. The example files you gave made this look really quite simple. You will need to use some basic Java String manipulation (which will take place in the tJava) and then save that value to the globalMap. Then in the lookup tFileInputDelimited you would just use the globalMap value you have set in the tJava.

Five Stars

Re: Iterate through List of files and lookup corresponding file in another directory

Thanks @rhall_2_0,

 

I was able to get it with the approach you mentioned. Only thing I struggled was on a link from tjava to tfileinputdelimited for source file. I was connecting with "iterate" on that link. After hours of search I was able to see its not iterate but 'trigger'.

Thanks a ton.

Community Manager

Re: Iterate through List of files and lookup corresponding file in another directory

You should be able to use the iterate link from the tJava. In this situation I don't think there will be much difference between the RunIf and the iterate. I assume you have set the RunIf link to have an expression of just true.

What’s New for Talend Spring ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog

How Media Organizations Achieved Success with Data Integration

Learn how media organizations have achieved success with Data Integration

Read

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.

Download