Five Stars

Filename validation in talend

Hi,

 

I need to validate filenames that are being processed. For e.g. requirement is file name should be

Main file - YYYYMMDD_HHMMSS_Field1_Field2.txt
Meta file - YYYYMMDD_HHMMSS_Field1_Field2_META.txt

Based on the file category i want two flows that will process main file and meta file and load into separate tables.

I am new to talend so detail explanation will be appreciated much

 

Thanks,

Pravin Sanadi

  • Data Integration
3 REPLIES
Seven Stars

Re: Filename validation in talend

Do You need process only files, which equal to already knowing pattern? or You need check filename not include nothing other than pattern?

 

example of filename pattern:

 

 

TalendDate.getDate("YYYYMMDD_HHmmss")+"_"+((String)globalMap.get("value1"))+"_"+((String)globalMap.get("value2"))+".txt"
TalendDate.getDate("YYYYMMDD_HHmmss")+"_"+((String)globalMap.get("value1"))+"_"+((String)globalMap.get("value2"))+"META.txt"

 

 

but this pattern mean - You must know file name up to seconds

TalendDate.getDate("YYYYMMDD_")+"*_"+((String)globalMap.get("value1"))+"_"+((String)globalMap.get("value2"))+".txt"
TalendDate.getDate("YYYYMMDD_")+"*_"+((String)globalMap.get("value1"))+"_"+((String)globalMap.get("value2"))+"META.txt"

this pattern in chain:

tFileList ->(Iterate) -> (Your Steps)

process all files from today

 

and etc

-----------
Five Stars

Re: Filename validation in talend

Hi,

 

Thanks for quick response. I need to validate all the files having the specified format. How do I route two flows one for main and one for meta file to load it in other tables.

 

Thanks,

Pravin 

Seven Stars TRF
Seven Stars

Re: Filename validation in talend

Hi,

I suggest you 2 options.

1st one, place a tFileList to get all the txt files, connect tFileList to a tJava with an iterate flow.

This tJava does nothing but opens 2 branch depending on the current filename.

Here I connect a tJava just to print the filename, preceding by "META > " if filename contains the string "_META.txt" (your usecase).

Here is what's the job looks like:

Capture.PNG

Look at the 2 "if" after tJava_4. The 1st one (on top) is for filenames with "_META.txt":

((String)globalMap.get("tFileList_2_CURRENT_FILE")).contains("_META.txt")

and the 2nd for other cases:

!((String)globalMap.get("tFileList_2_CURRENT_FILE")).contains("_META.txt")

(see the exclamation point at the begenning for negation).

 

You just have to play with filemask and order by option of the tFileList to drive how the files are processed.

 

The 2nd approach is to proceed with a group of files first ("_META.txt" for example), then to go with the second group (non _META). In this case you need 2 separate tFileList with the corresponding filemask.

The 1st one to include all txt files with "_META.txt" in the name and the 2nd to exclude these files (see advanced settings for Exclude Filemask).

Here is what's the job looks like:

Capture.PNG

As you can see, I have 2 separate subjobs linked with a OnSubjob_Ok trigger for orchestration.

1st tFileList is here (target, "*_META.txt" files only) :

Capture.PNG

2nd tFileList is here (target, *.txt" files but "*_META.txt"):

Capture.PNG

And the Advanced settings for this component (for *"_META.txt" exclusion):

Capture.PNG

You got it?


TRF