Five Stars

Copy bad input files aside and move on to next file

I have a Talend workflow that takes in multiple CSV files and ingests them into a database. Each file must conform to a given format. On occassion, some input files will have an incorrect/non conformant format - what we could call bad/corrupt input. Therefore I have designed my workflow such that if the file fails the format test at the tFileInputDelimited component it should be copied to an error_directory. I am using the "On Component Error" trigger to branch out to this part of the workflow that handles bad input(i.e downwards along the red rectangular block). The trouble with this design is that although the input file fails parsing on the tFileInputDelimited component, this doesn't actually result in the component erroring, thus the branch highlighted in a red rectangle below is not executed for bad input. Instead, execution flow carries on straight to the components on the right side of the tFileInputDelimited along the green rectange path in the image below.

The desired outcome is say for example I have 3 files in my input folder ordered as 1 good, 1 bad and 1 good. I would want the workflow to execture straight through along the green rectange to the right for file 1, then branch out downwards along the red rectange for file 2 and lastly execute straight through along the green rectangle for file 3. Can anyone suggest how I might achive this. Thanks.job.png

 

5 REPLIES
Ten Stars

Re: Copy bad input files aside and move on to next file

Do you have "Die on error" checked on your tFileInputDelimited?
Five Stars

Re: Copy bad input files aside and move on to next file

Thanks cterenzi. Yes I have previously tried to check  "Die on error" on tFileInputDelimited. The problem with checking "Die on error" is that in the given scenario of files ordered as : 1 good, followed by 1 bad, followed by 1 good, the process will do everything I want except that once it dies on file 2(bad one) , the whole workflow dies and stops, hence file 3 is never processed. The only manual workaround is to manually restart the workflow. This isn't feasible when there are thousands of files to process with say a few dozen bad input scattered through out.

Ten Stars

Re: Copy bad input files aside and move on to next file

I would recommend two loops through the file list.  The first pass to determine which files are bad and move them out.  The second to loop through the remaining files and process them normally.

Five Stars

Re: Copy bad input files aside and move on to next file

If I understood you correctly, I would still fall into the same trap whereby after moving the first bad file, the whole process dies before moving on to check if the next file is good or bad - unless I misread your suggestion...?

Ten Stars

Re: Copy bad input files aside and move on to next file

Your check would need to be more robust than simply pulling in data from an input component and letting it fail. Without knowing the specifics, I would try to read in a full row of data and test it for validity with a tJavaRow component. You can set variables to control further execution (i.e. whether to move the file to a "bad" directory).