TOS 2.1 / 2.2 RC1: Is it possible not to die if a xml file is invalid?

One Star

TOS 2.1 / 2.2 RC1: Is it possible not to die if a xml file is invalid?

Hi All.

I have the following job.

The tFileList iterates recursively over a directory for *.xml files.
My problem is that the found files can be invalid xml files. Currently, I'm using a TJava to do a very basic check to test if the actually handled file is a well-formed xml file. The problem lies in the very basic testing functionality. Therefore i would like to rely on the saxparser integrated in the tFileInputXML component but the component will die the job.

This is not usable in my situation as i have about 10000 (yes 10k) files to parse each day. With the current functionality the job won't touch any files and hence not create any records in the database if the first file was invalid.

What this post comes down to is the following question:

Is there any possibility to turn off the dying of a job if a file passed to tFileInputXML is not a well-formed XML?

Kind Regards.
Ken

Edit: Fixed some typos
One Star

Re: TOS 2.1 / 2.2 RC1: Is it possible not to die if a xml file is invalid?

Addition: I tried tLogCatcher but managed to generate logs only and not to suppress the die command
One Star

Re: TOS 2.1 / 2.2 RC1: Is it possible not to die if a xml file is invalid?

Hi Ken,
just a spontaneous idea: could tDTDValidator help you?
One Star

Re: TOS 2.1 / 2.2 RC1: Is it possible not to die if a xml file is invalid?

Hi Volker,

thank you for your response. I just checked th tDTDValidator documentation. Its purpose is to validate any xml file against a given DTD. This is a well-formedness check. No validation of the syntax itself is performed. The tDTDValidator component as well as the tXSDValidator component both expect the xml document to validate to be well-formed.

Any other suggestions?

Kind Regards,
Ken
One Star

Re: TOS 2.1 / 2.2 RC1: Is it possible not to die if a xml file is invalid?

Hi again,

after further research and looking into different components, i would suggest to add a property Die on Error to a tFilelist. I will add a request to the tracker soon if it is ok.

Kind regards,
Ken
Employee

Re: TOS 2.1 / 2.2 RC1: Is it possible not to die if a xml file is invalid?

In my mind adding a die on error on tFileList is trenderous.
Adding such a parameter would catch any exception in the sub job (the job in the iterate).
If you are writing to a db and if an error occurs, it would override the die on error option of the db component.

In your special case, the solution is to write a routine validating your XML file rather than using a parameter already defined in another component.
In your routine you can manage your errors like you want without any problem.

Regards,
One Star

Re: TOS 2.1 / 2.2 RC1: Is it possible not to die if a xml file is invalid?

hi,

thank you for your insight. In the meantime since my last post i managed to dive a little into the sourcecode. From what I found there I think i will not propose the above mentioned property.

So, I will stick to my existing solution.

Kind regards,
ken
One Star

Re: TOS 2.1 / 2.2 RC1: Is it possible not to die if a xml file is invalid?

hi again.

Would a property "die on error" on the tFileInputXML also create the same problem as when adding the property to the tFileList?

Kind Regards
Ken
One Star

Re: TOS 2.1 / 2.2 RC1: Is it possible not to die if a xml file is invalid?

And another one, is there a possibility not to kill a job which starts with a TFileInputCSV and that component encounters an error. I'm getting a java.lang.NumberFormatException because one row contains a double instead of an integer which would be the normal content?

Kind regards
Employee

Re: TOS 2.1 / 2.2 RC1: Is it possible not to die if a xml file is invalid?

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Have you checked out Talend’s 2019 Summer release yet?

Find out about Talend's 2019 Summer release

Blog

Talend Summer 2019 – What’s New?

Talend continues to revolutionize how businesses leverage speed and manage scale

Watch Now

6 Ways to Start Utilizing Machine Learning with Amazon We Services and Talend

Look at6 ways to start utilizing Machine Learning with Amazon We Services and Talend

Blog