Get nodes (not just strings) in tFileInputMSXML

One Star

Get nodes (not just strings) in tFileInputMSXML

Is there a way in the tFileInputMSXML component to retrieve nodes, i.e. whole subtrees, in a column of type Document?
I have a large XML file that is not only multi-schema, but also nested in multiple (=4-5) levels. I got tFileInputMSXML running to loop at any required nesting level and parse all the schemata I want. However, I have 100-200 schemata I need to parse altogether, so parsing them with a single tFileInputMSXML component seems not to be a good idea in terms of job design.
Therefore, I would like to parse upper nodes into their corresponding schemata, while preserving their child nodes as Documents. This way, I can delay the processing of the child nodes into subsequent components like tExtractXMLField and thus handle complexity by not having to parse all of the 100-200 different schemata at once inside a single tFileInputMSXML component.
What can I do?
What I have tried so far:
1. The tutorial ( selects a node with multiple children, but only outputs them as a string (see my comment at the bottom of the page).
2. The (non-multischema) tFileInputXML component has a "Get nodes" option for this case, but the tFileInputMSXML component has not.

Re: Get nodes (not just strings) in tFileInputMSXML

What is the structure of your document? Is it possible to paste it on forum?
Best regards
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.

What’s New for Talend Spring ’19

Watch the recorded webinar!

Watch Now


Introduction to Talend Open Studio for Data Integration.


Downloads and Trials

Test drive Talend's enterprise products.


Definitive Guide to Data Integration

Practical steps to developing your data integration strategy.