One Star

Get nodes (not just strings) in tFileInputMSXML

Is there a way in the tFileInputMSXML component to retrieve nodes, i.e. whole subtrees, in a column of type Document?
I have a large XML file that is not only multi-schema, but also nested in multiple (=4-5) levels. I got tFileInputMSXML running to loop at any required nesting level and parse all the schemata I want. However, I have 100-200 schemata I need to parse altogether, so parsing them with a single tFileInputMSXML component seems not to be a good idea in terms of job design.
Therefore, I would like to parse upper nodes into their corresponding schemata, while preserving their child nodes as Documents. This way, I can delay the processing of the child nodes into subsequent components like tExtractXMLField and thus handle complexity by not having to parse all of the 100-200 different schemata at once inside a single tFileInputMSXML component.
What can I do?
What I have tried so far:
1. The tutorial (https://help.talend.com/search/all?query=tFileInputMSXML&content-lang=en) selects a node with multiple children, but only outputs them as a string (see my comment at the bottom of the page).
2. The (non-multischema) tFileInputXML component has a "Get nodes" option for this case, but the tFileInputMSXML component has not.
1 REPLY
Moderator

Re: Get nodes (not just strings) in tFileInputMSXML

Hi,
What is the structure of your document? Is it possible to paste it on forum?
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.