One Star

Merge nodes from two source XML documents into one XML document

I am currently trying to merge data from two source XML documents with the same schema into a single XML document.  The output XML document must adhere to the same schema as the two source files.
My example case for this job is simple so as to ease debugging.  Each source file contains a flatorders element, which contains one or more order elements.  See the following as an example:

<?xml version="1.0" encoding="ISO-8859-15"?>
<flatorders>
  <order orderid="889923">
    <orderperson/>
    <shipToName>Ola Nordmann</shipToName>
    <shipToAddress>Langgt 23</shipToAddress>
    <shipToCity>4000 Stavanger</shipToCity>
    <shipToCountry>Norway</shipToCountry>
  </order>
</flatorders>

The idea is to take the order nodes from each of the source files and append them into a single output file.  The output file should have one flatorders node, which would contain all of the order nodes from both files.
Since in the future, I'd like to use this example to merge documents of a much greater complexity, I don't want the program to need to know about the contents of an order.  Nor do I wish to map individual order fields.  I simply want it to copy the order element in its entirety to the output file.  That being the case, I decided to use the Document datatype to hold orders en route.
However, when I built and ran the job, I found that it was almost doing what I wanted.  However, Talend appeared to have added the name of the Document object ("order") as an encapsulating element.  As an example, see the following.  Note the extra order elements encapsulating each order.

<?xml version="1.0" encoding="UTF-8"?>
<flatorders>
  <order>
    <order orderid="889923">
      <shipToName>Ola Nordmann</shipToName>
      <shipToAddress>Langgt 23</shipToAddress>
      <shipToCity>4000 Stavanger</shipToCity>
      <shipToCountry>Norway</shipToCountry>
    </order>
  </order>
  <order>
    <order orderid="889924">
      <shipToName>Snoid Florgsbottom</shipToName>
      <shipToAddress>Thatone St</shipToAddress>
      <shipToCity>City of Dreams</shipToCity>
      <shipToCountry>The Wilderness</shipToCountry>
    </order>
  </order>
</flatorders>

I suspected that when I created a Document object, Talend assumes that the name of the object itself is an element.  Therefore, when appending the order to the final document, it wraps the contents of the Document in another order element.
Is there a way to prevent this without needing to use tXMLMap to manually parse out each field?  If I were to use more complex datasets, manually mapping each field to the output field using tXMLMap would be tedious. 
As a reference, here are a few screenshots of my job.
Here are the input file settings and components for my job.  I've set the xPath query to "/flatorders/order"


Here is the input file schema.  It contains a single Document type called "order."


And finally, here are the settings for the tAdvancedFileOutputXML component.

1 REPLY
Community Manager

Re: Merge nodes from two source XML documents into one XML document

Hi  
You have to define the loop element on tAdvancedFileOutputXML, so now I don't see a direct way to merge nodes without loop element, need to extract fields from source XML file and rewrite a new XML file.
Regards
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business