prepare XML with multiple loop in same level

Highlighted
Thirteen Stars

prepare XML with multiple loop in same level

The idea for this article was described several times (for example @rhall_2_0 ) but questions is still here, so I decided to prepare an example.

 

With standard Talend components like tXMLMap and/or tMSXMLOutput it is not possible to create XML with many loops on same level, like:

<?xml version="1.0" encoding="UTF-8"?>
<customer>
    <first_name></first_name>
    <last_name></last_name>
    <emails>
        <email> <!-- This is a Loop - multiple email for customer-->
            <email></email>
            <type></type>
        </email>
    </emails>
    <phones>
        <phone> <!-- This is a Loop - multiple phones for customer-->
            <phone></phone>
            <type></type>
        </phone>
    </phones>
    <addresses>
        <address>
            <address> <!-- This is a Loop - multiple addresses for customer -->
                <address_line_1></address_line_1>
                <address_line_2></address_line_2>
                <address_line_3></address_line_3>
                <postcode></postcode>
                <country></country>
            </address>
        </address>
    </addresses>
</customer>

Life case - client with 2 addresses, 2 phones and 2 emails and we restricted by requirements have exactly this structure. 

Standard tXMLMap return wrong result, tMSXMLOutput also cannot be used in this case.

DataMapper:

  • documentation is empty - possible most shorterst and not useful in class
  • included with a most expensive license of Talend only

Solution which work in any version of Talend Studio (include Open Source):

we are split job to separate flows, 1 flow for main XML and 1 for each Loop:

1.png

 

in tXMLMap_1 assign PLACEHOLDER_XXXX as element value

2.png 

output XML will contain:

<emails>PLACEHOLDER_EMAIL</emails>
<phones>PLACEHOLDER_PHONES</phones>
<addresses>PLACEHOLDER_ADDRESS</addresses>

we will use this at future steps for replace.

 

each of separated flows contain tXMLMap for generate target structure:

3.png

 

each output XML part must be converted from Document type to String type and cleaned from unnecessary information (tReplace_1/2/3):

4.png

 

Finally all flows joined by tMap (tMap_1) as string, in our case we make Join by customer_id:

 

5.png

 

in middle part of tMap we are replace each placeholder by relevant values:

row9.email==null?
row8.customer.replaceAll("<emails>PLACEHOLDER_EMAIL</emails>", "")
:row8.customer.replaceAll("<emails>PLACEHOLDER_EMAIL</emails>", row9.email)

Final XML string we can use as string to insert into database or send to API, or store to file. It is important do not convert it to Document back, because in this case Talend will add a lot of waste in document (&l & and etc)

If we need use this document as XML (Document type) in Talend on future steps - best choice store to text file and open same file but as XML.

 

Final proper structured XML:

<?xml version="1.0" encoding="UTF-8"?>
<customer>
    <first_name>Joe</first_name>
    <last_name>Dow</last_name>
    <emails>
        <email>
            <email>email3@test.com</email>
            <type>home</type>
        </email>
        <email>
            <email>email4@test.com</email>
            <type>work</type>
        </email>
    </emails>
    <phones>
        <phone>
            <phone>0212579971</phone>
            <type>home</type>
        </phone>
        <phone>
            <phone>0212579972</phone>
            <type>work</type>
        </phone>
    </phones>
    <addresses>
        <address>
            <address_line_1>22, Queen Street</address_line_1>
            <address_line_2>Auckland</address_line_2>
            <address_line_3>CBD</address_line_3>
            <postcode>0610</postcode>
            <country>NZ</country>
        </address>
        <address>
            <address_line_1>23, Queen Street</address_line_1>
            <address_line_2>Auckland</address_line_2>
            <postcode>0611</postcode>
            <country>NZ</country>
        </address>
    </addresses>
</customer>

Conclusion:

Solution do not cover all possible cases, but add some functionality to standard Talend Studio and ideas how it possible to resolve similar cases.

 

files attached:

  • Talend demo jobs
  • mysql ddl for test data

 

 

 

-----------
Sixteen Stars

Re: prepare XML with multiple loop in same level

Nicely explained @vapukov