Fetching a XML without knowning the node names

Highlighted
Four Stars

Fetching a XML without knowning the node names

Hi all,

 

I have to fetch a XML file like this one, where I don´t know previously the node names. How could I do that?

Searched all around, but no success.

 

<?xml version="1.0" encoding="UTF-8"?>
<root>
<Capitals>
   <France>Paris</France>
   <UK>London</UK>
   <Russia>Moscow</Russia>
   <Portugal>Lisboa</Portugal>
   <China>Beijing</China>
</Capitals>
</root>

 

The final result shoud be 2 columns:

France - Paris

Russia - Moscow

etc...

 

Any ideas?

Appreciate any suggestions

Tags (1)

Accepted Solutions
Nine Stars

Re: Fetching a XML without knowning the node names

This can be done with Xpath using the local_name() and dot keywords.

 

Screenshot of test job showing also the details of the tExtractXMLField with the Xpath:

xml-job-extract-details.png

Output:

Starting job XmlDynamicElementNames at 23:25 03/05/2019.

[statistics] connecting to socket on port 3875
[statistics] connected
.---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------.
|                                                                                        #1. tLogRow_1                                                                                        |
+-----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| key | value                                                                                                                                                                                 |
+-----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| xml | <?xml version="1.0" encoding="UTF-8"?><root><Capitals><France>Paris</France><UK>London</UK><Russia>Moscow</Russia><Portugal>Lisboa</Portugal><China>Beijing</China></Capitals></root> |
+-----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

.------------------.
|  #1. tLogRow_2   |
+---------+--------+
| key     | value  |
+---------+--------+
| country | France |
| city    | Paris  |
+---------+--------+

.------------------.
|  #2. tLogRow_2   |
+---------+--------+
| key     | value  |
+---------+--------+
| country | UK     |
| city    | London |
+---------+--------+

.------------------.
|  #3. tLogRow_2   |
+---------+--------+
| key     | value  |
+---------+--------+
| country | Russia |
| city    | Moscow |
+---------+--------+

.--------------------.
|   #4. tLogRow_2    |
+---------+----------+
| key     | value    |
+---------+----------+
| country | Portugal |
| city    | Lisboa   |
+---------+----------+

.-------------------.
|   #5. tLogRow_2   |
+---------+---------+
| key     | value   |
+---------+---------+
| country | China   |
| city    | Beijing |
+---------+---------+

[statistics] disconnected

Job XmlDynamicElementNames ended at 23:25 03/05/2019. [exit code=0]

Attaching the export job zip also.

--
Please give Kudos and mark topics as solved where appropriate.

All Replies
Nine Stars

Re: Fetching a XML without knowning the node names

This can be done with Xpath using the local_name() and dot keywords.

 

Screenshot of test job showing also the details of the tExtractXMLField with the Xpath:

xml-job-extract-details.png

Output:

Starting job XmlDynamicElementNames at 23:25 03/05/2019.

[statistics] connecting to socket on port 3875
[statistics] connected
.---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------.
|                                                                                        #1. tLogRow_1                                                                                        |
+-----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| key | value                                                                                                                                                                                 |
+-----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| xml | <?xml version="1.0" encoding="UTF-8"?><root><Capitals><France>Paris</France><UK>London</UK><Russia>Moscow</Russia><Portugal>Lisboa</Portugal><China>Beijing</China></Capitals></root> |
+-----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

.------------------.
|  #1. tLogRow_2   |
+---------+--------+
| key     | value  |
+---------+--------+
| country | France |
| city    | Paris  |
+---------+--------+

.------------------.
|  #2. tLogRow_2   |
+---------+--------+
| key     | value  |
+---------+--------+
| country | UK     |
| city    | London |
+---------+--------+

.------------------.
|  #3. tLogRow_2   |
+---------+--------+
| key     | value  |
+---------+--------+
| country | Russia |
| city    | Moscow |
+---------+--------+

.--------------------.
|   #4. tLogRow_2    |
+---------+----------+
| key     | value    |
+---------+----------+
| country | Portugal |
| city    | Lisboa   |
+---------+----------+

.-------------------.
|   #5. tLogRow_2   |
+---------+---------+
| key     | value   |
+---------+---------+
| country | China   |
| city    | Beijing |
+---------+---------+

[statistics] disconnected

Job XmlDynamicElementNames ended at 23:25 03/05/2019. [exit code=0]

Attaching the export job zip also.

--
Please give Kudos and mark topics as solved where appropriate.

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog

How Media Organizations Achieved Success with Data Integration

Learn how media organizations have achieved success with Data Integration

Read

Why Companies Move to the Cloud: 7 Success Stories

Learn how and why companies are moving to the Cloud

Read Now

Agile Data lakes & Analytics

Accelerate your data lake projects with an agile approach

Watch