[resolved] How to Parse a nested XML Document

One Star

[resolved] How to Parse a nested XML Document

Hi, I have a problem with parsing a multi-level or highly nested XML document that contains weather forecast. Essentially I'd like to parse the XML input and convert it into a relational format, meaning that some parent XML elements must be extracted for each child element so that my relational result does not have missing values.
I have tried creating an File xml metadata file first, however I am not getting the results I want into Step 4 of 5 where I am mapping source schema to target schema. I do not quite understand what the parameters Xpath loop expression and loop limit denote.(I am good with my Xpath but not sure how to put it into use within Talend Open Data Studio)
Typical XML Input:
<?xml version="1.0" encoding="UTF-8"?>
<SiteRep>
 <Location i="14" lat="54.9375" lon="-2.8092" name="CARLISLE AIRPORT" country="ENGLAND" continent="EUROPE">
     <Period type="Day" value="2017-01-03Z">
       <Rep D="WSW" Gn="22" Hn="93" PPd="12" S="13" V="VG" Dm="7" FDm="4" W="7" U="1">Day</Rep>
       <Rep D="W" Gm="22" Hm="91" PPn="16" S="11" V="VG" Nm="1" FNm="-1" W="7">Night</Rep>
    </Period>
   <Period type="Day" value="2017-01-04Z">
       <Rep D="NW" Gn="11" Hn="74" PPd="0" S="7" V="VG" Dm="4" FDm="1" W="1" U="1">Day</Rep>
      <Rep D="SSE" Gm="2" Hm="90" PPn="1" S="2" V="GO" Nm="-3" FNm="-5" W="0">Night</Rep>
  </Period>
 </Location>
 <Location i="22" lat="53.5797" lon="-0.3472" name="HUMBERSIDE AIRPORT" country="ENGLAND" continent="EUROPE" elevation="24.0">
  <Period type="Day" value="2017-01-03Z">
   <Rep D="WSW" Gn="27" Hn="87" PPd="6" S="13" V="GO" Dm="8" FDm="3" W="7" U="1">Day</Rep>
   <Rep D="W" Gm="31" Hm="88" PPn="34" S="18" V="VG" Nm="4" FNm="0" W="7">Night</Rep>
  </Period>
  <Period type="Day" value="2017-01-04Z">
   <Rep D="NW" Gn="29" Hn="71" PPd="1" S="16" V="VG" Dm="6" FDm="1" W="3" U="1">Day</Rep>
   <Rep D="NW" Gm="20" Hm="87" PPn="1" S="11" V="VG" Nm="-1" FNm="-3" W="0">Night</Rep>
  </Period>
   </Location>
 </DV>
 </SiteRep>
The output I am attempting to create is that for each grandparent <Location>, parent <Period> and its child <Rep>, I will get :

Accepted Solutions
Community Manager

Re: [resolved] How to Parse a nested XML Document

Hi 
Set the Xpath loop expression to /SiteRep/Location/Period/Rep, you will able to extract all the attribute values of each Rep item, and then you need tJavaRow+tDenormalize component to denormalize the rows to one row for the same group (Location and ForecastDate).
Regards
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business

All Replies
Community Manager

Re: [resolved] How to Parse a nested XML Document

Hi 
Set the Xpath loop expression to /SiteRep/Location/Period/Rep, you will able to extract all the attribute values of each Rep item, and then you need tJavaRow+tDenormalize component to denormalize the rows to one row for the same group (Location and ForecastDate).
Regards
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: [resolved] How to Parse a nested XML Document

Thanks Shong, worked a treat!