Four Stars

How to parse XML in loop and find Index value

I am trying to parse the XML, need input how to parse the loop (rowdata repated twice) and how to get RepeatIndex value.

 

<main>

..

<ResonLst RepatingType="PageLst">

<rowdata RepatIndex="1">

<Desc>Description</Dec>

..

<rowdata RepeatIndex="Index_list"><4></rowdata>

</rowdata>

<rowdata RepatIndex="2">

<Desc>Description</Dec>

..

<rowdata RepeatIndex="Index_list"><5></rowdata>

</rowdata>

</ReasonLst>

..

</main>

7 REPLIES
Community Manager

Re: How to parse XML in loop and find Index value

Hi
As a newbie, I would suggest you to go Repository->Metadata and create a XML metadata following the wizards, the Loop xpath query element will be set to rowdata.
Let me know if you have any questions.

Regards
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
Four Stars

Re: How to parse XML in loop and find Index value

Thanks for the quick response Shong.  Actually I followed Metadata -> File XML -> selected all the fields. 

1) I am able to get the first occurrence of rawdata, NOT second occurrences.

2) Even the first occurrence I am able to get the Index header within " ", but not the value.

 

I am looking for these option to get additional fields while parsing.

 

 

Four Stars

Re: How to parse XML in loop and find Index value

@shong - I have attached sample XML, I have created metadata for this XML and used this in tFileInputXML.  tLogRow shows only below values, rest of the values are not parsing.

 

bk001|Writer|The First Book|Fiction|44.95|23-03-0007|An amazing story of nothing.|PageList|1|Bundle1|20170525T040000.000 GMT|Reasons1|User1|Name1|20170525T195225.384 GMT|New1|New Bundle1|Bundle1|Property|Index_Reason

 

Could you please help on this ?

 

Community Manager

Re: How to parse XML in loop and find Index value

Hi

Set the Xpath loop query to /x:books/book/ReasonList/rowdata, you should be able to extract all records, see

1.png

 

Regards

Shong

----------------------------------------------------------
Talend | Data Agility for Modern Business
Four Stars

Re: How to parse XML in loop and find Index value

@shong - Thanks for the response.

Now I got couple of rows, there I don't see value 7 / 8 after Index_Reason field.  How to get this ?

 

After parsing...

bk001|Writer|The First Book|Fiction|44.95|2000-10-01|An amazing story of nothing.|PageList|1|Bundle1|20170525T040000.000 GMT|Reasons1|User1|Name1|20170525T195225.384 GMT|New1|New Bundle1|Bundle1|Property|Index_Reason|PageList|2|Bundle2|20170525T040000.000 GMT|Reasons2|User2|Name2|20170525T195225.384 GMT|New2|New Bundle2|Bundle2|Property|Index_Reason


bk001|Writer|The First Book|Fiction|44.95|2000-10-01|An amazing story of nothing.|PageList|2|Bundle2|20170525T040000.000 GMT|Reasons2|User2|Name2|20170525T195225.384 GMT|New2|New Bundle2|Bundle2|Property|Index_Reason|PageList|2|Bundle2|20170525T040000.000 GMT|Reasons2|User2|Name2|20170525T195225.384 GMT|New2|New Bundle2|Bundle2|Property|Index_Reason

 

 

Since some of the fields are common and only few fields repeated twice, my objective to get single record and repeated fields should be additional columns.  So I tried to combine with tDenormalize with id, author, title, genre, price, pub_date, review.  It didn't helped.  Then followed tMap to copy the files into two different output files and used again tMap to join with above key fields, it created duplicate records and removed with tUniqueRow

 

Now the results shows as per below (Except the Index Value, I need your input the value).  Is there any other better way to do this ?

 

bk001|Writer|The First Book|Fiction|44.95|2000-10-01|An amazing story of nothing.|PageList|1|Bundle1|20170525T040000.000 GMT|Reasons1|User1|Name1|20170525T195225.384 GMT|New1|New Bundle1|Bundle1|Property|Index_Reason|2|Bundle2|20170525T040000.000 GMT|Reasons2|User2|Name2|20170525T195225.384 GMT|New2|New Bundle2|Bundle2|Property|Index_Reason

 

 

 

 

Community Manager

Re: How to parse XML in loop and find Index value


New_User wrote:

@shong - Thanks for the response.

Now I got couple of rows, there I don't see value 7 / 8 after Index_Reason field.  How to get this ?

 


See screenshot, you should extract the field Indexes/rowdata to get value 7/8.

1.png

----------------------------------------------------------
Talend | Data Agility for Modern Business
Four Stars

Re: How to parse XML in loop and find Index value

@shong - Thanks for the prompt response.

 

The parsing which I am trying to do is manually specifying XPath for each loop.  Is there option do automatically parse with XSD ?  So that I don't need to worry about if any other loop with different file ?

 

Tried tFileInputRaw (read XML file) -> tHMap (input is XML and output is Flat) -> tFileOutputDelimited.  In this flow I am not able to get header and there is NO option to specify the delimiter to get it as CSV file.  Is there any option to get the schema for the XSD and specify the delimiter for this scenario ?

 

Looking for option if XML input given then loop through fields as additional rows and comma delimited as output with header.

 

Please share your thoughts.