Read XML node content as string is possible with talend ?

Four Stars

Read XML node content as string is possible with talend ?

I need to read the <claims> tag content as it is, but when i read it using talend's tFileInputXML (with SAXparser) inner text content without tag wrapping were lost

 

INPUT XML

<ipa><us-patent-application>
<claims id="claims">
<claim id="CLM-00002" num="00002">
<claim-text><b>2</b>. The controller according to
<claim-ref idref="CLM-00001">claim 1</claim-ref>, wherein the sampling module is arranged to sample the signal indicative of the output current of the SMPS at a frequency that is an integer multiple of the switching frequency of the SMPS.</claim-text>
</claim>
<claim id="CLM-00003" num="00003">
<claim-text><b>3</b>. The controller according to
<claim-ref idref="CLM-00001">claim 1</claim-ref>, wherein the ripple component estimation module is arranged to estimate the ripple component by use of an interpolated low-pass filter.</claim-text>
</claim>
</claims>
</us-patent-application></ipa>

 

TALEND JOB

 

Screen Shot 2018-05-23 at 3.50.06 PM.png

 

OUTPUT XML After Parsing

 

<claims id="claims">
 <claim id="CLM-00002" num="00002">
<claim-text><b>2</b>
<claim-ref idref="CLM-00001">claim 1</claim-ref>
</claim-text>
</claim>
<claim id="CLM-00003" num="00003">
<claim-text><b>3</b>
<claim-ref idref="CLM-00001">claim 1</claim-ref>
</claim-text>
</claim>
</claims>

 

How i read the <claims> XML tag content as it is ?

Eight Stars

Re: Read XML node content as string is possible with talend ?

Hi,

 

The incoming XML doesn't seem to be formed correctly? The claim-text tag, contains the tag claim-ref within it.

 

<claim-text><b>2</b>. The controller according to
<claim-ref idref="CLM-00001">claim 1</claim-ref>, wherein the sampling module is arranged to sample the signal indicative of the output current of the SMPS at a frequency that is an integer multiple of the switching frequency of the SMPS.</claim-text>

 

That's possible the cause of the problem.

 

Thanks David

Regards David
Dont forget to give Kudos when an answer is helpful or the solution.
Six Stars

Re: Read XML node content as string is possible with talend ?

Hello linto_cheeran,

 

I see two records wrapped in that xml hence you probably need to apply a loop on the claim-ref like below:

/ipa/us-patent-application/claims/claim/claim-text/claim-ref

and consider including the fields required and removing the "Get Nodes" option will give you the text inside the xml tags.

 

Regards,

Praveen Kumar Bandi

Tutorial

Introduction to Talend Open Studio for Data Integration.

Definitive Guide to Data Integration

Practical steps to developing your data integration strategy.

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.