Four Stars

Read XML node content as string is possible with talend ?

I need to read the <claims> tag content as it is, but when i read it using talend's tFileInputXML (with SAXparser) inner text content without tag wrapping were lost

 

INPUT XML

<ipa><us-patent-application>
<claims id="claims">
<claim id="CLM-00002" num="00002">
<claim-text><b>2</b>. The controller according to
<claim-ref idref="CLM-00001">claim 1</claim-ref>, wherein the sampling module is arranged to sample the signal indicative of the output current of the SMPS at a frequency that is an integer multiple of the switching frequency of the SMPS.</claim-text>
</claim>
<claim id="CLM-00003" num="00003">
<claim-text><b>3</b>. The controller according to
<claim-ref idref="CLM-00001">claim 1</claim-ref>, wherein the ripple component estimation module is arranged to estimate the ripple component by use of an interpolated low-pass filter.</claim-text>
</claim>
</claims>
</us-patent-application></ipa>

 

TALEND JOB

 

Screen Shot 2018-05-23 at 3.50.06 PM.png

 

OUTPUT XML After Parsing

 

<claims id="claims">
 <claim id="CLM-00002" num="00002">
<claim-text><b>2</b>
<claim-ref idref="CLM-00001">claim 1</claim-ref>
</claim-text>
</claim>
<claim id="CLM-00003" num="00003">
<claim-text><b>3</b>
<claim-ref idref="CLM-00001">claim 1</claim-ref>
</claim-text>
</claim>
</claims>

 

How i read the <claims> XML tag content as it is ?

2 REPLIES
Seven Stars

Re: Read XML node content as string is possible with talend ?

Hi,

 

The incoming XML doesn't seem to be formed correctly? The claim-text tag, contains the tag claim-ref within it.

 

<claim-text><b>2</b>. The controller according to
<claim-ref idref="CLM-00001">claim 1</claim-ref>, wherein the sampling module is arranged to sample the signal indicative of the output current of the SMPS at a frequency that is an integer multiple of the switching frequency of the SMPS.</claim-text>

 

That's possible the cause of the problem.

 

Thanks David

Regards

David

Don't forget to give Kudos when an answer is helpful or the solution.
Six Stars

Re: Read XML node content as string is possible with talend ?

Hello linto_cheeran,

 

I see two records wrapped in that xml hence you probably need to apply a loop on the claim-ref like below:

/ipa/us-patent-application/claims/claim/claim-text/claim-ref

and consider including the fields required and removing the "Get Nodes" option will give you the text inside the xml tags.

 

Regards,

Praveen Kumar Bandi