Error processing resource while parsing XML with '&' symbol

One Star

Error processing resource while parsing XML with '&' symbol

Hi
I want to parse xml through Talend job.
In my xml contains special characters like"&" as "&" and "<" as "<" ">" as "&gt" "" etc
How to replace this special characters while parsing xml file through Talend?
sample xml
<?xml version="1.0"?>
<Extract>
<Record>
<ID>1</ID>
<NAME>Product 1</NAME>
<ATTS>
<ATT>Me & my attribute</ATT>
<ATT>Another attribute</ATT>
</ATTS>
</Record>
<Record>
<ID>2</ID>
<NAME>Product 2</NAME>
<ATTS>
<ATT>Foo attribute</ATT>
<ATT>Bar <br />attribute</ATT>
</ATTS>
</Record>
<Record>
<ID>3</ID>
<NAME>Product 3</NAME>
<ATTS>
<ATT>John Doe attribute</ATT>
<ATT>Foo & bar</ATT>
</ATTS>
</Record>
</Extract>
Please help me.
Thanks
Chin
Community Manager

Re: Error processing resource while parsing XML with '&' symbol

Hi Chin
What's the error do you get? I am able to read this file, print the result on the console and output them to a file.
Starting job test2 at 11:47 20/03/2013.

connecting to socket on port 3899
connected
1|Product 1|Me & my attribute
1|Product 1|Another attribute
2|Product 2|Foo attribute
2|Product 2|Bar <br />attribute
3|Product 3|John Doe attribute
3|Product 3|Foo & bar
disconnected
Job test2 ended at 11:47 20/03/2013.

Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: Error processing resource while parsing XML with '&' symbol

Hi Shong,
I understand your advise.
But,
My files contains with a zip folder named Test199512345.zip
Unzip this folder contains Folder name is TEST-199512345(123456). It contains the xml file name as TEST-199512345(123456)-TT.xml
Here I did the following:
Step 1: Unzip the folder and put it in a temporary folder
Step 2: Rename the folder format as hyphen and removed brackets (). eg: TEST-199512345-123456
Step 3: Rename the file format as hyphen and removed brackets (). eg :TEST-199512345-123456-TT.xml
I achieved Step 1 to Step 3. i used for shell script to remove the special character in folder and file name.
I failed to do while loading xml files into the table, need to replace the letters as
below tag in xml.
<FEEDBACK>
The Definitive Guide we offer a step by step guide on
how to install MongoDB and get it up and running smoothly.
Precompiled binaries are available for Linux, Mac OS X, Windows,
and Solaris. On most platforms you can download the archive from mongodb.org,
inflate it, and run the binary. "there is ink" in Fig. 3 The MongoDB server requires a directory it can write
database files to and a port it can listen for connections on.
The following section covers the entire install on the two variants of system:
Windows and everything else (Linux, Max, Solaris). 200 is A2&apos;, A1 > A2 > A3 - A7 is
Precompiled binaries are available for Linux, Mac OS X, Windows,
and Solaris. On most platforms you can download the archive from mongodb.org,
inflate it, and run the binary.
</FEEDBACK>
My Talend job throwing below error:
Exception in component tFileInputXML_1
org.dom4j.DocumentException: Error on line 15 of document : The entity name must immediately follow the '&' in the entity reference. Nested exception: The entity name must immediately follow the '&' in the entity reference.
at org.dom4j.io.SAXReader.read(SAXReader.java:482)
at org.dom4j.io.SAXReader.read(SAXReader.java:365)
at test_loading.test_3_0_load_data_0_1.TEST_3_0_LOAD_DATA.tMysqlInput_1Process(TEST_3_0_LOAD_DATA.java:2635)
at test_loading.test_3_0_load_data_0_1.TEST_3_0_LOAD_DATA$1.run(TEST_3_0_LOAD_DATA.java:17001)
Please help me out
Thanks
Chin
Community Manager

Re: Error processing resource while parsing XML with '&' symbol

Hi Chin
I tried to read the above example you gave and it works, I think the error occurs when trying to read other value that contains characters like "&&", for example:
<FEEDBACK>
This is a test example for && characters. hope it will help you!
</FEEDBACK>
A solution is to read the source xml file with tFileInputFullRow line by line, and then replace "&&" with "<!]>" on tMap, for example
<FEEDBACK>
This is a test example for && characters. hope it will help you!
</FEEDBACK>
will be:
<FEEDBACK>
This is a test example for <!]> characters. hope it will help you!
</FEEDBACK>
and output the rows to a new XML file. So that, you are able to read the new XML file without error, the job looks like:
tFilenputFullRow--main--tMap--tFileOutputOutput
|
onsubjobok
|
tFileInputXML--main--tLogRow

Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: Error processing resource while parsing XML with '&' symbol

Hi Shong,
I have many xml files to do it. Its not possible to add each xml <!]>.
I did 3 steps and after that parsing xml and load in a table.
step4 got failed.
I want to do while parsing time itself to resolve.
Please provide me alternative solution.
Thanks
Chin
One Star

Re: Error processing resource while parsing XML with '&' symbol

Hi Shong,
I tried to use "sed -i "s/&/&/g" *.xml" in talend, and achieved this way passing through parameters.
But I cannot pass all the characters like
"sed -i "s/&/&><&apos;/g" *.xml"
"<" as "<" ">" as "&gt" "" as &apos;etc
Please provide me the solution.
Community Manager

Re: Error processing resource while parsing XML with '&' symbol

Hi Shong,
I have many xml files to do it. Its not possible to add each xml <!]>.
I did 3 steps and after that parsing xml and load in a table.
step4 got failed.
I want to do while parsing time itself to resolve.
Please provide me alternative solution.
Thanks
Chin

The problem is your xml file contains double '&' symbol like "&&", it is "&&" causes the problem, not "&", "<" and "&gt", so you have to change && to be <!]> as I did before your read the xml file with tFileInputXML component, otherwise it always throws this exception.
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: Error processing resource while parsing XML with '&' symbol

Hi Shong,
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE doc >
<EMPS>
<EMP>
<STAFF>
<EMPCODE>111</EMPCODE>
<EMPDESIG>BA</EMPDESIG>
<DEPT>FIN</DEPT>
</STAFF>
<PERMANENT>
<ADDRESS>
<ADDRCODE>XX</ADDRCODE>
<ADDRCODE>ABCDE</ADDRCODE>
</ADDRESS>
</PERMANENT>
<FEEDBACK>
The Definitive Guide we offer a step by step guide on
how to install MongoDB and get it up and running smoothly.
Precompiled binaries are available for Linux, Mac OS X, Windows,
and Solaris. On most platforms you can download the archive from mongodb.org,
inflate it, and run the binary. "there is ink" in Fig. 3 The MongoDB server requires a directory it can write
database files to and a port it can listen for connections on.
The following section covers the entire install on the two variants of system:
Windows and everything else (Linux, Max, Solaris). 200 is A2&apos;, A1 > A2 > A3 - A7 is
Precompiled binaries are available for Linux, Mac OS X, Windows,
and Solaris. On most platforms you can download the archive from mongodb.org,
inflate it, and run the binary.
</FEEDBACK>
</EMP>
</EMPS>
Please explain me how to change && to be <!]> my xml around 300.
Thanks
Chin
One Star

Re: Error processing resource while parsing XML with '&' symbol

Where's the &&?
In response to your email, there's no && in the above example and Shong said he could process it ok.
One Star

Re: Error processing resource while parsing XML with '&' symbol

Please look in my message.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE doc >
<EMPS>
<EMP>
<STAFF>
<EMPCODE>111</EMPCODE>
<EMPDESIG>BA</EMPDESIG>
<DEPT>FIN</DEPT>
</STAFF>
<PERMANENT>
<ADDRESS>
<ADDRCODE>XX</ADDRCODE>
<ADDRCODE>ABCDE</ADDRCODE>
</ADDRESS>
</PERMANENT>
<FEEDBACK>
The Definitive Guide we offer a step by step guide on
how to install MongoDB and get it up and running smoothly.
Precompiled binaries are available for Linux, Mac OS X, Windows,
and Solaris. On most platforms you can download the archive from mongodb.org,
inflate it, and run the binary. "there is ink" in Fig. 3 The MongoDB server requires a directory it can write
database files to and a port it can listen for connections on.
The following section covers the entire install on the two variants of system:
Windows and everything else (Linux, Max, Solaris). 200 is A2&apos;, A1 > A2 > A3 - A7 is
Precompiled binaries are available for Linux, Mac OS X, Windows,
and Solaris. On most platforms you can download the archive from mongodb.org,
inflate it, and run the binary.
</FEEDBACK>
</EMP>
</EMPS>

Mr.Shong replied me that
will be:
<FEEDBACK>
This is a test example for <!]> characters. hope it will help you!
</FEEDBACK>
One Star

Re: Error processing resource while parsing XML with '&' symbol

See above!
One Star

Re: Error processing resource while parsing XML with '&' symbol

Please explain me.
One Star

Re: Error processing resource while parsing XML with '&' symbol

Your xml example doesn't contain &&
You need to identify the file that is producing the error.
Which version are you using?
One Star

Re: Error processing resource while parsing XML with '&' symbol

Still I am not able to do with my xmls.
See below :
My requirement : I have a zip folder around 30.
I did the following :
Step 1: Unzip the folder using with and put it in a temporary folder
tFileList_1 --> tSystem
"unzip "+((String)globalMap.get("tFileList_1_CURRENT_FILEPATH")) +" -d " + context.tempdirectory
I open my xml file, it seems below format: so that it is throwing error.
Exception in component tFileInputXML_1
org.dom4j.DocumentException: Error on line 15 of document : The entity name must immediately follow the '&' in the entity reference. Nested exception: The entity name must immediately follow the '&' in the entity reference.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE doc >
<EMPS>
<EMP>
<STAFF>
<EMPCODE>111</EMPCODE>
<EMPDESIG>BA</EMPDESIG>
<DEPT>FIN</DEPT>
</STAFF>
<PERMANENT>
<ADDRESS>
<ADDRCODE>XX</ADDRCODE>
<ADDRCODE>ABCDE</ADDRCODE>
</ADDRESS>
</PERMANENT>
<FEEDBACK>
The Definitive Guide we offer a step by step guide on
how to install MongoDB and get it up and running smoothly.
Precompiled binaries are available for Linux, Mac OS X, Windows,
and Solaris. On most platforms you can download the archive from mongodb.org,
inflate it, and run the binary. "there is ink" in Fig. 3 The MongoDB server requires a directory it can write
database files to and a port it can listen for connections on.
The following section covers the entire install on the two variants of system:
Windows and everything else (Linux, Max, Solaris). 200 is A2&apos;, A1 > A2 > A3 - A7 is
Precompiled binaries are available for Linux, Mac OS X, Windows,
and Solaris. On most platforms you can download the archive from mongodb.org,
inflate it, and run the binary.
</FEEDBACK>
</EMP>
</EMPS>
One Star

Re: Error processing resource while parsing XML with '&' symbol

Your example works fine in 5.0.2. Saying the same thing over and over again is not going to help.
Post an image of your job and your TOS version.
One Star

Re: Error processing resource while parsing XML with '&' symbol

I am using Talend 4.0.3 r47759 and from the attached screenshot 4.png, tRunJob_1 throwing error
Exception in component tFileInputXML_1
org.dom4j.DocumentException: Error on line 15 of document : The entity name must immediately follow the '&' in the entity reference. Nested exception: The entity name must immediately follow the '&' in the entity reference.
Please find attached screenshot of my jobs.
My XML :
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE doc >
<EMPS>
<EMP>
<STAFF>
<EMPCODE>111</EMPCODE>
<EMPDESIG>BA</EMPDESIG>
<DEPT>FIN</DEPT>
</STAFF>
<PERMANENT>
<ADDRESS>
<ADDRCODE>XX</ADDRCODE>
<ADDRCODE>ABCDE</ADDRCODE>
</ADDRESS>
</PERMANENT>
<FEEDBACK>
The Definitive Guide we offer a step by step guide on
how to install MongoDB and get it up and running smoothly.
Precompiled binaries are available for Linux, Mac OS X, Windows,
and Solaris. On most platforms you can download the archive from mongodb.org,
inflate it, and run the binary. "there is ink" in Fig. 3 The MongoDB server requires a directory it can write
database files to and a port it can listen for connections on.
The following section covers the entire install on the two variants of system:
Windows and everything else (Linux, Max, Solaris). 200 is A2&apos;, A1 > A2 > A3 - A7 is
Precompiled binaries are available for Linux, Mac OS X, Windows,
and Solaris. On most platforms you can download the archive from mongodb.org,
inflate it, and run the binary.
</FEEDBACK>
</EMP>
</EMPS>
One Star

Re: Error processing resource while parsing XML with '&' symbol

What's the runjob doing? Presumably that's where the tFileInputXML is?
Works in 4.0.2 as well.
One Star

Re: Error processing resource while parsing XML with '&' symbol

Hi Janhess,
tRunJob is calling Child job for insert XML info into each tables.
Here is the screenshot.
One Star

Re: Error processing resource while parsing XML with '&' symbol

That tFileInputXML doesn't match the example xml you posted. It's looping on /simple-patent-document/bibliographic-data which doesn't appear in your example. We can't help if you don't post the correct data.
One Star

Re: Error processing resource while parsing XML with '&' symbol

sorry it was wrong screenshot.
Please see the screenshot.
One Star

Re: Error processing resource while parsing XML with '&' symbol

Your parameters are wrong.
Your loop path should be "/EMPS/EMP"
Your xpath query should be
EMPCODE "STAFF/EMPCODE"
EMPDESIG "STAFF/EMPDESIG"
DEPT "STAFF/DEPT"
ADDRESSCODE "ADDRESS/ADDRCODE"
One Star

Re: Error processing resource while parsing XML with '&' symbol

Yes I know. Please see the problem with AddressCode
<EMP>
<STAFF>
<EMPCODE>111</EMPCODE>
<EMPDESIG>BA</EMPDESIG>
<DEPT>FIN</DEPT>
</STAFF>
<PERMANENT>
<ADDRESS>
<ADDRCODE>XX</ADDRCODE>
<ADDRCODE>ABCDE</ADDRCODE>
</ADDRESS>
</PERMANENT>
<FEEDBACK>
my loop path same worked. Now the problem with <FEEDBACK> tag data with A2&apos;, A1 > A2 > < characters. This problem yet to resolved.
One Star

Re: Error processing resource while parsing XML with '&' symbol

I don't have any problem with it/
xpath is "FEEDBACK"
One Star

Re: Error processing resource while parsing XML with '&' symbol

Hi,

How to get the value "homeDefault" and "true" from below xml
<name>
<chChSelectionMode><homeDefault/></chChSelectionMode>
<dynamicAddressFlag><true/></dynamicAddressFlag>
</name>

Since the tool identifies <homeDefault/> as tag but it is actually a value ,how to mention the xpath value in xml metadata
Four Stars

Re: Error processing resource while parsing XML with '&' symbol

Hi Banu
I believe we cannot directly configure in but i found custom solution , please find in below link:
http://anilkumarburri.wordpress.com/2013/06/18/how-to-process-the-below-xml-file/
hope it is going to help u


thanks
Anil Kumar Burri
http://anilkumarburri.wordpress.com/
One Star

Re: Error processing resource while parsing XML with '&' symbol

Hi Anil,

the link which u pasted is not opening only .......................
Four Stars

Re: Error processing resource while parsing XML with '&' symbol

HI Banu
Its working for me, otherwise try this:
http://anilkumarburri.wordpress.com/

thanks
Anil Kumar Burri
One Star

Re: Error processing resource while parsing XML with '&' symbol

Hi, 
I have error "<faultstring>The entity name must immediately follow the '&amp;' in the entity reference.</faultstring>"
while I load xml through TSoap. 
In my xml 
<Name> See &amp; Go</Name>
One Star

Re: Error processing resource while parsing XML with '&' symbol

it helps!