How to replace special characters in xml

One Star

How to replace special characters in xml

Hi,
My XML file contains P2', P1 > P2' + P5, and P2' - P5 = P2 > P
My Talend job throwing below error:
Exception in component tFileInputXML_1
org.dom4j.DocumentException: Error on line 15 of document : The entity name must immediately follow the '&' in the entity reference. Nested exception: The entity name must immediately follow the '&' in the entity reference.
at org.dom4j.io.SAXReader.read(SAXReader.java:482)
at org.dom4j.io.SAXReader.read(SAXReader.java:365)
at test_loading.test_3_0_load_data_0_1.TEST_3_0_LOAD_DATA.tMysqlInput_1Process(TEST_3_0_LOAD_DATA.java:2635)
at test_loading.test_3_0_load_data_0_1.TEST_3_0_LOAD_DATA$1.run(TEST_3_0_LOAD_DATA.java:17001)

Please help me this how to success to load the data in a table.
Chin
One Star

Re: How to replace special characters in xml

Hmmmm... Same here. any help in resolving this would be a great help.
Five Stars

Re: How to replace special characters in xml

not sure but these are the HTML contents in XML for that you have use tag to parse this XML could you please show us your input XML that will help us give more suggestions.
One Star

Re: How to replace special characters in xml

Hi Umeshrakhe,
Please find below sample xml.
<?xml version="1.0"?>
<EMPS>
<EMP>
<STAFF>
<EMPCODE>111</EMPCODE>
<EMPDESIG>BA</EMPDESIG>
<DEPT>FIN</DEPT>
</STAFF>
<PERMANENT>
<ADDRESS>
<ADDRCODE>XX</ADDRCODE>
<ADDRCODE>ABCDE</ADDRCODE>
</ADDRESS>
</PERMANENT>
<FEEDBACK>
The Definitive Guide we offer a step by step guide on
how to install MongoDB and get it up and running smoothly.
Precompiled binaries are available for Linux, Mac OS X, Windows,
and Solaris. On most platforms you can download the archive from mongodb.org,
inflate it, and run the binary. "there is ink" in Fig. 3 The MongoDB server requires a directory it can write
database files to and a port it can listen for connections on.
The following section covers the entire install on the two variants of system:
Windows and everything else (Linux, Max, Solaris). 200 is A2&apos;, A1 > A2 > A3 - A7 is
Precompiled binaries are available for Linux, Mac OS X, Windows,
and Solaris. On most platforms you can download the archive from mongodb.org,
inflate it, and run the binary.
</FEEDBACK>
</EMP>
</EMPS>
Five Stars

Re: How to replace special characters in xml

Hi Chin,
Thank you for sample file, If you are not able to parse the special characters then, please see the screen below, I am able to parse it correctly.

I create sample job to it and it is working.
recommend you to set tFileInputXML advance property.
Encoding select CUSTOM ="UTF-8"
I am using Talend 5.2 version.

Hope this will help..
One Star

Re: How to replace special characters in xml

Hi Umeshrakhe,
I understand your advise.
But,
My files contains with a zip folder named Test199512345.zip
Unzip this folder contains Folder name is TEST-199512345(123456). It contains the xml file name as TEST-199512345(123456)-TT.xml
Here I did the following:
Step 1: Unzip the folder and put it in a temporary folder
Step 2: Rename the folder format as hyphen and removed brackets (). eg: TEST-199512345-123456
Step 3: Rename the file format as hyphen and removed brackets (). eg :TEST-199512345-123456-TT.xml
I achieved Step 1 to Step 3. i used for shell script to remove the special character in folder and file name.
I failed to do while loading xml files into the table, need to replace the letters as
below tag in xml.
<FEEDBACK>
The Definitive Guide we offer a step by step guide on
how to install MongoDB and get it up and running smoothly.
Precompiled binaries are available for Linux, Mac OS X, Windows,
and Solaris. On most platforms you can download the archive from mongodb.org,
inflate it, and run the binary. "there is ink" in Fig. 3 The MongoDB server requires a directory it can write
database files to and a port it can listen for connections on.
The following section covers the entire install on the two variants of system:
Windows and everything else (Linux, Max, Solaris). 200 is A2&apos;, A1 > A2 > A3 - A7 is
Precompiled binaries are available for Linux, Mac OS X, Windows,
and Solaris. On most platforms you can download the archive from mongodb.org,
inflate it, and run the binary.
</FEEDBACK>
My Talend job throwing below error:
Exception in component tFileInputXML_1
org.dom4j.DocumentException: Error on line 15 of document : The entity name must immediately follow the '&' in the entity reference. Nested exception: The entity name must immediately follow the '&' in the entity reference.
at org.dom4j.io.SAXReader.read(SAXReader.java:482)
at org.dom4j.io.SAXReader.read(SAXReader.java:365)
at test_loading.test_3_0_load_data_0_1.TEST_3_0_LOAD_DATA.tMysqlInput_1Process(TEST_3_0_LOAD_DATA.java:2635)
at test_loading.test_3_0_load_data_0_1.TEST_3_0_LOAD_DATA$1.run(TEST_3_0_LOAD_DATA.java:17001)
Please help me out
Thanks
Chin