[resolved] SAXParseException when uploading DSC Tasks

One Star

[resolved] SAXParseException when uploading DSC Tasks

Hi
When I want to upload DSC tasks I get the following error:
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at java.io.BufferedInputStream.fill(Unknown Source)
at java.io.BufferedInputStream.read1(Unknown Source)
at java.io.BufferedInputStream.read(Unknown Source)
at sun.net.www.http.HttpClient.parseHTTPHeader(Unknown Source)
at sun.net.www.http.HttpClient.parseHTTP(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
at java.net.HttpURLConnection.getResponseCode(Unknown Source)
at org.talend.datastewardship.server.task.creation.TaskLoadClient.doLoad(TaskLoadClient.java:100)
at helios.j004_dedup_create_person_tasks_dsc_0_1.j004_DEDUP_create_person_tasks_DSC.tFileInputDelimited_1Process(j004_DEDUP_create_person_tasks_DSC.java:3087)
at helios.j004_dedup_create_person_tasks_dsc_0_1.j004_DEDUP_create_person_tasks_DSC.runJobInTOS(j004_DEDUP_create_person_tasks_DSC.java:6303)
at helios.j004_dedup_create_person_tasks_dsc_0_1.j004_DEDUP_create_person_tasks_DSC.main(j004_DEDUP_create_person_tasks_DSC.java:6130)
An error occured while uploading tasks.

MDM server log:
2014-12-11 11:30:51,781 ERROR  java.lang.NullPointerException
2014-12-11 11:30:51,781 INFO   login() User 'anonymous' successfully logged in Universe ''
2014-12-11 11:30:52,047 ERROR Failed to validate XML file through 'D:\app\talend\Talend-MDMServer-r118616-V5.5.1\jboss-4.2.2.GA\server\default\deploy\org.talend.datastewardship.war\WEB-INF\classes\org\talend\datastewardship\server\task\creation\inputTask.xsd
org.dom4j.DocumentException: Error on line 1 of document  : Content is not allowed in prolog. Nested exception: Content is not allowed in prolog.
at org.dom4j.io.SAXReader.read(SAXReader.java:482)
at org.dom4j.DocumentHelper.parseText(DocumentHelper.java:278)
at org.talend.datastewardship.server.util.xml.XmlUtil.validateXMLByXSD(XmlUtil.java:304)
at org.talend.datastewardship.server.task.creation.TaskValidationFilter.doFilter(TaskValidationFilter.java:38)
at org.talend.datastewardship.server.task.creation.TaskFilterChain.doFilter(TaskFilterChain.java:48)
at org.talend.datastewardship.server.task.creation.TaskCreator.buildTasks(TaskCreator.java:82)
at org.talend.datastewardship.server.task.creation.TaskLoadServlet.doGet(TaskLoadServlet.java:77)
at org.talend.datastewardship.server.task.creation.TaskLoadServlet.doPost(TaskLoadServlet.java:56)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:710)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at org.talend.datastewardship.server.security.SecurityFilter.doFilter(SecurityFilter.java:139)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:96)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:76)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at org.jboss.web.tomcat.filters.ReplyHeaderFilter.doFilter(ReplyHeaderFilter.java:96)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:230)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
at org.jboss.web.tomcat.security.SecurityAssociationValve.invoke(SecurityAssociationValve.java:179)
at org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:84)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at org.jboss.web.tomcat.service.jca.CachedConnectionValve.invoke(CachedConnectionValve.java:157)
at org.apache.catalina.authenticator.SingleSignOn.invoke(SingleSignOn.java:393)
at org.apache.catalina.authenticator.MDMSingleSignOn.invoke(MDMSingleSignOn.java:66)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:262)
at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:446)
at java.lang.Thread.run(Thread.java:745)
Nested exception:
org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLScanner.reportFatalError(Unknown Source)
at org.apache.xerces.impl.XMLDocumentScannerImpl$PrologDispatcher.dispatch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at org.dom4j.io.SAXReader.read(SAXReader.java:465)
at org.dom4j.DocumentHelper.parseText(DocumentHelper.java:278)
at org.talend.datastewardship.server.util.xml.XmlUtil.validateXMLByXSD(XmlUtil.java:304)
at org.talend.datastewardship.server.task.creation.TaskValidationFilter.doFilter(TaskValidationFilter.java:38)
at org.talend.datastewardship.server.task.creation.TaskFilterChain.doFilter(TaskFilterChain.java:48)
at org.talend.datastewardship.server.task.creation.TaskCreator.buildTasks(TaskCreator.java:82)
at org.talend.datastewardship.server.task.creation.TaskLoadServlet.doGet(TaskLoadServlet.java:77)
at org.talend.datastewardship.server.task.creation.TaskLoadServlet.doPost(TaskLoadServlet.java:56)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:710)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at org.talend.datastewardship.server.security.SecurityFilter.doFilter(SecurityFilter.java:139)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:96)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:76)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at org.jboss.web.tomcat.filters.ReplyHeaderFilter.doFilter(ReplyHeaderFilter.java:96)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:230)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
at org.jboss.web.tomcat.security.SecurityAssociationValve.invoke(SecurityAssociationValve.java:179)
at org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:84)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at org.jboss.web.tomcat.service.jca.CachedConnectionValve.invoke(CachedConnectionValve.java:157)
at org.apache.catalina.authenticator.SingleSignOn.invoke(SingleSignOn.java:393)
at org.apache.catalina.authenticator.MDMSingleSignOn.invoke(MDMSingleSignOn.java:66)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:262)
at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:446)
at java.lang.Thread.run(Thread.java:745)

When I set the commit to 1 row everything goes well, when I set commit to 50 i get this error.
We've recently switched from a H2 db to a Oracle 11g db. Could this cause this error?
Kind regards
Dries
Talend Platform for MDM 5.5.1 r118616
Windows 7 Enterprise
Java jdk: 1.7.0_60
Employee

Re: [resolved] SAXParseException when uploading DSC Tasks

Hi,
The lines:
2014-12-11 11:30:52,047 ERROR  Failed to validate XML file through 'D:\app\talend\Talend-MDMServer-r118616-V5.5.1\jboss-4.2.2.GA\server\default\deploy\org.talend.datastewardship.war\WEB-INF\classes\org\talend\datastewardship\server\task\creation\inputTask.xsd
org.dom4j.DocumentException: Error on line 1 of document  : Content is not allowed in prolog. Nested exception: Content is not allowed in prolog.

... are rather strange. It looks like you send malformed DSC tasks to the DSC. Changing the database vendor should not be the cause of the error (issue occurs before store to database). Could you check the DSC task contents you're sending?
One Star

Re: [resolved] SAXParseException when uploading DSC Tasks

Hi
Here's an example of a DSC task which should be uploaded:
.--------------------------------------.
| #1. tLogRow_1 |
+------------------+-------------------+
| key | value |
+------------------+-------------------+
| Id | 775231 |
| FIRST_NAME | Erik |
| LAST_NAME | Verlaet |
| FULL_NAME | Erik Verlaet |
| GENDER_CD | Male |
| LANGUAGE_CD | nl |
| WORK_STREET_NAME | Koekoekstraat |
| WORK_HOUSE_NR | 121 |
| WORK_STREET_CD | 6601 |
| WORK_POSTAL_CD | 2627 |
| WORK_CITY | Schelle |
| WORK_COUNTRY | BE |
| FAX | |
| WORK_PHONE | |
| EMAIL | admin@bel-east.be |
| SOURCE_CD | CP3 |
| matchingGroup | Ud37 |
| score | 0.0 |
| groupQuality | 0.0 |
| staticTag | DuplicatePerson |
| weights | 0 |
| isTarget | false |
+------------------+-------------------+
.-------------------------------------------.
| #2. tLogRow_1 |
+------------------+------------------------+
| key | value |
+------------------+------------------------+
| Id | 203131 |
| FIRST_NAME | Erik |
| LAST_NAME | Verlaet |
| FULL_NAME | Erik Verlaet |
| GENDER_CD | Male |
| LANGUAGE_CD | nl |
| WORK_STREET_NAME | Koekoekstraat |
| WORK_HOUSE_NR | 121A |
| WORK_STREET_CD | 6601 |
| WORK_POSTAL_CD | 2627 |
| WORK_CITY | Schelle |
| WORK_COUNTRY | BE |
| FAX | |
| WORK_PHONE | |
| EMAIL | erik.verlaet@skynet.be |
| SOURCE_CD | CP3 |
| matchingGroup | Ud37 |
| score | 0.0 |
| groupQuality | 0.0 |
| staticTag | DuplicatePerson |
| weights | 0 |
| isTarget | false |
+------------------+------------------------+

Kind regards
Dries
Highlighted
Employee

Re: [resolved] SAXParseException when uploading DSC Tasks

It seems the tStewardshipTaskOutput component is sending an invalid XML to the DSC. You could check if XML sent over http is correct by sniffing the traffic. I suggest you also to contact Support team to help you on that issue.

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Agile Data lakes & Analytics

Accelerate your data lake projects with an agile approach

Watch

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.

Download

Tutorial

Introduction to Talend Open Studio for Data Integration.

Watch