One Star

parse flux xml

i have a file
i transform this file in xml file
with tFileinputFullRow + tJavaRow
it's ok
now i would like extract a attribut of a element
i want to use tparseXMLrow
but i have a exception in component tParseXMLRow_1
org.dom4j.DocumentException: Error on line 1 of document : XML document structures must start and end within the same entity. Nested exception: XML document structures must start and end within the same entity.
at org.dom4j.io.SAXReader.read(SAXReader.java:482)
at org.dom4j.io.SAXReader.read(SAXReader.java:365)
at talenddemosjava.xmltovtg_0_1.xmltovtg.tMysqlInput_1Process(xmltovtg.java:575)
how do i make ?
thank a lot for your help !
xml structure is
<A>
<B>
</B>
<C>
<data ct="hhh" ></data>
<data ct="hhh" ></data>
</C>
</A>
15 REPLIES
One Star

Re: parse flux xml

Hi,
if you take a look at the error message you could find the solution: Your xml must start end end with the same tag. So you should add a opening tag at the start "<data>" for example and a closing one at the end "</data>" in this case.
Additional you should add a header in the first line like the following:
<?xml version="1.0" encoding="iso-8859-1"?>
You could find more information for example on: Wikipedia.
Bye
Volker
One Star

Re: parse flux xml

hi,
i verif structure of my xml message
it is good
but i have always Exception in component tParseXMLRow_1
org.dom4j.DocumentException: Error on line 1 of document : XML document structures must start and end within the same entity. Nested exception: XML document structures must start and end within the same entity.
at org.dom4j.io.SAXReader.read(SAXReader.java:482)
at org.dom4j.io.SAXReader.read(SAXReader.java:365)
at talenddemosjava.v1toxml_0_1.v1toxml.tFileInputFullRow_1Process(v1toxml.java:1356)
i think that the flux xml it isn't in one row but in several rows
when tParseXMLRow_1 parse the first row i see the start entity but i don't see the end entity because the end entity is on another row
do i make concat each row in one row ?
or is it a mistake to think that ?
thank for your help !
Seventeen Stars

Re: parse flux xml

hi,
XML document structures must start and end within the same entity

looks like a error xml structure ... Did your root Element are closed "at the end" !?
<myroot>
<other>....</other>
</myroot>

You can check your file by open it in Firefox to point the error !
++
One Star

Re: parse flux xml

Hi phil,
you must have the whole xml document in one row. Depending on your data flow you could use tDenormalize for example.
Bye
Volker
One Star

Re: parse flux xml

hi,
when i read my structure xml with tInputFileXML in logout i can extract the attritbut of element
it 's good
but when i parse the stream with component tParseXMLRow_1
i have always Exception
org.dom4j.DocumentException: Error on line 1 of document : XML document structures must start and end within the same entity. Nested exception: XML document structures must start and end within the same entity.
at org.dom4j.io.SAXReader.read(SAXReader.java:482)
at org.dom4j.io.SAXReader.read(SAXReader.java:365)
at talenddemosjava.v1toxml_0_1.v1toxml.tFileInputFullRow_1Process(v1toxml.java:1356)
why i can't use component tParseXMLRow_1
+
One Star

Re: parse flux xml

I think you will find the solution in my answer. tParseXmlRow is row-based. If you have spread your document over multiple rows this wouldn't work.
One Star

Re: parse flux xml

hi,
how do i parse a stream with xml flux ?
i have only tInputFileXML , it is for file
but for stream , what do i use ?
++
One Star

Re: parse flux xml

You could use tFileInputXml for an xml file and tParseXmlRow for XML data in an attribute of your flow. What is your input?
One Star

Re: parse flux xml

my input is in first a file
i transform this file in xml file
with tFileinputFullRow + tJavaRow
after i use tparseXMLrow for extract XML data in an attribute of flow
it doesn't work because xml message spread over multiple rows
how can i do to extract XML data in an attribute ?
++
One Star

Re: parse flux xml

You could use tDenormalize no concatenate multiple rows together. Use "\n" as delimiter. you need a unique key over all rows for the whole xml document. If you do not have one (or the file has only one xml document) you could a fix value in one additional attribute inside of your tJavaRow.
One Star

Re: parse flux xml

i don't understand how i use tDenormalize
can you give me a example, please ?
++
One Star

Re: parse flux xml

the file has only one xml document , the xml message spread over multiple rows
i have only a column where is spreaded the xml message
how i use tDenormalize
++
One Star

Re: parse flux xml

I made an example. Take a look at the pictures. There are two jobs. The difference is in the input row delimiter. The upper one uses "---" (which is never used in the xml) and loads the whole file in one row. The second one uses "\n" as delimiter and need to merge the rows together with a tDenormalize.
I modified the input:
<?xml version="1.0" encoding="iso-8859-1"?>
<document>
<A>
<B>
</B>
<C>
<data ct="hhh" >First</data>
<data ct="iii" >Second</data>
</C>
</A>
</document>

And this is the output (both the same one):
ct;value
hhh;First
iii;Second

Bye
Volker
One Star

Re: parse flux xml

thanks a lot of for your help
++
One Star

Re: parse flux xml

You are welcome!