One Star

tFileInput XML and xml parsing settings

Hello,
I have some problems with tFileInputXML and XML parsing options settings.
I need to extract few specific data from several big pdml files (tcp capture packets xml files) to put them in a csv file and treat them afterwards.
I succeeded in performing the extraction on a "little" sample file (19ko) with the DOM parsing option (see DOM_execution.jpg, but got an "Java Heap Space" error on a bigger file (several Mo)
I tried to turn to SAX parsing option but can't get my project work even on my "little" sample file (the output is empty see SAX_execution.jpg)
Is this a possible known bug of the tFileInputXML component with SAX parsing option and corresponding code generation or am I perhaps missing something somewhere in my project configuration ?
(I'm using TOS 3.1.2)
Thanks for your help,
Damien,
2 REPLIES
One Star

Re: tFileInput XML and xml parsing settings

for information I tried to upgrade to TOS 4.1.1 but I got exactly the same behavior...
Information to reproduce is below :
- Loop XPath query : "/pdml/packet"
- mapping information :
frame : "proto/field/@value"
time : "proto/field/@show"
id : "proto/field/@value"
and the input xml sample file (this one is not a valid pdml file but is sufficient to reproduce the parsing problem I meet using SAX) :
<?xml version="1.0"?>
<pdml version="0" creator="wireshark/1.2.6">
<packet>
<proto name="geninfo" pos="0" showname="General information" size="1308">
<field name="num" pos="0" show="3" showname="Number" value="3" size="1308"/>
</proto>
<proto name="frame" showname="Frame 3 (1308 bytes on wire, 1308 bytes captured)" size="1308" pos="0">
<field name="frame.time_relative" showname="Time since reference or first frame: 0.007766000 seconds" size="0" pos="0" show="0.007766000"/>
</proto>
<proto name="fake-field-wrapper">
<field name="data.data" showname="xxx" size="1254" pos="54" show="xxx" value="xxx"/>
</proto>
</packet>
</pdml>
One Star

Re: tFileInput XML and xml parsing settings

It seems I'm running into the same problem as in topic :
http://www.talendforge.org/forum/viewtopic.php?id=9440
...I find that a warning could be raised to inform that the xpath expression used would not be taken into account with sax...it was not obvious to me...