Problem with processing huge XML file with tFileInputDelimited

One Star

Problem with processing huge XML file with tFileInputDelimited

Hi, I have made a really simple job to remove the header (4 lines) and the last line of a really big xml document (more than 100 Go) encoded in ISO-8859-1.
This is really simple : I use the tFileInputDelimited to read the document line by line and remove the 4 lines header.
Then the tReplace is used to remove the last tag (<\IproClassDatabse>) (didn't find any other solution for such a big file).
But when the job is done the new file (without header and last line) have half less lines than the original (it should have 5 lines less) !
By using the tail command I can see that the new xml document doesn't end as the original xml document. The job seems to have stopped to process the document.
I have tried this job with smaller xml document and there is no error...

This is a really really simple job, so I really don't get where is the problem. Even if the xml document is really big (120Go) it shouldn't be a problem, it just take some times to be done.
Anyone already met a similar problem or have an idea where the problem comes from ?
Screenshot of the job :
http://i.imgur.com/T8tOlkUh.png
http://i.imgur.com/FGc4vu4h.png
http://i.imgur.com/diYxizxh.png
Community Manager

Re: Problem with processing huge XML file with tFileInputDelimited

Hi
To read a file line by line, I would suggest you to use tFileInputFullRow.
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: Problem with processing huge XML file with tFileInputDelimited

Hi, thanks for the answer. But I have just tried it and it is the same problem : it stops at the same place.
Community Manager

Re: Problem with processing huge XML file with tFileInputDelimited

Hi
The job is really simple, and I don't see something wrong in the job settings, which version are you using? Does the job end normally without error?
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: Problem with processing huge XML file with tFileInputDelimited

Hi,
I am using Talend Open Studio for ESB (5.3.0.r101800).
And the job ends normally, without error.
One Star

Re: Problem with processing huge XML file with tFileInputDelimited

Who knows if tBoostedFileInputXML component can handle that kind of files too....