tFileUnarchive error with tar.gz, worked fine before

One Star

tFileUnarchive error with tar.gz, worked fine before

I am encountering a strange problem with one job of our Talend project.
It's related to the tFileUnarchive component, which has only one .tar.gz file to extract, and that's it.
It worked fine for ages, then suddenly it failed last week (and since):
The error we get is this :
2015-08-14 06:20:36 : Start of subjob Unzip_S
Exception in component tFileUnarchive_1 Unexpected end of ZLIB input stream

Nothing has changed on the server or for this specific job, and we fail to see why we have this error..
This jobs consists only in tFileList --> iterate --> tFileUnarchive.
We use contexts to set the folders and filenames, file masks for the tFileList and this expression for unarchive : ((String)globalMap.get("tFileList_1_CURRENT_FILEPATH"))
On server side, permissions, users, versions of java are the same as before

java version "1.6.0_34"
OpenJDK Runtime Environment (IcedTea6 1.13.6) (rhel-
OpenJDK 64-Bit Server VM (build 23.25-b01, mixed mode)
Java 6

Talend is TOS for DI
The .tar.gz file is correct, complete, and extracts fine when using a tar cmd.
There is space on the disk, RAM as well, the file is not accessed by another user or program.
The job also works fine on local context, but its a Windows system, whereas the server is Linux.
We can bypass the problem by switching to a tSystem component, and call a 'tar zxvf <filename>' command, so it's not a blocking situation.
Anyone got an idea on why this fails to work ?
Thank you.
Community Manager

Re: tFileUnarchive error with tar.gz, worked fine before

It sounds like this Java issue ( which affected 1.6. An upgrade to the latest Java 1.7 should resolve this.
One Star

Re: tFileUnarchive error with tar.gz, worked fine before

I also saw this kind of Java issue..
We cant upgrade for now to 1.7 (and this was not an issue with 1.6 Smiley Indifferent)
Thanks anyway.
Community Manager

Re: tFileUnarchive error with tar.gz, worked fine before

It is an intermittent issue with v1.6. It is entirely possible that you have never seen it before and now you get it a lot. If you have not changed anything at all, Occam's Razor suggests that this is the problem. 
One Star

Re: tFileUnarchive error with tar.gz, worked fine before

Another idea : the .tar.gz file is copied before with a tSCPGet, perhaps the transfer is not complete when my sujob unarchive starts ?
With time, the data increased, and might took a little longer than before .. :-/
How can I check that tSCPGet is complete ? Although I do have a custom log 'file moved' with a link "component ok" just after the SCP part.
The workaround with tSystem produces quite the same error,whereas manually untar in command line works.
gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
Community Manager

Re: tFileUnarchive error with tar.gz, worked fine before

Is the tSCPGet in its own subjob? If it is then it should have completely finished by the time the "OnSubJobOK" is activated (if indeed you are using that). If not, try moving it to its own subjob and connect it to the next subjob using an "OnSubJobOK".
A basic way of testing this is to separate the tSCPGet component and the tFileUnarchive component into separate jobs. Ensure that the whole file is actually downloaded and then kick off a job using the tFileUnarchive component. If this works it hints towards the file not being completely downloaded. If that is the issue, then you will need to get hold of the file size of your archive in its remote location and then do a test on the file size before you try to unarchive it.
One Star

Re: tFileUnarchive error with tar.gz, worked fine before

Thanks for your help.
In fact I only got access to the server myself today.. :rolleyes:
I tried to add a delay between subjobs, and had proof that SCP wen fine all along.. (almost).
--> And it turns out the transfer went wrong last week, providing a faulty .tar.gz file (when extracting, it gives the famous error I had at first).
Combined with a conception error (unarchive job processes all .tar.gz files, because at first the client wanted to treat multiple files per day), and the first crash resulting in not cleaning the folder, I was ending up with always trying to extract a bad file. Smiley Indifferent
To my knowledge, there was only one "good" file in the folder..
Anyway, some corrections to make, and a file deleted, everything is ok now ! Smiley Very Happy


Talend named a Leader.

Get your copy


Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables


How Media Organizations Achieved Success with Data Integration

Learn how media organizations have achieved success with Data Integration


Downloads and Trials

Test drive Talend's enterprise products.