Six Stars

How to process a file stored on remote server, without copying it?

Hi

 

Is it possible to process a file stored on a remote server without copying it?

I'm working on very large files and would prefer not to store the data locally.

 

Thanks

12 REPLIES
Highlighted
Twelve Stars TRF
Twelve Stars

Re: How to process a file stored on remote server, without copying it?

If you can reach the file using a remote folder, it's possible, else you need to transfer it (using a ftp transfer or other).

TRF
Twelve Stars

Re: How to process a file stored on remote server, without copying it?

Why do you not want to copy the file? If it is for performance reasons, trying to process a file remotely will not necessarily improve the situation. It would be far safer and more efficient to copy and process locally. 

 

 

Rilhia Solutions
Moderator

Re: How to process a file stored on remote server, without copying it?

Hi,

Can you detail a little bit more your use case? What's your further processing on your large file which is on remote server? FTP server?

Best regards

Sabrina

--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Six Stars

Re: How to process a file stored on remote server, without copying it?


TRF wrote:
If you can reach the file using a remote folder, it's possible, else you need to transfer it (using a ftp transfer or other).

Hi @TRF

Sorry I've been away. I haven't tried to add it as a remote folder I'll try and let you know.

Six Stars

Re: How to process a file stored on remote server, without copying it?


rhall_2_0 wrote:

Why do you not want to copy the file? If it is for performance reasons, trying to process a file remotely will not necessarily improve the situation. It would be far safer and more efficient to copy and process locally. 

 

 


Hi @rhall_2_0.

Sorry for the delayed response. Yes, its the performance that concerns me, as well as the local storage. I'd like to run jobs concurrently which would mean filling up the local machine quite quickly even if I delete the file at the end of each run.

Six Stars

Re: How to process a file stored on remote server, without copying it?


xdshi wrote:

Hi,

Can you detail a little bit more your use case? What's your further processing on your large file which is on remote server? FTP server?

Best regards

Sabrina


Hi Sabrina

One case is where I'm comparing a file to its equivalent database table after a load, I just want to get the file value, compare and forget about it.

Six Stars

Re: How to process a file stored on remote server, without copying it?


accuracie wrote:

TRF wrote:
If you can reach the file using a remote folder, it's possible, else you need to transfer it (using a ftp transfer or other).

Hi @TRF

Sorry I've been away. I haven't tried to add it as a remote folder I'll try and let you know.


I just realized the server I'm accessing is an SFTP server not FTP, to my knowledge this isn't possible for SFTP?

Please advise on how to map an SFTP server on windows, or a point in the right direction please.

Twelve Stars

Re: How to process a file stored on remote server, without copying it?

You will have more hard drive space than RAM. The computer running the job's RAM will be used even if you do not download the file and leave it where it is when processing. Processing the data on the local machine will be quicker and (as you said) the file can just be deleted after the job is done.

 

There is another way you could approach this though. Would it be an option to send the job to the machine with the files and then start it on that machine using SSH? This would potentially achieve your requirements and is certainly very possible with Talend.

Rilhia Solutions
Six Stars

Re: How to process a file stored on remote server, without copying it?


rhall_2_0 wrote:

You will have more hard drive space than RAM. The computer running the job's RAM will be used even if you do not download the file and leave it where it is when processing. Processing the data on the local machine will be quicker and (as you said) the file can just be deleted after the job is done.

 

There is another way you could approach this though. Would it be an option to send the job to the machine with the files and then start it on that machine using SSH? This would potentially achieve your requirements and is certainly very possible with Talend.


I've seen a few examples where people "send" the job, I'll explore this as well but do you mind explaining how that works exactly. I haven't gotten a grip on the concept just yet.

 

Also, I really don't feel comfortable with copying the data but its about logic more than my feelings Man Happy so would you choose copying the file over sending the job to the server, if possible that is?

 

Thanks

Twelve Stars

Re: How to process a file stored on remote server, without copying it?

It's a weigh off (I guess) between real security implications (or legal implications) and ease of use. You can always simply leave a job you have built on that server and schedule it to run using a scheduling tool (I'm assuming you are using the Open Source tool and not the Enterprise Edition of Talend). There are other options but to give you a solution we will need to know what servers you are using (is SSH available, for example?), what version of Talend you are using, network accessibility, etc.  

Rilhia Solutions
Six Stars

Re: How to process a file stored on remote server, without copying it?

It's Linux servers. Yes SSH is available and I'm using 6.4.1(Open studio)

Twelve Stars

Re: How to process a file stored on remote server, without copying it?

Well linux makes it a bit easier. You can easily make use of SSH manually or as part of a Talend job. In its simplest form, this job could be built on your Studio, moved to the machine with the files, extracted from the zip and then you would just need to initialise the script file using SSH. Alternatively, if you want a job on your local machine to trigger the job on the remote machine (maybe even send it there and extract it), you can use tFTP components and the tSSH component. 

Rilhia Solutions