One Star

Using a remote file as input

Dear all,
First of all, I'm already very sorry, as this might to be a very basic question... But I am starting to dig into Talend and still feel quite lost :-)
What I would like to do is (I guess) pretty simple: download an xml or csv file through http, map it against an existing PostgreSql table and populate the Postrgre table with the records of the input file.
My problem is that the component tFileInputDelimited seems to accept only local files and don't recognize my http string:
http://earthquake.usgs.gov/earthquakes/catalogs/eqs1day-M1.txt
Is there a particular syntax? or did I forget to do something? a component that will download the file to feed the tFileInputDelimited component?
Thanks a lot in advance for your help!
Stéphane
11 REPLIES
Community Manager

Re: Using a remote file as input

Hello
You can use tFileFetch to get the file into local disk first and then use tFileInputDelimited to read the file.
Best regards
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: Using a remote file as input

Amazing, thanks for that! I will try it and, for sure, come back with loads of other questions ;-)
Best,
Stéphane
One Star

Re: Using a remote file as input

Well... I have one already!
How to link the tFileFetch component to the tFileInputDelimited?
Should I "hard-code" in the parameters of tFileFetch the output path and name and then use this as an input for tFileInputDelimited? Or is there a way to link directly the 2 components? (so that, e.g., if I change that output path of tFileFetch, the input path of tFileInputDelimited is automatically updated).
And, in any case, will the output file of tFileFetch replace the previous output file everytime I run the job? Or is there a specific setting to setup to reach this behavior?
Thanks again!
Stéphane
Community Manager

Re: Using a remote file as input

Hello
The job looks like:
tFileFetch
|
onsubjobok
|
tFileInputDelimited--row1-->tLogRow
so that, e.g., if I change that output path of tFileFetch, the input path of tFileInputDelimited is automatically updated).

Define a context variable for the output path of tFileFetch and the input path of tFileInputDelimited, so that both the two component always use the same path.
And, in any case, will the output file of tFileFetch replace the previous output file everytime I run the job?

If you set a fixed file name, it will replace the previous output file everytime, so you need to set a dynamical file name, eg:
"test"+TalendDate.formatDate("yyyy/MM/dd_HH:mm",TalendDate.getCurrentDate())+".txt"
when you run the job, it generate a file called:
test_2011/01/20_13:27.txt
Best regards
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: Using a remote file as input

That was a quick answer!
Will try so then, thanks a lot!
Community Manager

Re: Using a remote file as input

Hi
what you mean by "tFileInputDelimited--row1-->tLogRow". If I add a tLogRow component, I see all the records from the fetched file, which is correct,

I only take an example, It extract the records from the file and print them on console after you fetch the files successfully.
but still this only works if I entered a fixed file path as input for tFileInputDelimited.

You can set a dynamical path with context variable (see my screenshot)
so that you can set the value each time when you execute the job, see
http://www.talendforge.org/forum/viewtopic.php?id=1615
http://www.talendforge.org/forum/viewtopic.php?id=5681
or load the value from file/db with tContextLoad component at runtime.( see demo job on http://www.talendforge.org/forum/viewtopic.php?id=9130)
As far as I understand, the OnSubJobOk will only check that the fetch was successful

Yes, you are right. Only the subjob, tFileFetch here works successfully will fire the next subjob begin to run, otherwise, it is not.
Best regards
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: Using a remote file as input

It works perfectly!
Now just to improve a little bit, I would like to use what you wrote in your first post in order to keep tracks of all fetched files:
"test"+TalendDate.formatDate("yyyy/MM/dd_HH:mm",TalendDate.getCurrentDate())+".txt"
This used to work if I hard-code the path and file name but not if I enter it as default value in the context variable. Any advice on this?
Thanks!
Community Manager

Re: Using a remote file as input

Hello
Add a tJava at the begining of job, for example:
tJava
|
onsubjobok
|
tFileFetch
|
onsubjobok
|
tFileInputDelimited--main-->tLogRow
and type in the following code on tJava:
context.filename="test"+TalendDate.formatDate("yyyy/MM/dd_HH:mm",TalendDate.getCurrentDate())+".txt";
Best regards
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: Using a remote file as input

And this works perfectly too!
Thanks and have a great week-end!
One Star

Re: Using a remote file as input

Hi Shong,
I followed this example as explained by you. I am getting exception as below:
Exception in component tFileFetch_1
java.lang.Exception: Method failed: HTTP/1.1 405 Method Not Allowed
at talenddemosjava.filefetchfromhttp_0_1.FileFetchFromHTTP.tFileFetch_1Process(FileFetchFromHTTP.java:427)
at talenddemosjava.filefetchfromhttp_0_1.FileFetchFromHTTP.runJobInTOS(FileFetchFromHTTP.java:1263)
at talenddemosjava.filefetchfromhttp_0_1.FileFetchFromHTTP.main(FileFetchFromHTTP.java:1125)
The settings are exactly similar as given here, only difference is, my url is https so I have selected the protocol as https. (tried with http too!)
Screenshots are attached.
Appreciate your help!!
Community Manager

Re: Using a remote file as input

Hi
Method failed: HTTP/1.1 405 Method Not Allowed

The method specified in the Request-Line is not allowed for the resource identified by the Request-URI. The response MUST include an Allow header containing a list of valid methods for the requested resource. Try to uncheck the box 'post method'.
Best regards
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business