Six Stars

Scrape data from a aspx website

Hi,
I have to scrape data from a aspx website. Is this possible with Talend? If it is, what components will come in handy in this scenario?
Thanks in Advance,
3 REPLIES
Fifteen Stars

Re: Scrape data from a aspx website

You can do this, but there isn't a "one size fits all" solution with Talend. I have written a tutorial on how I achieved this with a Formula 1 site. The tutorial is here: https://www.rilhia.com/tutorials/using-third-party-java-library-scrape-content-table-web-page
I included the job so you can take that and have a play. But remember it was written specifically for the site I was working with.
Rilhia Solutions
Six Stars

Re: Scrape data from a aspx website

Thanks r_hall,
My requirement is like this, I have a csv file which holds addresses and I want Talend to look up those addresses on the site and copy relevant data in a new csv file or dump it in the database. 
csv----------------talend----------fetch the website-----------look up addresses from csv on website--------return relevant data and save
You think this can be possible with components or I have to create convoluted Java routine for this (I am not fluent with Java)? 
Thanks,
Fifteen Stars

Re: Scrape data from a aspx website

I doubt there is a component that will handle this, but I am not aware of every component available in Talend Exchange (maybe check there). However, writing a Java routine making use of a third party API really isn't that hard. If you are new to Talend, it may be a bit more of a challenge, but any gain in Java knowledge can only be a benefit when using Talend. It opens so many extra doors.
Rilhia Solutions