One Star

how do we retrieve data from HTML page

Hello Team,
We have a HTML page which is basically a form and has a submit button. I would like to know what component can we use in talend to get the data that is given as input in form at the click of the "Submit" button.
Request you to please explain in detail as to how do we configure it.
Awaiting for response.
Thank you
Regards,
Pratik
2 REPLIES
One Star

Re: how do we retrieve data from HTML page

Hi Pratik
There are two topics which is related to parsing HTML.
http://www.talendforge.org/forum/viewtopic.php?id=17700
http://www.talendforge.org/forum/viewtopic.php?id=17771
Now, here are my workaround.
You can download a custom component tTikaExtractor from Exchange.
And create job as follows.
Or use tExtractRegexFields to extract data.
Regards,
Pedro
One Star

Re: how do we retrieve data from HTML page

Hi Pedro,
I used your model with tTikaExtractor and I really thank you for it. But I have a problem...
In the output file the useful lines are not in sequence, they are in the same position of the html file.
Do you know why?
Thanks in advance.