Generally in Talend to get create a job document we right click on job and select the Generate Doc as HTML. Is there any alternate process or component by using which we can generate the Job document.
Not out of the box, but the Talend jobs are stored as XML in the back end. So you *could* create something to parse those and get whatever info you want from your job designs. I do this a lot BUT I always make sure NOT to edit the files.
The configuration files for your project are stored as XML. If you open a couple and take look (back them up first) you will see that the structure is quite logical. Take a copy and change the name from .properties or .item to .xml and you will be able to view them in a web browser quite easily. Once you have got that far you should be able to see how you can build a job to parse these and extract useful information. I have used this method for getting component configs and spotting logic errors (like not setting every tRunJob to get schema from parent, etc). Good luck.
I just want to fetch some info(not all) from the html doc of a job through a talend job and place those info on a word document.
There is no short and easy answer to this. You will need to process the XML and use XPath to find the data you want.
Try Talend Cloud free for 30 days.
Introduction to Talend Open Studio for Data Integration.
Practical steps to developing your data integration strategy.
Create systems and workflow to manage clean data ingestion and data transformation.