Four Stars

How to load the HTML data into tables

Hi Team, 

can you help me how to load HTML data into tables by using Talend? 

I have attached sample HTML file. 

Regards

Jay

 

1 ACCEPTED SOLUTION

Accepted Solutions
Eight Stars

Re: How to load the HTML data into tables

Hello Jayrapolu,

 

Generally, HTML is a subset of XML therefore use XML components.

First thing, the file you attached is not a valid HTML file. There are missing some tags (e.g. <HTML></HTML>), some tags are not closed (e.g. <BODY>)... You have not specified what should be the output format and some other important conditions, so it is hard to provide you the exact answer, but...

 

The most important component is tXMLMap I think. See the attached screenshot (sorry for the naming convention). If we take only the important part of the HTML you provided:

<body>
<table cellpadding="0" cellspacing="0" border="0" width="100%">
				<tr>
					<td width="186" class="headlabel">CONSUMER:</td>
					<td width="320" class="headvalue">Jay</td>
					<td width="73"><img src="images/spacer.gif" /></td>
					<td width="118" class="headlabel">DATE:</td>
					<td width="128" class="headvalue">17-10-2017</td>
				</tr>
				<tr>
					<td class="headlabel">MEMBER ID:</td>
					<td class="headvalue">AA40238899_C2C1               </td>
					<td><img src="images/spacer.gif" /></td>
					<td class="headlabel">TIME:</td>
					<td class="headvalue">12:32:54</td>
				</tr>
</table>
</body>

You can use the following job to extract headlabels and headvalues.
snip.PNG

I also attached an export of the job. 

 

I hope, that this will help you to solve this task.

 

Best regards

lojdr

 

2 REPLIES
Eight Stars

Re: How to load the HTML data into tables

Hello Jayrapolu,

 

Generally, HTML is a subset of XML therefore use XML components.

First thing, the file you attached is not a valid HTML file. There are missing some tags (e.g. <HTML></HTML>), some tags are not closed (e.g. <BODY>)... You have not specified what should be the output format and some other important conditions, so it is hard to provide you the exact answer, but...

 

The most important component is tXMLMap I think. See the attached screenshot (sorry for the naming convention). If we take only the important part of the HTML you provided:

<body>
<table cellpadding="0" cellspacing="0" border="0" width="100%">
				<tr>
					<td width="186" class="headlabel">CONSUMER:</td>
					<td width="320" class="headvalue">Jay</td>
					<td width="73"><img src="images/spacer.gif" /></td>
					<td width="118" class="headlabel">DATE:</td>
					<td width="128" class="headvalue">17-10-2017</td>
				</tr>
				<tr>
					<td class="headlabel">MEMBER ID:</td>
					<td class="headvalue">AA40238899_C2C1               </td>
					<td><img src="images/spacer.gif" /></td>
					<td class="headlabel">TIME:</td>
					<td class="headvalue">12:32:54</td>
				</tr>
</table>
</body>

You can use the following job to extract headlabels and headvalues.
snip.PNG

I also attached an export of the job. 

 

I hope, that this will help you to solve this task.

 

Best regards

lojdr

 

Four Stars

Re: How to load the HTML data into tables

Thanks for the solution. Very much appreciated. 

 

Regards
Jay