One Star

Parsing XML files and loading into HDFS

Hi,
I need to parse XML files and load it to HDFS.  few are simple enough and few having data delimited by "|". 
eg: 
<ebo:Record>CREATE|59|0|59|0|2015-10-28 00:00:00|||EA|S|2955|303176760||2015-10-28 00:00:00|R|0003|8|1</ebo:Record>

 
<ebo:Record>CREATE|179|0|179|0|2015-10-28 00:00:00|||EA|S|2955|303151906||2015-10-28 00:00:00|R|0003|8|1</ebo:Record>

I have to pick these files from a specific directory using Talend and load it into hdfs. There will be no transformation involved. Also, as there will be more than 50 xml format I don't want to go through creating the metadata schema individually for each file format.Is there any way to automate this task?
1 REPLY
Community Manager

Re: Parsing XML files and loading into HDFS

If you want to load the whole xml file to HDFS, you can use tHDFSPut component, set the filemask as "*.xml", it will put all the files in the local folder where you specified to HDFS server.
----------------------------------------------------------
Talend | Data Agility for Modern Business