I'm using TOS Big Data and I'm trying to read in an XML file and separate out some of the data into different XML files based on some of the info. As an example if I have an XML file with books in it, I want some books to go to one file and others to go to a different file based on the ISBN defined in a child element of the book. I've been looking for info on how to do this easily, but haven't come across anything yet. Any suggestions?
This isn't necessarily too hard, but then again it isn't necessarily that easy. This all depends on the input and output schemas. We will need a lot more info before we can help.
I can't share a specific example because it's customer data, but the general idea of what I want to do is to evaluate a tag within an XML file (like Book) and have it and all elements under it go to different output XML files based on the data in an element. I've attached a sample XML for this example of doing this with books.
Logic might be something like:
If genre="Computer" then send the book (and child elements) to output XML file 1
if genre="Fantasy" then send the book (and child elements) to output XML file 2
OK, it is quite hard to give a detailed answer based on this. However if your input structure does not contain more than 1 looping section, it may be quite easy to achieve this using a tXMLMap component. If you are dealing with multiple looping sections, you will have to use the tExtractXMLField component. This requires a bit of knowledge of XPath queries, but is far more powerful than a tXMLMap. With the tExtractXMLField component you would need to use a tMap to send the data in different directions.
To build the XML (after extracting and sending the data to the relevant path), you will probably end up using a tXMLMap component. However if your output XML is complicated (has more than 1 looping section), you may need to do something a little more complicated when building this.
By the way, I am assuming you are using the Open Source Edition of Talend. If so, the above stands. If you are using the Enterprise Edition then you will have access to the Talend Data Mapper. This is much more powerful at working with XML BUT will require a training course. If you purchased the Enterprise Edition, chances are you will have purchased some training. If so, make sure you tale the TDM training.
The first 100 community members completing the Open Studio survey win a $10 gift voucher.
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Watch the recorded webinar!
Learn how to make your data more available, reduce costs and cut your build time
Read about OTTO's experiences with Big Data and Personalized Experiences
Pick up some tips and tricks with Context Variables