One Star

TOS Data Integration version control

I hope someone can help me regarding source/version control for Talend Open Studio (Integration and/or Big Data).  I am aware that the enterprise version of Talend has an integration with SVN, but this is not an option for me since I am working for a start up company with no money to spend. 
My requirement: I need to put my jobs under version control, preferably using Mercurial (hg). 
Background: My company is a "java-shop" and the other developers are using java, spring, maven, jenkins, nexus ++. The "java way" of doing things. My background is in data warehousing, and I have worked a lot with ETL tools such as SAP BODS, Informatica and SSIS. I have used TOS "on the side" for test-projects. I am the only one with this skill-set, and I need to convince my coworkers that using an ETL tool is a good choice for data integration and processing tasks. Smiley Happy For now I am building "Standalone Jobs" from Talend and scheduling these using Jenkins as a scheduling tool. 
I have spent a lot of time in the forum searching for information on this, and my "Google skills" are also quite good. I find a few topics that mentions git, and I have read some blogs, but most of them involves putting the whole Talend project folder, or some subset of it () under version control. 
To the point:

Would it be enough to only "export items" and put these under source control? Avoiding all the generated java-code. It will require a good workflow of course and depends on the developer to export the correct items every time. But for the goal of versioning code only?
Any other success stories with TOS and version control? How have you done it?


Re: TOS Data Integration version control

Hi Thomas,
So far, talend provides community user with a free trial of Talend Enterprise for Data Integration.
You can download it freely from Talend Official Website:
Best regards
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.

Re: TOS Data Integration version control

Version control for Talend Jobs only really provides the benefit of keeping your work safe. You can use it to go back to a particular day, but unless you are doing that for the whole project, it is not bullet proof. By the way, I am talking about the Enterprise Edition here as well. Forget the concept of merging changes and having several developers able to concurrently work on a job, that will only end in tears. What you can do is (as you mentioned) export your jobs at different points in development. If you export single jobs then you can keep track of your changes for a job and go back and forwards between them. You will have to remember that a lot of the time jobs depend on other jobs or metadata, so you need to put effort into managing this.
Branching can be useful, but you will never merge a branch back in the way in which you would normal Java code. It is important to remember that while a source control is essential for any and all development projects, with Talend (and I am sure many other code generation tools) you are massively restricted in what functionality you can safely make use of.
With regard to Nexus and other artefact repositories, I think it is key to have one of these. If every time you want an executable you have to compile it, then it creates opportunities for issues to creep in. Talend have been using Nexus with ESB for a while and have just brought it into the Data Integration toolset for the Enterprise Edition. That is a massive improvement in my eyes. 
One Star

Re: TOS Data Integration version control

Thanks for your feedback xdshi, rhall! 
The Enterprise Edition would no doubt cover all my needs for version control but it is not an option at this stage since I do not have any money to spend. Smiley Sad 
I will be (at least for a while) the only developer of Talend jobs, so I for now I do not need to worry about multiple developers, branching ++.  I really only want to achieve:

Keeping my work safe.
"Rollback" to previous versions of my jobs if neccessary. I.e. I should be able to pull a revision of some repository created by myself and get a working Talend job when imported into TOS.
Avoid tracking changes on generated java-sources if possible. 

One Star

Re: TOS Data Integration version control

It sounds like you can achieve everything using standard version control outside of Talend. I use the community edition with Git (which is a little more modern than SVN and is free and open-source...) and am able to track changes, tag releases, roll back, etc. 
If you go down the Git route then you've got the option of using the command line or GUI tools such as Tortoise Git. It's not integrated with Talend but I haven't found that to be too great a problem. You could also look at the Eclipse Git plugin () which should play nicely with Talend.
To prevent generated output from being version-controller you just need to add the paths to a file named .gitignore