Four Stars

Source Control: Project, Job and Artifact versions - best practise

Hello 


We are just starting out on our Talend 6.4 Journey and are interested in using GIT and following a good workflow - one that we can move to CI asap.

My questions (apologies there a few - although they are all closely related Smiley Happy ) revolves around Source Control and Versions, looking down from the Source Control Repository Level all the way down to the Job and Artifact Level.

Dale Anderson ( https://www.talend.com/blog/2016/03/30/talend-job-design-patterns-best-practices-part-2/) seems to say that no actual Job versioning should not be done using the Talend Studio Job major and minor buttons, but we should "use the native SCC branching and tagging mechanisms." instead.

 

Q: I presume he means we leave the job version in the studio at at 0.1 forever?

I'm confused how this might work:

1) Talend puts ALL projects in a single Git repository.

So If you branch/tag your GIT repository then ALL  projects are branched or tagged. - Q: Can you not configure a repository per project?

2) Even though the projects are versioned, it is jobs THEMSELVES that are published  as artifacts to nexus at a particular version number. - Q: How do you tie an artifact version to a branch/tag? (Part of me feels each job should have it's own git repo.... Smiley Happy ) 

3) I did find this https://help.talend.com/reader/ElruncSKqadnY2jVuXhG8g/orkrMC2OzmiRk151LjYeVw which seems to  "... allow you to release your whole project and publish all job artifacts with the same fixed version"

So this could tie the artifact version to the project version which could be tied to a branch/tag. (even though the branch/tag would exist in other projects!) - Q:is anyone doing this?

4) In this scenario what happens if:

* You have a project at branch/tag 1.1 with 2 job artifacts published to nexus and deployed to your runtime at version 1.1
* you work on jobA but not jobB and branch/tag the project (actually whole repo!) at 1.2.
* you change the version of all Job artifacts to 1.2 and publish and deploy? (this means jobB has incremented in version although it has not changed?
* you work on jobB but not jobA and branch/tag the project (actually whole repo!) at 1.3.
* you change the version of all Job artifacts to 1.3 and publish and deploy? (this means here jobA has incremented in version although it has not changed?

Q:  what if you need to rollback jobA to 1.2. If you rollback all jobs to 1.2 you will have lost changes to jobB? Do you never rollback but only move forward so you would need to publish all artifacts at 1.4 with a fix to jobA? (which would be overwriting  jobA artifact 1.3 with jobA 1.2 artifact somehow?)

Any advice about how people manage this workflow with or without CI would be gratefully received.


Best wishes

6 REPLIES
Moderator

Re: Source Control: Project, Job and Artifact versions - best practise

Hello,

Have you already checked this online guide about:TalendHelpCenter:Talend Software Development Life Cycle Best Practices?

Best regards

Sabrina

--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Four Stars

Re: Source Control: Project, Job and Artifact versions - best practise

Hello
Yes I've read that.
My questions are raised based on reading that doc.
It would be useful to have an opportunity to chat to someone who has put CI
into practice
Cheers
Nomit
Six Stars

Re: Source Control: Project, Job and Artifact versions - best practise

Hello,

 

This is to continue the conversation we started at the end of the post in the link below and your last response quoted below.  My next post will try to answer your to your last response the best I can.

https://community.talend.com/t5/Deployment/Merge-git-Development-to-Master-Branch-via-Talend-Studio/...

 


n999 wrote:

1) Yes - since posting I've discovered that you can have one project per repo...I think I was confused by the root config in the tac where you can add Git config.

 

Yes one job per project per repo sounds like something that won't work with the way talend works - If only you could include ref projects or joblets or metadata or connection details  from outside your own project as injected dependencies, and version these injected dependencies in your project. This would be how you would build a regular maven based java project by referencing injected dependencies in your project pom.

 

2) an 3)  - sorry some quick fire questions Smiley Happy

 

So you change the project version number ( I think this might be what talend calls 'the published version number...') manually in the pom in jenkins?

 

Have you tried changing the project version number as per https://help.talend.com/reader/ElruncSKqadnY2jVuXhG8g/orkrMC2OzmiRk151LjYeVw?_ga=2.139841203.1208745...?

 

I presume this does the same thing?

 

What version number do you use? Is it just incremental numerical versioning? 

 

You don't use the git tag as the pom version number?

 

So this deploys all jobs in that project to nexus at the SAME version?

 

So you have some jobs in that project incrementing in version number in nexus even though they have no code change?

 

If so, does that mean you redeploy all these new unchanged jobs to a job server?

 

---

I did find this https://www.teschglobal.com/resources-all/talend-devops-continuous-integration :

 

The video is all interesting (and the blog post mentions continuous deployment using metaservlet) showing the use of a script (around 29:40) to deploy to nexus and changing what he calls the publish version for all jobs. This is opposed to editing the pom in jenkins or using the method in the help.talend.com link above....

 

4) " So if one of the jobs needs to roll back, we would roll back just that one job but still continue to move forward ..." Just to clarify , so you would roll back job A on the TAC/Job Server but not in the code base? so you would make an incremental fix to job A in Master to represent the rollback and then release the project all at a new incremented version? And again deploying all new versioned jobs in that project on the Tac/job server?

 

Sorry for all the questions  and thanks again for your time



 

Six Stars

Re: Source Control: Project, Job and Artifact versions - best practise

So you change the project version number ( I think this might be what talend calls 'the published version number...') manually in the pom in jenkins?

Correct

 

Have you tried changing the project version number as per https://help.talend.com/reader/ElruncSKqadnY2jVuXhG8g/orkrMC2OzmiRk151LjYeVw?_ga=2.139841203.1208745...? I presume this does the same thing?

 We have not tried changing versions in that manner.  We are on version 6.3 and I am not seeing that as an option so I guess it is new to 6.4.  Thanks for the heads up though!

 

What version number do you use? Is it just incremental numerical versioning? 

 

The version number we started with was 1.0.0 and we increase the different major/minor numbers with some loose concept on when we should increase them.  It's just our first run at it and we are very inexperienced with best practices on versioning numbers.

 

You don't use the git tag as the pom version number?

We incorporate the version number used in Jenkins into the Git tag along with some wacky name.

 

So this deploys all jobs in that project to nexus at the SAME version?

Correct

 

So you have some jobs in that project incrementing in version number in nexus even though they have no code change?

Correct, because I think it is looked at like a suite of jobs.  Harmonious in deployment just as in software development a version implies a snapshot of all the code regardless if only parts of the code has changed.

 

If so, does that mean you redeploy all these new unchanged jobs to a job server?

Not necessarily.  We only re-deploy jobs we need to.  This is a good question.  Right now we don't have a ton of jobs and the process isn't completely ironed out on how we would manage a ton of jobs.

Six Stars

Re: Source Control: Project, Job and Artifact versions - best practise

4) " So if one of the jobs needs to roll back, we would roll back just that one job but still continue to move forward ..." Just to clarify , so you would roll back job A on the TAC/Job Server but not in the code base? so you would make an incremental fix to job A in Master to represent the rollback and then release the project all at a new incremented version? And again deploying all new versioned jobs in that project on the Tac/job server?

 

We would possibly roll back the code and the job in TAC, but it also depends on the situation and what the issue is.  We may roll back the code and then do a release and update TAC to the new release.  Just to that job, maybe.  All depends what the situation is and how your jobs are set up.  However again we've just scratched the surface with this and we're running 6.3.  I think 6.4 has some enhancements and fixes as it relates to continuous integration.

 

Hopefully this limited experience helps.  

Four Stars

Re: Source Control: Project, Job and Artifact versions - best practise

Hi

 

Many thanks for that.

All extremely useful stuff.

 

Yes seeing all jobs in a project as a single piece of code does make sense, although each job is actually deployed to both nexus and then job server at different versions under the  manual approach and it's not 100% clear what Talend docs recommend you should do in a CI env. If you should or should not deploy all jobs at a new project version to nexus or job server. 

 

We are yet to create or deploy our first real job (!).  I'm going to attempt to try and follow the below (maybe too complicated esp relating to git)  model and see where it takes me. I'll post my findings here.... Smiley Happy

 

  • Developers branch off develop to create feature branches and work and create unit tests in this branch.
  • Developers do not change the job version in the Studio using the minor and major buttons so this job stays at version 0.1.
  • When ready the feature branch is merged into Develop
  • Jenkins listens for this push and runs "GenerateSources" from the develop branch and then  "RunTests". 
  • If test are passed, "DeployToNexus" is run deploying jobs to the Snapshots nexus repo at the same version each time; 0.1. (You could choose to add a timestamp to not overwrite each snapshot....)
  • Another 2 jenkins job "DeployToDevServer" and"DeployToQaServer" uses metaservlet to deploy all snapshot jobs in that project to our DEV and QA runtime job Servers
  • Some form of QA testing takes place....

 

  • When Code is production ready then Develop is merged into Master and a meaningful tag is created
  • Another set of "LIVE "jenkins jobs exist from "GenerateSources" to "DeployToLiveServer" than require a manual build and require a git master tag to be added to the "GenerateSources" jenkins config manually.
  • This 'pipeline' is run manually and will deploy all jobs to the nexus Releases repo at the version of the tag. It will also deploy all these new releases to the Live Runtime.....

This is all pie in the sky and may not be possible/safe. Using metaservlet is a big unkown and raises questions like are jobs deployed by metaservlet visible in the TAC, etc.

 

Anyway - enough research - now time to see what can be done.....

 

Thanks again, will kepp you posted,  and I hope Talend do a webinar on CI at some point! Smiley Happy

 

Cheers