Continuous Integration with CodePipeline, Jenkins, and Talend

Overview

AWS CodePipeline is a continuous integration and continuous delivery service that helps build, test, and deploy code every time there is an update. Continuous Integration is a major topic in Talend projects, and more capabilities have been added to the product in this area, especially a better integration with Maven, Jenkins, and Nexus.

 

This tutorial shows an example of integration between AWS CodePipeline and the Talend Continuous Integration process based on Talend CI-builder, Talend CommandLine, and Jenkins.

 

Thus, you will gain a better insight on:

  • how to install and configure the Talend Continuous Integration tools
  • how to configure AWS CodePipeline to orchestrate the Talend Continuous Integration process

 

Architecture

The diagram below shows the overall architecture of this tutorial.

architecture.png

 

  1. Developer pushes changes to the GIT repository.
  2. AWS CodePipeline polls and fetches updates from GitHub.
  3. AWS CodePipeline calls Jenkins to trigger a new build.
  4. Jenkins downloads source code from GitHub, then builds artifacts using Talend CI-Builder and Talend CommandLine.
  5. Jenkins publishes artifacts to Nexus.

 

Assumptions

  1. Amazon Web Services (AWS):

    • You should be familiar with the AWS platform, since this article does not explain details of the Administration and Management of AWS services. Refer to the Amazon Web Services (AWS) - Getting Started to read about the AWS functionality that Talend provides.
    • You should have full access to the main AWS services described in the Prerequisites section below.
  2. Talend:

  3. Maven: You should have a basic knowledge of Maven.

 

Environment

  • AWS Platform will be used to host the tutorial
  • Talend Data Integration 6.3

 

Prerequisites

  1. A valid AWS account with full access to following services:

  2. Valid AWS Access Keys to programmatically access AWS services

  3. A GitHub account. A free account is enough for this tutorial. You could also use any other GIT distribution you are familiar with.
  4. Talend Product (any commercial edition)—https://www.talend.com/products. The following modules will be used:

    • Talend Studio
    • Talend CI-Builder
    • Talend CommandLine
    • Talend Artifact Repository (Nexus)

 

Set up the tools

 

GitHub

  1. Create a GitHub account or use your existing one.
  2. Create a repository to host the Talend project sources. For this tutorial, call it DEMOS. Using a public and free GIT repository is fine for this tutorial.

  3. Click Create Repository.

    create_repo.png

     

 

Choose an AWS Region

Connect to the AWS console. Choose a region where AWS CodePipeline is available, since CodePipeline is not available in all AWS regions now.

Note: Other considerations when choosing an AWS region include latency and compliance requirements.

 

For this demo, choose the Frankfurt region.

 

Create an IAM Role for Jenkins Integration

You must create the Identity and Access Management (IAM) role that will be attached to the EC2 instance running Jenkins. This will authorize Jenkins to interact with AWS CodePipeline.

  1. Sign in to the IAM console.
  2. Choose Roles, then choose Create New Role.
  3. Type the name of the role to create, such as JenkinsAccessRole.

    rolename.png

  4. Click Next Step.
  5. Select Role Type: Amazon EC2.
  6. Attach Policy: choose the managed policy AWSCodePipelineCustomActionAccess.
  7. Click Next Step.
  8. Click Create Role.

The role JenkinsAccessRole was successfully created.

 

Launch an EC2 instance

For this demonstration, you will use an EC2 instance to host Jenkins and Talend software (CommandLine, CI-Builder, and Nexus). For simplicity, you will install Nexus on the same server, even if you could choose to install it on a separate host in your environment.

  1. Connect to the EC2 console and choose the Frankfurt region.
  2. Click Launch Instance.
  3. Choose an Amazon Machine Image: select Amazon Linux AMI, usually the first choice in the list.

    amazon_lx_ami.png

  4. Choose an Instance type: select m4.xlarge, which is a good fit for this demo.
  5. Next: Configure Instance Details.

    • Number of instances = 1
    • Network: choose your default VPC
    • Subnet: select your default or preferred subnet
    • Auto-assign Public IP: select Enable
    • IAM Role: select JenkinsAccessRole
    • Keep all other settings with default values

      configure_instance.png

  6. Next: Add storage.

    • Size = 100GB
  7. Next: Add Tags.

    • Name = Talend 6.3 Continuous Integration
  8. Next: Configure Security Group.

    • Add an SSH rule with values.

      • Type = ssh
      • Protocol = TCP
      • Port range = 22
      • Source = Custom 0.0.0.0/0, ::/0
    • Add a custom TCP rule for accessing the Jenkins web UI.

      • Type = Custom TCP Rule
      • Protocol = TCP
      • Port range = 8081
      • Source = Custom 0.0.0.0/0, ::/0
    • Add a custom TCP rule for accessing Nexus.

      • Type = Custom TCP Rule
      • Protocol = TCP
      • Port range = 9080
      • Source = Custom 0.0.0.0/0, ::/0
  9. Review and Launch

    • Ignore the warning on security group this time, even though you should avoid opening ports to 0.0.0.0/0 in real environments.
  10. Launch the instance.

 

Install Talend Components

Once the EC2 instance has started, create a directory called /opt/talend. Use it as the base directory for your Talend installation.

 

Follow the official Talend installation documentation (available from the Talend Help Center) to install the following Talend components under /opt/talend folder:

  • Talend CommandLine
  • Talend CI-Builder
  • Talend Administration Center
  • Talend Artifact Repository
  • Talend Studio, or use the local Studio on your laptop if you have one

 

Create a Talend project hosted on GIT

This tutorial requires a Talend project that uses GIT as storage.

 

Use Talend Administration Center and Talend Studio to create and configure a simple remote project:

  1. Connect to the Talend Administration Center console.
  2. Navigate to the Projects view:

    projects_view.png

     

  3. Click Add to create a new project.
  4. Use the following values to create the project:

    1. Label = Demos_CI
    2. Project type = Data Integration/ESB
    3. Storage = GIT
    4. Click the Advanced Settings checkbox

      1. Url: URL of the DEMOS GIT repository created earlier
      2. Login: your GIT login
      3. Password: your GIT password

        project_create.png

  5. Click Check connection to check the connection with GitHub.
  6. Click Save.

Congratulations! You have successfully created your Talend project hosted on GitHub. You can now launch your Talend Studio, create a remote connection to this project, and import the sample job hello_job.zip provided as an Attachment to this article.

 

Install and configure Apache Maven

 

Install Maven

  1. Connect to the EC2 instance.
  2. Create the folder /opt/maven.
  3. Download Apache Maven with the two commands below:

    wget http://apache.mirrors.ovh.net/ftp.apache.org/dist/maven/maven-3/3.3.9/binaries/apache-maven-3.3.9-bin.tar.gz
    tar xvfz apache-maven-3.3.9-bin.tar.gz
  4. Edit your .bash_profile file to add the Maven home and path.

    # .bash_profile
    # Get the aliases and functions
    if [ -f ~/.bashrc ]; then
            . ~/.bashrc
    fi
    # User specific environment and startup programs
    JAVA_HOME=/opt/java/jdk1.8.0_121
    export JAVA_HOME
    MAVEN_HOME=/opt/maven/apache-maven-3.3.9
    export MAVEN_HOME
     
    PATH=$MAVEN_HOME/bin:$JAVA_HOME/bin:$PATH:$HOME/.local/bin:$HOME/bin
    export PATH
  5. Check that Maven is configured properly by typing the command mvn -version:

    [ec2-user@ip-xxx-xx-xx-xx ~]$ mvn -version
    Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T16:41:47+00:00)
    Maven home: /opt/maven/apache-maven-3.3.9
    Java version: 1.8.0_121, vendor: Oracle Corporation
    Java home: /opt/java/jdk1.8.0_121/jre
    Default locale: en_US, platform encoding: UTF-8
    OS name: "linux", version: "4.4.41-36.55.amzn1.x86_64", arch: "amd64", family: "unix"
    [ec2-user@ip-xxx-xx-xx-xx ~]

 

Configure the Maven settings file

  1. Navigate to the Maven conf directory under the Maven base directory:

    cd /opt/maven/apache-maven-3.3.9/conf
  2. Locate and edit the settings.xml file and make the following changes:

    1. Set the localRepository path.

      1. Uncomment this line:

        <localRepository>/path/to/local/repo</localRepository>
      2. Set the local repository to your CommandLine repository path.

        <localRepository>/home/ec2-user/talend/commandline/configuration/.m2/repository</localRepository>

        Note: The .m2 folder will be created automatically when CommandLine starts.

    2. Set the Nexus server credentials.

      1. Add the following lines to the settings file.

        Note: Update it with the correct values for Nexus username and password.

        <servers>
             <server>
                <id>tac</id>
                <username>your nexus username</username>
                <password>your nexus password</password>
             </server>
        </servers>
  3. Save your changes, then close the file.

 

Create the BuildJob_pom.xml file

As described in the Talend Software Development Life Cycle Best Practices Guide, Talend relies on Apache Maven and Talend CommandLine to generate sources through Talend CI-builder.

 

For this tutorial you will create a POM file (in XML format), called BuildJob_pom.xml, which holds information on the Maven project, its configuration, and the instructions to generate sources. You will use the local-generation mode, which means that you do not need to launch the CommandLine, as it will be launched and shut down automatically.

  1. Log in to your Talend EC2 instance with putty or another SSH client.
  2. Navigate to the Talend CI-builder directory, for example /opt/talend/Talend-CI-Builder-20161216_1026-V6.3.1.
  3. Create a file called BuildJob_pom.xml.
  4. Edit the BuildJob_pom.xml file, then copy and paste the content below.

    Note: Adapt the parameters to your environment-specific values.

    <project xmlns="http://maven.apache.org/POM/4.0.0" 
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
    http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>org.talend</groupId>
    <artifactId>buildsources</artifactId>
    <version>A.B.C</version>
    <properties>
    <!-- Required. Commandline application workspace and Studio path, only for
    local(script) mode -->
    <commandline.workspace>commandline_workspace_path</commandline.workspace>
    <product.path>studio_path</product.path>
    <!-- Optional. Specify target directory where generated sources will be stored
    -->
    <projectsTargetDirectory>${basedir}/projectSources</projectsTargetDirectory>
    <!-- Optional. Specify version for the artifact to be built. Can be set for
    each Job independently -->
    <deploy.version>A.B.C-SNAPSHOT</deploy.version>
    </properties>
    <build>
    <plugins>
    <plugin>
    <groupId>org.talend</groupId>
    <artifactId>ci.builder</artifactId>
    <version>A.B.C</version>
    <executions>
    <execution>
    <phase>generate-sources</phase>
    <goals>
    <!-- local(script) mode -->
    <goal>local-generate</goal>
    </goals>
    <configuration>
    <!-- Optional. Specify CommandLine user -->
    <commandlineUser>jobbuilder@talend.com</commandlineUser>
    <!-- Optional. Jvm Parameters for local(script) mode -->
    <!-- <jvmArguments>-ea -Xms512m -Xmx1300m -XX:
    +CMSClassUnloadingEnabled -XX:MinHeapFreeRatio=10 -XX:MaxHeapFreeRatio=20 -XX:
    +HeapDumpOnOutOfMemoryError</jvmArguments> -->
    <!-- Optional. Parameter used to filter on specific Job
    status (TEST, DEV, PROD, etc) -->
    <!--
    <itemFilter>(status=TEST)or(status=PROD)or(status="")</itemFilter> -->
    </configuration>
    </execution>
    </executions>
    </plugin>
    </plugins>
    </build>
    <pluginRepositories>
    <!-- everything through Maven Central Proxy -->
    <pluginRepository>
    <id>Central</id>
    <name>Central</name>
    <url>http://<host>:<port>/nexus/content/repositories/central</url>
    </pluginRepository>
    <pluginRepository>
    <id>thirdparty</id>
    <name>thirdparty</name>
    <url>http://<host>:<port>/nexus/content/repositories/thirdparty/</url>
    </pluginRepository>
    </pluginRepositories>
  5. Save and close the file.

 

Upload the CI-Builder plugin in Nexus

The following steps come from the Talend Software Development Life Cycle Best Practices Guide. They have been adapted to this demonstration.

  1. Navigate to the Talend-CI-Builder directory:

    cd /home/ec2-user/talend/Talend-CI-Builder-20161216_1026-V6.3.1
  2. Execute the following command:
    mvn install:install-file -Dfile=ci.builder-6.3.1.jar -DpomFile=ci.builder-6.3.1.pom
  3. Launch a web browser and connect to your Nexus web application.
  4. Click Repositories, then select the thirdparty repository.
  5. Click the Artifact Upload tab, and upload the CI Builder POM and JAR files you just installed on your machine.

The Maven plugin is now available to anyone, and can be incorporated into your builds. For more information on the Nexus thirdparty repository, see the Nexus documentation.

 

Install and Configure Jenkins

 

Install Jenkins

Since installing Jenkins is easy and straightforward, this tutorial will not explain this process in detail. The overall procedure is as follows:

  1. Log in to your EC2 instance with putty or another SSH client.
  2. Create the directory /opt/jenkins, which will be the base directory of your Jenkins installation.
  3. To perform a correct installation, follow the Installing Jenkins documentation. Opt for the installation with Tomcat and Jenkins war file.
  4. Configure Jenkins to listen on port 9080. You can use any other port, as long as you set up the security group accordingly.

After installing and configuring Jenkins, you should now be able to access the Jenkins dashboard with your favorite web browser as shown below.

jenkins_dashboard.png

 

Install the Jenkins AWS CodePipeline plugin

  1. Click Manage Jenkins.
  2. Select Manage Plugins.
  3. Choose the Available tab, then search for the AWS CodePipeline plugin in the Filter search box.

    plugin.png

     

  4. Select the plugin, then click Download now and install after restart.
  5. Restart Jenkins when installation is complete.

 

Create the Jenkins projects

The steps in this section are based on those in the Talend Software Development Life Cycle Best Practices Guide. They have been adapted to this demonstration.

 

Create the BuildSources Jenkins project

  1. Connect to the Jenkins web UI.
  2. Click New Item.

    new_item.png

  3. Enter an Item Name: BuildSources
  4. Choose Freestyle Project, then click OK.
  5. The Project configuration page appears:

    project_config.png

     

  6. On the Source Code Management tab, select AWS CodePipeline, then configure the values as shown below:

    1. AWS Region = EU_CENTRAL_1
    2. AWS Acces Key: use your access key
    3. AWS Secret Key: use your secret key
    4. Category = Build
    5. Provider = JenkinsBuildSources

      src_code_mgmt.png

       

  7. On the Build Triggers tab, select Build periodically and configure the schedule to poll the SCM as often as you want, for example every minute.

    build_triggers.png

     

  8. On the Build tab, invoke a Maven target associated with Talend CI-Builder to generate sources.

    1. Add a build step with type Invoke top-level Maven targets.

      build_step.png

    2. In the Goals field, enter the following command:

      org.talend:ci.builder:6.3.1:local-generate
    3. Click Advanced, then configure the following values:

      1. POM: enter the path to the global BuildJob_pom.xml on the server
      2. Settings file:

        1. Select Settings in file system.
        2. File path: configure the path to the Maven settings file, for example /opt/maven/apache-maven-3.3.9/conf/settings.xml.

          path_to_maven.png

           

  9. On the Post-build Actions tab, add a post-build action.

    1. Choose AWS CodePipeline Publisher.

      post-build.png

    2. Do not add any output locations.
    3. Click Save.

 

Create the RunTests Jenkins project

  1. Connect to the Jenkins web UI.
  2. Click New Item.

    new_item.png

  3. Enter an Item Name: RunTests
  4. Choose Freestyle Project, then click OK.
  5. The Project configuration page appears.

  6. On the Source Code Management tab, select AWS CodePipeline, then configure the values as shown below:

    1. AWS Region = EU_CENTRAL_1
    2. AWS Acces Key: use your access key
    3. AWS Secret Key: use your secret key
    4. Category = Build
    5. Provider = JenkinsRunTests

      src_code_mgmt2.png

       

  7. On the Build Triggers tab, select Build periodically and configure the schedule to poll the SCM as often as you want, for example every minute.

    build_triggers.png

     

  8. On the Build tab, invoke a Maven target associated with Talend CI-Builder to run the test.

    1. Add a build step with type Invoke top-level Maven targets.

      build_step.png

    2. In the Goals field, enter the following command:

      test -fn -e
    3. Click Advanced, then configure the following values:

      1. POM: enter the path to the global BuildJob_pom.xml on the server
      2. Settings file:

        1. Select Settings in file system.
        2. File path: configure the path to the Maven settings file, for example /opt/maven/apache-maven-3.3.9/conf/settings.xml.

          path_to_maven2.png

           

  9. On the Post-build Actions tab, add a post-build action.

    1. Choose AWS CodePipeline Publisher.

      post-build.png

    2. Do not add any output locations.
    3. Click Save.

 

Create the DeployToNexus Jenkins project

  1. Connect to the Jenkins web UI.
  2. Click New Item.

    new_item.png

  3. Enter an Item Name: DeployToNexus
  4. Choose Freestyle Project, then click OK.
  5. The Project configuration page appears.

    project_config2.png

     

  6. On the Source Code Management tab, select AWS CodePipeline, then configure the values as shown below:

    1. AWS Region = EU_CENTRAL_1
    2. AWS Acces Key: use your access key
    3. AWS Secret Key: use your secret key
    4. Category = Build
    5. Provider = JenkinsDeployToNexus

      src_code_mgmt3.png

       

  7. On the Build Triggers tab, select Build periodically and configure the schedule to poll the SCM as often as you want, for example every minute.

    build_triggers.png

     

  8. On the Build tab, invoke a Maven target associated with Talend CI-Builder to run the test.

    1. Add a build step with type Invoke top-level Maven targets.

      build_step.png

    2. In the Goals field, enter the following command:

      test -fn -e
    3. Click Advanced, then configure the following values:

      1. POM: enter the path to the global BuildJob_pom.xml on the server
      2. Properties: Enter the Maven parameter to indicate the URL of the Nexus snapshots repository for deployment, for example http://localhost:8081/nexus/content/repositories/snapshots/
      3. Settings file:

        1. Select Settings in file system.
        2. File path: configure the path to the Maven settings file, for example /opt/maven/apache-maven-3.3.9/conf/settings.xml.

          path_to_maven3.png

           

  9. On the Post-build Actions tab, add a post-build action.

    1. Choose AWS CodePipeline Publisher.

      post-build.png

    2. Do not add any output locations.
    3. Click Save.

 

Create and configure the AWS CodePipeline

If you are not familiar with CodePipeline, you may find that reading What is AWS CodePipeline? before proceeding may be helpful.

Create the Talend Pipeline

  1. Connect to AWS CodePipeline.
  2. Click Create Pipeline.

    • AWS may show you a Welcome screen with a Get Started button if you have never used CodePipeline before. Click Get Started.
  3. Pipeline name: TalendPipeline.

    pipeline_name.png

     

  4. Click Next Step.
  5. Source location: select GitHub in the drop-down list.

    src_location.png

     

  6. Connect to GitHub.

    1. Click Connect to GitHub. This opens a new window to configure/authorize access for AWS CodePipeline to GitHub, so follow the instructions and give the needed credentials.

      connect_github.png

       

    2. Once the connection is configured, click the Repository field in the Create Pipeline wizard to display the list of your GitHub repositories, including the DEMOS repository that you created before.

      • Repository: Select the DEMOS repository
      • Branch: master

        Note: if you do not see the master branch, verify that you have previously created a Talend project hosted on GitHub through the Talend Administration Center dashboard.

         

        connect_github2.png

  7. Click Next Step.

  8. Add Jenkins.

    1. Build provider: select Add Jenkins

      build_provider.png

       

    2. Add Jenkins

      • Provider name = JenkinsBuildSources
      • Server URL = your Jenkins URL

        For example http:/ /xx.xxx.xx.xx:9080/jenkins where xx.xxx.xx.xx is the IP address of the EC2 instance hosting Jenkins.

      • Project name = BuildSources

        add_jenkins.png

         

  9. Click Next Step.
  10. Beta—Deployment provider: select No Deployment.

    deployment.png

     

  11. Click Next Step.
  12. Create a Service role.

    1. Click Create Role to create a new Service role.

      This opens a window with the new Role Summary and a message saying Choose Allow to grant AWS CodePipeline read and write access to resources in your AWS account.

    2. Click Allow to create the role.

      A new role is created and the role name is automatically populated.

      servce_role.png

       

  13. Click Next Step.
  14. Review your pipeline details, then click Create pipeline to proceed with the creation of the pipeline.

    review.png

     

  15. Congratulations! You have successfully created your pipeline.

    pipeline_complete.png

     

The pipeline only contains two steps at this stage. Since you have created three different Jenkins projects (BuildSources, RunTests, and Deploy2Nexus), you must edit the pipeline and add steps to call the other two projects.

 

Edit and configure the Talend pipeline

  1. Click Edit to start editing the pipeline.

    edit_pipeline.png

     

  2. Add a new stage right after the Build stage, then configure it.

    1. Click Stage.

      build_stage.png

       

    2. Call the new stage Tests.

      tests.png

       

    3. Click Action. This displays the Add Action form.

      • Action category = Test
      • Test Actions:

        1. Action Name = JenkinsTests
        2. Test provider = Add Jenkins
      • Add Jenkins

        1. Provider Name = JenkinsRunTests
        2. Server URL = your Jenkins URL

          For example http:/ /xx.xxx.xx.xx:9080/jenkins where xx.xxx.xx.xx is the IP address of the EC2 instance hosting Jenkins.

        3. Project name = RunTests

    4. Click Add action to save the new action.

      add_action.png

       

  3. Add another stage after the Tests stage.

    1. Click Stage and call the new stage DeployToNexus.

      deployToNexus.png

       

    2. Click Action. This displays the Add Action form.

      • Action category = Build
      • Build Actions:

        1. Action Name = DeployToNexus
        2. Test provider = Add Jenkins
      • Add Jenkins

        1. Provider Name = JenkinsDeployToNexus
        2. Server URL = your Jenkins URL

          For example http:/ /xx.xxx.xx.xx:9080/jenkins where xx.xxx.xx.xx is the IP address of the EC2 instance hosting Jenkins.

        3. Project name = DeployToNexus
    3. Click Add Action to save the new action.

      add_action2.png

       

  4. Click Save pipeline changes.

    save.png

     

You just saved changes to your pipeline. If you need to update your pipeline again, click Edit, make your changes, then save again.

 

Test the architecture

Your pipeline is fully configured and ready to be tested. To do so, you need to go back into Talend Studio, make changes to your sample Job, then push the changes to GIT.

hello_job.png

 

AWS CodePipeline detects the source changes and automatically launches a new build that goes through all the phases: JenkinsBuildSources > JenkinsTests > JenkinsDeployToNexus.

pipeline_test.png

 

When all Jenkins phases are successfully executed, a new artifact is deployed in your Nexus repository.

new_repo_item.png

 

For more information about pipelines, see the AWS Create a Four-Stage Pipeline tutorial.

Version history
Revision #:
24 of 24
Last update:
‎09-19-2017 09:45 AM
Updated by:
 
Labels (1)