Talend 7.2.1 CI zero install workflow explained

Introduction

Talend, as a Java code generator, leverages a standard Maven-based Continuous Integration (CI) implementation.

 

In the past, setting up a Talend CI environment was not a straightforward process. It required you to build or extend a dedicated CI agent/slave with a pre-installed and configured Talend CommandLine. This requirement added an extra step in the implementation of a common Java-based CI/CD environment. It also prevented the usage of existing standard Maven-based agents generally available on cloud CI managed services, such as AWS CodeBuild, Azure DevOps, and GitLab.

 

Talend 7.2.1 introduces a frictionless CI/CD implementation with the concept of Talend CI zero install workflow, where the required Talend CommandLine is now downloaded and installed, on demand, during the build process.

 

Implementation workflow

Talend Studio/CommandLine is built on the Eclipse platform. The zero install workflow is based on the Eclipse P2 provisioning system. P2 provides a way to automate the installation of applications on the Eclipse platform. For more information, see the Eclipse Foundation Equinox p2 documentation page.

 

The automated download and installation of the CommandLine is done using the Talend builder Maven plugin called CI Builder.

 

Important: The CI Builder plugin (builder-maven-plugin) implements the zero install workflow, so the plugin must be available to Maven during the build process. Talend Studio/Commandline 7.2.1 unpacks its own version of the CI Builder plugin, as well as the other Talend Maven plugins (cloudpublisher-maven-plugin and signer-maven-plugin), and copies them into the Studio/CommandLine local Maven .m2 repository during the first execution of Studio/CommandLine (.m2 package: org.talend.ci). These plugins are uploaded automatically to the Nexus/Artifactory talend-custom-libs-release repository during Studio sync. In the CI environment, Maven can now access the plugin directly from this repository. The talend-custom-libs-release repository must be referenced in the pluginRepositories section of the Maven settings.xml file.

  <pluginRepositories>
      <pluginRepository>
        <id>central</id>
        <name>central</name>
        <url>${env.TALEND_NEXUS_URL}repository/maven-central/</url>
        <layout>default</layout>
      </pluginRepository>
      <pluginRepository>
        <id>talend-custom-libs-release</id>
        <name>nexus-third-party</name>
        <url>${env.TALEND_NEXUS_URL}repository/talend-custom-libs-release</url>
        <layout>default</layout>
      </pluginRepository>
    </pluginRepositories>

In 7.1.1, these Talend Maven plugins are unpacked into the Studio local .m2 repository. However, unlike in 7.2.1, they are not uploaded to the remote talend-custom-libs-release repository, so a separate installation/upload into a thirdparty repository is required.

 

If a 7.1.1-compatible configuration is required, for instance in the context of an upgrade to 7.2.1, Talend released a separate download package for CI Builder named Talend-CI-Builder-Maven-Plugin-20190620_1446-V7.2.1.zip, for a dedicated install in the thirdparty repository.

nexus3-thirdparty.png

 

CI Builder checks whether the CommandLine is installed locally. If not, it uses the value of the new Maven parameter, updatesite.path, to download and install it. The installation process is done in two steps:

  1. CI Builder downloads a set of p2 plugins from the update site.
  2. Then it runs the p2 installer to download, install, and configure the CommandLine from the same update site.

 

CI Builder checks for the presence of the CommandLine installation in the folder set by the product.path parameter. This was needed in Talend 7.1.1, but in 7.2.1 the product.path parameter is optional, and to avoid any installation issues related to authorization, Talend recommends that you avoid using it.

  • If product.path is not set, the p2 installer uses the user's home directory as the root folder for the installation path.

    • Linux: /${user.home}/.installation/.commandline
    • Windows: C:\User\current user\.installation\.commandline
  • If it is set, the plugin uses that path as the target location to install the CommandLine, for example, /opt/talend/commandline. The user running the Maven command (such as Jenkins in a Jenkins agent) must have write access to this directory. If it doesn't, the installation, and therefore the build, will fail.

By omitting the product.path parameter, you guarantee that each Maven user will have write access to their own home directory.

 

Workflow diagram

The following diagram summarizes the overall workflow:

maven-build.png

 

Execution log

As mentioned earlier, the first time the Jenkins Job is executed, CI Builder checks if a Talend CommandLine is installed on the (product.path) path. If not, it will start the P2 installation using the updatesite.path. The Maven build log looks like this:

 

job_log1.png

 

In the context of a persistent build agent (VM or physical server) where the filesystem is preserved, subsequent builds will leverage the newly installed CommandLine without a new installation.

 

job_log2.png

 

Talend license file

The newly installed CommandLine does not contain the license file. The license file location is set by the new license.path parameter. The value can be a local file path or a URL from a download site.

 

Talend 7.2.1 Maven command

A Maven command in Talend 7.2.1 CI looks like this:

mvn -s <setting.xml path> \
-f <project name>/poms/pom.xml \
-am -pl <comma separated list of jobs modules> \
-D generation.type=local \
-D license.path=<local path or url> \
-D updatesite.path=<local path or url> \
clean package

Note the use of the new license.path and updatesite.path parameters.

 

Note: The generation.type=local parameter is still required, even though generation.type=server, needed to use the CommandLine in server mode, has been deprecated since Talend 7.1.1.

 

Maven parameters

In addition to the updatesite.path and license.path parameters, additional parameters are available for the configuration of the zero install:

  • forceUpdate: Forces the installation of the CommandLine from the update site, even if a local installation is present
  • p2Installer.path: Uses a default value of ${user.home}/.installation/.p2Installer

In addition to the product.path parameter for the CommandLine installation, you can also use the p2Installer.path to specify a path other than the default user home directory, though Talend recommends letting CI Builder use the default value.

 

If the HTTP site is secured with Basic Authentication, the provided updatesite.remote.user and updatesite.remote.password are used as credentials.

updatesite.remote.user
updatesite.remote.password

 

If the license.path value is a URL, the license file can also be downloaded automatically. license.remote.user and license.remote.password are the credentials for the Basic Authentication.

license.remote.user
license.remote.password

 

Talend Job dependencies are listed in the Job pom.xml definition and are resolved (downloaded) by Maven using the Maven settings.xml file. Talend Custom components, however, are not listed in the pom.xml file and are unknown to Maven. They must be downloaded by CommandLine. For this reason, you must set the location (Maven repository) of these components in the install-path/configuration/config.ini file.

components.nexus.url
components.nexus.repository
components.nexus.user
components.nexus.password

 

Prior to 7.2.1, those parameters were configured during the installation of CommandLine. In 7.2.1, if present, these parameters are used to update the new installed config.ini file.

nexus.url=<nexus url>
nexus.lib.repo=<nexus repository>
nexus.user=<nexus user>
nexus.password=<nexus password>

 

In addition to the p2 and commandline installation, CI Builder can also download and install Studio/CommandLine patches from a list of comma separated URLs. patch.remote.user and patch.remote.password provide the credentials for the basic authentication.

patch.path (comma separated list of patch urls)
patch.remote.user
patch.remote.password

 

P2 update site setup

As stated in the Talend Software Development Life Cycle Best Practices Guide, on the Before scheduling the execution of your artifacts page, the update site can be set using a standard HTTP server (Apache, NGINX, and others) or an application server such as Tomcat.

 

AWS S3 web site

Using the HTTP server approach, instead of installing or configuring a dedicated HTTP server from the update site, you can leverage the capability of an AWS S3 bucket to be configured as a static content web site.

 

You can follow the different AWS documentation pages to set and configure an S3 bucket for this purpose:

When the S3 web site is set (including the public access bucket policy), download the Talend_Full_Studio_p2_repository-20190620_1446-V7.2.1.zip file from the link provided in your license email. Unpack it and upload the full content to the bucket using the AWS S3 web site or AWS CLI.

 

An S3 bucket web site has been set for this purpose. The content looks like this:

s3bucket.png

 

Note: The index.html file is optional. It's not part of the p2 package. It was added to display a default html page when accessing the site URL from a browser.

 

The update site URL naming convention is based on the bucket name and the region where you created the bucket:

http://bucket name-s3-website-region-amazonaws.com

s3bucketwebsite.png

 

Alternatives

 

Azure Storage

Similar to AWS S3, Azure provides a way to configure Azure Storage as a static web site. For more information, see the Microsoft Azure Static website hosting in Azure Storage page.

 

Nexus

To consolidate all the Talend CI assets in a common location, Nexus can be used to store the p2 site, in addition to the Talend Maven Job dependencies and third-party repositories.

 

Nexus 2 Pro natively supports the P2 repository type. You can add this feature to the Nexus OSS by installing an additional plugin. For more information, see the Sonatype P2 Repositories page.

 

At the time of this writing, Nexus3 doesn't support the P2 repository yet but an RFE ticket has been opened for it. For more information, see the Sonatype Nexus 3.x P2 Repository Format Support page.

 

Artifactory

As with Nexus3, Artifactory supports P2 repositories as well. For more information, see the JFrog Artifactory P2 Repositories page.

 

Implementation: Jenkins

The following section shows you how to leverage the CI zero install in Jenkins.

 

P2 update site as global variable

Instead of copying the value of the update site into each Jenkins Job, Talend recommends that you create a Jenkins Global property that will be injected as an environment variable in each of them.

 

  1. In Jenkins, navigate to the Manage Jenkins > Configure System menu.

    jenkins-config-system.png

     

  2. Scroll to the Global properties section and add a new property/variable. Use the same technique for the Nexus (or Artifactory) root URL.

    jenkins_globalvars2.png

 

Talend license file as a secret file

As mentioned earlier, the license.path can identify either a local file path or a URL where the license can be downloaded from.

 

If the license is downloaded from a URL, the site should be secured by, at least, Basic Authentication. AWS S3 bucket web site doesn't support a direct security layer of this type (unless combined with Amazon CloudFront) so the other technique used here is to save the license file as a secret file in Jenkins. The file is encrypted in Jenkins and copied, on demand, on the build agent requesting it. The file is deleted by Jenkins at the end of the build. This will guarantee some protection against sharing the license file. Other CI platforms, such as Azure DevOps and GitLab, support a similar feature.

 

  1. In Jenkins, navigate to the Credentials > System > Global credentials (unrestricted) > Add Credentials menu.

  2. Add a new credential of type Secret file. Set a unique ID and upload the license file using the Browse... button.

    jenkins_talendlicense.png

    The next sections show you how these two variables can be leveraged in the Jenkins Jobs.

 

Jenkins Job: Freestyle

  1. In a Jenkins Freestyle Job definition, scroll to the Bindings section, then select Secret file from the Add drop-down list.

    job-add-secretfile.png

     

  2. Provide a name for the global variable that will reference the path of the license file copied by Jenkins during the build.

  3. In the Credentials list, select the Secret file > Specific credentials you previously set (it may be selected automatically).

    jenkins_license_secretfile.png

     

  4. In the Invoke top-level Maven targets build step, you can now initialize the license.path and updatesite.path Maven parameters with their respective environment variables.

    jenkins_maven_step.png

     

Jenkins Job: Declarative Pipeline

The same approach can be used in a Pipeline Job.

 

Set the TALEND_721_LICENSE variable using the credentials directive in the environment section. It is then used in the Maven command as an env environment variable, as in env.TALEND_CMDLINE_UPDATE_SITE.

jenkins-pipeline.png

 

Container-based build agent

One important aspect of the introduction of zero install is the flexibility to set up container-based build agents, which are starting to become the standard for a CI/CD environment. Jenkins provides different ways to implement these build agents, thanks to a vast plugin ecosystem that includes Docker, Kubernetes, and Cloud providers' own container-based solutions such as AWS ECS and Fargate.

 

This article uses Docker. Numerous plugins exist to configure Docker container build agents. The section below shows one of the most popular.

 

Docker plugin

If you already have a Jenkins server set up, verify that you have the Docker plugin installed. If not, go to the menu Manage Jenkins > Manage Plugins and install the Docker plugin:

docker_plugin.png

 

Docker cloud

You need to configure the Docker installation. This is done on the Jenkins Cloud tab.

  1. Navigate to Manage Jenkins > Configure System.

  2. Scroll to the end of the page to add a new Cloud entry. Select Docker.

    docker_cloud.png

    Note: You can use Kubernetes or AWS ECS as well, but they require additional configuration. That will be the topic of another article.

  3. Enter a unique name and the host URI. You can set multiple Docker cloud definitions based on the number of agents you have.

    For this example, Docker has been installed on the Jenkins server and will be used as the main Docker agent provider, hence the name talend-721-docker-local.

  4. Because the Docker daemon is installed on the Jenkins server, the Docker Host URI needs to be set using the UNIX socket connector unix:///var/run/docker.sock, or left blank if the DOCKER_HOST environment variable is set on the server.

    docker_cloud-def.png

     

    As the documentation mentioned (see the screenshot above), if the daemon is installed on a remote agent, it must be configured appropriately and the Docker Host URI will be set as tcp://<agent host>:2376. Additional server credentials can be set if necessary.

 

Docker Agent templates

As mentioned in the introduction, Talend 7.2.1 CI zero install allows you to leverage standard agents for the build process without the need to pre-install the CommandLine.

 

To demonstrate this, configure an agent based on the official Docker Hub Maven image.

maven-agent.png

In addition to Maven, the image already contains the required JDK and Git client.

 

  1. Click the Docker Agent templates button to create the agent definition.

    docker-agent-template.png

     

  2. On the configuration page, set a unique label for the agent. This will be the label used in the Jenkins Jobs. For the Docker image, adding the name maven specifies the latest Maven image. If a specific version is required, you can indicate it, for example maven:3.6.1-jdk-11.

    maven-agent-def.png

     

    Important: If the Talend build uses the Maven docker profile (-P docker) to build Job images, the agent needs to use the Docker daemon from the host (the Jenkins server in this case) as the maven image doesn't have Docker installed.

  3. To configure the agent for this purpose, click the Container settings button and add /var/run/docker.sock:/var/run/docker.sock in the Volumes field.

    docker_cloud-def2.png

    Talend recommends keeping this configuration by default to support all types of builds.

     

Due to the ephemeral nature of containers, the build environment created in the container will be discarded at the end of each build, including the CommandLine installation. This guarantees a consistent and reproducible build environment. However, even though the CommandLine size has been reduced considerably, it still takes a non-negligible time (about two minutes, depending on your environment) to download and install it each time.

 

If build speed is a concern, you can make the CommandLine installation persistent across all the builds and among all the containers generated from the same template by mounting a dedicated volume. Using the official maven image, the default maven user is root and its home directory is set to /root.

 

Based on the default value of product.path: ${home.dir}/.installation/.commandline, the CommandLine will be installed at /root/.installation/commandline. To make the CommandLine installation persistent, set a Docker volume to point to this folder. The volume can be a named volume, or can just point to a host folder. This example uses the second option, and points to the host /tmp folder.

 

On the Docker agent template > Container Settings tab, add the mapped volume /tmp:/root/.installation/.commandline under the existing /var/run/docker.sock:/var/run/docker.sock volume you set earlier.

cmdline_volume_mount.png

 

Usage

Freestyle Job

  1. In a Freestyle Job, check the option Restrict where this project can be run (just above the Source Code Management section) and enter the name of the agent you set earlier.

     

    Be sure that it is recognized by Jenkins by checking the message underneath.

    freestyle-docker-agent.png

     

Pipeline Job

In a Pipeline Job, the agent is set in the agent directive using the label option.

pipeline-docker-agent.png

 

Version history
Revision #:
50 of 50
Last update:
‎08-13-2019 04:46 AM
Updated by: