How to schedule a Talend Job with Kubernetes

This article explains how to schedule a Talend Job as a Kubernetes Job and how to use Kubernetes as a Job orchestrator.


Sources for this project are available in the attached Zip files.


Talend Job



  1. Create a Talend Job to generate random data and display it in the output console.




  2. Configure the tRowGenerator component as follows:

    • Identifier: A random value for each row

    • Firstname: A generated first name

    • Lastname: A generated last name

    • City: A generated city

  3. Specify the number of rows to generate using the context variable numRow, as shown in Figure 2. This helps you configure Jobs in Kubernetes.






You will find the Job Logrow 0.1 in the file attached to this article.



Before you can build a container image, you need to build the Job package. To build a Job, follow these steps:

  1. Right-click your Job, then select Build Job.


    Build-job.pngFigure 4: Build Job

  2. Keep the default configuration and click Finish.




Configuring the Dockerfile

Now that you have a zip file containing your batch process, you need to configure a Dockerfile before you can build the Docker image. For this example, the Dockerfile and all other mentioned files are in the file attached to this article.


The Dockerfile is composed of three sections:

  • Arguments
  • Java download
  • Job configuration



This section contains all the arguments needed to configure the build.




Java download

This section uses a multi-stage build, with a step dedicated to downloading a JRE. This download uses parameters from the arguments section.




Job configuration

This section explains how to build the container image for your Job. You can split this section into multiple parts:


  • Arguments: map some of the parameters defined in the first section of the Dockerfile



  • Labels: allow you to define and document your image



  • Environment variables: provide information and help to configure the running process


    In this example, the variable NUMROW allows you to configure the context variable numRow.


  • Installation: installs the JVM and your Job in a folder /opt/talend



  • Run User: the process runs as the user talend



  • Run command: the CMD runs the Job at each startup, using the environment variable NUMROW to overwrite the context variable



Once your Dockerfile is configured, copy the file you generated earlier to the same folder, as shown below.



Building the Docker image

  1. To build the image, run the following command in the folder where the Dockerfile and the zip file are located:

    docker build -t username/logrow:0.1.0 .

    Replace username with your Docker Hub username.


  2. At the end of the build process, you should see something like:

    Successfully built 6ef71caf6a90
    Successfully tagged username/logrow:0.1.0


  3. Test your images by running the following command:

    docker run --rm -i -e NUMROW=3 username/logrow:0.1.0






Pushing your Docker image

You need to push your images into a registry. This example uses a public Docker Hub. Replace username in the command below with your own username, or you will not be able to push your images.

docker push username/logrow:0.1.0


The second option, and the most common in many companies, is to use a private registry.


Configuring Minikube and Helm



This example simulates a Kubernetes cluster with Minikube, which runs a single node cluster hosted on a VirtualBox machine. To install Minikube see, Install Minikube on the Kubernetes documentation page.


Helm is a package manager for Kubernetes applications. It is very useful when you want to deploy multiple configurations as a single package. To install Helm see, Installing Helm on the Helm documentation page.



To prepare the environment, you need to initialize Minikube and Helm.



To initialize Minikube, type the following commands:

minikube start
minikube dashboard


You should see the Kubernetes dashboard:





To initialize Helm, type the following command:

helm init


You should see this after the run:

$HELM_HOME has been configured at /Users/username/.helm.
Tiller (the Helm server-side component) has been installed into your Kubernetes Cluster.
Happy Helming!


Deploying your Job

You are ready to deploy your application. In this example, you are deploying a Kubernetes CronJob.


A cron job is a Job based on a schedule. To compare this to Talend Administration, it is a Job scheduled in the Job conductor.


This application is composed of two Kubernetes objects:

  • ConfigMap: a key/value object that contains the numRow context value

    This way, you are able to change the configuration of a Job without having to redeploy.

  • CronJob: contains your Talend Job container specifications


Understanding a Helm chart

A Helm chart is composed of multiple files that, once deployed, represent a release. In this example, you will deploy a package demo-job.




The demo-job folder contains the following:

  • Chart.yaml: represents the definition of a chart

  • values.yaml: contains values for all variables used to configure your templates



In the folder templates, you have the configuration files for your Kubernetes objects:

  • configMap.yaml: creates a ConfigMap object that defines the value of the numRow variable. In the data section, use the key numrow to map to the container NUMROW environment variable.

  • cronJob.yaml: contains the definition of the cron job deployment, such as the container to use and the mapping of the NUMROW environment variable to the configMap variable.



Deploying your Helm chart

  1. To deploy your chart, run the following command from inside the helm folder:

    helm install --name my-release --namespace talend ./demo-job

    This command contains:

    • --name my-release: configures the name of the release (replace my-release with desired release name)

    • --namespace talend: configures a new namespace called talend (replace as necessary)

    • ./demo-job: replace demo-job with the name of your package folder



    NAME:   my-release
    LAST DEPLOYED: Mon Mar 26 16:32:57 2018
    NAMESPACE: talend
    ==> v1/ConfigMap
    NAME                 DATA  AGE
    my-release-demo-job  1     0s
    ==> v1beta1/CronJob
    NAME                 KIND
    my-release-demo-job  CronJob.v1beta1.batch


  2. Verify that your objects were correctly deployed:





One thing to understand when you deploy a Cron Job:

  • A Cron Job is a scheduled Kubernetes object that is a Job. For each execution of a Job, Kubernetes creates a pod. The pod is where the container is running, and where you will be able to find logs.







If you want to change the number of rows generated, go into ConfigMap and change the value of numrow in my-release-demo-job.

Version history
Revision #:
21 of 21
Last update:
‎02-25-2019 01:10 AM
Updated by: