Having access control on data when executing Jobs on an execution server in Talend is one of the critical business requirements that clients look for in a platform. The need for compliance can be due to regulatory reasons or internal business confidentially. So users often create separate Service accounts to manage the access control in the JobServer and execute Job tasks accordingly. But this can be cumbersome, time consuming, and limit audit capabilities if done outside the toolsets of Talend.
When using Talend Administration Console (TAC), one of the improvements can be to have multiple JobServers deployed, with each server mapped to individual service accounts for controlling the access of data. Once you lock the service accounts and JobServer combinations, no user can be configured to use RUN AS capabilities for executing JobServer tasks, because sudo capability (in other words, running programs with the security privileges of another user) can be curtailed when creating the service accounts. Taking this design approach, the following article describes a simple Job design to automate the creation of multiple JobServers, and to configure them in TAC to enable access control when executing the Jobs.
As shown in the diagram above, the process to automate the process of creating multiple JobServers is depicted in multiple steps below.
Note: This process is explained in the Talend Data Fabric installation guide provided as part of your product installation, and is available in the Talend Help Center. Navigate to Installing and Configuring Talend server modules, then scroll down to the Installing and configuring your JobServers section.
Details of the configurable default values of the Job context are listed below:
Edit and modify the Default.properties file to change the values according to the HOST environment where you plan to deploy the JobServer using the available ports. Importantly, ensure there are no firewalls or other software rules restricting access to the configured ports.
If you don’t want to modify the Default.properties directly, then during execution of the Job, parameters can be set on runtime by providing inline parameters such as:
JobServer_run.bat --context=Default --context_param service_account=svc_talend --context_param MONITORING_PORT=8878 ……
After the Job finishes successfully, a new folder will be created in the same folder where the Archive file is placed. The folder will be named according to the service account name given.
Follow the instructions in the README.txt file to enable the JobServer as a Service.
In TAC, access to resources is provided through the project(s). All the resources are tied to the project, and then you give users access to the project(s).
In a Production TAC, you can create projects with the None storage option as shown below. In this case, there is no source repository like SVN or Git behind this project. The project is simply a Label to attach our resources to for deployment of the binaries and security purposes.
Depending on how closely you want to manage access to the JobServers, and thus to the Service Accounts, you may decide to create one or more such projects. You can even create 21 such projects to enable you to manage access to the JobServers on a very granular level.
Once you have the projects created, you can assign Ops Users the Operation Manager role and give them Read access to the project(s). In the diagram below, User1 has access to two projects named TENANT1 and TENANT2.
Set up one JobServer for each Service Account by duplicating the JobServer directory so that you have one directory for each JobServer, as shown below:
Set up the RUN_AS_WHITELIST parameter for each JobServer, as shown below, to further ensure the fact that no other user, other than the whitelisted one, can execute jobs through this JobServer.
The whole JobServer folder for each Service Account is owned by that Service Account. So only a root user, or a user that is allowed to sudo to that Service Account, can see the configuration files.
The JobServer is set up to start under that Service Account.
The Service Account has access to all folders required for Jobs running under that Service Account to work.
You need a Kerberos keytab for each Service Account. The keytab is placed in the home directory of that Service Account, so only processes started under that Service Account will be able to access the keytab file.
As an example, details on how to create a keytab on Windows are available on this ktpass page, but follow the correct process for your operating system.
Set up the JobServers in the TAC with the correct ports as shown below.
Associate the Server in the TAC to the corresponding project(s). You can associate one JobServer to one project, or many JobServers to one project. This depends on the level of granularity you want. In the beginning, it may be easier to assign one JobServer to one project.
Once you have set up all the projects and servers, limit the Rights of the Operation Manager to only the two items related to Job Conductor, as shown below.
Since the Production TAC only has projects with the No Storage option, you need to provide Jobs as Zip files or from Nexus. When the Job Conductor imports a Zip from disk or from Nexus, it looks at the jobInfo.properties file to know which project to link this Job to. This properties file is provided within the Zip file of the Job. If you need to change the project so that a project label for a different tenant can be attached to the Job, you can modify the project= attribute in this file to match your project in Production. This can be automated through a build process. This is to allow a binary Job built from a project called xyz_dev to be attached to project TENANT1 in Production.
When creating and editing tasks on the Job Conductor, the user will only be allowed to associate a Job to the JobServer that he/she has access to. So a user will never be able to run a Job against a Service Account which he/she does not have access to using the RUN AS feature of TAC.