Connecting Talend with Git + SSH

Overview

This article covers connecting Talend to Git repositories using the SSH protocol. Git is a version control system that Talend supports, similar to SVN. Git can store repositories locally or remotely. The key difference is centralization. In SVN, the Repository is in one centralized location. The advantage with Git is there can be a local copy of a repository, and you can push, pull, clone, and merge not just from a central location, but also from another developer’s codebase. You can look at diffs to compare code changes and decide which to keep.

 

While using Talend, you may not need to do most of these; you will likely just be cloning, pushing (committing), and pulling from a centralized location (either from a remote URL or a local Git Server installation).

 

Requirements

  1. Git Bash

    https://git-scm.com/downloads

  2. Git GUI client (required if using Windows)

    https://git-scm.com/download/gui/windows

    If you are using Windows to house your TAC, then Git Bash or a Git client will need to be installed to create an SSH Key. If you are using Linux, OpenSSH comes automatically with the OS.

  3. One of the following supported distributions of Git

    1. Bit Bucket (Talend version 6.2.1+)

      https://bitbucket.org/product

    2. Git Hub

      https://github.com/

    3. Git Lab

      https://about.gitlab.com/

  4. Talend version 6.x (refer to the license email you received from support@talend.com for your version)

    https://www.talend.com/download/

 

Operating System Recommendations

  1. Red Hat Linux 7

    https://www.redhat.com

  2. Windows 7

    https://www.microsoft.com/en-us/software-download/windows7

  3. Windows 10

    https://www.microsoft.com/en-us/windows/

 

Connecting to Git with SSH

  1. To verify an RSA key can be created, open a Git Bash window and run the command:

    ssh -v

    The output should be similar to the following:

    ssh_v.png

    Note: If you are using Git 2.10 on Windows, see below.

     

  2. List the contents of the ~/.ssh directory by running the command:

    ls -a ~/.ssh

    The result should be similar to the following:

    $ ls -a ~/.ssh
    ls: /c/Users/emmap1/.ssh: No such file or directory

    Note: If a .ssh folder doesn't exist, you will see a No such file or directory error. You must create a .ssh folder under c:\Users\user\.ssh.

     

  3. Type ssh-keygen. You will be prompted to save the key in the default location: c:\Users\user\.ssh\id_rsa.

    Entering a passphrase is optional when prompted. The complete command will look like the following:

    $ ssh-keygen
    Generating public/private rsa key pair.
    Enter file in which to save the key (/c/Users/emmap1/.ssh/id_rsa):
    Created directory '/c/Users/emmap1/.ssh'.
    Enter passphrase (empty for no passphrase):
    Enter same passphrase again:
    Your identification has been saved in /c/Users/emmap1/.ssh/id_rsa.
    Your public key has been saved in /c/users/emmap1/.ssh/id_rsa.pub.
    The key fingerprint is: e7:94:d1:a3:02:ee:38:6e:a4:5e:26:a3:a9:f4:95:d4 emmap1@EMMA-PC

     

  4. After the RSA key pair is generated, review the .ssh folder and verify that the id_rsa and id_rsa.pub files are listed:

    ls_a_ssh.png

     

  5. Create an SSH config file and add these parameters:

    Host bitbucket.org
     IdentityFile ~/.ssh/id_rsa

    Note: The space before and after IdentityFile is required – do not delete this.

     

  6. In Gitbash, type the following command to add bitbucket and create a known_hosts file:

    ssh –T git@bitbucket.org

    Note: If you are using git 2.10 on Windows, there is an issue with ssh-agent meaning it will not start automatically.

    1. To start it in Gitbash, type ssh-agent -s.

    2. Type eval $(ssh-agent).

    3. Type ssh-add –l. This should state that you have no identity.

    4. Add an identity by typing ssh-add ~/.ssh/id_rsa.

    5. Run ssh-keygen to overwrite the id_rsa file. Your ssh-agent was not running when the previous file was created, so an improper key may have been generated as a result.

  7. From your Git distribution, click your avatar > settings.

    bit_hub_lab_settings.png

     

  8. Scroll down to the SSH keys portion in the Security tab.

    Bit Bucket

    sec_tab_bit.png

     

    Git Hub

    sec_tab_hub.png

     

    Git Lab

    sec_tab_lab.png

     

  9. Perform a cat command on the id_rsa.pub file:

    cat ~/.ssh/id_rsa.pub

    cat.png

     

  10. Copy the contents of the id_rsa.pub file and place it into the Key portion of the SSH keys tab. Click Save.

    Bit Bucket

    key_bit.png

     

    Git Hub

    key_hub.png

     

    Git Lab

    key_lab.png

     

    Git is now set up for use with SSH. The next steps will connect TAC to it.

     

Connecting TAC

  1. Create a new repository.

    Bit Bucket

    repo_bit.png

     

    Git Hub

    repo_hub.png

     

    Git Lab

    repo_lab.png

     

  2. Expand the I’m starting from scratch section, then copy the Git URL shown after git remote add origin, in this case:

    git@bitbucket.org:arthur-talend/bitbucket-test.git

    For Bit Bucket and Git Hub, the Git URL will be searchable in the project itself, refer to the images below:

    Bit Bucket

    giturl_bit.png

     

    Git Hub

    giturl_hub.png

     

    Git Lab

    giturl_lab.png

     

  3. Create a new repository in TAC, enter the correct details (including the URL you just copied), check the connection, and then click Save.

    conn_bit.png

     

    Note: If you are not using your own account to run the Talend services, your .ssh folder may be defined as a different directory. The environment can be set up to work around this issue.

    1. In the System Environment Variables > PATH, add your Git cmd path, normally: C:\Program Files\Git\cmd.

    2. Alternatively, use the home variable, and establish your home variable to a folder of your choice. For details, see the Git documentation: https://git-scm.com/book/en/v2/Git-Internals-Environment-Variables.

       

  4. Refresh your repository page for the repo; it should look a little different now.

    refresh.png

     

    Note: Notice the SSH URL and the Branch and Watcher.

     

  5. Log in to Studio (remember to give your user the correct credentials as well as project authorizations).

    Note: Continue with the instructions in this step if you are using SSH2; if not, skip to the next step.

    These instructions will be performed on a machine with Talend Studio installed, but that is unable to connect or obtain a connection.

    1. If you are having issues connecting Studio to Git, use the ssh-keyscan command to add the host URL to the known_hosts file by typing the following:

      ssh-keyscan –H {host.url} >> known_hosts
    2. Create and append the public RSA key to the authorized keys file.

      auth_keys.png

       

      Type the following command:

      cat id_rsa.pub>>authorized_keys
    3. If you still can't connect to Git, open Git Bash and type this command to find the location from which Git is trying to read the keys:

      ssh –vT git@bitbucket.org

      The first 2 parameters in the output show from where the configurations and files are being read.

      read_keys.png

      After running the ssh –vT command, if it is properly set, you should be logged in as the Git user. However, sometimes the ssh -vT command isn’t sufficient, and your user will be logged in as “anonymous”. In this case, you will need to use this command:

      ssh-keygen -t rsa -C “you@email.com”

       

    4. Repeat the steps above to place all of the files–known_hosts, authorized_keys, config, and id_rsa.pub–within the folder from which Git is reading the files (identified above) and then try logging in again.

      Note: If you still cannot log in, it is unable to identify one or more of your configurations and thus cannot properly SSH into the Git server.

  6. To finish, navigate to the source tab within Bitbucket, then click your repository. Notice that the structure now looks like that of a correct Talend repository.

    repo_final.png

     

Caution

Talend 6.1.1 and Bitbucket on SSH does not work. Talend supports Bitbucket starting in 6.2.1 and thus it is not possible to set up 6.1.1 with SSH in Bitbucket, only HTTPS. If you try to set up Bitbucket, SSH, and Talend 6.1.1, you will receive a mismatched protocol error in the console.

Note: Talend 6.3.1 and later: if using LDAP(S), there are no user settings available.

 

Git Troubleshooting Guide

What is the first troubleshooting information needed?

The most important piece of information are the logs. The logs will normally tell you what the issue is, such as the RSA fingerprint is not recognized, or there is an unknown host in the logs. Other possible issues are that the user is wrong or the host is incorrect. For instance, in the URL:

git@bitbucket.org:user-talend/bitbucket-test.git
  • The first git is the user; if you have a local server implementation, you might not have called your user git so try the user you used to log in (or try the service account).
  • The host in the URL is bitbucket.org, but sometimes the host is not bound within the hosts file. Check to ensure that there are no inconsistencies.

The new URL could look something like this:

talend@user-pc.talend.org:user-talend/talend-repo.git

 

What issues have been encountered when trying to set up SSH?

The most common issues are due to connectivity with the central repository. The Log in to Studio section offers a couple of troubleshooting steps that can check for inconsistencies in the implementation of SSH. Another issue is if you are not using OpenSSH as the SSH client. For example, Tectia does not have the same SSH commands and there are some limitations that exist. This will require assistance from your SSH implementation.

 

Git is setup, my client is implemented, and when I check the connection in TAC it says that the connection is OK. Yet when I try to connect, why doesn’t it allow me to save?

The most common reason for this is that the RSA key is not trusted. When this occurs, you will need to go back to the SSH key deployment section of your account. Perhaps you may want to read the key, as there may be an issue with it. Another option is to add the key directly to the project, instead of to the user. To do this, go to the project itself > Settings > Deploy keys and add the RSA key to the project.

 

I'm getting an error that states “stashing local changes did not successfully complete”.

There are two issues to check:

  1. Permissions can be an issue. Check the folder permissions on the folder being complained about, and see what the distributed locking mechanism is. For instance, if you are on an NFS, try using the Git Repo URL from Bit Bucket or Git Hub instead to see if it works. If it does, then that's how the OS was configured.
  2. This could be an issue with the workspace. Try using a new workspace, by deleting the existing and creating a new one, or by specifying a new workspace. If it’s CommandLine, you can make a copy of the commandline-workspace and then delete it, letting CommandLine recreate the entire workspace.
Version history
Revision #:
12 of 12
Last update:
‎09-16-2017 09:13 PM
Updated by:
 
Tags (3)