cancel
Showing results for 
Search instead for 
Did you mean: 

Architecture, Best Practices, and How To's

Understand the custom libraries process.
View full article
Overview This document covers connecting Talend to Git repositories using the SSH protocol. Git is a version control system that Talend supports, similar to SVN. Git can store repositories locally or remotely. The key difference is centralization. In SVN, the Repository is in one centralized location. The advantage with Git is there can be a local copy of a repository, and you can push, pull, clone, and merge not just from a central location, but also from another developer’s codebase. You can look at diffs to compare code changes and decide which to keep.   While using Talend, you may not need to do most of these; you will likely just be cloning, pushing (committing), and pulling from a centralized location (either from a remote URL or a local Git Server installation).   Requirements Git Bash https://git-scm.com/downloads Git GUI client (required if using Windows) https://git-scm.com/download/gui/windows If you are using Windows to house your TAC, then Git Bash or a Git client will need to be installed to create an SSH Key. If you are using Linux, OpenSSH comes automatically with the OS. One of the following supported distributions of Git Bit Bucket (Talend version 6.2.1+) https://bitbucket.org/product Git Hub https://github.com/ Git Lab https://about.gitlab.com/ Talend version 6.x (refer to the license email you received from support@talend.com for your version) https://www.talend.com/download/   Operating System Recommendations Red Hat Linux 7 https://www.redhat.com Windows 7 https://www.microsoft.com/en-us/software-download/windows7 Windows 10 https://www.microsoft.com/en-us/windows/   Connecting to Git with SSH To verify an RSA key can be created, open a Git Bash window and run the command: ssh -v The output should be similar to the following: Note: If you are using Git 2.10 on Windows, see below.   List the contents of the ~/.ssh directory by running the command: ls -a ~/.ssh The result should be similar to the following: $ ls -a ~/.ssh ls: /c/Users/emmap1/.ssh: No such file or directory Note: If a .ssh folder doesn't exist, you will see a No such file or directory error. You must create a .ssh folder under c:\Users\user\.ssh.   Type ssh-keygen. You will be prompted to save the key in the default location: c:\Users\user\.ssh\id_rsa. Entering a passphrase is optional when prompted. The complete command will look like the following: $ ssh-keygen Generating public/private rsa key pair. Enter file in which to save the key (/c/Users/emmap1/.ssh/id_rsa): Created directory '/c/Users/emmap1/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /c/Users/emmap1/.ssh/id_rsa. Your public key has been saved in /c/users/emmap1/.ssh/id_rsa.pub. The key fingerprint is: e7:94:d1:a3:02:ee:38:6e:a4:5e:26:a3:a9:f4:95:d4 emmap1@EMMA-PC   After the RSA key pair is generated, review the .ssh folder and verify that the id_rsa and id_rsa.pub files are listed:   Create an SSH config file and add these parameters: Host bitbucket.org IdentityFile ~/.ssh/id_rsa Note: The space before and after IdentityFile is required – do not delete this.   In Gitbash, type the following command to add bitbucket and create a known_hosts file: ssh –T git@bitbucket.org Note: If you are using git 2.10 on Windows, there is an issue with ssh-agent meaning it will not start automatically. To start it in Gitbash, type ssh-agent -s. Type eval $(ssh-agent). Type ssh-add –l. This should state that you have no identity. Add an identity by typing ssh-add ~/.ssh/id_rsa. Run ssh-keygen to overwrite the id_rsa file. Your ssh-agent was not running when the previous file was created, so an improper key may have been generated as a result. From your Git distribution, click your avatar > settings.   Scroll down to the SSH keys portion in the Security tab. Bit Bucket   Git Hub   Git Lab   Perform a cat command on the id_rsa.pub file: cat ~/.ssh/id_rsa.pub   Copy the contents of the id_rsa.pub file and place it into the Key portion of the SSH keys tab. Click Save. Bit Bucket   Git Hub   Git Lab   Git is now set up for use with SSH. The next steps will connect TAC to it.   Connecting TAC Create a new repository. Bit Bucket   Git Hub   Git Lab   Expand the I’m starting from scratch section, then copy the Git URL shown after git remote add origin, in this case: git@bitbucket.org:arthur-talend/bitbucket-test.git For Bit Bucket and Git Hub, the Git URL will be searchable in the project itself, refer to the images below: Bit Bucket   Git Hub   Git Lab   Create a new repository in TAC, enter the correct details (including the URL you just copied), check the connection, and then click Save.   Note: If you are not using your own account to run the Talend services, your .ssh folder may be defined as a different directory. The environment can be set up to work around this issue. In the System Environment Variables > PATH, add your Git cmd path, normally: C:\Program Files\Git\cmd. Alternatively, use the home variable, and establish your home variable to a folder of your choice. For details, see the Git documentation: https://git-scm.com/book/en/v2/Git-Internals-Environment-Variables.   Refresh your repository page for the repo; it should look a little different now.   Note: Notice the SSH URL and the Branch and Watcher.   Log in to Studio (remember to give your user the correct credentials as well as project authorizations). Note: Continue with the instructions in this step if you are using SSH2; if not, skip to the next step. These instructions will be performed on a machine with Talend Studio installed, but that is unable to connect or obtain a connection. If you are having issues connecting Studio to Git, use the ssh-keyscan command to add the host URL to the known_hosts file by typing the following: ssh-keyscan –H {host.url} >> known_hosts Create and append the public RSA key to the authorized keys file.   Type the following command: cat id_rsa.pub>>authorized_keys If you still can't connect to Git, open Git Bash and type this command to find the location from which Git is trying to read the keys: ssh –vT git@bitbucket.org The first 2 parameters in the output show from where the configurations and files are being read. After running the ssh –vT command, if it is properly set, you should be logged in as the Git user. However, sometimes the ssh -vT command isn’t sufficient, and your user will be logged in as “anonymous”. In this case, you will need to use this command: ssh-keygen -t rsa -C “you@email.com”   Repeat the steps above to place all of the files–known_hosts, authorized_keys, config, and id_rsa.pub–within the folder from which Git is reading the files (identified above) and then try logging in again. Note: If you still cannot log in, it is unable to identify one or more of your configurations and thus cannot properly SSH into the Git server. To finish, navigate to the source tab within Bitbucket, then click your repository. Notice that the structure now looks like that of a correct Talend repository.   Caution Talend 6.1.1 and Bitbucket on SSH does not work. Talend supports Bitbucket starting in 6.2.1 and thus it is not possible to set up 6.1.1 with SSH in Bitbucket, only HTTPS. If you try to set up Bitbucket, SSH, and Talend 6.1.1, you will receive a mismatched protocol error in the console. Note: Talend 6.3.1 and later: if using LDAP(S), there are no user settings available.   Git Troubleshooting Guide What is the first troubleshooting information needed? The most important piece of information are the logs. The logs will normally tell you what the issue is, such as the RSA fingerprint is not recognized, or there is an unknown host in the logs. Other possible issues are that the user is wrong or the host is incorrect. For instance, in the URL: git@bitbucket.org:user-talend/bitbucket-test.git The first git is the user; if you have a local server implementation, you might not have called your user git so try the user you used to log in (or try the service account). The host in the URL is bitbucket.org, but sometimes the host is not bound within the hosts file. Check to ensure that there are no inconsistencies. The new URL could look something like this: talend@user-pc.talend.org:user-talend/talend-repo.git   What issues have been encountered when trying to set up SSH? The most common issues are due to connectivity with the central repository. The Log in to Studio section offers a couple of troubleshooting steps that can check for inconsistencies in the implementation of SSH. Another issue is if you are not using OpenSSH as the SSH client. For example, Tectia does not have the same SSH commands and there are some limitations that exist. This will require assistance from your SSH implementation.   Git is setup, my client is implemented, and when I check the connection in TAC it says that the connection is OK. Yet when I try to connect, why doesn’t it allow me to save? The most common reason for this is that the RSA key is not trusted. When this occurs, you will need to go back to the SSH key deployment section of your account. Perhaps you may want to read the key, as there may be an issue with it. Another option is to add the key directly to the project, instead of to the user. To do this, go to the project itself > Settings > Deploy keys and add the RSA key to the project.   I'm getting an error that states “stashing local changes did not successfully complete”. There are two issues to check: Permissions can be an issue. Check the folder permissions on the folder being complained about, and see what the distributed locking mechanism is. For instance, if you are on an NFS, try using the Git Repo URL from Bit Bucket or Git Hub instead to see if it works. If it does, then that's how the OS was configured. This could be an issue with the workspace. Try using a new workspace, by deleting the existing and creating a new one, or by specifying a new workspace. If it’s CommandLine, you can make a copy of the commandline-workspace and then delete it, letting CommandLine recreate the entire workspace.
View full article
The configuration steps needed to successfully deploy and use CI Builder.
View full article
How to set up Talend Studio with Spark to work with kerberized Kafka.
View full article
How to set up Spark jobs in Talend Studio to utilize Dynamic Context.
View full article
Top Kudoed Posts in Knowledge Base
Top Contributors