Architecture, Best Practices, and How-Tos

Create JobServers using a DI Job, then configure them in TAC.
View full article
Overview SAP is a popular ERP system that allows thousands of companies to store transaction data and Master Data in their SAP systems. Companies investing in Big Data and Apache Hadoop technologies want to be able to extract data from their legacy systems, such as SAP, and load it into Hadoop to provide transformed or raw data to their analytics teams; allowing them to draw insights from the data.       Environment Talend Studio 6.2.1 SAP ECC 6.0 EhP6 Hortonworks 2.6.4 Prerequisites  1. Set up Kerberos and get a ticket: Install the Kerberos client from the MIT site Update your security policies Configure the krb5.ini file Add big data nodes to the hosts file on the local system Get a ticket, as shown in the following image:  2. Configure SAP connectivity: Install the Talend function module on your SAP system Install the sapjco jar files from SAP on the Studio computer Create SAP connection metadata  In the SAP Connection window, click Check to make sure the connectivity works. If  succesful , the following image is shown.  Log in to the SAP ECC and make sure the MARA table has data. To do this, use the transaction SE16.   3. Config Hadoop connectivity:  In Talend Studio, log in to your project and select the Metadata menu. Right-click your Hadoop Cluster and click Create Hadoop Cluster. Select the distribution and version of your Hadoop cluster and select one of the options to load the configuration. Click Next. Enter the Ambari information and click Next. The system retrieves your cluster information and populates the remaining data.   In the Hadoop Cluster Connection window, make sure the services are running by clicking Check Services.     Build job Open Talend Studio, log in to your project and navigate to the Metadata menu. Right-click on the SAP connection and select Retrieve SAP table. Enter the SAP table name that you want to extract data from and click Search. Select the SAP table and click Next to review the schema. Then click Finish.  The table MARA will appear in the list of SAP Tables. Right-click on Job Designs and click Create a standard Job. Give your job a name.  Drag the MARA table into the canvas and the Studio will automatically create a tSAPTableInput component with the label MARA. In the component tab, enter the filter condition if needed. Drag a tHDFSOutput component from the palette to the canvas. Connect the two components using the row1(Main). Select the subjob in the canvas and click Basic settings in the Component tab. Select Show subjob title and enter a title. Drag the tHDFSconnection component from the palette and connect it to the subjob using an OnSubjobOk link. Select the tHDFSconnection component and, in the Component tab, change the Property Type to Repository. Select the HDFS connection you created. Click the tHDFSOutput component. In the Component tab, select Use Existing connection and select the connection from the drop-down menu. Enter a filename for the hdfs output file.     Drag a tSAPconnection component  from the palette and connect it to the tHDFSConnection component using an OnSubjobOk link. Click the tSAPconnection component and select the SAP connection from the repository.     Run Job From Advanced Settings on the Run tab, set the minimum and maximum memory settings. Click Run.   The following message will display in the Hadoop environment.                     
View full article
Implement the rolling users for database access feature of Vault.
View full article
Use Jobs to capture performance benchmarks between Talend and the Azure Cloud MS SQL Server.
View full article
Use Curator to manage your Elasticsearch cluster.
View full article
Steps to migrate a Talend ESB product from version 6.x  to 7.0.1.
View full article
Set up context groups while configuring Hadoop.
View full article
Assumptions, prerequisites, and migration procedures.
View full article
Outlines the problem and challenges of MDM migration.
View full article
How to set up Spark jobs in Talend Studio to utilize Dynamic Context.
View full article
Includes creating  a basic Chef cookbook.
View full article
Migrate the missing repositories and users from Nexus 3 to Nexus 2.
View full article
Set the CPU affinity for MDM on Linux.
View full article
Suggestions for using Talend Data Preparation.
View full article
Don't alter existing components, create a custom component to add new features.
View full article
Steps to generating a Google access token from a JSON key.
View full article
Use / to avoid overriding historical data during the synchronization processes.
View full article
Install Decanter and enable Camel Trace and Event features.
View full article
Top Contributors