In this tutorial, create Hadoop Cluster metadata automatically by connecting to the Cloudera Manager.
This tutorial uses Talend Data Fabric Studio version 6 and a Hadoop cluster: Cloudera CDH version 5.4.
1. Create a new Hadoop cluster metadata definition
The Hadoop Configuration Import wizard opens.
2. Select the automatic configuration method
There are different ways to create Hadoop cluster metadata:
3. Connect to the Cloudera Manager
The Cloudera Manager is an end-to-end application for managing Cloudera CDH clusters. To retrieve the connection information and create the corresponding metadata, you will connect to the Cloudera Manager.
The cluster named Cluster 1 appears in the Discovered clusters list.
The wizard detects configuration files and lists the corresponding services. In this tutorial, we will keep the default configuration and create metadata definitions for YARN, HDFS, Hive and HBase. The definition for Spark is not available.
4. Create metadata corresponding to the listed services except Spark
The Checking Hadoop Services window opens. The Namenode and Resource Manager status is 100%.
5. Inspect the metadata created in the Repository
The metadata definitions are now available.
The metadata definitions are now ready to be used in a Talend Job.