Cloudera Manager provides an overall platform to manage and monitor all the nodes and services that make up a CDH cluster. It accomplishes this task through the use of agents. Each managed object has an agent that allows it to receive configuration information from Cloudera Manager, and to send performance stats. EMC Isilon cannot have the Cloudera Manager agent installed on it. Fortunately the Cloudera API ( http://cloudera.github.io/cm_api/) allows an unmanaged object to be added to Cloudera Manager. Not only can it be added as a host, but it can be assigned roles. Using the API we can add Isilon as a host, and assign it the necessary roles to run HDFS. Cloudera manager can then push the configuration to all the nodes in the cluster.
Prerequisits
A running Cloudera Cluster with Cloudera Manager. See these links:
Configure Hadoop with Cloudera Manager
EMC Isilon configured for HDFS with correct permissions for Cloudera. See these links:
Cloudera permission on EMC Isilon
Integrate Isilon with the HDFS service
From the main page click the drop down arrow to the right of the Cluster name. Select “Rename Cluster” |
Rename the default cluster name to a name without any spaces in it. This makes it easier to use the API in later steps. In this example we name the cluster “Cluster1” |
The cluster name is now changed. |
The Cloudera API client code is need to interact with the Cloudera API. From the Manager server download it from github with the following command: wget https://github.com/cloudera/cm_api/archive/master.zip |
Unzip the file with the following command: unzip master |
Change directory into the cm_api-master/python directory with the following command: cd cm_api-master/python |
Install the API client with the following command: python setup.py install The CM API client is now installed on your system |
Now we use the python and the CM API client to add our Isilon cluster to Cloudera Manager. It’s added to the cluster, hdfs and mapreduce service, default rack, and give the NameNode role in hdfs. The commands to do this are below: python import socket from cm_api.api_client import ApiResource api = ApiResource("cdhmanager.ebc.emc.local",username="admin",password="admin",version=1) cluster = api.get_cluster("Cluster1") hdfs = cluster.get_service("hdfs1") mapred = cluster.get_service("mapreduce1") name = 'isilon1.cto.emc.local' host = api.create_host(name, name, socket.gethostbyname(name), "/default") nn = hdfs.create_role("hdfs01-nn", "NAMENODE", name) After running the API commands, we can see some errors and warnings pop up in Cloudrea Manager. Since the Isilon cluster does not have an agent the error in the host service will always be present. The warnings in the HDFS service will be rectified in later steps. Click on the HDFS service |
Once the HDFS service page opens, click on the Instances tab. |
We see all the nodes and their roles listed. Notice the Isilon host is stopped but health is good. It also has the namenode role assigned to it. |
In the middle of the page click “Add” |
Add the Datanode and SecondaryNameNode roles to Isilon. |
Now Isilon has 3 roles. |
Click the box next to “Name” so that all hosts are selected. |
Click “Actions for Selected” and choose stop. |
Click “Stop”. The HDFS service is stopped on all nodes. |
The stop command runs successfully on all nodes. |
Using the drop down, sort the hosts by the role. Select “NameNode”. You will see the role listed on both Isilon and the Cloudera Manager as we defined in Table1. Select the Manager. |
Click “Actions for Selected” and choose Delete. |
Confirm the deletion of the NameNode role from the Cloudera Manager server by clicking “Delete” |
Using the drop down, sort the hosts by the role. Select “SecondaryNameNode”. You will see the role listed on both Isilon and the Cloudera Manager as we defined in Table1. Select the Manager. |
Click “Actions for Selected” and choose Delete. |
The roles are now set under the HDFS service so that all HDFS connections will be made to Isilon |
Click the “Home” button in the top left corner. Notice we have more warnings under the HDFS service but no errors. |
Click the arrow to the right of the cluster name. Select “Deploy Client Configuration”. The new HDFS configuration will be pushed out to the cluster. |
Confirm deployment by clicking “Deploy Client Configuration” |
The configuration is successfully deployed. |
Confirm the configuration on any of the nodes with the following command: cat /etc/hadoop/conf.cloudera.hdfs1/core-site.xmk |grep isilon1 NOTE: replace isilon1 with the name of your Isilon cluster |
Back on the home screen select the arrow to the right of the madreduce service. Click “Restart” |
Confirm restart by clicking “Restart”. The oozie and hive services will also restart. The Cloudera Hadoop cluster is now ready to use Isilon for HDFS. |
Resolve HDFS warnings
On the home screen click the warning wrench next to the HDFS service. |
Of the 6 warning listed, 3 can be resolved. The 3 warning related to Isilon cannot be resolved because of the llack of agent on the Isilon cluster. Clicking on the warning will bring you to the configuration page to resolve the warning. Select the defaults and the warning will be resolved.
On the home page we have now resolved all the warnings non Isilon related.
|
Comments