In previous posts I showed how to setup VMware Big Data Extension, and integrate it with Isilon for HDFS. This blog will show you how to use BDE to deploy Horonworks data platform (HDP)
When you deploy the Big Data Extension (BDE) vApp, the Apache 1.2.1 Hadoop distribution is included in the OVA that you download and deploy. You can add and configure other Hadoop distributions, like Hortonworks Data Platform (HDP), using the command line. One of the benefits of VMware Big Data Extension is the ability to configure, deploy and run multiple Hadoop distributions from different vendors.
To setup HDP you must first download the tar ball distributions from the repo web site. The website is:
http://public-repo-1.hortonworks.com/HDP-1.0.7/repos/centos5/tars/
Browsing the website you can see links to different versions, documentation, and utilities. For this blog we use 1.07, but all versions should work.
VMware Big Data Extension uses a Ruby script called config-distro.rb located in the /opt/serengeti/sbin directory on the Serengeti vApp management server to retrieve the tar ball packages and set up the puppet manifests used to automate deployment. We run this utility and give it the correct URL information for the different packages we want to deploy.
When the Serengeti vApp is deployed, along with the management server VM is a template VM. This VM is a Centos 5 distribution and is used to deploy all the nodes that make up a Hadoop cluster. The management VM uses chef to deploy the packages to the template and configure it accordingly. In future blogs Ill show you how to make a custom template using CentOS 6. If you use it, you would then also have to use the correct HDP distro
We use the following URL to retrieve the 1.0.7 packages for Centos 5:
Hadoop : http://public-repo-1.hortonworks.com/HDP-1.0.7/repos/centos5/tars/hadoop-1.0.2.tar.gz
Pig: http://public-repo-1.hortonworks.com/HDP-1.0.7/repos/centos5/tars/pig-0.9.2.tar.gz
Hive: http://public-repo-1.hortonworks.com/HDP-1.0.7/repos/centos5/tars/hive-0.8.1.tar.gz
Hbase: http://public-repo-1.hortonworks.com/HDP-1.0.7/repos/centos5/tars/hbase-0.92.0.tar.gz
Zookeeper: http://public-repo-1.hortonworks.com/HDP-1.0.7/repos/centos5/tars/zookeeper-3.3.4.tar.gz
Log in to management server using either putty or the VMware console |
|
Run the config-distro command with the correct options.The config-distro.rb command has the following options (Note this is all one line) config-distro.rb --name HortonWorksHDP --vendor HDP --version 1.0.7 --hadoop http://public-repo-1.hortonworks.com/HDP-1.0.7/repos/centos5/tars/hadoop-1.0.2.tar.gz --pig http://public-repo-1.hortonworks.com/HDP-1.0.7/repos/centos5/tars/pig-0.9.2.tar.gz --hive http://public-repo-1.hortonworks.com/HDP-1.0.7/repos/centos5/tars/hive-0.8.1.tar.gz --hbase http://public-repo-1.hortonworks.com/HDP-1.0.7/repos/centos5/tars/hbase-0.92.0.tar.gz --zookeeper http://public-repo-1.hortonworks.com/HDP-1.0.7/repos/centos5/tars/zookeeper-3.3.4.tar.gz --hve true |
. |
After the command is executed the packages are retrieved and placed in the proper directory on the management server, and the chef manifest is updated. |
|
Change directory to the /opt/serengeti/www/distros directory. Do list (ls) and you should see a directory with the name used by the config-distro.rb command. You will also see a file called “manifest” |
|
Open the manifest file with a text editor. In this example we use vim |
|
Look through the files. The manifest now contains the correct information for chef to use to retrieve the packages during automated deployment. |
|
Change directory to /opt/serengeti/www/specs List the directory (ls). You see directories that contain different information for hadoop distributions and a file called map. Open the map file with a text editor |
|
Scroll through till you find the section for “HDP”. There are 4 sections. Each section represents a configuration option during deployment. Make sure the version listed is the same as the package you retrieved. If it is not, edit it. Exit without saving. |
|
Restart the tomcat service service tomcat restart |
|
In the web client, under the Big Data extension tab, click on Hadoop Distributions |
|
Verify the distribution you just configured is present
To download the EMC Hadoop starter kit that shows configuring BDE with Isilon for HDFS and configuration of HDP go to the following link: |
Comments