Install Hortonworks

Instead of installing Hadoop and all the other components, you will use a preconfigured Docker image. Hortonworks has a Data Platform Sandbox that already has a container which you can load in Docker. To download it, go to https://hortonworks.com/downloads/#sandbox and select DOWNLOAD FOR DOCKER.

You will also need to install the start_sandox_hdp_version.sh script. This will simplify the launching of the container in Docker. You can download the script from GitHub at: https://gist.github.com/orendain/8d05c5ac0eecf226a6fed24a79e5d71a.

Now you will need to load the image in Docker. The following command will show you how:

docker load -i <image name>

The previous command loads the image into Docker. The image name will be similar to HDP_2.6.3_docker_10_11_2017.tar, but it will change depending on your version. To see that the sandbox has been loaded, run the following command:

docker images

The output, if you have no other containers, should look as it does in the following screenshot:

In order to use the web-based GUI Ambari, you will want to have a domain name established for the sandbox. To do that, you will need the IP address of the container. You can get it by running two commands:

docker ps
docker inspect <container ID>

The first command will have the container ID, and the second command will take the container ID and return a lot of information, with the IP address being towards the end. Or, you can take advantage of the Linux command line and just get the IP address by using the following command:

docker inspect $(docker ps --format "{{.ID}}") --format="{{json .NetworkSettings.IPAddress}}"

The previous command wraps the previously mentioned commands into a single command. The docker inspect command takes the output of docker ps as the container ID. It does so by wrapping it in $(), but it also passes a filter so that only the ID is returned. Then, the inspect command also includes a filter to only return the IP address. The text between the {{}} is a Go template. The output of this command should be an IP address, for example, 172.17.0.2.

Now that you have the IP address of the image, you should update your host's file using the following command:

echo '172.17.0.2 sandbox.hortonworks.com sandbox-hdp.hortonworks.com sandbox-hdf.hortonworks.com' | sudo tee -a /etc/hosts

The previous command redirects the output of the echo—which is the text you want in your /etc/hosts file and sends it to the sudo tee -a /etc/hosts command. This second command uses sudo to run as root. The tee command sends the output to a file and to the terminal (STDOUT). The -a tells tee to append to the file, and /etc/hosts is the file you want to append. Now, in your browser, you will be able to use names instead of the IP address.

Now you are ready to launch the image and browse to your Hadoop framework.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset