Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Previous Chapter

Hadoop Cluster Deployment

Table of Contents

Hadoop Cluster Deployment

Credits

About the Author

About the Reviewers

www.PacktPub.com

Support files, eBooks, discount offers and more

Why Subscribe?

Free Access for Packt account holders

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Errata

Piracy

Questions

1. Setting Up Hadoop Cluster – from Hardware to Distribution

Choosing Hadoop cluster hardware

Choosing the DataNode hardware

Low storage density cluster

High storage density cluster

NameNode and JobTracker hardware configuration

The NameNode hardware

The JobTracker hardware

Gateway and other auxiliary services

Network considerations

Hadoop hardware summary

Hadoop distributions

Hadoop versions

Choosing Hadoop distribution

Cloudera Hadoop distribution

Hortonworks Hadoop distribution

MapR

Choosing OS for the Hadoop cluster

Summary

2. Installing and Configuring Hadoop

Configuring OS for Hadoop cluster

Choosing and setting up the filesystem

Setting up Java Development Kit

Other OS settings

Setting up the CDH repositories

Setting up NameNode

JournalNode, ZooKeeper, and Failover Controller

Hadoop configuration files

NameNode HA configuration

JobTracker configuration

Configuring the job scheduler

JobQueueTaskScheduler

FairScheduler

CapacityTaskScheduler

DataNode configuration

TaskTracker configuration

Advanced Hadoop tuning

hdfs-site.xml

mapred-site.xml

core-site.xml

Summary

3. Configuring the Hadoop Ecosystem

Hosting the Hadoop ecosystem

Sqoop

Installing and configuring Sqoop

Sqoop import example

Sqoop export example

Hive

Hive architecture

Installing Hive Metastore

Installing the Hive client

Installing Hive Server

Impala

Impala architecture

Installing Impala state store

Installing the Impala server

Summary

4. Securing Hadoop Installation

Hadoop security overview

HDFS security

MapReduce security

Hadoop Service Level Authorization

Hadoop and Kerberos

Kerberos overview

Kerberos in Hadoop

Configuring Kerberos clients

Generating Kerberos principals

Enabling Kerberos for HDFS

Enabling Kerberos for MapReduce

Summary

5. Monitoring Hadoop Cluster

Monitoring strategy overview

Hadoop Metrics

JMX Metrics

Monitoring Hadoop with Nagios

Monitoring HDFS

NameNode checks

JournalNode checks

ZooKeeper checks

Monitoring MapReduce

JobTracker checks

Monitoring Hadoop with Ganglia

Summary

6. Deploying Hadoop to the Cloud

Amazon Elastic MapReduce

Installing the EMR command-line interface

Choosing the Hadoop version

Launching the EMR cluster

Temporary EMR clusters

Preparing input and output locations

Using Whirr

Installing and configuring Whirr

Summary

Index

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.