Contents
Chapter 1: Big Data and the Hadoop Ecosystem
Developing Enterprise Applications with Hadoop
Chapter 2: Storing Data in Hadoop
Combining HDFS and HBase for Effective Data Storage
Managing Metadata with HCatalog
Choosing an Appropriate Hadoop Data Organization for Your Applications
Chapter 3: Processing Your Data with MapReduce
Your First MapReduce Application
Designing MapReduce Implementations
Chapter 4: Customizing MapReduce Execution
Controlling MapReduce Execution with InputFormat
Reading Data Your Way with Custom RecordReaders
Organizing Output Data with Custom Output Formats
Writing Data Your Way with Custom RecordWriters
Optimizing Your MapReduce Execution with a Combiner
Controlling Reducer Execution with Partitioners
Using Non-Java Code with Hadoop
Chapter 5: Building Reliable MapReduce Apps
Unit Testing MapReduce Applications
Local Application Testing with Eclipse
Using Logging for Hadoop Testing
Reporting Metrics with Job Counters
Defensive Programming in MapReduce
Chapter 6: Automating Data Processing with Oozie
Oozie Parameterization with Expression Language
Validating Information about Places Using Probes
Designing Place Validation Based on Probes
Implementing Oozie Workflow Applications
Implementing Workflow Activities
Implementing Oozie Coordinator Applications
Implementing Oozie Bundle Applications
Deploying, Testing, and Executing Oozie Applications
Using the Oozie Console to Get Information about Oozie Applications
Chapter 8: Advanced Oozie Features
Building Custom Oozie Workflow Actions
Adding Dynamic Execution to Oozie Workflows
Using Uber Jars with Oozie Applications
Real-Time Applications in the Real World
Using HBase for Implementing Real-Time Applications
Using Specialized Real-Time Hadoop Query Systems
Using Hadoop-Based Event-Processing Systems
A Brief History: Understanding Hadoop Security Challenges
Oozie Authentication and Authorization
Security Enhancements with Project Rhino
Putting it All Together — Best Practices for Securing Hadoop
Chapter 11: Running Hadoop Applications on AWS
Options for Running Hadoop on AWS
Understanding the EMR-Hadoop Relationship
Automating EMR Job Flow Creation and Job Execution
Orchestrating Job Execution in EMR
Chapter 12: Building Enterprise Security Solutions for Hadoop Implementations
Security Concerns for Enterprise Applications
What Hadoop Security Doesn’t Natively Provide for Enterprise Applications
Approaches for Securing Enterprise Applications Using Hadoop
Simplifying MapReduce Programming with DSLs