Book Description
Organizations are eager to capitalize on real-time data analysis, move beyond batch processing for time-critical insights, and excel at big data in a predictable, reliable way. But performance has been an issue for distributed systems like Hadoop, especially when the use cases of a single cluster become multi-tenant or multi-workload. The worst part? You may not even know you have a performance issue.
In this report, Chad Carson and Sean Suchter from Pepperdata describe the performance challenges of running multi-tenant distributed computing environments, especially within a Hadoop context. After examining pros and cons of current solutions for these problems, you’ll learn how to use real-time, intelligent software that tracks and dynamically adjusts each application’s usage of physical hardware. Get ahead of your Hadoop operations for faster, better decision-making and faster, better business returns.
With this report, you’ll explore:
- How Hadoop and other multi-tenant distributed systems work, and why performance matters
- Business-visible symptoms of performance problems: late jobs, inconsistent runtimes, and underutilized hardware
- Scheduling challenges in multi-tenant systems
- Symptoms and solutions for CPU performance limitations
- Physical and virtual limits of node memory—and what happens when you run out
- Identifying and solving performance problems due to disk and network performance limits and other typical bottlenecks
- Solutions for monitoring performance and accurately allocating cluster costs among users and business units