1.6 Programming the data center

One example of building deterministic systems out of indeterministic component computations are data centers constructed of commercial, off-the-shelf (COTS) components. Many well-known web services companies have proven the economic advantages of using COTS hardware as basic building blocks for highly reliable data centers. Such an environment becomes practical when infrastructure software alleviates the need for developers to concern themselves with the intricacies of how such a data center partitions work between the various hardware components. Instead, application developers can focus on higher-level concerns, such as specifying the algorithms to use when servicing an incoming request.

Example: MapReduce. A good example of an infrastructure that makes programming COTS clusters easier is MapReduce.[13] With MapReduce, a user provides some data as well as some algorithms to operate on that data, and submits that as a request to the MapReduce infrastructure software. The MapReduce software, in turn, distributes the workload required to compute the specified request across available cluster nodes and returns a result to the user. (In Chapter 9 you will learn how to build an actor-based MapReduce data-processing engine.)

An important aspect of MapReduce is that, upon submitting a job, a user can reasonably expect some result back. For instance, should a node executing parts of a MapReduce job fail to return results within a specified time period, the MapReduce software restarts that component job on another node. Because it guarantees to return a result, MapReduce not only allows an infrastructure to scale a compute-intensive job to a cluster of nodes, but more significantly, MapReduce lends reliability guarantees to the computation. It is that reliability aspect that makes MapReduce suitable for COTS-based compute clusters.

While a developer using MapReduce can expect to receive a result back, exactly when the result will arrive cannot be known prior to submitting the job: the user knows only that a result will be received, but he or she cannot know, in advance, when that will be. More generally, the system provides a guarantee that at some point a computation is brought to completion, but a developer using the system cannot in advance put a time bound on the length of time a computation would run.

Intuitively, it is easy to understand the reason for that: As the infrastructure software partitions the computation, it must communicate with other system components—it must send messages and await replies from individual cluster nodes, for instance. Such communication can incur various latencies, and those communication latencies impact the time it takes to return a result. You can't tell, in advance of submitting a job, how large those latencies will be.

Although some MapReduce implementations aim to ensure that a job returns some (possibly incomplete) results in a specified amount of time, the actors model of concurrent computation is more general: it acknowledges that we may not know in advance just how long a concurrent computation would take. Put another way, you cannot place a time bound in advance on the length a concurrent computation would run. That's in contrast to traditional, sequential algorithms that model computations with well-defined execution times on a given input.

By acknowledging the property of unbounded computational times, actors aim to provide a more realistic model of concurrent computing. While varying communication latencies is easy to grasp in the case of distributed systems or clusters, it is also not possible in a four-core processor to tell in advance how long before cores 2, 3, and 4 will send their replies back to core 1. All we can say is that the replies will eventually arrive.

At the same time, unboundedness does not imply infinite times: While infinity is an intriguing concept, it lends but limited usefulness to realistically modeling computations. The actor model, indeed, requires that a concurrent computation terminate in finite time, but it also acknowledges that it may not be possible to tell, in advance, just how long that time will be.

In the actor model, unboundedness and indeterminism—or, unbounded indeterminism—are key attributes of concurrent computing. Although these characteristics are also present in primarily sequential systems, they are pervasive in concurrent programs. Acknowledging these attributes of concurrency and providing a model that allows a developer to reason about concurrent programs in the face of those attributes are the prime goals of actors.


Footnotes for Chapter 1:

[1] Hewitt et. al., "A Universal Modular ACTOR Formalism for Artificial Intelligence" HewittBS73

[2] Haller and Odersky, "Scala Actors: Unifying Thread-based and Event-based Programming" HallerO09

[3] See http://akka.io/.

[4] Sutter, "The Free Lunch Is Over: A Fundamental Turn Toward Concurrency" Sutter05

[5] Goetz et. al., Java Concurrency in Practice Goet06a

[6] Gropp et. al., Using MPI: Portable Parallel Programming with the Message--Passing Interface gropp1

[7] Hoare, "Communicating Sequential Processes" Hoar78a

[8] See http://golang.org/.

[9] Armstrong et. al., Concurrent Programming in Erlang Armstrong-etal95

[10] Haller and Odersky, "Capabilities for Uniqueness and Borrowing" HallerO10

[11] John L. Hennessy and David A. Patterson, Computer Architecture: A Quantitative Approach HennessyP11

[12] Hoare, "Communicating Sequential Processes" Hoar78a

[13] Dean and Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters" Dean-Ghemawat08

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset