1.2 Actors versus threads

In a concurrent program, many independently executing threads, or sequential processes, work together to fulfill an application's requirements. Investigation into concurrent programming has mostly focused on defining how concurrently executing sequential processes can communicate such that a larger process—for example, a program that executed those processes—can proceed predictably.

The two most common ways of communication among concurrent
threads are synchronization on shared state and message passing. Many familiar programming constructs, such as semaphores and monitors, are based on shared-state synchronization. Developers of concurrent programs are familiar with those structures. For example, Java programmers can find these structures in the java.util.concurrent package in common Java distributions.[5] Among the biggest challenges for anyone using shared-state concurrency are avoiding concurrency hazards, such as data races and deadlocks, and scalability.

Message passing is an alternative way of synchronizing cooperating
threads. There are two important categories of systems based on message passing. In channel-based systems, messages are sent to channels (or ports) that processes can share. Several processes can then receive messages from the same shared channels. Examples of channel-based systems are Message-Passing Interface (MPI)[6] and systems based on the Communicating Sequential Processes (CSP) paradigm,[7] such as the Go language.[8] Systems based on actors (or agents, or Erlang-style processes[9]) are in the second category of message-passing concurrency. In these systems, messages are sent directly to actors; you don't need to create intermediary channels between processes.

An important advantage of message passing over shared-state concurrency is that it makes it easier to avoid data races. A data race happens whenever two processes access the same piece of data concurrently and at least one of the accesses is mutating (that is, changing the value of) the data. For example, two Java threads concurrently accessing the same field of the same instance, such that one of the threads reassigns the field, constitutes a data race. If processes communicate only by passing messages, and those messages are immutable, then data races are avoided by design.

Aside from such low-level data races, higher-level data races exist. For example, a process may depend on receiving two messages in a certain order. If it is possible that the two messages are sent concurrently to that process, the program contains a race condition; this means that in some runs the program enters an invalid state through concurrent modification of shared state, namely the state of the (shared) receiving process. Leaving data races aside for a moment, anecdotal evidence suggests that message passing in practice also reduces the risk of deadlock.

A potential disadvantage of message passing is that the communication overhead may be high. To communicate, processes have to create and send messages, and these messages are often buffered in queues before they can be received to support asynchronous communication.

By contrast, shared-state concurrency enables direct access to shared memory, as long as it is properly synchronized. To reduce the communication overhead of message passing, large messages should not be transferred by copying the message state; instead, only a reference to the message should be sent. However, this reintroduces the risk for data races when several processes have access to the same mutable (message) data. It is an ongoing research effort to provide static checkers; for instance, the Scala compiler plug-in for uniqueness types[10] that can verify that programs passing mutable messages by reference do not contain data races.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset