Barriers are a common and simple space-based technique used to synchronize a group of processes. A barrier is a particular point in a distributed computation that every process in a group must reach before any process can proceed further. For instance, a distributed application may start many “worker” processes and have them wait until some initial conditions hold before they are allowed to start processing. Some distributed computations may also proceed in phases, where all processes need to complete phase one before proceeding as a group to phase two.
Barriers are easy to implement with a shared variable: If we have n processes that need to reach a certain point in the computation before any processes can continue, then to create a barrier we write a shared variable to a space with the value zero. As each process reaches the barrier, it takes the barrier entry, increments its value, and returns it to the space. Each process then reads the barrier entry with a template that specifies a counter value of n, which causes the process to wait until all other processes have reached the barrier.
Let's look at a simple demonstration of a barrier using the SharedVar from Chapter 3. Here is a worker class that makes use of a barrier:
public class Worker { public static void main(String[] args) { JavaSpace space = SpaceAccessor.getSpace(); int id = Integer.parseInt(args[0]); int numWorkers = Integer.parseInt(args[1]); try { // let worker zero create the barrier if (id == 0) { SharedVar barrier = new SharedVar("barrier", 0); space.write(barrier, null, Lease.FOREVER); } System.out.println("Worker " + id + " before the barrier"); // barrier point - take the barrier entry and // increment its counter SharedVar template = new SharedVar("barrier"); SharedVar result = (SharedVar) space.take (template, null, Long.MAX_VALUE); result.increment(); space.write(result, null, Lease.FOREVER); // wait until barrier entry has been updated by // all workers SharedVar barrier = new SharedVar("barrier", numWorkers); space.read(barrier, null, Lease.FOREVER); // proceed past the barrier System.out.println("Worker " + id + " past the barrier"); } catch (Exception e) { e.printStackTrace(); } } }
The worker's main method expects two command line parameters: a unique integer identifier for the process (a number between 0 and n-1) and the total number of workers participating; these parameters are assigned to id and numWorkers fields, respectively.
Following along, the worker then enters into the barrier part of the code. If the process has an id of zero, then it is responsible for writing the barrier entry into the space with a value of zero. First we do some work before reaching the barrier; in this example the work is just a println statement that prints:
Worker 0 before the barrier
Once we reach the barrier we create a template, remove the matching barrier entry from the space, increment its value, and return the entry to the space. As each process reaches this fragment of code, the number in the barrier entry will approach numWorkers.
The worker then creates a barrier template, specifying a counter value of numWorkers, and calls read. This read will block until all processes have reached the barrier and incremented the barrier entry. Once that occurs, the value of the barrier is at numWorkers, and all processes complete their read operation and proceed to do post-barrier work, which in this example is a println statement that prints:
Worker 0 past the barrier
That is really all there is to a barrier. Give this example a try by running several workers.
Note that we have presented one method of creating a barrier, but there are many variations on this theme: barriers can be initially set to numWorkers and decremented each time a worker reaches the barrier. Other times, barriers aren't dependent on the number of processes, but rather some external condition (such as the total tasks computed by a set of processes), and a master process places an entry into the space when it is okay to move past the barrier. We will see an example of this in Chapter 11.