4.4. Barrier Synchronization

Barriers are a common and simple space-based technique used to synchronize a group of processes. A barrier is a particular point in a distributed computation that every process in a group must reach before any process can proceed further. For instance, a distributed application may start many “worker” processes and have them wait until some initial conditions hold before they are allowed to start processing. Some distributed computations may also proceed in phases, where all processes need to complete phase one before proceeding as a group to phase two.

Barriers are easy to implement with a shared variable: If we have n processes that need to reach a certain point in the computation before any processes can continue, then to create a barrier we write a shared variable to a space with the value zero. As each process reaches the barrier, it takes the barrier entry, increments its value, and returns it to the space. Each process then reads the barrier entry with a template that specifies a counter value of n, which causes the process to wait until all other processes have reached the barrier.

Let's look at a simple demonstration of a barrier using the SharedVar from Chapter 3. Here is a worker class that makes use of a barrier:

public class Worker {
  public static void main(String[] args) {
    JavaSpace space = SpaceAccessor.getSpace();

    int id = Integer.parseInt(args[0]);
    int numWorkers = Integer.parseInt(args[1]);

    try {
       // let worker zero create the barrier
       if (id == 0) {
           SharedVar barrier =
             new SharedVar("barrier", 0);
           space.write(barrier, null, Lease.FOREVER);
       }

       System.out.println("Worker " + id +
        " before the barrier");

       // barrier point - take the barrier entry and
       // increment its counter
       SharedVar template = new SharedVar("barrier");
       SharedVar result = (SharedVar)
         space.take (template, null, Long.MAX_VALUE);
       result.increment();
       space.write(result, null, Lease.FOREVER);

       // wait until barrier entry has been updated by
       // all workers
       SharedVar barrier =
         new SharedVar("barrier", numWorkers);
       space.read(barrier, null, Lease.FOREVER);

      // proceed past the barrier
      System.out.println("Worker " +
        id + " past the barrier");

      } catch (Exception e) {
        e.printStackTrace();
      }
  }
}

The worker's main method expects two command line parameters: a unique integer identifier for the process (a number between 0 and n-1) and the total number of workers participating; these parameters are assigned to id and numWorkers fields, respectively.

Following along, the worker then enters into the barrier part of the code. If the process has an id of zero, then it is responsible for writing the barrier entry into the space with a value of zero. First we do some work before reaching the barrier; in this example the work is just a println statement that prints:

Worker 0 before the barrier

Once we reach the barrier we create a template, remove the matching barrier entry from the space, increment its value, and return the entry to the space. As each process reaches this fragment of code, the number in the barrier entry will approach numWorkers.

The worker then creates a barrier template, specifying a counter value of numWorkers, and calls read. This read will block until all processes have reached the barrier and incremented the barrier entry. Once that occurs, the value of the barrier is at numWorkers, and all processes complete their read operation and proceed to do post-barrier work, which in this example is a println statement that prints:

Worker 0 past the barrier

That is really all there is to a barrier. Give this example a try by running several workers.

Note that we have presented one method of creating a barrier, but there are many variations on this theme: barriers can be initially set to numWorkers and decremented each time a worker reaches the barrier. Other times, barriers aren't dependent on the number of processes, but rather some external condition (such as the total tasks computed by a set of processes), and a master process places an entry into the space when it is okay to move past the barrier. We will see an example of this in Chapter 11.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset