6.2 Monitoring actors

There are several scenarios in which you need to monitor the life cycle of a group of actors. In particular, you can significantly simplify error handling and fault tolerance in a concurrent system through monitoring. Here are some examples:

  • Scenario A. We want to be notified when an actor terminates normally or abnormally. For instance, we might want to replace an actor that terminated because of an unhandled exception. Or we might want to rethrow the exception in a different actor that can handle it.
  • Scenario B. We want to express that an actor depends on some other actor in the sense that the former cannot function without the latter. For instance, in a typical master-slave architecture the work that a slave does is useless if the master has crashed. In this case, it would be desirable if all slaves would terminate automatically whenever the master crashes to avoid needless consumption of resources, such as memory.

Both of the above scenarios require us to monitor an actor's life cycle. In particular, they require us to be notified when an actor terminates, normally or abnormally. The actors package provides special support for managing such notifications. However, before diving into those monitoring constructs it is helpful to look at the ways in which actors can terminate.

Actor termination

There are three reasons why an actor terminates:

  1. The actor finishes executing the body of its act method.
  2. The actor invokes exit.
  3. An exception propagates out of the actor's body.

The first reason is a special case of the second one: After executing an actor's body, the runtime system invokes exit implicitly on the terminating actor. The exit method can be invoked with or without passing an argument.

The Actor trait contains the following two method definitions (omitting the modifiers):

  def exit(): Nothing
  def exit(reason: AnyRef): Nothing

Both methods have result type Nothing, which means that invocations do not return normally because an exception is thrown in all cases. In this case, the particular instance of Throwable should never be caught inside the actor, since it is used for internal life-cycle control. Invoking exit (with or without argument) terminates the current actor's execution. The reason parameter is supposed to indicate the reason for terminating the actor. Invoking exit without an argument is equivalent to passing the Symbol 'normal to exit; it indicates that the actor terminated normally. Examples for arguments that indicate abnormal termination are:

  • Exceptions that the actor cannot handle
  • Message objects that the actor cannot process
  • Invalid user input

Exceptions that propagate out of an actor's body lead to that actor's abnormal termination. In the following section, you will learn how actors can react to the termination of other actors. We will show the difference between normal and abnormal termination as seen from an outside actor. More importantly, we will see how to obtain the exit reason of another actor that terminated abnormally.

Linking actors

An actor that wants to receive notifications when another actor terminates must link itself to the other actor. Actors that are linked together implicitly monitor each other.

For example, Listing 6.2 shows a slave actor, which is supposed to do work on behalf of a master actor. The work that the slave does is useless without the master, since the master manages all results produced by the slaveā€”the slave depends on its master. This means that whenever the master crashes, its dependent slave should terminate, since otherwise it would only needlessly consume resources. This is where links come into play. Using the link method, the slave actor links itself to the master actor to express the fact that it depends on it. As a result, the slave is notified whenever its master terminates.

    object Master extends Actor {
      def act() {
        Slave ! 'doWork
        react {
          case 'done =>
            throw new Exception("Master crashed")
        }
      }
    }
    object Slave extends Actor {
      def act() {
        link(Master)
        loop {
          react {
            case 'doWork =>
              println("Done")
              reply('done)
          }
        }
      }
    }
Listing 6.2 - Linking dependent actors.

By default, termination notifications are not delivered as messages to the mailbox of the notified actor. Instead, they have the following effect:

  • If the terminating actor's exit reason is 'normal, no action is taken.
  • If the terminating actor's exit reason is different from 'normal, the notified actor automatically terminates with the same exit reason.

In our master-slave example, this means that the master actor's termination, which the unhandled exception causes, results in the slave actor's termination; the exit reason of the slave actor is the same as for the master actor, namely an instance of UncaughtException. The purpose of class UncaughtException is to provide information about the context in which the exception was thrown, such as the actor, the last message that actor processed, and the sender of that message. The next section shows how to use that information effectively.

Let's use the interpreter shell to interact with the two actors:

  scala> Slave.start()
  res0: scala.actors.Actor = Slave$@190c99
  
scala> Slave.getState res1: scala.actors.Actor.State.Value = Suspended
scala> Master.start() Done res2: scala.actors.Actor = Master$@395aaf
scala> Master.getState res3: scala.actors.Actor.State.Value = Terminated
scala> Slave.getState res4: scala.actors.Actor.State.Value = Terminated

Right after starting the Slave, its state is Suspended. When the Master starts, it sends a 'doWork request to its Slave, which prints Done to the console and replies to the Master with 'done. Once the Master receives 'done, it throws an unhandled exception causing it to terminate abnormally. Because of the link between Slave and Master, this causes the Slave to terminate automatically. Therefore, both actors are in state Terminated at the end.

Trapping termination notifications. In some cases, it is useful to receive termination notifications as messages in a monitoring actor's mailbox. For example, a monitoring actor may want to rethrow an exception that isn't handled by some linked actor. Or, it may want to react to normal termination, which is not possible by default.

You can configure actors to receive all termination notifications as normal messages in their mailbox using the Boolean trapExit flag. In the following example, actor b links itself to actor a:

  val a = actor { ... }
  val b = actor {
    self.trapExit = true
    link(a)
    ...
  }

Note that before actor b invokes link it sets its trapExit member to true; this means that whenever a linked actor terminates (normally or abnormally) it receives a message of type Exit (see below). Therefore, actor b is going to be notified whenever actor a terminates (assuming that actor a did not terminate before b's invocation of link).

    val a = actor {
      react {
        case 'start =>
          val somethingBadHappened = true
          if (somethingBadHappened)
            throw new Exception("Error!")
          println("Nothing bad happened")
      }
    }
    val b = actor {
      self.trapExit = true
      link(a)
      a ! 'start
      react {
        case Exit(from, reason) if from == a =>
          println("Actor 'a' terminated because of " + reason)
      }
    }
Listing 6.3 - Receiving a notification because of an unhandled exception.

Listing 6.3 makes this more concrete by having actor a throw an exception. The exception causes a to terminate, resulting in an Exit message to actor b. Running it produces the following output:

  Actor 'a' terminated because of UncaughtException(...)

Exit is a case class with the following parameters:

  case class Exit(from: AbstractActor, reason: AnyRef)

The first parameter tells us which actor has terminated. In Listing 6.3, actor b uses a guard in the message pattern to only react to Exit messages indicating that actor a has terminated. The second parameter of the Exit case class indicates the reason why actor from has terminated.

The termination of a linked actor that some unhandled exception caused results in an Exit message in which reason is equal to an instance of UncaughtException; it is a case class with the following fields:

  • actor: Actor: the actor that threw the uncaught exception
  • message: Option[Any]: the (optional) message the actor was processing; None if the actor did not receive a message
  • sender: Option[OutputChannel[Any]]: the (optional) sender of the most recently processed message
  • cause: Throwable: the exception that caused the actor to terminate

Since UncaughtException is a case class, it can be matched against when receiving an Exit message. For instance, in Listing 6.3 we can extract the exception that caused actor a to terminate directly from the Exit message:

  react {
    case Exit(from, UncaughtException(_, _, _, _, cause))
        if from == a =>
      println("Actor 'a' terminated because of " + cause)
  }

Running Listing 6.3 with the above change results in the following output:

  Actor 'a' terminated because of java.lang.ExceptionError!

When an actor's trapExit member is true, the actor is also notified when a linked actor terminates normally; for instance, when it finishes the execution of its body. In this case, the Exit message's reason field is the Symbol 'normal. You can try this yourself by changing the local variable somethingBadHappened to false. The output of running the code should then look like this:

  Nothing bad happened
  Actor 'a' terminated because of 'normal

Restarting crashed actors

In some cases, it is useful to restart an actor that has terminated because of an unhandled exception. By resetting a crashed actor's state, or at least parts of it, chances are that the actor can successfully process outstanding messages in its mailbox. Alternatively, upon restart the outstanding messages could be retrieved from the crashed actor's mailbox and forwarded to a healthy actor.

Listing 6.4 shows how to create a keep-alive actor that monitors another actor, restarting it whenever it crashes. The idea is that the keep-alive actor first links itself to the monitored actor (the patient), and then invokes keepAlive. The keepAlive method works as follows: When receiving an Exit message indicating the abnormal termination of patient (in this case, reason != 'normal), we re-link self to patient and restart it. Finally, keepAlive invokes itself recursively to continue monitoring the patient.

  // assumes `self` linked to `patient` and `self.trapExit == true`
  def keepAlive(patient: Actor): Nothing = {
    react {
      case Exit(from, reason) if from == patient =>
        if (reason != 'normal) {
          link(patient)
          patient.restart()
          keepAlive(patient)
        }
    }
  }
Listing 6.4 - Monitoring and restarting an actor using link and restart.

You may wonder why we link self to the patient actor before restarting it. After all, keepAlive assumes that this link already exists. The reason is that self automatically unlinks itself when receiving an Exit message from patient. We do this to avoid leaking memory through links that are never removed. Since in most cases terminated actors are not restarted, this behavior is a good default.

    val crasher = actor {
      println("I'm (re-)born")
      var cnt = 0
      loop {
        cnt += 1
        react {
          case 'request =>
            println("I try to service a request")
            if (cnt % 2 == 0) {
              println("sometimes I crash...")
              throw new Exception
            }
          case 'stop =>
            exit()
        }
      }
    }
  
  val client = actor {     react {       case 'start =>         for (_ <- 1 to 6) { crasher ! 'request }         crasher ! 'stop     }   }
  actor {     self.trapExit = true     link(crasher)     client ! 'start     keepAlive(crasher)   }
Listing 6.5 - Using keepAlive to automatically restart a crashed actor.

  def renderImages(url: String) {
    val imageInfos = scanForImageInfo(url)
    self.trapExit = true
    val dataFutures = for (info <- imageInfos) yield {
      val loader = link {
        react { case Download(info) =>
          throw new Exception("no connection")
          reply(info.downloadImage())
        }: Unit
      }
      loader !! Download(info)
    }
    var i = 0
    loopWhile (i < imageInfos.size) {
      i += 1
      val Input = dataFutures(i-1).inputChannel
      react {
        case Input ! (data @ ImageData(_)) =>
          renderImage(data)
        case Exit(from, UncaughtException(_, Some(Download(info)),
                                          _, _, cause)) =>
          println("Couldn't download image "+info+
                  " because of "+cause)
      }
    }
  }
Listing 6.6 - Reacting to Exit messages for exception handling.

Listing 6.5 shows how to use our keepAlive method to automatically restart an actor whenever it crashes. Actor crasher is the actor that we want to monitor and restart. It maintains a counter such that whenever the counter is even, handling a 'request message results in an exception being thrown. Since the exception is not handled, it causes the actor to crash. We can also tell the crasher to stop, thereby terminating it normally. The client actor waits for a 'start message, and then sends several requests to crasher, some of which cause crashes.

The last actor, the keep-alive actor, links itself to the crasher with trapExit set to true. It is important that the keep-alive actor links itself to the crasher before the client starts. Otherwise, the client could cause the crasher to terminate without sending an Exit message to the keep-alive actor; since the Exit message would never be received, the crasher actor would not be restarted. Running the code in Listing 6.5 produces the following output:

  I'm (re-)born
  I try to service a request
  I try to service a request
  sometimes I crash...
  I'm (re-)born
  I try to service a request
  I try to service a request
  sometimes I crash...
  I'm (re-)born
  I try to service a request
  I try to service a request
  sometimes I crash...
  I'm (re-)born

As you can see, the crasher actor processes six 'request messages. Every second message results in a crash, causing the keep-alive actor to restart it. Restarting the crasher re-runs its body, producing a rebirth message.

Exception handling using futures

One advantage of futures over simple asynchronous messages is that they make it easy to identify to which request they correspond. Basically, each future represents the asynchronous request that created the future. We can leverage this property of futures for exception handling. Let's revisit the image downloader example of Chapter 5. In the following section, we will show how you can extend Listing 5.10 to handle exceptions that may be thrown during image retrieval (for instance, IOExceptions).

Listing 6.6 shows the renderImages method extended with code to handle uncaught exceptions in the downloader actors. The idea is as follows: First, the actor that renders the images sets its trapExit member to true, which enables it to receive termination notifications from linked actors. Second, the renderer actor links itself to each downloader actor. For this, we use one of the link methods defined in the Actor object. The variant we use takes a code block (more precisely, a by-name parameter of type => Unit) as an argument, creates a new actor to execute that block, and links the caller to the newly created actor. More importantly, you link and start the new actor in a single, atomic operation to avoid a subtle race condition. Between starting the new actor and linking to it, the newly created actor could die, which would result in a lost Exit message. This is the main reason the Actor object provides a link method that takes a code block as an argument.

Note that you have to add an explicit type annotation to the react expression. The reason is that the return type of react is Nothing, which is compatible with both link methods since Nothing is a subtype of every other type. By adding the : Unit type ascription, we force the compiler to select the link method that takes a code block.

After the renderer actor has sent out all download requests, it loops trying to receive ImageData objects from each future's input channel. To handle uncaught exceptions in the downloader actors, the renderer also reacts to Exit messages. Whenever an Exit message with an UncaughtException as its reason is received, we use a nested pattern to extract the message the terminated actor was processing. This enables us to easily access the corresponding ImageInfo, since it was passed as part of the Download message.


Footnotes for Chapter 6:

[1] In Scala, Symbols are similar to strings, except that they are always interned, which makes equality checks fast. Also, the syntax for creating Symbols is slightly more lightweight compared to strings.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset