There are several scenarios in which you need to monitor the life cycle of a group of actors. In particular, you can significantly simplify error handling and fault tolerance in a concurrent system through monitoring. Here are some examples:
Both of the above scenarios require us to monitor an actor's life cycle. In particular, they require us to be notified when an actor terminates, normally or abnormally. The actors package provides special support for managing such notifications. However, before diving into those monitoring constructs it is helpful to look at the ways in which actors can terminate.
There are three reasons why an actor terminates:
The first reason is a special case of the second one: After executing an actor's body, the runtime system invokes exit implicitly on the terminating actor. The exit method can be invoked with or without passing an argument.
The Actor trait contains the following two method definitions (omitting the modifiers):
def exit(): Nothing def exit(reason: AnyRef): Nothing
Both methods have result type Nothing, which means that invocations do not return normally because an exception is thrown in all cases. In this case, the particular instance of Throwable should never be caught inside the actor, since it is used for internal life-cycle control. Invoking exit (with or without argument) terminates the current actor's execution. The reason parameter is supposed to indicate the reason for terminating the actor. Invoking exit without an argument is equivalent to passing the Symbol 'normal to exit; it indicates that the actor terminated normally. Examples for arguments that indicate abnormal termination are:
Exceptions that propagate out of an actor's body lead to that actor's abnormal termination. In the following section, you will learn how actors can react to the termination of other actors. We will show the difference between normal and abnormal termination as seen from an outside actor. More importantly, we will see how to obtain the exit reason of another actor that terminated abnormally.
An actor that wants to receive notifications when another actor terminates must link itself to the other actor. Actors that are linked together implicitly monitor each other.
For example, Listing 6.2 shows a slave actor, which is supposed to do work on behalf of a master actor. The work that the slave does is useless without the master, since the master manages all results produced by the slaveāthe slave depends on its master. This means that whenever the master crashes, its dependent slave should terminate, since otherwise it would only needlessly consume resources. This is where links come into play. Using the link method, the slave actor links itself to the master actor to express the fact that it depends on it. As a result, the slave is notified whenever its master terminates.
object Master extends Actor { def act() { Slave ! 'doWork react { case 'done => throw new Exception("Master crashed") } } } object Slave extends Actor { def act() { link(Master) loop { react { case 'doWork => println("Done") reply('done) } } } }
By default, termination notifications are not delivered as messages to the mailbox of the notified actor. Instead, they have the following effect:
In our master-slave example, this means that the master actor's termination, which the unhandled exception causes, results in the slave actor's termination; the exit reason of the slave actor is the same as for the master actor, namely an instance of UncaughtException. The purpose of class UncaughtException is to provide information about the context in which the exception was thrown, such as the actor, the last message that actor processed, and the sender of that message. The next section shows how to use that information effectively.
Let's use the interpreter shell to interact with the two actors:
scala> Slave.start() res0: scala.actors.Actor = Slave$@190c99
scala> Slave.getState res1: scala.actors.Actor.State.Value = Suspended
scala> Master.start() Done res2: scala.actors.Actor = Master$@395aaf
scala> Master.getState res3: scala.actors.Actor.State.Value = Terminated
scala> Slave.getState res4: scala.actors.Actor.State.Value = Terminated
Right after starting the Slave, its state is Suspended. When the Master starts, it sends a 'doWork request to its Slave, which prints Done to the console and replies to the Master with 'done. Once the Master receives 'done, it throws an unhandled exception causing it to terminate abnormally. Because of the link between Slave and Master, this causes the Slave to terminate automatically. Therefore, both actors are in state Terminated at the end.
Trapping termination notifications. In some cases, it is useful to receive termination notifications as messages in a monitoring actor's mailbox. For example, a monitoring actor may want to rethrow an exception that isn't handled by some linked actor. Or, it may want to react to normal termination, which is not possible by default.
You can configure actors to receive all termination notifications as normal messages in their mailbox using the Boolean trapExit flag. In the following example, actor b links itself to actor a:
val a = actor { ... } val b = actor { self.trapExit = true link(a) ... }
Note that before actor b invokes link it sets its trapExit member to true; this means that whenever a linked actor terminates (normally or abnormally) it receives a message of type Exit (see below). Therefore, actor b is going to be notified whenever actor a terminates (assuming that actor a did not terminate before b's invocation of link).
val a = actor { react { case 'start => val somethingBadHappened = true if (somethingBadHappened) throw new Exception("Error!") println("Nothing bad happened") } } val b = actor { self.trapExit = true link(a) a ! 'start react { case Exit(from, reason) if from == a => println("Actor 'a' terminated because of " + reason) } }
Listing 6.3 makes this more concrete by having actor a throw an exception. The exception causes a to terminate, resulting in an Exit message to actor b. Running it produces the following output:
Actor 'a' terminated because of UncaughtException(...)
Exit is a case class with the following parameters:
case class Exit(from: AbstractActor, reason: AnyRef)
The first parameter tells us which actor has terminated. In Listing 6.3, actor b uses a guard in the message pattern to only react to Exit messages indicating that actor a has terminated. The second parameter of the Exit case class indicates the reason why actor from has terminated.
The termination of a linked actor that some unhandled exception caused results in an Exit message in which reason is equal to an instance of UncaughtException; it is a case class with the following fields:
Since UncaughtException is a case class, it can be matched against when receiving an Exit message. For instance, in Listing 6.3 we can extract the exception that caused actor a to terminate directly from the Exit message:
react { case Exit(from, UncaughtException(_, _, _, _, cause)) if from == a => println("Actor 'a' terminated because of " + cause) }
Running Listing 6.3 with the above change results in the following output:
Actor 'a' terminated because of java.lang.Exception: Error!
When an actor's trapExit member is true, the actor is also notified when a linked actor terminates normally; for instance, when it finishes the execution of its body. In this case, the Exit message's reason field is the Symbol 'normal. You can try this yourself by changing the local variable somethingBadHappened to false. The output of running the code should then look like this:
Nothing bad happened Actor 'a' terminated because of 'normal
In some cases, it is useful to restart an actor that has terminated because of an unhandled exception. By resetting a crashed actor's state, or at least parts of it, chances are that the actor can successfully process outstanding messages in its mailbox. Alternatively, upon restart the outstanding messages could be retrieved from the crashed actor's mailbox and forwarded to a healthy actor.
Listing 6.4 shows how to create a keep-alive actor that monitors another actor, restarting it whenever it crashes. The idea is that the keep-alive actor first links itself to the monitored actor (the patient), and then invokes keepAlive. The keepAlive method works as follows: When receiving an Exit message indicating the abnormal termination of patient (in this case, reason != 'normal), we re-link self to patient and restart it. Finally, keepAlive invokes itself recursively to continue monitoring the patient.
// assumes `self` linked to `patient` and `self.trapExit == true` def keepAlive(patient: Actor): Nothing = { react { case Exit(from, reason) if from == patient => if (reason != 'normal) { link(patient) patient.restart() keepAlive(patient) } } }
You may wonder why we link self to the patient actor before restarting it. After all, keepAlive assumes that this link already exists. The reason is that self automatically unlinks itself when receiving an Exit message from patient. We do this to avoid leaking memory through links that are never removed. Since in most cases terminated actors are not restarted, this behavior is a good default.
val crasher = actor { println("I'm (re-)born") var cnt = 0 loop { cnt += 1 react { case 'request => println("I try to service a request") if (cnt % 2 == 0) { println("sometimes I crash...") throw new Exception } case 'stop => exit() } } }
val client = actor { react { case 'start => for (_ <- 1 to 6) { crasher ! 'request } crasher ! 'stop } }
actor { self.trapExit = true link(crasher) client ! 'start keepAlive(crasher) }
def renderImages(url: String) { val imageInfos = scanForImageInfo(url) self.trapExit = true val dataFutures = for (info <- imageInfos) yield { val loader = link { react { case Download(info) => throw new Exception("no connection") reply(info.downloadImage()) }: Unit } loader !! Download(info) } var i = 0 loopWhile (i < imageInfos.size) { i += 1 val Input = dataFutures(i-1).inputChannel react { case Input ! (data @ ImageData(_)) => renderImage(data) case Exit(from, UncaughtException(_, Some(Download(info)), _, _, cause)) => println("Couldn't download image "+info+ " because of "+cause) } } }
Listing 6.5 shows how to use our keepAlive method to automatically restart an actor whenever it crashes. Actor crasher is the actor that we want to monitor and restart. It maintains a counter such that whenever the counter is even, handling a 'request message results in an exception being thrown. Since the exception is not handled, it causes the actor to crash. We can also tell the crasher to stop, thereby terminating it normally. The client actor waits for a 'start message, and then sends several requests to crasher, some of which cause crashes.
The last actor, the keep-alive actor, links itself to the crasher with trapExit set to true. It is important that the keep-alive actor links itself to the crasher before the client starts. Otherwise, the client could cause the crasher to terminate without sending an Exit message to the keep-alive actor; since the Exit message would never be received, the crasher actor would not be restarted. Running the code in Listing 6.5 produces the following output:
I'm (re-)born I try to service a request I try to service a request sometimes I crash... I'm (re-)born I try to service a request I try to service a request sometimes I crash... I'm (re-)born I try to service a request I try to service a request sometimes I crash... I'm (re-)born
As you can see, the crasher actor processes six 'request messages. Every second message results in a crash, causing the keep-alive actor to restart it. Restarting the crasher re-runs its body, producing a rebirth message.
One advantage of futures over simple asynchronous messages is that they make it easy to identify to which request they correspond. Basically, each future represents the asynchronous request that created the future. We can leverage this property of futures for exception handling. Let's revisit the image downloader example of Chapter 5. In the following section, we will show how you can extend Listing 5.10 to handle exceptions that may be thrown during image retrieval (for instance, IOExceptions).
Listing 6.6 shows the renderImages method extended with code to handle uncaught exceptions in the downloader actors. The idea is as follows: First, the actor that renders the images sets its trapExit member to true, which enables it to receive termination notifications from linked actors. Second, the renderer actor links itself to each downloader actor. For this, we use one of the link methods defined in the Actor object. The variant we use takes a code block (more precisely, a by-name parameter of type => Unit) as an argument, creates a new actor to execute that block, and links the caller to the newly created actor. More importantly, you link and start the new actor in a single, atomic operation to avoid a subtle race condition. Between starting the new actor and linking to it, the newly created actor could die, which would result in a lost Exit message. This is the main reason the Actor object provides a link method that takes a code block as an argument.
Note that you have to add an explicit type annotation to the react expression. The reason is that the return type of react is Nothing, which is compatible with both link methods since Nothing is a subtype of every other type. By adding the : Unit type ascription, we force the compiler to select the link method that takes a code block.
After the renderer actor has sent out all download requests, it loops trying to receive ImageData objects from each future's input channel. To handle uncaught exceptions in the downloader actors, the renderer also reacts to Exit messages. Whenever an Exit message with an UncaughtException as its reason is received, we use a nested pattern to extract the message the terminated actor was processing. This enables us to easily access the corresponding ImageInfo, since it was passed as part of the Download message.
[1] In Scala, Symbols are similar to strings, except that they are always interned, which makes equality checks fast. Also, the syntax for creating Symbols is slightly more lightweight compared to strings.