Chapter 3

Defining Resilience

Andrew Hale

Tom Heijer

Pictures of Resilience

Resilience first conjures up in the mind pictures of bouncing back from adversity: the boxer dancing courageously out of his corner despite the battering in the previous round; the kidnap victim, like Terry Waite, emerging from months of privation as a prisoner of terrorist organisations, smiling and talking rationally about his experiences; the victim of a severe handicap, like Stephen Hawkins or Christopher Reeve, still managing to make a major contribution to society or a chosen cause. If we were to apply this image to organisations, the emphasis would come to fall on responding to disaster: rapid recovery from a disastrous fire by reopening production in a temporary building; restoring confidence among local residents after a major chemical leak by full openness over the investigation and involvement in decisions about improved prevention measures; or restoring power on the network after a major outage by drafting in extra staff to work around the clock. This captures some of the essentials, with an emphasis on flexibility, coping with unexpected and unplanned situations and responding rapidly to events, with excellent communication and mobilisation of resources to intervene at the critical points. However, we would argue that we should extend the definition a little more broadly, in order to encompass also the ability to avert the disaster or major upset, using these same characteristics. Resilience then describes also the characteristic of managing the organisation’s activities to anticipate and circumvent threats to its existence and primary goals. This is shown in particular in an ability to manage severe pressures and conflicts between safety and the primary production or performance goals of the organisation.

This definition can be projected easily onto the model which Rasmussen proposed to understand the drift to danger (Figure 3.1, adapted from Rasmussen & Svedung, 2000).


Figure 3.1: Rasmussen’s drift to danger model

This aspect of resilience concentrates on the prevention of loss of control over risk, rather than recovery from that loss of control. If we use the representation of the bowtie model of accident scenarios (Figure 3.2 adapted from Visser, 1998), we are locating resilience not only on the right-hand side of the centre event, but also on the left.

Reverting to Rasmussen’s model, resilience is the ability to steer the activities of the organisation so that it may sail close to the area where accidents will happen, but always stays out of that dangerous area. This implies a very sensitive awareness of where the organisation is in relation to that danger area and a very rapid and an effective response when signals of approaching or actual danger are detected, even unexpected or unknown ones. The picture this conjures up is of a medieval ship with wakeful lookouts, taking constant soundings as it sails in the unknown and misty waters of the far north or west, alert for icebergs, terrible beasts or the possibility of falling off the edge of the flat earth. We cannot talk of resilience unless the organisation achieves this feat consistently over a long period of time. As the metaphor of the ship implies, resilience is a dynamic process of steering and not a static state of an organisation. It has to be worked at continuously and, like the voyage of the Flying Dutchman, the task is never ended and the resilience can always disappear or be proven ineffective in the face of particular threats.


Figure 3.2: Bowtie model

How Do We Recognise Resilience When We See It?

These characteristics of resilience are worked out in more detail elsewhere in this book. However, we wish to add one more aspect of a definition at this point. This is something we need to consider when asking the question: ‘how do we recognise a resilient organisation?’ One answer to this is to rely on measuring the sort of characteristics which have been set out above. Are they present in the organisation or system? However, this still leaves us with the dilemma as to whether these characteristics are indeed the ones that are crucial in achieving that dynamic avoidance of accidents and disasters. We would really like to have confirmation with an outcome measure over a long period. The obvious one is the safety performance of the organisation or activity over a long period, coupled with its survival (and preferably prosperity) on the basis of other performance measures such as production, service, productivity and quality. The question then arises whether a safe organisation is by definition a resilient one and whether one which has accidents is by definition not resilient. We shall explore the first question in Chapter 9 by looking at railways, but suffice it here to say that the answer is ‘no’. Organisations can be safe without being resilient. A few observations about the second question are given below.

Is Road Traffic Resilient?

Let us take the road system and ask whether it is resilient. It undoubtedly has many accidents. It is high up on the list of worldwide killers compiled by the World Health Organisation. In the Netherlands it kills more than ten times as many people as die from work accidents. So it does not look good for its credentials as a resilient system. However, let us look at it in terms of risk. In setting up a study into interaction behaviour between traffic participants (Houtenbos et al., 2004) we made some rough estimates of risk for the road system in the Netherlands. We took ‘back of the envelope’ figures such as the following:

•  1.3 × 1011 vehicle kilometres per year (cf.,

•  An estimated average of 5 encounters/km with other road users; encounters being meetings with other traffic at road junctions or during overtaking manoeuvres, pedestrians or cyclists crossing the road, etc., where we can consider that a potential for accident exists. (We have taken a range from 1 encounter/km for motorways on which 38% of Dutch vehicle kilometres are driven, up to 20/km on 50kph roads in urban areas, on which 26% of the kilometres are driven (, with intervening numbers for 80 and 100kph roads. These are figures we estimated ourselves and have no empirical basis.)

This means that there are some 6.5 × 1011 encounters/year for a death toll of a little over 1000/year. The vast majority of these are the result of an encounter between at least two traffic participants. In a more accurate analysis we would need to subtract one-sided accidents where the vehicle leaves the road due to excessive speed, loss of adhesion, etc. This gives us a risk of death of 1.5 × 10-9/encounter. We know that there are far more injury and damage accidents than deaths, but even if this is a factor 10,000 more, the accident rate per encounter is still only 1.5 × 10-5. This is extremely low, particularly when we consider that typical figures used in quantitative risk assessment to estimate the probability of human error in even simple tasks are never below 10-4. Is the road system resilient on the basis of this argument? Given the exposure to situations which could lead to accidents, we would argue that it is, despite the accident toll. This led us to conclude that the participants manage these encounters very effectively using local clues and local interactions, and to decide that we needed to study much more thoroughly how the encounters are handled before jumping to all sorts of conclusions about the need to introduce complex information technology to improve the safety of these interactions in the system.

In the other direction, we can ask whether one of the reasons why the aviation system has such a good safety performance – particularly in the space between take-off and landing – is simply that airspace, certainly compared with the roads in the Netherlands, is amazingly empty and so encounters, even without air traffic management (ATM) control, would be very rare? By this we do not mean to call into question whether aviation is ultra-safe or not, but to put the question whether safety performance alone can be seen as proof of resilience without taking into account the issue of exposure to risk. We do this particularly because we see some similar (potential) mechanisms in both systems, which lead to high safety performance, and which we would see as characteristic of resilience. Notable among these is the (potential) presence of simple local rules governing interactions, without the need for central controlling intervention. Hoekstra (2001) demonstrated with simulations just how successful such local rules for governing encounters under ‘free flight’ rules (with no ATM control) could be. We see just such ‘rules’ being used by drivers on the road at intersections (Houtenbos et al., 2004) and they also seem to work amazingly well most of the time.


In this short note we are pleading for two things: what is interesting for safety is preventing accidents and not just surviving them. If resilience is used with its common meaning of survival in adversity, we do not see it to be of interest to us. If its definition is extended to cover the ability in difficult conditions to stay within the safe envelope and avoid accidents it becomes a useful term. We would, however, ask whether we do not have other terms already for that phenomenon, such as high reliability organisations, or organisations with an excellent safety culture.

We would enter a plea that we should consider resilience against the background of the size of the risk. You can be resilient in situations of very high risk and still have a quite substantial number of accidents. The significant thing is that there are not far more. You can also fail to be resilient in conditions of low risk and have no accidents. We will argue in Chapter 9 that you can also perform well in major risk conditions without being resilient.

Nature of Changes in Systems

Yushi Fujita

Humans in systems (e.g., operators, maintenance people) are essentially alike and are, in general, adaptive and proactive. These are admirable qualities, but of limited scope. Adaptive and proactive behaviors can change systems continuously, but humans at the front end alone may or may not be able to recognise the potential impact that the changes can have on the system, especially the impact when several changes are put into effect simultaneously. Humans at the back end (e.g., administrators, regulators) tend to have sanguine ideas such as that the system is always operated as planned, and that rules and procedures can fix the system at an optimal state. Mismatches caused by these two tendencies constitute latent hazards, which may cause the system to drift to failures.

