Preface: Seeking Resilience

Christopher Nemeth

Becoming Resilient is the second text in the Ashgate series “Resilience Engineering in Practice (REiP).” While Ashgate Publishing’s “Resilience Engineering Perspectives” series has explored what the field of resilience engineering (RE) is, REiP take the practical approach to RE. The chapters in this text seek answers to the challenging questions that are posed by applying concepts in prior texts to actual problems. Their reports show that while the first successful steps have been made, there is still a lot to do in order to develop RE from an initial concept into an approach that will change the way systems are developed and operated.

Opportunities for Engineering Practice

The creation of systems that are ready to evolve in response to unforeseen conditions poses a challenge to also develop a new way to think about design and engineering. Designers and engineers typically develop systems, and engineers are entrusted with ensuring systems are built to operate according to requirements. But new approaches such as RE call for new abilities. What professional abilities are needed to create systems that have the resilient characteristics that chapters in this text describe? What skills and opportunities will engineers need in order to develop systems that can adapt to meet unforeseen demand?

For years, radio towers used a strong base to withstand the effects of high winds. Their rigid design, though, limited how high they could be built. The invention of slender radio masts, held in place by guy wires, made taller towers possible by allowing the structure to move in response to the wind instead of standing rigidly against it. Engineering practice faces a similar transition.

Engineers have traditionally sought ways to maintain sufficient margins to assure safe performance. In the process they have developed a resistance to sources of variability that could affect those margins. This may fit well-bounded stable domains, where sources of variability are fairly well known. However, poorly bounded and ill-behaved domains are increasing in number and importance. Domains such as these routinely make demands that can only be met by socio-technical systems (Hollnagel and Woods, 2005), which are the goal-directed collaborative assembly of people, hardware and software. In these systems, their elements operate collectively, not individually. Woods (2000) referred to the interaction of all system elements as “agent-environment mutuality.” Their performance and interaction provide outcome behavior, and the data that can be gathered on their performance can be compared against requirements.

Engineering is the application of science and mathematics “by which the properties of matter and the sources of energy in nature are made useful to people” (Merriam Webster, 2013). Systems engineering (SE), which has a significant role in RE, integrates multiple elements into a whole that is intended to serve a useful purpose. SE “… is an interdisciplinary approach and means to enable the realization of successful systems” that “focuses on defining customer needs and required functionality early in the development cycle, documenting requirements, then proceeding with design synthesis and system validation while considering the complete problem: operations, performance, test, manufacturing, cost and schedule, training and support, disposal” (INCOSE, 2013). To do this, SE “… integrates all the disciplines and specialty groups into a team effort forming a structured development process that proceeds from concept to production to operation” and “considers both the business and the technical needs of all customers with the goal of providing a quality product that meets the user needs” (INCOSE 2013). The process assembles elements into a coherent whole, but how does that whole operate? How does it respond to demands? What happens when it reaches the upper bounds of its ability to withstand a challenge? Answers to these and other challenges will come from new approaches by those who develop these systems.

A Resilient Outlook

In order to develop systems that function in a resilient manner, engineering has the opportunity to grow in a number of different ways. Effective engineering ensures positive outcomes from a system’s performance. To make resilience routine, engineers might foresee what may go wrong, develop new tools, and use good design to model adaptive solutions. Here are a number of initiatives that can make that intention reality.

Reconceptualize. Mapping all possible interdependencies among system elements is too difficult, because many of them are hidden. Instead, approach the design problem at a higher level that allows for anticipation, as well as needed change, from simple reconfiguration to more complex needs to expand and adapt. In addition to centralized command architectures and flat architectures, consider multi-role, multi-echelon networks.

Study what goes right. In contrast to the traditional safety focus on failures, resilience engineering emphasizes the importance of focusing on what works—on what goes right (Hollnagel, 2014). This requires us to pay attention to that which we routinely neglect simply because it “just happens.” Learning to do so is not very difficult, since it is a question of changing what we look for rather than to dig deeper. Contrary to traditional safety thinking, breadth is more important than depth.

Cultivate requisite imagination. The measure of an organization’s success is the ability to anticipate changes in risk before failure and loss occur; to create foresight (Woods, 2000). Adamski and Westrum (2003) describe requisite imagination as the ability to foresee what might go wrong, and maintain a questioning attitude throughout the development process. Aspects of practice they consider essential to this trait include thoroughly defining the task to be performed, identifying organizational constraints, matching the world of the system designer with that of the system user, considering the operational environment and the domain where work will be performed, surveying past failures, using controls appropriate to the tasks to be performed, accounting for potential erroneous actions, and taking conventions and constraints into account. This is no small job, and it calls for further research to understand how to support the tasks that this kind of foresight requires.

Develop new tools to develop and operate systems. System engineering tools and knowledge management tools must incorporate human and organizational risk. Based on empirical evidence, develop ways to control or manage a system’s ability to adapt. This includes developing ways for a system to monitor its own adaptive capacity so that it can make changes in anticipation of future opportunities or disruptions. Provide feedback, using knowledge about operations to identify conditions when to launch analyses of key system features.

Resilience engineering includes operational oversight, which is typically termed “management.” However this type of oversight means more than what management normally implies. RE is similar to management because it requires the system to be self-aware; able to reflect on how well it has adapted. It is different because it goes beyond operations to include research and development to ensure it has the traits that make it adaptable in the first place.

Management that correctly understands the operations of any system will also be likely to correctly estimate how well its strategies will work when unforeseen challenges occur. While management points of view influence how systems are to be configured, they may not reflect the realities of operational demands. Managers who don’t understand the operator’s point of view at the sharp end can miss the demands and constraints operators face. Their well-intentioned efforts that do not reflect an understanding of sharp end issues can produce both doctrinal and technological surprise. For example, healthcare organizations experience such misunderstandings that include software development cost overruns, alert overload, mode errors that include operating on the wrong patient or patient site, and a general increase in the number of shortcuts needed to compensate for cumbersome and inflexible technology.

Management’s part in this includes balancing production pressure with protection from loss. Foster a culture that encourages reporting. Respond with repair or authentic reform when circumstances call for it. Enable front-line supervisors to make important decisions in order to be aware of and act on problems as evidence begins to develop. Understand operations well enough to know when they are encroaching on safety boundaries.

Create ways to monitor the development and occurrence of unforeseen situations. Complex systems are dynamic and need means to not only monitor performance but also make deliberate adjustments to anticipate, and respond to, unforeseen situations. Operators know how they engage and deal with these. Frontline workers have identified 8–12 workplace and task factors that can make work difficult, including interfaces with other groups, input information that is partial or missing, and staff and resource shortages (Reason, 1997). Wreathall (2001; 2006) and Wreathall and Merritt (2003) reviewed sets of indicators that map onto aspects of resilience. Such measures point to the onset of problems in normal work practices as pressures grow. They also reveal where workers develop adjustments to compensate for that. Management is usually unaware of changing demands or of the need for workplace adjustments. These indicators are chosen to reveal circumstances and can also reveal situations management may not know about, and current plans may not be adequate to handle changing demands.

Develop tools to signal how to make production vs safety tradeoffs and sacrifice decisions. Enable an organization to know when to relax production goals in order to reduce the risk of coming too close to safety boundaries, even under uncertain conditions. Learn how organizations consider and make these decisions, as well as what is needed to support them.

Cultivate ways to visualize and foresee side effects. Develop means to show how systems adjust their performance to handle unexpected situations. Show how pressure from other units or echelons affects a particular portion of a system. Take interactions with other systems into account, and be aware of the implications that interactions present, such as cascading effects.

Promote and use good design. The development of well-considered prototypes makes it possible to evaluate how, and how well, solutions can adapt. Norman (2011) contends that “Good design can help tame complexity, to relish its depth, richness, not by making things less complex—for the complexity is required — but by managing the complexity.” Good design can be used to model discoveries of adaptations to change and uncertainty. The prototypes that result offer compelling evidence of a feasible future that others can understand and evaluate.

Acknowledge and manage variability. New configurations introduce uncertainty, and bounding a problem to exclude uncertainty does not eliminate it. Embrace non-linear approaches to explore how systems and networks adapt to change and disruption.

Like RE, each of these opportunities challenges the imagination to move professional practice from what is known to what it can, and needs to, become.

Reading Guide

A brief comment follows each of the chapters to invite the reader’s attention to key points that connect with the book’s theme and occasionally describe how the chapter relates to the one that follows.

Each chapter examines the need for RE in actual settings, including healthcare, nuclear power, aviation, railway tunnels, construction, and disaster recovery. The chapters explore practical issues that will need to be resolved, and new approaches that will be needed to make RE feasible. Understand how systems work in reality. Translate a system description into a prescription to improve its adaptive ability. Negotiate the differences between work-as-imagined and work-as-done, and learn from the experience. Learn and anticipate as a way to cope with fundamental surprise. Pay attention to more subtle safety indicators such as process safety and organizational hazards. Analyze adaptation as a way to improve system monitoring and systemic learning. Understand how interplay among multiple levels and actors can influence socio-technical systems. Translate team training from individual to a distributed cognition approach that recognizes tasks are variable. Gain and retain a new perspective to notice what couldn’t be seen before and, once it is noticed, compels one to act. Triangulate incomplete or ambiguous readings from multiple sources during an event, and determine what gaps in these readings say about the role of human cognition in achieving resilience.

The concept of adaptive systems refers to a result of the way systems perform. “Becoming Resilient” implies that what we describe in these pages is a process. As each chapter shows, the process can include new approaches to methods, organizational structures, and work processes.

System concepts take time to evolve, and the development of RE will also take time to develop the science, measures, and means that other approaches already have in place.

Acknowledgement

The author is grateful to David Woods, John Wreathall, and Erik Hollnagel for their insightful comments during the development of this preface.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset