Chapter 16

From Myopic Coordination to Resilience in Socio-technical Systems. A Case Study in a Hospital

Anne Sophie Nyssen

In socio-technical systems, the overall functioning requires, by definition, a coordination of actions and decisions of the different agents involved in the task. It seems reasonable to assume that the resilience capacity in such systems therefore also should require the coordination of local and spontaneous coping strategies distributed in time and space to recover from surprise, unexpected events or crises. Indeed, the history of accident investigations contains many instances of dramatic coordination failures. The purpose of this chapter is to show why the study of coordination mechanisms is so crucial to the resilience of socio-technical systems. We illustrate this importance using an example from one kind of socio-technical system, a health care system. First, we introduce the concept of coordination as a component of resilience in socio-technical systems like hospitals and show how they handle coordination requirements. Then, we describe how practitioners adapt to the unexpected using emergent coordination mechanisms. We conclude by developing our main argument that resilience of socio-technical systems largely depends on their ability to project themselves outside the ‘local immediate’ when an unexpected event arises in order to be able to develop a coordinated spatio-temporal solution.

Introduction

In socio-technical systems, the overall functioning by definition requires a coordination of actions and decisions of the different agents involved in the task. It seems reasonable to assume that the resilience capacity in such systems therefore also should require the coordination of local and spontaneous coping strategies distributed in time and space to recover from surprise, unexpected events or crises. Indeed, the history of accident investigations contains many instances of dramatic coordination failures. All organisations, including socio-technical systems develop capabilities to detect and deal with unexpected events. These are part of the uncertainty of the world that every complex system must face and learn to cope with. Part of the capabilities developed by such systems includes the development of rules, procedures, standardisation and training programmes. These centralised systems organise and control the interactions both inside the system and between the system and its environment in order to maintain continuing alertness and to control the safety boundaries of the organisation. However, it is accepted that it is impossible to anticipate and write down a rule for every circumstance. Furthermore, the collection and analysis of unexpected past events, such as accidents, do not allow the prediction of future ones.

In that context, system resilience also seems to depend on the front-line agents’ capacity for flexibility, local autonomy and creativity, which allows them to adapt to changing and unexpected circumstances. This flexibility is generally considered as a positive contributor to – or even the base of – the resilience of complex systems, through the ‘loose coupling’ they introduce between the system’s agents and the internal and external constraints. However, contrary to frequent assumptions about sharp end performance, local, spontaneous actions and decisions are not always virtuous (positive).

In our view, resilience in socio-technical systems relies more on the dynamic of the interactions between the agents than on each individual or sub-component’s ability to adapt. It is the dynamic of these interactions that allows such a system to cope successfully (or not) with unexpected events, irregular variations or crises.

The purpose of this chapter is to show why the study of coordination mechanisms is so crucial to the resilience of socio-technical systems. We illustrate this importance using an example from one kind of socio-technical system, a health care system. First, we introduce the concept of coordination as a component of resilience in socio-technical systems like hospitals and show how they handle coordination requirements. Then, we describe how practitioners adapt to the unexpected using emergent coordination mechanisms. We then examine some conventional dimensions of safety and resilience and show how they may fail in socio-technical systems. We conclude by summarising the main arguments.

Coordination as a Component of Resilience in Socio-Technical Systems like Hospitals

Coordination in socio-technical systems covers two parts: a movement of division and distribution of actions among different agents and a movement of integration of actions and decisions distributed in time and space (Savoyant and Leplat, 1983, Pavard, 1994). All socio-technical systems, like hospitals, are confronted with both. Today, a patient will very seldom visit only one hospital department and furthermore rarely sees only one physician during their stay. Multiple departments, professional skills and technical devices are brought together in order to provide a complete health service and also to provide uninterrupted care around the clock. This specialisation and continuing process means that more and more information has to be exchanged between departments as well as between individual operators. Hospitals themselves have even become specialised because of economic pressures, so that a patient may have to go to several hospitals and health care institutions to be properly taken care of. This obviously raises the coordination challenge to the inter-organisation level and makes it necessary to also consider resilience at that level.

Classically, we can distinguish the situations of coordination on the basis of four dimensions (Bonabeau and Theraulaz, 1994):

•  the compatibility of the goals of each agent involved in the task

•  the sharing of common resources

•  the skills of the agent in relation to the task

•  the type of interaction – face to face and synchronous communication (same place and same period) or distant and asynchronous communication (different place and different period).

Each may be a source of tension in socio-technical systems when an unexpected event arises and hence have implications for resilience. For example, confronted with the same patient situation, the surgeon would prefer to operate as soon as possible while the anaesthetist would prefer to stabilise some parameters. There is one common goal related to the patient’s health but two contradictory sub-goals related to the emergency strategy. Conflicts may also appear when all the agents actively decide to cooperate, for example, when different practitioners are called to an operating room to deal with a cardiac arrest. Their actions and decisions must be coordinated if we don’t want this assistance to end in chaos. The agent’s experience in relation to the task also influences the exchange of information and the coordination. When practitioners repeatedly work together, a reduction of verbal information exchanges is observed as practitioners get to know each other. Any regularly repeated action by a member of a team becomes an act of implicit communication, a signal that triggers actions by other team members, hence allowing synchronization, just as explicit verbal communication would do (Nyssen and Javaux, 1996). This relation changes during an unexpected event or crisis: the greater the trouble, the greater are the demands for information centred on the task (Bressole et al., 1994).

Another major change for communication and cooperation mode in hospitals comes from the introduction of new computer-based technology that allows distance between the operators and the task (e.g., robotic surgery) and/or between the operators (e.g., electronic patient’s file). These technologies deeply transform the coordination situations from face-to-face and synchronous communication to distant and asynchronous communication in which information is mainly built up incrementally by accretion. In robotic surgery, we have showed how the technical device changes profoundly the structure of the surgeons’ task and, hence, the mode of cooperation between the surgeon and his assistant. It favours an explicit division of work and an explicit leadership based on order communication and continuous control of the work or asks for confirmation. Whether the interaction is synchronic or asynchronic may be a critical factor in view of the emergency response capacity of an organisation (Johansson and Hollnagel, 2007). When confronted with unusual problems, people in charge are naturally prompted to intervene, although the problem could have required sharing information distributed over time and space and a collective response.

These developments help us to envision the coordination challenge for socio-technical systems confronted to unexpected event and crisis. But, they do not specify how organisations deal with these challenges.

The Organisation’s Approach to Coordination

When we consider the organisational side of coordination, we can identify three sets of coordination requirements: vertical, lateral and longitudinal.

•  Vertical coordination requirement. All organisations, including hospitals, structure their work and activities through a vertical distribution of decision-making responsibilities and roles. Because of their various positions in the hierarchy, individuals are likely to have differential access to information and knowledge. The exchange of information is required to keep the subordinates’ supervisor aware of the situation and decisions. A vertical flow of information is needed, both bottom-up and top-down, to provide the right people with the right information to carry out the task.

•  Lateral coordination requirement. Another important aspect of complex organisations is the existence of different fields of expertise that are institutionalised and organised into different units or departments, which have with different technologies and specific subcultures. In the collective activity, each expert has only a partial representation of the situation. The task requires the coordination of the different expert’s activities and, in many cases, the evaluation and integration of the information from these various sources into one global base of knowledge, if not a shared representation.

•  Longitudinal coordination requirement. Many organisations operate all around the clock, requiring exchange of information between the different skills. The tasks themselves are made up of subsets of activities or sequences of actions that must be executed in the proper form and with the appropriate timing. These process constraints shape the coordination of work either at the longitudinal or the lateral coordination levels. Furthermore, many processes are dynamic and are subject to modification; the supervisor must continuously update his/her representation of the situation.

Any organisation positively organises its coordination by developing a series of conventional management tools that specify ad hoc patterns of behaviour and directly or indirectly shape the interactions and the communication between the agents. These tools supposedly enable the agents to manage everyday and unexpected situations more effectively.

This coordination approach is based on the Common Knowledge Theory (Lewis, 1969, Krauss and Fussell, 1990) and is derived from the classical assumption that the success of coordination lies in the extent to which the community and individual agents are prepared to understand and share ‘common ground’. In work studies, the idea of ‘common ground’ shared by the individuals who perform a collective activity led to the concepts of ‘functional referential’ (Leplat, 1991), or ‘mental model’ (Norman, 1987) that define work processes and allow group members to organise their activities.

For example, the aviation industry has attempted to reduce the problems of cooperation between humans and automatic systems by organising both human–machine and human–human communication, using a straightforward and predefined division and distribution of tasks (e.g., Pilot Flying and Pilot Non-Flying), a codification and standardisation of the communication language, a principle of systematic verbalisation of the main intentions, perceptions and actions (call-outs), a principle of systematic crosschecking of actions and understandings, and mandatory training of so-called ‘non technical skills’ (Crew Resource Management).

Although health care practitioners point out that ‘improving communication’ is an important corrective strategy (Kluger et al., 2000), communication has not received much attention in hospitals. Better training, better techniques and better standards of equipment have been recommended in order to improve the patients’ safety, but not much effort has been made on communication training and tools. Coordination then relies more on a series of conventional management tools such as hierarchy, work organisation processes, patient processes, procedures, daily lab rounds, patients’ files, handover meetings and the like. We can identify such conventional tools for each coordination level.

Vertical Coordination Tools

The work organisation in a hospital allocates the simplest part of the collective patient task to the novices and the more complex part to the more experienced workers. Novices’ work is commonly monitored by residents and/or seniors. In many hospitals, a phone communication network is organised in a cascade to provide, at all times, help from those who are more experienced. The basic functioning of vertical coordination is that the novices do their work, with some internal and external ‘sentry markers’ (events, parameters) which tell them that it is time to call for assistance from someone at a higher level of expertise. The key issue is therefore the relevance of these ‘sentry markers’ that alert the novices and suggest a call at the right time. A previous study (De Keyser and Nyssen, 1993), showed that trainees often fail to estimate correctly how long they should reasonably wait before calling and do so too late: either they overestimate their competences or they underestimate the speed of the patient’s deterioration or sometimes they do not want to lose face.

Lateral Coordination Tools

The resources of the hospital (either technical or human) are both specialised and limited and a careful coordination of the activities, both in time and in space across the facilities, is required. This resource management is based on multiple planning activities, which are organised according to different time frames: they may start up to a year beforehand, and then evolve from annual, to monthly, to daily and hourly schedules. Computerised systems are used to exchange information between people from different areas in order to gather the right people at the right time and place and achieve lateral coordination. The work process itself also defines how the tasks are organised among the agents and shape their interactions. Individual and collective patterns and sequences of behaviour are defined into procedures.

Longitudinal Coordination Tools

There are rotations of multiple teams in charge of the patient around the clock in a hospital. A transition period is generally planned in the agent’s schedule to allow for data transmission briefings.

One important tool for longitudinal (and lateral) coordination is the patient’s records, either in its paper form or its computerised form. For both the hospital and for the team, it is a means to trace and memorise the state of the patient and the actions of the different agents around the clock. It contains the history of the case, contextual information, the distributed dynamic diagnosis and the treatment process. Each agent is supposed to fill in the patient’s records with their contribution to the actions and information and to transmit their knowledge to the next agent. By accumulating information over time and from different agents, the patient’s records play a critical role in coordinating. It is intended to produce the global representation of the patient’s situation to enable an isolated agent to solve dynamic problem situations.

The hospital normally takes for granted that the staff will adhere to these coordination principles and, so doing, assumes that coordination problems are solved. However, in the following case description, I will show how these centralised coordination mechanisms fail to organise the activities of different agents when unexpected event arises and, hence reveal the adaptation strategies and the resilience capacity of the system.

A Catastrophic Experience

One night on a weekend, a 16 year old patient showed up at the reception desk of the emergency room of Hospital A for respiratory distress related to a problem of chronic asthma. The clinical examination was done by a resident who prescribed treatment (inhaled bronchodilator and antibiotic therapy) and let the patient go home.

Later in the night, his respiratory distress symptoms reappeared at a higher degree. In the early morning, the patient showed up at the emergency room of a larger hospital (B) in the same area. The patient was in an agitated and anxious state. He was directed to a room of unit X where he was examined by resident 1 (R1) and monitored (ECG, arterial blood, pulse). The clinical examination did not show any acute respiratory problems. R1 gave the patient some treatment and kept him for close monitoring. At the end of his shift (9 am), R1 transmitted all the information concerning all the patients to the resident 2 (R2). In the middle of the morning, the patient felt better and R2, after examination, decided to let the patient go.

In the early afternoon, the patient showed up again at the emergency department of hospital B, still complaining about respiratory distress. At the reception desk, the secretary recognised the patient. For her, there was no real emergency and this time she decided to refer the patient to Unit Z. Unit Z had just been created in the emergency department in order to take care of minor emergencies. General practitioners (GPs) from outside the hospital were used in that unit for patient care. In that unit, the patient was examined by a GP who was on call in the hospital for her fifth time. The nurse, who was working with the GP, also recognised the patient and informed her about the case. The nurse went to get the patient’s records and, at the same time, asked R2 to come and see the patient.

The GP and R2 were both in the room and examined the patient. The patient showed some signs of tachycardia, nasal flaring, hypoxemia and anxiety. R2 decided to give the patient oxygen and started corticoids and the GP proposed to give him some anxiolytics. In the afternoon, the patient complained about chest pains. The GP added some analgesic to the treatment and called a psychologist who did not diagnose any particular mental problems.

At the end of the afternoon, a nurse came to see the GP to tell her that the patient was not complaining anymore and seemed better. She asked her to come and examine the patient in order to see if he could go home. After close examination, she decided to let the patient go. R2 heard later that the patient had left.

In the evening, the patient went into respiratory arrest at home. He was taken by ambulance to hospital B but too late.

Case Analysis: An Emergence-through-use Approach of Coordination

As for many accident analyses in complex systems, it is not easy to identify where and how the case went wrong through the course of events. Leaving aside the benefit of hindsight, each decision, each action seems to be relevant in its finite temporal and spatial interval. The overall failure appears to come from a lack of integration of the decisions and actions distributed in time and space and across the agents; that is to say from a lack of coordination.

The centralised organisation of coordination presented above failed to achieve the level of integration of knowledge and action that was necessary for handling the problem situation. Confronted with an unusual situation, the agents organised their behaviour through direct and local interactions with the work environment, based on their understanding of the situation. However, this local process of coordination disorganised the standardised sequence of operations. Let us analyse in detail the dynamic of the interaction and the coordination failure.

•  Hospitals A and B are two units in the same urban area. Both are capable of taking care of a chronic asthmatic patient. For such a ‘routine’ problem, they are two equivalent resources, two interchangeable structures. Hence going to hospital A first, then to hospital B should not have been to be an issue for the patient. However, coordination is not organised between the two units. There was no communication of any type (verbal or written) between the two hospitals Consequently, each unit re-starts the diagnosis process: creating its own patient records, its own diagnosis and its own treatment process. Each diagnosis process seems to be relevant in its finite temporal and spatial interval. But because of the slow dynamic of the problem, the move from one hospital to another impaired the detection of the overall representation of the problem. However, it must be noted that this lack of communication could have been overcome by the application of a recent national measure that appoints the patient as the agent who keeps his exam record files.

•  There is a procedure written by the chief of the department that organises the patient orientation across the different sub-units of the emergency department, and so lateral coordination. However, this algorithm was not used by the agents in the case described above. They were not really informed about it, and they were not involved in its development process. Furthermore, the algorithm does not cover the case of a patient who comes back again several times in the same day. Actually, the first line agent, the receptionist, achieves coordination without formal procedure. The receptionist identifies the degree of emergency, the nature of the problem and attempts to match the demands with the available resources. This matching is mediated by direct interactions with doctors and nurses, and supported by a computerised system that gives an external representation of the workload of the different sub-units. This global representation of the team’s activities is permanently updated, yet does not keep track of past activities. Its goal is to help the synchronic management of resources. It was not intended to help the agents for longitudinal coordination.

•  The coordination between GP and R2 is something that emerges from a set of local interactions rather than by the implementation of the procedure that explicitly organises the transmission of information between physicians. In the case study, it was the nurse who detected the presence of the patient the second time and organised the transmission of information through the patient’s records and by direct verbal interaction between R2 and GP.

By doing this, the nurse created a system in which everyone believed that the others knew everything about the task, hence shared the same understanding in a fully cooperative work. This emergent movement of increasing expertise resulted in a pattern of overlapping expertise rather than of cooperating work. In fact, the knowledge of the medical task was represented most redundantly, but the centralised knowledge necessary for a dynamic problem solving was paradoxically represented least redundantly, and was lost across the agents as well as in vertical and lateral coordination.

•  Each physician used the patient’s records but everything happened as if each agent started a new reasoning process instead of integrating the information recorded in the patient’s records and constructing a global dynamic representation. Clearly, the individual performance of the agents was not improved by the use of this external static memory source assumed to achieve longitudinal coordination. Even the direct verbal interaction among individuals did not really help to alert the physicians and provide any diagnostic benefit. In contrast, these local interaction and local coordination processes might actually have worked against the efficiency of the problem solving process by confusing each person’s role and creating some kind of ‘stammering’ in the reasoning and treatment process.

Discussion

Part of the benefit of being a socio-technical system facing a crisis or an unexpected event comes from the juxtaposition of agents, either human or technical. The different agents create redundancy; each agent can detect signals or dangers, update the process representation with new information and interactions with the environment and formulate a regulation plan. The increase in the number of agents has by its nature an adaptive value for the system, at least to a certain extent. This is a very simple form of redundancy relying on the increase in resources for signal detection, diagnosis and action plans, bringing benefit in terms of resilience. But, there is no increase in expertise among the agents who are interchangeable. In our case study the patient himself, when confronted with an unexpected evolution of his symptoms, adopted this strategy by increasing the number of hospitals he interacted with. A beneficial aspect of this strategy is that by replaying the game every time, each agent can detect someone else’s error and improve safety. However, our case study shows that this strategy may not be optimal in some problem situations.

The delivery of appropriate medical care depends on obtaining information from different sources that could specify the cause of the patient’s symptoms. In many cases, this is an iterative task. The complexity comes from the fact that these sources are distributed in space and in time. When our patient decided to move from one hospital to another, he involuntarily created a rupture of this task, affecting the critical integration of the temporal aspects of the problem. The detection of the dynamic pattern of the problem and its repetition over time was impaired by the distribution in space and in time of the diagnosis and decision making process. The health care system lost its resilience.

A second approach relating to resilience may be found in the way hospitals have dealt with the coordination problems by developing centralised tools such as written procedures, work processes, automated systems that specify the work and the activities across time and distance and guide interactions. The above situation shows how these centralised coordination mechanisms may fail when the tool does not cover a particular case. Reasons for these failures can be, for instance, that the agents are not familiar with the conventional tools, that the computerised systems are not designed with the coordination needs in mind and so the critical information for coordination is not saved or not transmitted.

Correlatively, our case study clearly demonstrates that when agents are faced with an unexpected event, they often rely on ‘emergence-through use’ coordination instead of referring to centralised tools. Each agent seems to organise the activities through direct and local interactions in his/her work environment.

By analogy, recent research that studies coordination in insects and non human societies shows local coordination mechanisms at the origin of very complex patterns of adaptation (Bonabeau and Theraulaz, 1994, Gilbert and Conte, 1995). For instance, Reynolds (1987) demonstrated that the flocking behaviour of birds can be simulated by assuming that individual birds make local adjustments based on the velocity and bearing of neighbouring birds. Thus, despite appearances, such complex flocking behaviour is generated by local coordination processes rather than by global centralised ones.

A strong message from High Reliability Organisations’ commitment to resilience is about their sensitivity to the front end operations and their ability to distribute the control and decision making to the low level members (Weick and Sutcliffe, 2001). These members are closest to the problem and are better able to adapt as the tempo of operation quickly changes and unexpected problems arise. A central assumption is that when people have a well developed situational awareness, they can make the continuous adjustments necessary to respond to the dynamics of the situation and the unexpected. This flexibility of the decision structures is an important issue for resilience in large scale organisations like hospitals facing the unexpected.

But local regulations are not always positive, contrary to some naive assumptions about sharp end performance. In robotic systems, Mataric (1992) has shown that distributed control can lead to a kind of ‘myopia’ (short-sightedness). When there is no centralised supervisor that possesses a global representation of the operations, the group of robots can fall into the trap corresponding to a ‘local minimum.’ We have shown in our case study that each agent seems to give precedence to their own current perception of the situation based on his/her local and real time interactions with the patient, and re-starts the reasoning process instead of continuing it, falling into the trap of ‘myopia’. Within each spatio-temporal window, the problem appears as a ‘routine’ emergency problem and each physician copes adequately with the emergency symptoms. In emergency departments, people in charge naturally focus on emergency but in doing so the system fails to capture the global pattern of the problem. Apparently, the patient was well at the hospital, went home and the symptoms started again. The process is not linear. The challenge for dealing with the problem is to understand this pattern. Otherwise, the management of the problem will always stop too early. This understanding would have required that, beside the emergency symptoms, one physician sat back, shared questions with the other physicians in charge, reconstituted historically the different isolated responses, integrated the external factors’ influence and anticipated the lethal process when the patient is pushed outside the hospital. However, as Lagadec (2004) mentioned in his analysis of the 2003 French heatwaves that killed nearly 15,000 people, in most cases, the culture during crisis is you act – you do not have time to think.

Conclusions – ‘Enhancing Projection outside the Local Immediate’

From a development perspective, there is a difference between everyday life coping strategies and resilience capacity. The difference is not the intensity of the event but its pathologic impact that requires a reorganisation of the system in order to maintain safety and survive the crisis (Bowlby, 1973, Cyrulnik, 2003).

The issue in this chapter is not to argue for or against one of the two organisation’s approaches of coordination described above: centralised or emergent. It is evident that the two approaches of coordination are clearly embedded in work practices and can both be beneficial in terms of resilience. However, these approaches may not be sufficient in today’s large scale organisations and should be complemented by the capacity to coordinate responses over time and space.

My argument is that resilience of socio-technical systems largely depends on their ability to bypass the myopic cognitive bias mentioned above in order to be able to develop a coordinated spatio-temporal solution. This requires two competencies from the systems: the ability to cope with the unexpected as it arises (this was effectively done in our case study partly thanks to the local spontaneous coordination processes), and the ability to keep on projecting themselves into the future beyond the present but taking into account the past.

Projection refers to a process of symbolisation of the diversity and complexity of all eventualities. It requires the ability to keep interactions going internally as well as with the external environment during the crises, in order to be able to capture the history, to read the changing circumstances and to be prepared for what the future holds. This is fundamental for coordinating the responses over time.

Following this argument, a rich array for future research for resilience design is a better understanding of how to enhance systems to project themselves outside the ‘local immediate’ when an unexpected event arises and thus how to represent and record the histories of the local agent-environment coupling adjustments distributed in time and in space in order to enhance a coordinated spatio-temporal regulation process over time. What kind of tools are the most appropriate to tackle this new coordination requirement? We have seen that traditional tools allow data saving and sharing, but mainly in an asynchronic mode. In our view, this discontinuity may favour the repetition of the regulation process instead of its steps-by-steps refinement over time. The challenge is to inject learning into the adjustment process. This may require synchronic interaction for collective decision-making expertise to elaborate coordinated responses over time.

From this perspective, the study of interaction becomes an important paradigm to capture the resilience capacity of socio-technical systems. The idea of interaction as an instrument of development of cognition, and thus serving adaptation is not new. It is central to Piaget’s theories (1967, 1992). Adaptation, in his constructivism framework, is achieved through agentenvironment interactions via the conjunction of two processes: (a) the assimilation of new experiences into existing structures, and (b) the accommodation of these structures, that is, adaptation of existing ones and/or the creation of new ones. The latter, learning through accommodation, occurs for the purpose of ‘conceptual equilibration’ and the elimination of perturbations.

At a metaphorical level, the resilience capacity in socio-technical systems becomes observable and defined through the study of interactions and coordination modes inside the system and between the system and its environment.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset