1 Introduction

The Dutch railway network is one of the busiest in Europe in terms of rail traffic. It is also technically complex, due to its high number of switches, double tracks, and associated signalling (ProRail 2011). This makes it highly vulnerable to disruptions. Disruptions are an event or a series of events that lead to substantial deviations from planned operations (Nielsen 2011). These disruptions result in growing dissatisfaction among travellers, extra expenses, and revenue losses. Consequently, responding to disturbances in a timely manner in order to restore services rapidly has become an important objective. To do so, operators must assess the nature and state of the disruption and adjust operations before it becomes impossible to control (Johansson and Hollnagel 2007). Under the influence of restructuring policies, the Dutch railway system has undergone major changes over the past decades, resulting in the separation of infrastructure management and rail operations activities. This has turned disruption management into an inter-organizational challenge (De Bruijne and Van Eeten 2007; Schulman and Roe 2007).

Railway disruption management involves the rescheduling of three interdependent key resources: (1) rail infra capacity, which is managed by ProRail, the infrastructure manager (2) train crew, and (3) rolling stock, which is managed by the train operating companies (TOC).Footnote 1 Control of these key resources is distributed among the multiple, geographically separated control centres of both organizations, all of which enjoy partial autonomy and have the authority to adapt plans. The tight coupling between resources makes disruption management a complex puzzle and requires control centres to work closely together. Coordination can be achieved through predefined plans and procedures, but given the dynamic and uncertain environment in which operators work, real-time adaptation of plans is often necessary (Johansson and Hollnagel 2007). In practice, situations during a disruption often changed faster than the involved parties could communicate and the decentralized control made it difficult to manage disruptions with a national impact (Goodwin et al. 2012).

This is why ProRail and Dutch Railways established a joint Operational Control Centre Rail (OCCR) in 2010. The co-location of both parties was intended to encourage communication and coordination in order to reduce recovery time during disruptions. In the OCCR, ProRail and Dutch Railways monitor railway traffic at a national level and can intervene in local operations when necessary. This makes it possible to synchronize adaptation by the different local control centres, while safeguarding the ability of local operators to quickly respond to small disruptions. In the literature, this kind of control has been termed polycentric control (Branlat and Woods 2010; Woods and Branlat 2010). Polycentric control seeks to sustain a dynamic balance between the two layers of control—those closer to the basic processes with a narrower field of view and scope and those farther removed with a wider field of view and scope—as situations evolve and priorities change.

Nevertheless, this kind of large-scale coordination is not easy when working in a complex and dynamic environment (Ritter et al. 2007). It also depends on how geographically and organizationally separated teams carry out their roles and manage interdependencies across the different levels of control (Johansson and Hollnagel 2007; Woods and Branlat 2010). During the past year, there have been several large-scale disruptions in the Dutch railway system where the situation became ‘out of control’ and no one really knew what was going on or what should be done. Effective leadership is thus important to orchestrate the actions of the multiple teams involved in the management of a disruption (DeChurch et al. 2011). The number of studies on leadership in multiteam systems—operating in non-routine and dynamic environments—is however very limited, and multiteam system research is a relatively new field of research based primarily on laboratory research (Zaccaro and DeChurch 2012). As such, much can be learned about how leadership processes manifest themselves and influence the adaptation process in a real-world context.

In this paper, we are interested in the role of leadership behaviours of the OCCR during the management of large-scale disruptions. This leads us to the following research question:

How do leader teams in the OCCR provide leadership during the management of disruptions and which challenges affect their leadership?

To answer this research question, we have analysed the management of two large disruptions. Before we introduce these cases, we will first take a closer look at the development of the OCCR and its established role and responsibilities in Sect. 2. In Sect. 3, we will look at adaptation in a multiteam system and the role of leadership. This section provides a framework for studying leadership behaviours. The methods are described in Sect. 4, followed by brief case descriptions in Sect. 5. The results of the study are provided in Sect. 6 and discussed in Sect. 7. The conclusions are presented in Sect. 8.

2 Disruption management in the Dutch rail system

The establishment of the OCCR has created a structure with two layers of control on a regional and national levels (see Fig. 1). ProRail currently has thirteen regional traffic control centres that are responsible for the railway traffic in specified geographical areas. ProRail controls and monitors all the train movements, and its traffic controllers assign paths to all TOCs. Regional traffic controllers monitor the railway traffic in their designated areas and optimize traffic flows. In addition, train dispatchers are responsible for the safe allocation of railway tracks on the sections assigned to them. Similarly, Dutch Railways has five regional operations control centres that monitor railway traffic and manage train crew and rolling stock schedules. Operators of ProRail and Dutch Railways in the OCCR also monitor traffic and operations on a national level. They coordinate the activities of the different regional operators and regulate shared resources, such as rolling stock. Secondly, the creation of the OCCR means that many parties involved in the management of railway disruptions who used to be physically separated are now co-located. They not only include ProRail’s traffic control and Dutch Railways’ operations control, but also teams responsible for Incident Management, Asset Management and contractors.

Fig. 1
figure 1

Different roles involved in the traffic management and their lines of communication

If a disruption occurs, ProRail’s train dispatchers and regional traffic controllers assess its impact on rail traffic. Only the train dispatchers have real-time information on the position of trains and therefore play a central role in the communications with people at the location of the incident (Schipper et al. 2015). A notification with details on the disruption is placed in the communication system (ISVL) by the Back Office, which can be accessed by most parties in the rail system. During this first phase of the disruption management process, the regional control centres of ProRail and Dutch Railways take the lead to prevent the disruption from propagating. Nevertheless, the operators in the OCCR have the authority to overrule all decisions made by the regional control centres. The regional traffic controller will then share an overview of the remaining rail infrastructure capacity with the national traffic control and operations control in the OCCR. The national traffic controller will check if this distribution of the remaining capacity does not negatively impact other regions. National traffic controllers have a global overview of traffic flows using time–distance diagrams. A contingency plan is then selected, together with Dutch Railways’ network operations controllers. These predefined plans contain alternative timetables for the most common disruptions. Before the contingency plan is implemented, a final check with the regional control centres is made to check feasibility, e.g. whether train drivers are available to operate trains. The implementation of the contingency plan initiates the second phase of the disruption management process in which recovery of the rail infrastructure commences. Once rail capacity is fully recovered, rail services are fully restored, this being the third phase.Footnote 2

3 Leadership in a multiteam system

The adaptive capacity of complex systems has been found to depend on the balance between the distribution of authority and autonomy across local control centres and the capacity to avoid a fragmented response to disruptions (Woods and Branlat 2011a, b; Woods and Shattuck 2000). In the literature on resilience engineering, the answer to this trade-off is sought in polycentric control (Branlat and Woods 2010; Ostrom 1999; Woods and Branlat 2010). Polycentric control seeks to sustain a dynamic balance between local and distant centres of control, as they are in a constant interplay as situations evolve and as a result of activities and progress at each centre (Branlat and Woods 2010). Although research is building up on polycentric control, still little is known about its workings and how a dynamic balance should be maintained between both layers of control. As mentioned in the introduction, managing the interactions of the control centres (both horizontally and vertically) is not an easy task. It requires multiple teams working at different locations and with different organizational backgrounds, goals, and responsibilities to effectively align their activities.

There is a growing body of literature on these so-called multiteam systems (MTS), i.e. networks of distinct yet interdependent (component) teams that address highly complex and dynamic environments (Shuffler et al. 2014; Zaccaro et al. 2012). MTS are officially defined as: ‘two or more teams that interface directly and interdependently in response to environmental contingencies toward the accomplishment of collective goals’ (Mathieu et al. 2001: 290). Contrary to most of the studies on teamwork, which focus on individuals within a single team, MTS research looks at how multiple teams function to grasp the unique opportunities, challenges, and complexities of these systems (Marks et al. 2005). For instance, although teams might be effective at within-team coordination, the system itself may still fail to adapt to a disruption, due to an inability to meet between-team coordination requirements (Luciano et al. 2015). MTS research has stressed the importance of leader teams (e.g. representatives of the component teams), situated hierarchically above the component teams, who have system-wide responsibilities and the task of managing the interdependencies among component teams (Davison et al. 2012). Studies have shown that effective leadership has a positive influence on inter-team coordination and overall MTS performance (e.g. DeChurch and Marks 2006; DeChurch et al. 2011). It is therefore important to look at the behaviour of these leaders in the adaptation process of a MTS (Zaccaro and DeChurch 2012).

Location, timing, and the type and severity of the incident will all influence the adaptation process and the capacity of the system to adjust operations before it becomes impossible to control (Golightly et al. 2013). The way in which operators respond to a disruption (both individually and as a team) will also be context specific, depending on individual characteristics such as experience, knowledge, and flexibility (Maynard et al. 2015). Nevertheless, Burke et al. (2006) argue that each team adaptation follows a cyclical process consisting of four phases: (a) situation assessment, (b) plan formulation, (c) plan execution, and (d) learning. In this study, we will look at the first three phases. In their model, Burke and colleagues stress the importance of teamwork competencies, such as mutual monitoring, communication, backup behaviour, and leadership during the phase of plan execution. We believe that these teamwork competencies are also important in MTS settings, but argue (and will show later in the paper) that they are not only important during plan execution, but also during the phases of situation assessment and plan formulation.

First of all, adaptation requires the ability to quickly recognize cues that signal the need for adaptive actions. However, as Uitdewilligen and Waller (2012) observe, since there are many component teams in a MTS, situation assessment will be highly distributed, and therefore, the situation awareness of teams will also be distributed. In order to create a compatible understanding of the situation between teams, it is essential to share crucial information. Exchanging appropriate information and providing each other with regular updates helps to maintain a compatible situation awareness of the dynamic environment to ensure coordinated behaviour. MTS leaders can facilitate communication and the timely and accurate exchange of information between component teams to maintain situation awareness. Moreover, during moments of stress, component team members might not be able to uphold an awareness of the system (Uitdewilligen and Waller 2012). Leader teams can act as an information hub in order to create an overall understanding of the operational environment and potential future development trajectories of the system. The latter is important to formulate a plan or pick a contingency plan that brings the MTS’s capabilities, resources, and actions into line with the emergent dynamics in the operating environment. The quality of this plan depends on how well it fosters and maintains this alignment (Zaccaro and DeChurch 2012).

Leader teams also have an important role in monitoring the performance of component teams in terms of their progress towards system level goals (Zaccaro and DeChurch 2012). For example, leader teams can provide feedback in the form of verbal suggestions or corrective behaviours in the event of errors or performance discrepancies (Marks et al. 2001). Component team members may also struggle to perform their tasks due to a high workload. In this case, leader teams can provide backup behaviour by prompting other component teams to provide help, by shifting workload to other teams, or by proactively offering help with specific tasks. Finally, given the dynamic environment in which MTSs operate, it is crucial that this is continuously monitored, both internally (status and needs of teams) and externally (environmental conditions) (Marks et al. 2001). If unexpected changes occur within an MTS’s performance environment and the contingency plan no longer seems appropriate, it must be decided whether to reconsider, abandon, or adjust the original plan (ibid.). Leader teams play an important role in monitoring the system, identifying impending and actual blockages to goal accomplishment, and perhaps adapting the course of action when necessary (Zaccaro and DeChurch 2012).

In Table 1, we have summarized the above-mentioned leadership functions and provided behavioural markers. Behavioural markers are descriptions of, in this particular case, observable leadership behaviours (Dietz et al. 2015). Some of the markers have been adopted from theory on individual teams. We have translated these markers to make them suitable for the multiteam context of our study.

Table 1 Important components of effective leadership and their behavioural markers

Leadership behaviours are assumed to have an important influence on the relationship between the adaptation process and outcome (Maynard et al. 2015; Zaccaro and DeChurch 2012). However, it is difficult to quantify and compare outcomes given the unique characteristics of disruptions and their contexts. We therefore relate leadership to system performance by its ability to secure the adaptive capacity of the system. Woods and Branlat (2011a, b) have identified three basic patterns of adaptive failure in complex systems: (a) decompensation, (b) working at cross-purposes, and (c) getting stuck in outdated behaviours. These patterns can eventually lead to a system breakdown and thus need to be avoided or recognized and escaped from. Decompensation occurs when disruptions grow and cascade faster than operators can respond. In this case, the capacity of operators to maintain control can suddenly collapse and the capacity of the system to respond to immediate demands might be lost. Secondly, working at cross-purposes is the result of a lack of coordination between the different control centres (both horizontally and vertically) and results in conflicting goals that undermine the system’s over-arching goals. The last pattern is at play when people hold on to initial assessments of situations and lack the capacity to revise plans as conditions change. As a result, the tactics or strategies chosen do not match the actual challenges and so there is a risk of failure to adapt.

In this study, we look at the adaption process in two cases, with an emphasis on the communication and coordination processes between the teams in the OCCR and the teams in the local control centres. We are especially interested in whether and how leadership behaviours are applied to prevent or correct the system from falling into one of the three maladaptive traps (see Fig. 2).

Fig. 2
figure 2

Analytical framework

4 Methods

4.1 Case selection

To examine the leadership of leader teams in the OCCR, two cases of large impact disruptions were studied. These disruptions were selected because of their non-routine characteristics and the rapidly changing environmental conditions, factors that increase the risk of adaptive failures and therefore necessitate effective leadership. In case 1, we examined leadership during a winter storm that challenged the ability of local operators to stay in control. In case 2, we studied the management of a broken overhead wire at the largest train station in the Netherlands. Following Woods and Cook (2006), these cases do not serve as examples of successful or unsuccessful adaptation, but we believe that they are valuable for revealing patterns in teamwork and leadership behaviours in a naturalistic environment.

Many teams are involved in the management of disruptions, each with their own tasks and responsibilities. For instance, the ability to swiftly recover from a disruption depends greatly upon how quickly maintenance teams are able to repair rail infrastructure. As the focus of this study is on the leadership of leader teams in the OCCR, we have focused our analysis on the interactions between ProRail’s local and national traffic control teams and Dutch Railways’ local and national operations control teams.

4.2 Data collection

To examine the leadership of the leader teams in the OCCR, ProRail provided access to recordings of 102 telephone conversations between national and regional traffic controllers during both disruptions. Unfortunately, Dutch Railways was unable to provide us with their recordings of the telephone conversations between their operators in the OCCR and the local control centres. However, a large number of documents were obtained from both ProRail and Dutch Railways. We examined shift reports written by operators involved in managing the disruptions from both organizations, event reports on both disruptions, and the communication system logs. In addition to this, the winter storm case was evaluated internally by ProRail and Dutch Railways. This evaluation report includes a careful examination of the communication between ProRail’s national and regional traffic controllers. This extensive evaluation report was used as complementary data. For the broken overhead wire case, we conducted our own evaluation, which includes 9 interviews with operators directly involved in the management of the disruption. All interviews were tape-recorded and transcribed. The evaluation was presented to a group of managers from ProRail and Dutch Railways for expert feedback on the findings. Finally, 10 follow-up interviews were held with managers and operators to clarify events and leadership behaviours. As some of the interviews with operators were held during their shift, it was not possible to tape-record them. Instead, detailed notes were taken of those interviews. All other interviews were recorded and transcribed.

4.3 Data analysis

The telephone conversations (102 in total) were transcribed and then coded to capture the leadership behaviours. The software programme ATLAS.ti was used to systematically code the data. Instead of the more common quantitative approach of measuring behavioural markers as a frequency or on a scale, a qualitative approach was chosen, which involved labelling the leadership functions. Pieces of the telephone conversations were labelled according to the markers (Table 1) provided for the leadership functions. For instance, if a national traffic controller informed a regional traffic controller that he would be rerouting international trains, this piece of conversation was coded as proactively assisting component teams. This qualitative approach made it possible to provide a rich description of leadership behaviours and challenges to leadership on the basis of a systematic analysis. The telephone conversations were also used to identify indicators for the three adaptive traps. The latter may, for instance, be a request for help, if an operator is at risk of losing the capacity to adapt. In the second step, we used our additional data to complement our initial findings, identify patterns in the behaviours of the leader teams, and relate this behaviour to the three adaptive traps.

5 Case descriptions

Before we move on to the results of our study, we will first give a brief description of both cases. A more detailed timeline of the events in both cases is provided in Tables 2 and 3.

Table 2 Main events in the first case: winter storm
Table 3 Main events in the second case: broken overhead wire

Case 1 Winter storm

The first case happened during a winter’s day in 2014. Around 5:30 a.m., a massive snow storm caused numerous malfunctions to switches and guarded crossings in the southern part of the Netherlands. This region is managed by two regional traffic control centres, Eindhoven and Roosendaal. Within 2 h, twenty-six malfunctions had been reported. Prior to the storm, cuts to the rail service had been made to add some slack to the system. Nonetheless, due to diminishing rail capacity, regional traffic controllers and train dispatchers struggled to keep rail traffic flowing. Around 09:15, the regional traffic controller in Eindhoven temporarily stopped all rail traffic to get an overview of the situation and regain control. This came as quite a surprise to the operators in the OCCR as they were unaware of the severity of the situation. Around 11 a.m., the regional traffic control centres regained control and rail service was gradually restored. However, it took another 6 h to get the rail service completely up and running due to the limited availability of train crew and rolling stock.

Case 2 Broken overhead wire

Early on a Monday morning in 2015, a train broke an overhead wire upon entering Utrecht Central Station, the largest and most important station in the Netherlands. Power was automatically taken off the overhead system in the vicinity of the broken wire, depriving six platform tracks and two rail tracks of power. Normally, it is possible to restore power to non-affected groups remotely, but due to construction work at the train station, groups had been rearranged and the power had to be restored manually. This made it difficult for the train dispatchers and regional traffic controllers to estimate the available rail capacity, and so, it took almost one and a half hours to implement a contingency plan. Despite this contingency plan, train dispatchers and regional traffic controllers kept struggling to keep the rail traffic flowing as there was often no crew on the trains. With all platform tracks occupied, trains were queuing up to enter Utrecht Central Station. The overhead wire was repaired around noon, but it took several more hours to fully restore train services.

6 Results

In this section, we will show how different leadership behaviours manifested themselves during the management of the disruptions. To structure the description of our findings, we use the three basic patterns of adaptive failure to see whether or not and how the different leadership behaviours were used to prevent or correct the system from falling in one of the three traps.

6.1 Decompensation

The pattern of decompensation can be observed in both cases. For example, in the winter storm case, local operators of ProRail were confronted with cascading failures as the snow caused more and more malfunctions to switches and level crossings. As a result, operators were quickly running behind the tempo of events. For instance, due to problems with crossing barriers, train dispatchers had to give verbal instructions to each train driver in order for a train to pass a level crossing. These verbal instructions greatly increased their workload and caused severe delays to the train services. These delays and the loss of rail capacity made it very difficult for the regional traffic controllers to keep the rail traffic running. For the local operators of Dutch Railways, updating the crew schedules became quite a bottleneck during both disruptions. Since last-minute changes to the crew schedule must be announced by phone, the communication workload increased rapidly and operators struggled to get in contact with the train crews. Hence, there were too few operators to manage all the anomalies and the overview of the train crew was soon lost, as one of the coordinators of Dutch Railways (LBC) explains:

LBC: If you have one or two phone calls on paper, but not in the system, you are lost. A thousand people will start to phone you and they all just want one thing: they want to know what they should do and if they will be back on time at the end of their shift.

As a result, trains often could not depart because there was no crew assigned to them. With the platforms still being occupied, arriving trains could not enter the stations. This caused a further escalation of the situation and an increase in workload, since train drivers and conductors on the trains queued outside the station had to be rescheduled.

6.1.1 Backup behaviour by leader teams

To solve the above-mentioned deadlock and to prevent local controllers from completely losing control, the coordinator of Dutch Railways in the OCCR decided to switch to the highest emergency situation (code red M3 + P3) during both disruptions. This ‘code red’ procedure is designed get more and better control over the rescheduling of train crew. This procedure involves several measures. First of all, management tables were placed at the largest stations. This basically meant that all crew members arriving at the station had to report to this table to be registered. Registration at these tables enables local operators to update the systems and to re-assign crew to trains. Secondly, the coordinators of Dutch Railways decided to redistribute the rescheduling of the crew on long-distance trains among the other local control centres and operators in the OCCR. In addition, operators in the OCCR took over the management of the rolling stock, so additional capacity at the regional control centres became available for the rescheduling of train crew. Nevertheless, as both cases have shown, it took quite some time to fully regain control and sometimes it was even easier to just wait until the next shift of train crew and start with a clean sheet.

Likewise, although they were not acting according to a formalized procedure, we noticed that ProRail’s national traffic controllers proactively assisted the regional traffic controllers by rerouting international and cargo trains; updating the communication system (ISVL) with details on the disruption and verbal agreements; arranging locomotives to tow-stranded trains and cleaning up the timetables. The latter is a task that is easily shed during periods of high workload.

6.1.2 Performance monitoring and recognizing workload problems

However, the cases also show that operators in the OCCR struggle to determine if local operators are exhausting their capacity to adapt and backup needs to be provided. Since national traffic controllers only have a general overview of the traffic flows, they are not able to determine the seriousness of local situations by means of the traffic control systems. Hence, it is important to have regular contact with local operators to monitor their performance. Yet, as the overhead wire case clearly showed, national traffic controllers increasingly struggled to contact regional traffic controllers in order to create a shared understanding and discuss the need for backup. Besides, national traffic controllers often passively waited to be called for help instead of proactively offering assistance. However, in the telephone conversations we only once identified a clear request for assistance by a regional traffic controller and sometimes the help offered was rejected even through the operators were faced with a huge workload. This greatly increases the risk of intervening when the capacity to adapt has already been lost.

As Branlat and Woods (2010) observe, it is important to detect a developing problem at an early stage to be able to respond and avoid a decompensation collapse. The key information then is how hard operators are working to stay in control. In both cases, it was noticed that little time and effort was invested in discussing the performance of the regional control centres and potential future risks. For example, in the winter storm case the information being shared between the regional and national traffic controller mainly concerned an enumeration of all the malfunctions. Despite the fact that the network traffic controller acknowledged the seriousness of the situation at an early stage, they were unable to translate this information to a shared understanding of the impact of all the malfunctions on the train service and the local operators’ ability to stay in control. Hence, the national traffic controller was unaware that the regional traffic controllers were nearing their capacity limits.

Moreover, even when there are clear signals that local operators are struggling to stay in control, operators in the OCCR do not always recognize the seriousness of the situation and immediately respond to these signals. For example, in the winter storm case the regional traffic controller specifically asked for help, but the national traffic controller responded by asking the regional traffic controller to first make a logging of the remaining rail capacity in the communication system. So, instead of discussing the operational situation by phone, the national traffic controller had to make sense of the situation with the help of a simple text message. He therefore missed important contextual information. This reliance on communication systems to monitor the performance of the local control centres entails other risks. Due to the high workload, local operators were often unable to update the system with new information on disturbances and verbal agreements. Hence, operators in the OCCR might have made sense of the operational situation on the basis of outdated or incomplete information. Furthermore, the lack of new information in the communication system might falsely give the impression that everything is under control.

6.2 Working at cross-purposes

Contingency plans form an important coordination mechanism in the Dutch railway system as they tell operators of ProRail and Dutch Railways which trains should be cancelled and when and where trains should be short-turned. However, before a contingency plan can be implemented the disrupted area first has to be isolated to prevent congestion and a propagation of the disruption to other areas. The workload of local operators can really peak during this first phase of the disruption management process, especially if the disruption occurs at a major station, as in the second case, and trains have to be shunted and a lot of rescheduling work has to be done. Moreover, coordinating activities requires a great deal of dialogue between the control centres. For instance, the regional traffic controller has to warn neighbouring traffic controllers about the situation and order them to stop trains from moving to the affected area. ProRail’s regional traffic controller also has to consult with the regional monitor of Dutch Railways to decide where trains should be short-turned and what should be done with the trains stranded in the disrupted area. Hence, this first phase of the disruption management process is characterized by local improvization and little control over the situation by the OCCR. A national traffic controller outlines the situation in the broken overhead wire case:

NTC: We don’t have a contingency plan ready, but they (local control centres) are very active in short-turning trains. They are very busy at all locations, but how exactly and what they are precisely doing, I don’t know. They are still writing everything down.

However, during both cases it was noticed that information is often no longer shared properly during stressful situations as people tend to focus on their own task. For example, regional traffic controllers often told the national traffic control that they experienced updating and reading the messages in the communication system as an administrative burden, which had lower priority then trying to keep traffic flowing. Moreover, telephone lines quickly got overloaded and communication flows crumbled due to the large flow of direct communication between operators. This caused control centres to work at cross-purposes, as teams acted on the basis of incomplete information and faulty assumptions. In the second case, for instance, neighbouring traffic control centres were unaware of the difficulties that operators in the disrupted area were experiencing in keeping the traffic flowing. Neighbouring traffic controllers therefore kept sending trains to the disrupted area. As a result, trains were queuing up before the station, which made it more difficult to isolate the disrupted area and halt the spread of the disruption to other areas. Similarly, Dutch Railways’ local control rooms started to make use of each other’s resources, such as train personnel, without consultation. There were also instances in which train drivers were relying on (incorrect) information from ProRail’s train dispatchers, as they could not get in contact with their own organization.

6.2.1 Orchestrating action and managing the flows of communication

The OCCR has the important task of developing an overall understanding of operational conditions. To create this overall understanding, the coordinators of the different teams co-located in the OCCR regularly come together to share and discuss the information received from local operators and decide on a shared course of action. The various parties then inform the local operators of the decisions that have been made to orchestrate their activities. However, as we have observed, especially in the broken overhead wire case, the overall understanding of the situation created by the coordinators in the OCCR can quickly become outdated, as one of the national coordinators rail (LCR) explains:

LCR: What you repeatedly see is that we are running behind the facts here in the OCCR. What often happens is that we are discussing things that are already outdated. So, while we are creating a shared understanding, the situation outside has already changed completely.

Hence, it is important that local operators provide regular situation updates so that the operators in the OCCR can update their overall understanding of the situation. Despite this, we noticed that these big picture updates were very scarce. Instead, the operators in the OCCR had to actively collect the information themselves. This was made difficult by the overloaded telephone lines. In fact, in the broken overhead wire case, a pattern emerged in which neighbouring traffic control centres were actually providing the national traffic controllers with important new information when they contacted them for guidance.

This information disadvantage negatively influenced the OCCR’s ability to monitor performance and take control when needed. First of all, since decision making by the coordinators in the OCCR was based on already outdated information, their decisions were often no longer feasible and new rounds of decision making had to be started. As a result, the role of the OCCR became reactive, instead of proactive. Moreover, the development of a collective understanding on the basis of new information takes quite some time. This conflicts with local operators’ need for a quick decision in order to intervene quickly in the escalating situation. For example, in the winter storm case the regional traffic controller single-handedly decided to stop the rail traffic in his area of control while the operators in the OCCR were still discussing newly obtained information on the situation outside. If this decision had been coordinated better, it might have had less of an impact on the management of the train crew and services could have been restored sooner. Finally, instead of being a hub for information collection and dissemination, we noticed that local control centres often bypassed the OCCR for information and consultation. Instead, they sought direct contact with the local operators managing the disruption in order to receive firsthand information. This is illustrated by the following fragment of a conversation between a regional traffic controller and national traffic controller.

RTC: I will discuss matters with Utrecht. Not to be rude, but I prefer to listen to Utrecht instead of you, because with them I have a shorter line of communication (…) If you tell me that they will be able to manage things and the regional traffic controller over there says he is not, then I will run into problems with them.

At Dutch Railways, they try to solve issues with the synchronization between and with the local control rooms by scheduling regular conference calls with their shift leaders to obtain periodic situation updates. In addition, the coordinator at Dutch Railways in the OCCR can make use of four ‘cards’ (punctuality, control, large traffic flows and rolling stock) that are assigned to each control centre matching the operational environment. These cards indicate the priorities for each control area and provide guidelines for achieving these goals. For instance, during these major disruptions the coordinator assigned the ‘control card’ (preventing the propagation of disruptions) to all regional control rooms in order to shift to a clear chain of command in which there should be no discussion about decisions made by the operators in the OCCR. Nevertheless, applying these cards are not without their difficulties when it is necessary to make a trade-off between the goals of carrying passengers and achieving a balance in the rolling stock, as one of the coordinators of Dutch Railways (LBC) explains:

LBC: A chain of command starts with good agreements and communication and there you have it… Good communication is often difficult because you can’t get into contact with each other. I’m also convinced that not everyone fully understands what these cards actually mean. You should actually do a check. We are currently playing the ‘rolling stock’ card, but do you know what that means? It means that I can cancel a passenger train to free up a train driver, because rolling stock has first priority. There should be no discussion then about the fact that the train is full with passengers and that cancelling the train will lead to a crowded platform.

6.3 Getting stuck in outdated behaviour

The success of the Dutch disruption management model largely depends upon the capacity of local operators to make a correct situation assessment quickly so that a contingency plan can be implemented that matches operational conditions. In a time-compressed and dynamic environment the availability of information needed to make an accurate assessment of the situation is however often challenging, while decisions have to be made quickly to prevent the situation from escalating (Salas et al. 2001). Regional traffic controllers normally deal with this issue by relying on their experience. In other words, they anticipate that a situation will unfold according to earlier experiences and start to manage the disruption in line with the anticipated contingency plan, (cf. Schipper et al. 2015). This is not an easy task, however, when disruptions are cascading, as in the first case, or when operators are confronted with a new and complex situation, as in the second case. In those cases, understanding of the situation often needs to be adjusted on the basis of new insights (Uitdewilligen and Waller 2012). In the broken overhead wire case, this led to a tension between the desire to implement a contingency plan and the need to remain vigilant to changes in the environment.

6.3.1 Plan formulation and remaining vigilant to changes in the environment

The previous section highlighted the risks of managing a disruption without a shared plan. To reduce these risks, operators in the OCCR tried to formulate and implement a contingency plan as soon as possible. Hence, national traffic controllers urged regional traffic controllers to quickly make an assessment of the remaining infrastructure capacity. However, train dispatchers and regional traffic controllers found it difficult to make an accurate assessment of the complex and evolving situations, either because there was still a lot unknown or because any assessment of the situation was soon outdated. For example, in the broken overhead wire case it took quite some time to investigate the break in the overhead wire and restore power to the overhead lines, while in the winter storm case the number of malfunctions reached a total of 26 within an hour. Moreover, in both cases we observed that regional traffic controllers struggled to divide their attention between making a situation assessment and keeping the traffic flowing to prevent a propagation of the disruption. They often preferred to focus on to the latter.

In the second case, the implementation of a contingency plan was further delayed because the unique circumstances meant that predefined plans were not applicable. Hence, plans had to be adjusted by hand to the specific circumstances, which is a time-consuming task. In the meantime, the situation deteriorated rapidly. Local operators of Dutch Railways were struggling to assign crew members to trains, platform tracks were kept occupied, and trains were queuing up in front of the station. Consequently, the issue was no longer just a loss of infrastructure capacity due to the broken overhead wire as operators struggled to keep control over all resources. Hence, the alternative service plan being implemented no longer matched operational conditions. In fact, the OCCR’s desire to swiftly move on to the plan execution phase conflicted with the capacity limits of the local operators, as one of the team leaders of the regional traffic control rooms explains:

Team Leader: When they (OCCR) want to implement a contingency plan, which in my view happens more often, we are still in the first phase of managing the disruption. Dealing with the shunting of trains so we can get an overview of the situation and to see what is still possible. At that point, there is already a logging in ISVL that we will operate according to this contingency plan. When that logging was made we had seven trains waiting for a red signal! (…) I believe that there has been a check, but the desire of National Traffic Control (to quickly implement the contingency plan) and what we could manage in practice, didn’t match.

6.3.2 Adjusting plans to unexpected changes in environment

The national traffic controller had indeed checked with the regional traffic controllers whether they thought the alternative service plan could be implemented. Although the regional traffic controllers agreed with the plan, they soon had to revise their judgement and make additional cuts to the train service. There was actually quite some doubt among the national traffic controllers as to whether the regional traffic controllers had made an accurate assessment of the available capacity and if the contingency plan could be implemented. This concern was never fully expressed to the regional traffic controllers nor did they take the time to jointly make a good assessment of the situation in order to detect any mistakes. In fact, the operators of ProRail and Dutch Railways in the OCCR decided not to significantly adjust the plans, but to hold on to the chosen contingency plan in order to create stability and to re-assess the situation later on to see if additional measures were needed.

Nevertheless, in this case the contingency plan did not lead to a stable train service as there was not enough capacity to run all the trains according to the alternative service plan. Instead of stability, incremental adjustments had to be made to the contingency plan to match it to the changing conditions. This kind of re-planning is not without risks. Not only does it lead to unreliable information for passengers, since trains are cancelled at last-minute notice, but it also causes confusion among the control centres. Revising a plan requires a great deal of renewed coordination between the different control centres and increases their communicative burden and workload. This makes the decision to revise a plan in progress difficult and highlights the importance of making an accurate assessment of the situation. In practice though, operators in the Dutch railway system (at both levels of control) often tend to simplify conditions and make a positive estimation of the possibilities to run trains, as a travel information employee (MRI) explains:

MRI: What you could witness here was the classical rail spasm, which you see often, to say let’s try and see what happens (…) The problem is that you are totally unpredictable for the passengers. At best you are predictable in terms of underperformance (…) I wonder if we would have had the same problems if we had made bigger cuts to the train service. Then afterwards, we could have seen what was still possible and if there was room for more. Now we make initial cuts in the train service and start to clean up the mess. However, the mess doesn’t become any smaller and we still have to make additional cuts.

7 Discussion

The analysis of these two large-scale disruptions has shown that leadership is not an easy task in a MTS adapting under stress and that adaptive failures form a serious threat to the system. In Table 4, the barriers to leadership, as found in the previous section, have been summarized and contrasted with the markers from Table 1. The main findings will be discussed in the next section.

Table 4 Summary of the barriers to leadership observed in the cases

First of all, we have seen that decompensation is a serious issue in the Dutch railway system during large-scale disruptions, as local operators were falling behind the tempo of events. To avoid this maladaptive trap, it is important that workload distribution problems are noticed quickly and that workload is redistributed or assistance is offered proactively. We observed two specific issues in providing backup behaviour regarding backup provision and requesting and accepting backup. First of all, operators in the OCCR often struggled to adequately monitor the performance of local operators in order to detect whether they might need backup and how this should be provided. Besides performance monitoring, it is therefore important that the local operators themselves indicate that they need assistance and that help is accepted when needed.

However, when confronted with increasing demands, local operators are not always able to recognize and express their need for assistance. As Smith-Jentsch et al. (2009) notice, backup providers and recipients will weigh up the likely costs and benefits of coordinating backup prior to offering it or requesting it. The interviews revealed that regional traffic controllers often refuse help because they prefer to manage things on their own. There is a fear among regional traffic controllers of relinquishing control over their process and risking losing sight of the overall picture in their own region. Moreover, some regional traffic controllers actually believe that asking for help is a sign of weakness. Studies have shown that factors like trust, team orientation, and the experience of working together have a positive effect on offering and requesting backup (Fiore et al. 2003; Smith-Jentsch et al. 2009). However, given the setting of distributed teams and continuously changing team compositions, it can be expected that these factors will be less developed and that requests for assistance will be context specific, as one of the traffic coordinators of ProRail describes:

Traffic coordinator: It strongly depends on who is on the other side of the phone. A good regional traffic controller knows when to hand things over, instead of wanting to do everything themselves. If I call them and tell them, I will call your colleague, or I will take over this part of your work, they shouldn’t mind.

Secondly, the telephone conversations revealed that signals of backup needs were not always recognized by the operators in the OCCR as a legitimate need for help. This is partly due to the fact that the information shared was so detailed that the operators in the OCCR were unable to grasp the core message. This shows that just sharing information about the situation at hand is not enough, but that it needs to be translated into meaningful information for others. However, the telephone conversations also showed that operators in the OCCR regularly failed to ask for clarification of the information received in order to create a shared understanding of the situation. The interviews revealed that operators in the OCCR are often hesitant to cross-check information out of the fear of intervening in the work of the local operators.

Another key issue we identified was the information gap of the teams in the OCCR during the management of the disruptions. We expected the OCCR to have an overall understanding of the situation during the disruptions in order to orchestrate the activities of the local control centres. On the contrary, we noticed that the OCCR quickly had a degraded or outdated situation awareness due to the amount of information that had to be shared between teams, inadequate communication lines, and the pace at which the environment changed during the disruptions. This shows that effective leadership is not just the result of the actions of the leader teams, but that component teams play a critical role in facilitating the performance of leader teams by maintaining their situation awareness (Salmon et al. 2008). As such, component teams should be aware of the kind of information the leader teams need and provide regular updates, something which is easily neglected when confronted with a high workload.

Finally, this study has revealed important tensions between coordination by plan and the need to remain vigilant to changes. Research has shown that the adaptability of teams depends on the speed with which environmental changes are recognized and appropriate responses are enacted (Burke et al. 2006). However, local control centres need to contain the disruption and make an accurate situation assessment simultaneously. As the cases illustrate, it is not always easy to do the latter. The local operators were nevertheless under pressure from the OCCR to quickly move to the implementation of a contingency plan. The broken overhead wire case showed that situation assessment and plan execution are thus not always strictly separated steps, but these activities actually overlapped and even conflicted. It resulted in an oversimplification of conditions and in the end the need to revise the contingency plan. Hence, disruption management is not always a single, linear process, but may involve several rounds of assessment, rectification, and adjustment of plans (Golightly et al. 2013). This creates an important challenge for operators in the OCCR, who have to decide between holding on to an initial assessment and revising a plan in progress, the latter involves a great deal of renewed coordination between the teams involved in the disruption management process.

8 Conclusion

While most studies have focused on the contribution of leadership to the adaptation of single teams, leadership in a multiteam setting poses additional challenges. Both in theory (with the development of the concept of polycentric control) and practice (with the development of the OCCR), there is a strong belief that complex networks of control centres—which pursue their own sub-goals and operate in a dynamic and turbulent environment—need a higher level of control to coordinate their activities. The main aim of this study was to further investigate the role of MTS leadership in a real-world setting. We therefore examined the role of leader teams in the management of two large-scale disruptions. This study has shown that operators in the OCCR experienced difficulties in recognizing workload problems before local operators lost capacity to control the situation; were confronted with an outdated situation awareness when coordinating activities of the local control centres, and tended to oversimplify conditions in order to swiftly implement standard contingency plans.

The challenges to MTS leadership identified in this study show that it cannot be expected that polycentric control will instantly occur, simply by placing a leader team above the component teams. Leadership in a MTS requires effective teamwork between component and leader teams in which the component teams should actually facilitate the leader teams in their role. This requires specific interventions, such as joint training sessions, in order to gain a better understanding of how other teams function and to improve communication and coordination skills (Wilson et al. 2005).

Naturally, we are aware of the limitations of our study. The two case studies analysed show the behaviour of a specific group of operators dealing with a specific disruption. It is therefore difficult to generalize the insights of this study, although we must point out that these findings are embedded in broader longitudinal research. As such, a larger body of knowledge on the management of disruptions has been collected over a three-year period through many hours of observations at the different control centres, interviews with operators, and by studying evaluation reports on other disruptions. Hence, the detailed descriptions of the findings from these two cases are embedded in a broader understanding of disruption management in the Dutch railway system and the behaviour of operators at both levels of control.

With this research, we have shown some of the difficulties of providing leadership in a MTS. We believe, however, that leadership is important in relation to MTS effectiveness. We therefore need further empirical research on leadership in various multiteam systems to increase our understanding of the unique challenges of leadership processes in MTS and how to deal with them. In addition, in this study we have not focused on leadership behaviours prior and subsequent to the management of a disruption, but these transition phases can be of importance to the effectiveness of leadership during the management of disruptions. Moreover, the coordination between the different leader teams in the OCCR fell outside the scope of this study, but poses an interesting challenge in terms of balancing the needs of one’s own team or organization and that of the system as a whole. Future research on these topics could help our understanding of leadership in a MTS.