A Next Step in Disruption Management: Combining Operations Research and Complexity Science

Railway systems occasionally get into a state of out-of-control, meaning that there is barely any train is running, even though the required resources (infrastructure, rolling stock and crew) are available. These situations can either be caused by large disruptions or unexpected propagation and accumulation of delays. Because of the large number of aﬀected resources and the absence of detailed, timely and accurate information, currently existing methods cannot be applied in out-of-control situations. Most of the contempo-rary approaches assume that there is only one single disruption with a known duration, that all information about the resources is available, and that all stakeholders in the operations act as expected. Another limitation is the lack of knowledge about why and how disruptions accumulate and whether this process can be predicted. To tackle these problems, we develop a multidisciplinary framework aiming at reducing the impact of these situations and - if possible - avoiding them. The key elements of this framework are (i) the generation of early warning signals for out-of-control situations using tools from complexity science and (ii) a set of rescheduling measures robust against the features of out-of-control situations, using tools from operations research.


Introduction
The phrase 'no news is good news' is particularly true for train operating companies; when the railways do make the headlines of the daily news, the item is usually filled with images of stranded passengers, overcrowded trains and blank information screens. These situations are typically caused by very large disruptions, such as extreme weather conditions or power shutdowns. Due to the complexity of railway operations, dispatchers have trouble reacting to these events, allowing the disruption to spread through the system. For this reason, research in disruption management, aiming at providing dispatchers with computerized support for generating modified timetables, rolling stock and crew schedules after disruptions, has recently received increased attention.
However, currently existing methods often require assumptions that severely limit their applicability to very large disruptions, when effective rescheduling is needed the most. In particular, the current state-of-the-art in railway disruption management is only able to deal with isolated, well-defined disruptions. It is usually assumed that there is only one single disruption such as a partial or complete track blockage, that the duration is known, that all information about the resources is correct, and that all stakeholders in the operations act as expected, Cacchiani et al (2014) presents a broad range of examples. In practice, these assumptions are not always met. Supposedly real-time management information systems for the timetable, rolling stock and crew may lag behind, especially when disruptions cause many deviations from the regular schedules. Next to that, train drivers and conductors may not be aware or even ignore rescheduling decisions made by dispatchers. Furthermore, the duration of a disruption often depends on the time needed for repairing malfunctioning or broken infrastructure, which can take longer or shorter than expected.
In this research, we aim to reduce the gap between theory and practice by analyzing situations where the shortcomings of current techniques are most prominent, so called out-of-control situations. With this term, we refer to situations where the disruption causes dispatchers to no longer have an overview over the system, limiting their abilities to make viable rescheduling decisions. As a result, such a situation can eventually lead to the termination of all railway traffic in a large part of the railway network. Out-of-control situations can arise after extreme incidents (e.g. power shutdowns in a major or crucial part of the network) or combinations of large disruptions. In railway systems, these disruptions easily accumulate and spread over the network due to the high utilization of the infrastructure and strong links between resource schedules. In such situations, decision making becomes slower and less effective due to the uncertainty in the disruption duration and the availability of resources. On top of that, the decision making process may lack updated information or manpower to adapt adequately to the situation. The decisions can then turn out unworkable, leading to barely any train being able to run, even though all resources might be available.
In order to develop effective countermeasures that mitigate the impact of out-of-control situations, it is necessary to better understand how multi-ple (primary) disruptions cause large-scale problems. We focus here on delay propagation and amplification. The complex interaction between various elements of the railway system (infrastructure, timetable, rolling stock and crew schedule, dispatchers and information systems) ultimately lead to amplification of delay on a large scale. Some attempts have been made to capture these (Monechi et al, 2017), but disruption phenomena on the macro-scale have been proven hard to capture. A data-driven approach is proposed to capture these interactions, exploiting similarities between the railway system and other multi-layered systems, e.g. electricity networks (Buldyrev et al, 2010) or climate and vegetation systems (Tirabassi et al, 2014;Yin et al, 2016). The generated insights can be used to develop new disruption management techniques aiming to reduce the impact of out-of-control situations and, if possible, avoid them.
The contribution of this paper is a multi-disciplinary framework for dealing with out-of-control situations, comprising of two main parts. The first part involves the detection and prediction of large disruptions using tools from physics and complexity science (CS), with the aim of providing dispatchers sufficient time for responding to the situation. Ultimately, this allows us to study the evolution towards out-of-control situations and ultimately, to predict them. The second part involves a number of countermeasures that can be applied in (near) out-of-control situations, based on techniques from operations research (OR). The core idea is to completely decouple the operations in the disrupted region from the rest of the railway network. Next to that, we propose the use of self-organizing, local scheduling principles for rolling stock and crew, which are robust for the features of out-of-control situations and also relieve pressure of dispatchers.
The remainder of this paper is structured as follows. In Section 2, we give a detailed description of out-of-control situations, how they arise and what is currently done to prevent them. In Section 3, we discuss the current state of the art of railway disruption management. In Section 4, we describe the framework for dealing with out-of-control situations. We conclude the paper in Section 5.

Out-of-control situations
Extreme events can heavily disrupt the schedule of a train operating company. When this happens, dispatchers are confronted with a very complex problem, as the affected number of resources is large and typically, together with the duration of the disruption, uncertain. This can cause gaps in the information flow, such that the decisions of dispatchers may be based on outdated information, making the matter worse. In these situations, the railway system can get into a state of out-of-control, which we qualitatively define as a situation 'where dispatchers cease to have an overview of the system and consequently decide to terminate all railway traffic in the affected region, even though the required resources (infrastructure, rolling stock and crew) might be available.' Out-of-control situations usually occur after the amplification of multiple initial disruptions. However, this process cannot easily be predicted, because the consequences of a disruption vary a lot. Often the problem is confined to one particular train, track or train line. In other situations, the disruption may propagate and be amplified through time and space. An example is the case where the delayed train carries crew members that need to be transported towards other trains, which then in turn will also delayed. These kind of amplification effects may lead to large-scale disruptions and eventually trigger an out-of-control situation.
One of the most extended analyses of out-of-control situations can be found in a report of the Dutch Ministry of Infrastructure after a harsh winter (Nederlandse Spoorwegen, ProRail, Ministerie van Infrastructuur en Milieu, 2012), with multiple out-of-control situations occurring in the Dutch railway system. As we will provide insights into out-of-control situations using the findings of this report, and will also give three examples of these situations in the Netherlands, we now shortly discuss the organization of the Dutch railway system. The Dutch railway system consists of about 7,000 kilometers of tracks. The maintenance and management of the infrastructure is the responsibility of ProRail. Next to that, ProRail is responsible for the timetable during the real time operations. Netherlands Railways (NS) is by far the largest operator of passenger trains, handling over one million passenger trips each day. In the real time operations, NS handles the rescheduling of rolling stock of crew and is responsible for providing the correct information to the passengers. Because of the temporal density of the Dutch railway schedule, disruptions can easily spread. The decision making takes place on nineteen different locations: five regional centers of NS, thirteen traffic control centers of ProRail and one national control center.
Nederlandse Spoorwegen, ProRail, Ministerie van Infrastructuur en Milieu (2012), find three main causes of out-of-control situations in the Dutch railway system: -The local nature of decision making. Because dispatchers have a locally restricted area of authority, the global picture is not always available. For example, to reduce workload, dispatchers might directly coordinate a route for a train through their area without registering this train in the system; this leads to so-called 'ghost trains'. -The fragmentary decision making process. In the Dutch railway system, the decision making is not only fragmented in terms of (spatial) area, but also spread across different organizations and coordination levels. -The loss of routine through the usage of all kind of additional measures on such days. In the anticipation of extreme weather, timetables are often adapted prior to these events. However, it is argued that this might have a negative impact in these situations, because dispatchers normally strongly rely on their routine and experience with the timetable.
It must be noted that these reported causes of out-of-control situations cannot only be found in the Dutch railway system, but are actually features of many railway systems around the world. For example, Schipper and Gerrits (2018), who compared the practices of disruption management in find that the Belgian and Austrian railways have a similar level, and the German railways a higher level of decentralization compared to the Dutch railway system. Acting on the report of the Dutch ministry, many changes have been made in the Dutch railway operations to reduce the chance for these events to emerge. The rescheduling procedures have been reshaped in order to accelerate the decision making process. NS also refined the reduced timetable that is used on days where extreme weather is expected. While this certainly improves the controllability of the system, the downside of the reduced timetable is that about 20% of all trains are canceled (even 50% in the Randstad, the densely populated area in the west of the Netherlands), strongly reducing the transport capacity (Trap et al, 2017). Furthermore, as the decision to operate the reduced timetable is based on weather forecasts, in some cases it turns out that the measure was not necessary after all. Finally, as illustrated in the next section, not all out-of-control situations are caused by extreme weather conditions, again highlighting the inadequacy of the current approach.

Case studies
To illustrate how the railway network gets into a state of out-of-control, we next present three case studies of such situations in the Dutch railway network.

February 2012 -Winter weather
Extreme weather is a major factor in the triggering of out-of-control situations, since it often causes multiple large disruptions around the same time.
It is estimated that out-of-control situations with causes related to extreme weather happened about ten times during the period 2009-2012. The case of 3 February 2012 has been analyzed in a report to the Dutch Ministry of Infrastructure and Environment (Nederlandse Spoorwegen, ProRail, Ministerie van Infrastructuur en Milieu, 2012).
We start with some numbers from the mentioned report. On this day, there were 305 infrastructure disruptions (about two to three times more than usual), of which 20 where switch disruptions that lasted more than half an hour. Furthermore, there were 250 problems with rolling stock, including six broken trains (daily average between one and two trains). Also, an adapted timetable was used. The amount of delayed trains because of missing personnel was 89, two times higher than usual. Throughout the day, there was an increasing amount of schedule alterations performed by dispatchers. Another feature that is typical for out-of-control situations is that the information flow contained gaps, especially for passengers.
The evolution of the delay on the day is visualized in Fig. 1. Initially, the disrupted area was confined around Amsterdam, but later spread towards Rotterdam and Roosendaal. At the beginning of the evening, the delay even reached the far east of the Netherlands (Enschede). Interestingly, the area between Utrecht and Den Bosch remained rather unaffected.

January 20-electric outage
Besides extreme weather conditions, there are also other causes of out-ofcontrol situations. An example of such a situation is 17 January 2017, when a 6 power outage happened in large parts of Amsterdam. The power was restored at 7:15. As expected, this disruption had a significant impact on the railway traffic around Amsterdam during the morning. Incorrect data in the information systems of ProRail and NS hindered all traffic to and from Amsterdam until after 10:00. Furthermore, when the systems were up and running again, dispatchers were faced with a very large workload since the resource schedules were heavily disrupted. As a result, trains were running irregularly for the majority of the day. It eventually took until 21:00 the regular service was restored (see Appendix for a visualization of the delay evolution).
18 January 2018 -storm Different from (general) winter conditions, storms have a more direct impact on the infrastructure, for example in the form of fallen trees. Early in the morning, there was a collision with a person at Heerenveen, which resulted in some problems in the morning, seen in a high-delay signature around Zwolle (Zl) (see Appendix for a visualization of the delay evolution). Soon after this, the storm kicked in and because of fallen trees and damaged overhead lines, the fire department ordered the closing of several stations. Subsequently, the decision was made to cancel all train activity up to 14:00. This got extended to 16:00, and ultimately up to 17:00 no trains were running. Around 17:00, the storm had settled and dispatchers tried to restart operations. However, the limited overview of the whereabouts of rolling stock and crew strongly limited the possibilities of dispatchers. For this reason, it was decided to broadcast a negative travel advice for the rest of the day, even though the storm had already past.

Comparison
The three cases reflect different evolutions of disrupted situations. During the first (3 February 2018), many trains still were running and the delay had a lot of time to spread across the country. The second (17 January 2017) and third (18 January 2018) are cases where a standstill of a large part of the system occurred. To put the three case studies in perspective, we compare the total (summed) delay in shown in Fig. 2. It is visible that 17 January 2017 (red) on average returns to a normal state in the evening, while 3 February 2018 (black) kept its disrupted state up to the end of the day. Furthermore, the gradual increase of February 18 2018 (orange) points to the standstill of some trains, but the cancellation of many others (because the curve would be much more irregular otherwise). Also the positions of the total delay maxima throughout the day varies on the different dates.
Summarizing, we can say that in these out-of-control situations, the problems differ greatly in shape, magnitude and time of the day. The spread of the delay depended on whether parts of the network were shutdown. Comparing these events to regular days, one finds that the accumulated delay on disrupted days may (on average) return to regular values, but not always.

Literature Review on Disruption Management
When a disruption occurs, the timetable, rolling stock circulation and crew schedule need to be adjusted to obtain a new feasible schedule. Since solving this problem in an integrated manner leads to unacceptably long computation times, the problem is, both in theory and in practice, decomposed and solved sequentially. First, the timetable is adjusted. The modified timetable then serves as input for the rolling stock rescheduling problem. Finally, both the adjusted timetable and rolling stock schedule are input for the crew rescheduling problem. It must be noted that such a sequential approach can lead to the situation where no feasible solution exists for one of the later stages due to a decision made in an earlier stage. Hence, it is sometimes necessary to resolve the timetabling or rolling stock rescheduling problem, until an overall feasible solution is found (Dollevoet et al, 2017). Recent surveys of proposed methods and algorithms for the different steps are presented in Cacchiani et al (2014) and in Ghaemi et al (2017b).

Timetable rescheduling
Timetable rescheduling deals with finding a new feasible timetable by canceling, retiming, rerouting or reordering trains services. Of the three rescheduling phases, timetable rescheduling has received the most attention in the literature. Approaches differ in the type of incident that has occurred (either a small disturbance in the timetable or a more serious disruption such as a track blockage), in the level of detail the railway infrastructure is considered (either macroscopic or microscopic) and in the extent the inconvenience of passengers is taken into account. Objectives are usually to stay close to the regular timetable and minimize the total or maximum delay.
Many microscopic approaches formulate timetable rescheduling problems as job scheduling problems, in which a number of operations (the passing of trains) with certain operation times (running times) have to be scheduled on machines (block sections), see e.g. D' Ariano et al (2007). In case of small delays, such models can be solved within a reasonable amount of time by formulating them as job scheduling problems. Macroscopic approaches use a higher level representation of the railway network, which has the advantage that additional aspects can be incorporated. For example, Schöbel (2007) introduces the problem of delay management, where one decides whether trains depart on time or should wait for delayed feeder trains. The objective in delay management is usually to minimize the total delay of all passengers combined. More recently, this problem has been extended with the routing of passengers (Dollevoet et al, 2012) and the capacities of stations (Dollevoet et al, 2014).
Only a few contributions consider timetable rescheduling after larger disruptions. Louwerse and Huisman (2014) introduce the problem of finding a new timetable in case of partial or complete blockades. Additional constraints are added to increase the probability that a feasible rolling stock schedule exists for the modified timetable. Veelenturf et al (2015) extend this model by considering a larger part of the network, allowing rerouting of trains and incorporating the transition from the regular timetable to the modified timetable and back. Ghaemi et al (2017a) propose a different mixed integer programming formulation for the same problem, incorporating railway infrastructure on a microscopic level. In a follow-up paper, Ghaemi et al (2018) study the impact of uncertain disruption duration estimations on the rescheduling strategy and passenger delays by combining the rescheduling model with a passenger assignment model and a probabilistic disruption time prediction model.

Rolling stock rescheduling
The rescheduling of rolling stock calls for adapting the rolling stock circulation to the modified timetable by changing the compositions of certain trains. Sometimes, this implies that shunting movements are canceled or that new shunting movements are introduced. In case no train units are available, train services must be canceled. Hence, the goal is usually to minimize a combination of the number of canceled trains, the number of changed shunting movements and the difference with the planned end-of-day inventory at the stations.
Nielsen et al (2012) present a rolling horizon approach for rescheduling rolling stock. In this approach, the rolling stock is rescheduled periodically, as information about the disruption is updated. The model used is based on a mixed integer programming formulation of the rolling stock scheduling problem proposed in Fioole et al (2006). Kroon et al (2014) use the same model but also take passenger flows into account when rescheduling the rolling stock. Since disruptions can cause passengers to take different paths, their model tries to facilitate this change in demand by adapting the rolling stock schedule. To solve the problem, the authors iteratively compute a rolling stock schedule and simulate the corresponding passenger flows, until a satisfactory overall solution is found. In Veelenturf et al (2017) this model is extended by also allowing small timetable adjustments, namely introducing stops of trains at stations where they would normally not call. Haahr et al (2016) compare the composition model used by Nielsen et al (2012) and Kroon et al (2014) with a path based model and conclude that both models are fast enough to be used in rescheduling contexts.

Crew rescheduling
When the timetable and rolling stock schedule are updated, it is known which tasks need to be executed by the train drivers and conductors. Crew rescheduling involves assigning these tasks to the crew members. Often, many changes are necessary to the crew schedules as disruptions cause many duties to become infeasible. For example, a train driver on a delayed train might arrive too late for his next task, such that this task must be performed by a different train driver. Many (labor) restrictions need to be respected when reassigning tasks, the most important one being that a crew duty should always end at the planned crew base. If a task cannot be assigned to any crew member, it must be canceled. This is especially undesired for driving tasks, as this requires the rolling stock schedule to be updated once more. Therefore, the objective in crew rescheduling is usually minimizing the number of canceled tasks and changes to duties. Huisman (2007) addresses crew rescheduling in the context of scheduled maintenance operations. As the number of possible duties is very large, the problem is solved using a combination of column generation and Lagrangian relaxation. Potthoff et al (2010) consider the crew rescheduling problem when a disruption has occurred that causes a blockage of a route. To keep the problem size tractable, first a core problem with a limited number of tasks is solved. In case the solution contains canceled tasks, tasks that are in some sense close to canceled tasks are added to the core problem. This process is repeated until all tasks are covered or a time limit is exceeded. Veelenturf et al (2012) extend the crew rescheduling problem by also allowing retiming of trips. This increases scheduling flexibility, such that more tasks can be covered. In Veelenturf et al (2014), uncertainty with respect to the length of the disruption is taken into account by requiring that duties have feasible completions in a number of different scenarios. A completely different approach to crew rescheduling is taken by Abbink et al (2010). In this paper, train drivers are represented by driver-agents. In case the duties of some drivers have become infeasible, the driver-agents try to solve this by swapping tasks between drivers.

Takeaways
As is clear, there is a vast amount of literature on disruption management for railway systems. However, only a few contributions (Ghaemi et al, 2018;Nielsen et al, 2012;Veelenturf et al, 2014) take the uncertainty that comes with majordisruptions into account, at least to some extent. Furthermore, the largest disruptions that are considered in the literature are complete blockages of one route for a number of hours. For larger (combinations of) disruptions, the performance of current models is unknown. On top of that, the effectiveness of the proposed methods is completely dependent on the data accuracy in information systems and the willingness of stakeholders to cooperate, two assumptions that are often violated in case of larger disruptions. These observations lead us to the conclusion that the current state-of-the-art of railway disruption management is unable to cope with out-of-control situations.

Framework for dealing with out-of-control situations
As we have seen in the previous section, existing disruption management techniques are ineffective when it comes to preventing or reducing the impact of out-of-control situations. Therefore, in this section we propose a new framework for dealing with out-of-control situations. The framework is visualized in Fig. 3. It contains six steps, which can be divided into two parts. In the first part, tools from CS are used to generate early warning signals in case an out-of-control situation is likely to occur, and to determine which part of the network is most affected. In the second part, techniques from OR are used to find appropriate rescheduling measures, with the aim to prevent the out-ofcontrol situation and maintain a high quality service. Of the six steps that the framework contains, only Step 3 can be solved using existing methods. For all other steps, new methods need to be developed.
The key concept of the framework is the disrupted region. In Step 2, this region is identified and completely decoupled from the rest of the network, i.e. no trains or crews are allowed to move from the disrupted region to the non-disrupted region or vice versa. As a consequence, passengers who need to travel from within the disrupted region to the rest of the network or the other way around can do so by transferring at one of the boundary stations. This drastic measure is taken in order to isolate the disruption and to prevent it from propagating further through the network. Furthermore, by decoupling

Effective measures (Complexity Science) (Operations Research)
Step 1: Anticipate amplification using early warning metrics Step 3: Reschedule the non-disrupted region Step 2: Identify and isolate the disrupted region Step 4: Modify line system inside the disrupted region Step 5: Schedule resources inside the disrupted region Step 6: Manage the passenger flows the appropriate disrupted region, it can be assumed that outside the disrupted region complete information is available, such that we can use tailored disruption management strategies for both parts.
It must be noted that a possible seventh step of the framework would be to recouple the two parts and transition back to the regular timetable once the disruption is over. However, such an operation is highly complex and could easily lead to repeated loss of control. Hence, the safest option is to maintain the two parts separate for the rest of the day. During the night, sufficient time is available to set up the resources again in order to start the regular timetable the next day.
In the remainder of this section, we will consider every step in more detail and indicate how techniques from CS and OR can be used to support the decisions that are required to be made in every step.
Step 1 Anticipate amplification using early warning metrics In order to prevent out-of-control situations from happening, it is essential to provide dispatchers with early warning signals for these situations, giving them sufficient time to respond and take the necessary measures. In Complexity Science literature, early warning signals are derived in different manners. The most common approach is to look at statistical metrics like increased autocorrelation and variance (Scheffer et al, 2009;Thompson and Sieber, 2011). These are quite established in physical systems, but cannot directly be applied on the railway system due to its high degree of heterogeneity and discontinuity of processes. Therefore, we suggest the creation of a statistical model.
The statistical model would contain all dynamic interactions of all delayed trains across an area. Finding these dynamic interactions is difficult for a number of reasons. First and foremost, the railway system is highly heterogeneous, meaning that the interaction between trains are situationally different -lines, train type, train direction, infrastructure capacity consumption, number of crew members and exogenous factors such as accidents or technical problems all distinguish one situation from another. Second, the behavior of people involved (drivers, dispatchers, passengers, emergency services) is not necessarily systematic. And third, the system comprises of multiple network layers (infrastructure, rolling stock, train crew and an information/decision network) instead of one. Large-scale disruptions may amplify stronger in these kind of systems, as seen in the example of a major disruption in the Italian intertwined electricity-internet network (Buldyrev et al, 2010).
There are attempts in literature to capture these systematic dynamics. Monechi et al (2017) analyzed railway logistics from Germany and Italy, and found a number of dynamic interactions, one of which is backward propagating delay. Kecman and Goverde (2015) used Dutch railway data and focus on quantifying parameters of running and dwell times, which are important (fluctuation-driven) uncertainties in microscopic models. Goverde (2010) made an analytical approach of describing the system, using the timetable and parametrization of quantities like dwell times to make a forward integration model. Furthermore, Ball et al (2016) showed the equilibrium diagram of a simple model when connecting the rolling stock layer with a crew layer, illustrating the effect of interdependent networks. These papers illustrate different approaches to define structural railway dynamics, but there is no overall consensus on a macroscopic approach (Monechi et al, 2017), making it hard to make accurate predictions for large disruptions. Looking to applications in other fields, CS provides many examples of systems in which the specific dynamics are not fully known or where the interactions are highly heterogeneous. For example, Sebille et al (2012) used a transfer matrix method to predict the movement of plastics in the ocean. Another example is the interaction between forest and savanna systems, where Hirota et al (2011) showed various types of macroscopic pattern formation.
Trying to apply these existing methods onto the railway system, distinguishes two levels of statistical models. First it should be emphasized that we will treat delay as the state variable: the propagation and amplification of delay can be seen as a proxy of the magnitude of problems in the railway system. A first-order model would contain mainly advection and diffusion of delay, which can be derived from (lag-corrected) correlations or using more advanced methods like singular spectrum analysis. These processes give a first hint on how the effects of one specific disruption spreads through the network. A second-order model would also contain dynamic interactions: in the case of multiple disruptions, interactions may lead to amplification effects. It is necessary to take these effects into account, because these are important in the growth of out-of-control situations. This second-order model can be derived by analyzing the macro-evolution of delay, which is for example captured in a so-called transfer matrix (as in Sebille et al (2012)), which is calculated directly from data. Using methods as described above, allows for the prediction of the evolution of delay in time, which (depending on the robustness) allows the application of early warning metrics.
Step 2 Identifying and isolating the disrupted region The region that is decoupled from the rest of the network is referred to as the disrupted region. The boundary of this region is not trivially given by one single metric (e.g. accumulated delay), because multiple logistic factors are important to consider when decoupling any region from the rest of the system.
First and foremost, one needs to consider if it is necessary to decouple a region at all. If early warning indicators anticipate a large disrupted system, there are many alternative countermeasures to consider and the system might also remain controllable (although disrupted). Second, in some situations (e.g. when a station is completely disrupted), several stations or tracks may be forced to be at the boundary of the disrupted region. Third, one needs to identify tracks that have a large impact on the propagation of the delay throughout the country. By removing all dependencies along these diffusion regions, the spread of delay will strongly be reduced. These tracks can be identified using the statistical models used to create the early warning signals. Fourth, the amount of rolling stock within the disrupted region, and outside of the disrupted region needs to be considered. Locking a large disrupted region when there are very few trains in the area reduces the efficiency of the logistics. Fifth and finally, the size of the control area should not be too large as the service level within the region is likely to be lower compared to the rest of the network, since self-organizing strategies will be used to schedule the resources within the disrupted region. But it also should not be too small, because the robustness of the self-organization may drop if there is not room for adaptation.
Step 3 Rescheduling the non-disrupted region Outside the disrupted region complete information is available, so conventional disruption management techniques can be applied to reschedule the railway traffic in this part of the network. The rescheduling of the crew is the most complicated, as crew duties must end at their fixed base and it is likely that crew members outside the disrupted region have their base inside the disrupted region (and vice-versa). This problem can be addressed by for example imposing that the duties of such crew members should end near the boundary between the two regions and taking into account the expected time it takes for them to travel back to their base. An additional challenge is that, from the perspective of the non-disrupted region, the separation of the two regions can be considered as a combination of track blockages, a disruption that has not yet been considered in existing literature. Since computation times are likely to increase with the size of the disruption, dedicated (possibly heuristic) algorithms need to be developed in order to find good solutions in a reasonable amount of time.
Step 4 Determining a modified line system for the disrupted region When the disrupted region is decoupled from the rest of the network, it is unlikely that the original line system, specifying which lines are operated at which frequencies, can be maintained. This has two main reasons. Firstly, as the platforms at the boundary stations are divided among the disrupted and the non-disrupted region, and turning a train takes more time than simply continuing in the same direction, the railway infrastructure is unlikely to allow for the same number of trains as in the regular line system. Secondly, as there is only a limited amount of rolling stock available within the disrupted region at the time of decoupling, and trains are not allowed to transfer between the regions, it is possible that there is insufficient rolling stock available to operate the regular line plan. As such, it is certainly necessary to modify the line system for the disrupted region. Evidently, a model for modifying the line plan should take both the infrastructure and the available rolling stock into account, effectively moving line planning from the strategic to the operational setting. As few existing line planning models take the available infrastructure and rolling stock into account (see Schöbel (2012)), this problem asks for novel mathematical models, (partially) integrating timetabling and rolling stock scheduling into the line planning problem.
Step 5 Scheduling rolling stock and crew in the disrupted region Since out-of-control situations are characterized with great uncertainty regarding the exact whereabouts of the rolling stock and crew, it is not possible to communicate detailed instructions to the crew. Instead, the idea is to provide a strategy on what task to do next. This way, we reduce the dependence on central traffic controllers and avoid having to wait for clearance from dispatchers that are faced with incomplete information.
Given that in the previous step a workable line plan is generated, it should be possible to find appropriate strategies that restore a stable service in the disrupted region as soon as possible. Simple principles could be used to determine when trains should depart after arriving at a station, and which rolling stock units are used to operate the different lines. For the scheduling of the crew, more intricate strategies are required, as some crew members eventually need to exit the disrupted region in order to end at their base, and the other way around. By employing agent-based modeling, the best performing strategies can be identified.
Step 6 Managing the passenger flows In the sixth and final step of the framework, the passenger flows are managed. Since the line plan in the disrupted region is adjusted, passengers also have to be routed differently through the network. Furthermore, since the disrupted region is not operated using a fixed timetable it will be a challenge to provide the passengers with proper information on how to travel to their destination.

Conclusion
Many methods have been proposed over the years for rescheduling railway systems after disruptions. However, in out-of-control situations, which are characterized by a very large number of affected resources and a high degree of uncertainty, these methods are less effective. In this paper, we proposed a new multidisciplinary framework for dealing with such situations to close this gap between theory and practice. In coming years, we plan to further develop the steps in this framework, and test its performance using simulation and serious gaming.
Appendix -Delay evolution of other case studies