1 Introduction

Critical infrastructure lifelines, such as electrical power, water and wastewater systems, and road networks, are essential for supporting the continued functioning of communities, and are closely linked to the stability of urban populations (Ouyang 2014; Bristow 2019). As such, damage to these lifelines due to a disaster or malicious attack can have a profound impact on the well-being and livelihood of residents in urban areas. Increasingly complex infrastructure networks produce interdependencies between systems that are challenging to assess (Loggins and Wallace 2015). To prevent the propagation of failures within and between systems in a disaster setting, these interdependencies must be identified and protected.

Generally, disaster risk management can be grouped into four phases, including mitigation, preparedness, response, and recovery (Rubin et al. 1985; Berke et al. 1993). Of these four, recovery is often regarded as the most poorly understood and researched (Rodríguez et al. 2018). Recovery is closely linked with the resilience of infrastructure systems and should be considered concurrently with resilience given that a key component of resilience is a timely return to predisaster (Haimes 2009) or even superior conditions. Research in the area of recovery is critically important to fill the gap noted above. Understanding recovery through assessment of previous cases or through modeling can inform local communities of best practices and areas where improvements in planning and preparation at various scales may be required.

In response to the necessity of supporting their populations after disasters, many nations develop plans that identify, classify, and establish strategies for the protection and continued operation of their infrastructure systems. These plans, while necessary for promoting unified goals, standards, and requirements at a national scale, do not address specific issues that are experienced within individual communities. As such, they should encourage plans for protection that are made at regional and local scales (Rodríguez et al. 2018). The variation in population, needs, and concerns of dissimilar urban areas require individual infrastructure protection assessment to provide value within the local context (Bristow 2019). Individuals can play a role in protecting their own homes and livelihoods, but must also trust that broader infrastructure systems will be in place within a reasonable amount of time to continue to support their recovery in the aftermath of a disaster (Onuma et al. 2017).

Understanding the processes that contribute to recovery of these broader systems can improve community preparedness, thereby increasing the resilience of populations and expediting their recovery after disruptive events. Many assessment methodologies provide useful results and findings, but are often limited in their scope or resolution. Therefore, the objective of this study is to demonstrate data-driven modeled estimates of multi-infrastructure restoration at the city scale that can be used to inform local communities at an individual neighborhood level about the outages and recovery processes that they may face after a disaster.

For this work, features from existing resilience and recovery tools are used to provide a novel assessment methodology that integrates probable damage, restoration priority, and dependencies within systems to illustrate the dynamics of urban recovery. In the sections that follow, hazard assessment tools are introduced, a study area, infrastructure systems, and hypothetical hazard are defined, and trial results are presented with key findings and suggestions for future research highlighted.

2 Background

Various data-driven methodologies exist for the purposes of hazard assessment and recovery modeling that serve a wide variety of needs. Researchers such as Miles (2018) model the progress of long-term recovery for individuals after a disaster by tracking milestones like housing inspections and insurance payouts. Others focus on the optimization of resource allocation to best serve the needs of populations after disaster. Hu et al. (2016) examine how limited government resources may best be distributed after disaster, while Lubashevskiy et al. (2014) model the transfer of resources from undamaged locations to areas impacted by disaster.

Other research focuses on the loss of functionality and resilience of specific systems after disruption, such as Duffey’s (2019) and Nateghi’s (2018) studies of restoration and resilience modeling for electrical power systems after failure due to a range of hazards. Further, He and Nwafor (2017) propose a method for optimizing the recovery of gas pipelines to minimize loss after disaster, and Khatavkar and Mays (2019) investigate the performance of water distribution systems subject to limited water or power availability after failure. In addition, Pagano et al. (2019) provide an analysis of other methods of assessing resilience for water distribution systems.

Numerous authors have focused on the effects of disasters and disruptions on transportation systems and road networks. Ganin et al. (2017) and Sullivan et al. (2010) propose models to assess the efficiency and route performance of transportation networks after various types of disaster. In the case of Muriel-Villegas et al. (2016), this work is expanded beyond the traditional assessment of urban systems to address the performance of interurban transportation routes to remote locations.

Many other studies and methods exist in the research areas mentioned, and each provides valuable insight into the loss and recovery of functionality for specific types of infrastructure systems after disruptions and offers essential information that can be used in broader disaster assessment. In the case of this work, an assessment of multi-infrastructure systems is necessary.

For such systems, various other methodologies have been proposed. Many contain models of the dependencies between sectors within larger infrastructure systems to assess how failures propagate through the system and components recover over time (Cavdaroglu et al. 2013; Setola et al. 2016; Stergiopoulos et al. 2016). Others propose metrics for assessing the resilience of systems and their recovery after disruptive events (Henry and Emmanuel Ramirez-Marquez 2012; Tran et al. 2017). With the exception of Ouyang and Wang’s (2015) study of recovery times of gas and power systems subject to hurricane hazards, these methods primarily construct a framework for including recovery times in their models rather than providing information about the source of recovery time inputs. Although these frameworks are valuable, including repair times in models provides a more comprehensive understanding of the effects of disaster and the resulting lack of functionality that a community may expect.

One tool commonly used in North America that provides disaster damage assessments and recovery time estimates is Hazus, a software developed by the United States Federal Emergency Management Agency (FEMA). Hazus inputs include hazard categorization, geographic information, and infrastructure system data for a location of interest. This information is processed in Hazus and results are produced by applying data from empirical studies of previous hazards to the input information. Results include the probability of damage to different infrastructure components, as well as the likelihood of recovery for various systems after a specified amount of time (FEMA 2011). Dependencies within and between infrastructure systems are not explicitly modeled, so outputs are based on prior recovery trends from similar systems and expert judgement.

To address the recovery dynamics of multi-infrastructure systems by incorporating the interdependencies that connect them as well as recovery time and resource constraints, Bristow and Hay present GMOR, the Graph Model for Operational Resilience (Bristow and Hay 2017; Bristow 2019). GMOR models the components of infrastructure systems and the dependencies between them along with failure, repair time, and required repair resource information to track the recovery of systems over time. Further details on GMOR’s functionality can be found in Bristow and Hay (2017) and Bristow (2019).

Inputs to GMOR can come from a variety of data sources at different scales. For example, if data for one infrastructure system only exists at the neighborhood level, these data can still be integrated into a model where other system information is available at the street or building level. In addition, repair time and resource parameters can be adjusted in the model to accurately reflect conditions in the area being modeled. In this way, GMOR overcomes limitations of other modeling platforms by not constraining models to a single infrastructure system and by allowing for the dependencies between systems to be modeled as well. This work therefore fills a gap in previous research by providing a model of multiple infrastructure systems and assessing their failures as well as their recovery over time with specific time parameters and dependency relationships. In addition, our work provides an assessment of an area (detailed in Sect. 3.1) not previously studied in this way.

3 Materials and Method

For the study presented here, physical entities for water distribution, wastewater collection, electrical power, and road networks are represented within a GMOR model by unique identifying parameters. Beyond these systems, others such as buildings, maintenance facilities, and supply networks may be added to the model in the future. The GMOR parameters include details like the type of entity (function, resource, event, or system), spatial information (if present), and any other entities within the model on which a given component is dependent (Bristow and Hay 2017). The parameters are combined into a city-scale model and used to generate five hundred randomized trials in a Monte Carlo fashion that allows for estimates of the recovery timeline of the infrastructure.

Many of the parameters used in the GMOR model for this study are derived from Hazus information produced in a previous study conducted by The Geological Survey of Canada on the vulnerability of infrastructure in the selected case study area (Journeay et al. 2015). Hazus data utilized include the probability of occurrence for varying levels of damage as well as repair and recovery parameters for the different infrastructure systems. Further details of this integration are included in the following sections.

Previous work by Bristow (2019) using data from the government study was limited in its scope and discussion of results, and was primarily developed to introduce an updated approach to simulation in GMOR. The work presented here encompasses the whole District of North Vancouver and introduces an additional capability in GMOR for modeling mutually exclusive damage states that was not available in earlier trials.

The incorporation of dependencies is a key feature of GMOR that offers an improvement over modeling approaches that use individualized repair times for components without representing their reliance on other necessary systems and processes. GMOR only shows that an entity is functional once its upstream dependencies are also functional and a specified delay (equal to the time required to repair the entity) has passed.

Functionality of all systems in a given neighborhood is represented by a single entity within the model that indicates that water, wastewater, power, and road networks within that neighborhood have all been restored, and that the neighborhood path dependencies to the water distribution and wastewater treatment facilities (see Sect. 3.2) are intact. The recovery time shown for this entity is simply equal to the highest calculated time for any one of the included systems. This entity allows users to quickly evaluate neighborhood recovery as a whole, while also having the opportunity to evaluate the recovery times of individual systems within each neighborhood.

For prioritizing the repairs done to the various infrastructure systems within the district, a random order is applied in the model. That is, there is no one neighborhood that is consistently prioritized for repairs before the others in the model.

Random ordering reflects the uncertainty involved in the location of damages that occur within a system and the possible location of repair resources at the time of a disaster. In addition, without input from the district regarding facilities or areas that should be prioritized, and given the possible conflicting priorities of different stakeholders, a random ordering is considered appropriate. Should further information or discussions offer insight into locations or regions that should be repaired before others, a specific ordering may be applied to the model.

3.1 Study Area

The case study involves the District of North Vancouver (DNV), a municipality located in the southwest portion of British Columbia, Canada, across a marine inlet from the City of Vancouver. This largely suburban municipality is home to approximately 86,000 residents (Statistics Canada 2017), most of whom live in the southern portion bordering the City of North Vancouver. As seen in Fig. 1, the northern part of the district primarily consists of sparsely populated, forested terrain.

Fig. 1
figure 1

The study area and landmarks surrounding the district of North Vancouver. Map tiles by Stamen Design (http://stamen.com), under CC BY 3.0. Data by OpenStreetMap, under ODbL/Landmarks, borders, and orientation information added

The district is situated in an area that is subject to seismic hazard and is at risk of a large earthquake in the future. An earlier study (Journeay et al. 2015), conducted by the district in partnership with Natural Resources Canada (NRCan) and a number of other research partners, examined the effects of an earthquake on local infrastructure systems. This study evaluated the likelihood and effects of known hazards in the region before completing a comprehensive assessment of a reference-case magnitude 7.3 shallow earthquake centered in the nearby Strait of Georgia. Hazard information for this earthquake was then processed with Hazus to determine the direct effects and probabilities of failure of the various infrastructure systems of interest (DNV 2015; Journeay et al. 2015).

Hazus outputs include the estimated level of damage to various infrastructure components and, for some infrastructure systems, the probability of system recovery after a certain number of days. Instead of using these data directly, however, the goal of this model is to offer an estimate of recovery time for each component in the study area in order to provide an overall view of multi-infrastructure restoration at the city scale. To achieve this, the damage state reported by Hazus is coupled with repair times for the various systems. These repair times are gathered from federal partners involved in the earlier study as well as Hazus documentation (FEMA 2011), which is derived in part from a study of earthquake damage data developed by the Applied Technology Council (1985), as well as expert judgement.

Separating damages and repair times provides flexibility to the process of modeling in GMOR. If improvements are made to recovery time estimates or local availability of resources, these can be quickly incorporated into an updated GMOR model. In addition, this separation allows dependencies to be added to the GMOR model that are not represented within Hazus.

3.2 Included Infrastructure Systems

Infrastructure sectors included in this study are potable water distribution, wastewater collection, power distribution, and road and highway networks. Each of these systems is separated into zones based on neighborhood boundaries within the district shown in the inset of Fig. 1.

One water treatment facility in the northern part of the district and one wastewater treatment facility in the south are included in the study. Water distribution and wastewater collection networks are connected by neighborhood paths to central water supply and wastewater treatment facilities by a process explained using Fig. 2 as a simple illustration. In this example, zone A contains a water treatment and distribution facility that provides potable water to residents in zones A through D. Here it is important to make a distinction between repairs and functionality. For example, the water piping system in zone D could be quickly repaired after a hazard, or may even be undamaged, but that does not mean that it is immediately functional. Repaired pipes cannot convey water to residents unless the system feeding them is functional first.

Fig. 2
figure 2

Illustration of the process used to determine functionality of the water distribution system in the model. In order for the potable water system in zone D to function and provide water to residents, the treatment facility and piping systems in zone A, as well as the piping systems in zone B or zone C first need to be repaired

The GMOR framework accounts for this so that, in order for the water distribution system in zone D to be functional (that is, to provide water for residents there), the pipe network in zone D, pipe network in zone B or zone C, and pipe network and treatment plant in zone A need to be functional as well. In this way, the individual neighborhood systems are dependent on one another to provide a functional route from each to the main supply. In future models, higher resolution zoning may be established to more accurately represent the paths, connections, and separations between systems that are not neatly contained within individual neighborhood boundaries.

Power distribution and road systems are represented as isolated entities within neighborhoods, with no reliance on systems in bordering neighborhoods. Since power generation is largely located far outside the district, the repair time of transmission lines entering into the district is not included in this study. It is assumed that they remain intact or are rapidly repaired due to their significant impact on the district and surrounding areas.

In the same way, damage to local roadways is assumed to present the most immediate disruption for residents. Redundancy in road systems, the capacity of repair crews to pass minor obstacles, and an overall lack of extensive damage to the road network observed in the study likely result in negligible delays to access repairs compared to the length of the repairs themselves. In addition, modeling each individual road segment and its connection to other roads is computationally complex, and the unknown location of repair crews relative to damaged components at the time of a disaster restricts modeling to a neighborhood scale.

In this model, water distribution and wastewater collection network failures and probabilities of failure for power and road networks are correlated to those established (but unpublished) for use in the previous district study of a magnitude 7.3 earthquake (Journeay et al. 2015).

3.3 System Failures and Recovery Time Parameters

Recovery time parameters and failure data for infrastructure systems are sourced from Hazus documentation and federal partner data and estimates. The application of this data to the current study is described below.

3.3.1 Water Distribution and Wastewater Collection

For water distribution and wastewater collection systems, the estimated times for pipe repairs are shown in Table 1. These repair times are scaled by the number of breaks and leaks within a neighborhood. The levels of damage and availability of repair crews are held constant, but recovery time varies based on the distribution indicated in the table.

Table 1 Summary of repair time parameters by repair task

There is a single water distribution facility in the district. In order for water distribution systems in other neighborhoods to be functional, this facility must first be repaired. The system is largely gravity-fed but is supported by a transmission pumping system in the water distribution facility. Repair time for the pumping system is derived by weighting Hazus information with probabilities of damage and is calculated as a distribution with a mean of 2.83 days and a standard deviation of 1.34 days.

The district has one wastewater treatment facility located near its southern border. In the same way that the water distribution system in each neighborhood is joined to this facility by adjacent neighborhoods, the wastewater collection network is connected in the same way. Each individual neighborhood must be able to reach the facility by means of functioning neighborhood wastewater treatment network entities in order to be restored to full function itself. Future studies may explicitly model individual pipe segments, redundancy provided by parallel systems, or critical sewer lines, but that complexity was not included here for the sake of computational efficiency and lack of available data.

3.3.2 Power Distribution

For the electrical power distribution system, the paths of power lines and locations of other key parts of the system were not available. Instead, a rough outage estimate produced in conjunction with the previous study (Journeay et al. 2015) indicated a 68.1% chance that 75,000 customers would be without power for six months. This estimate was then weighted by population at a neighborhood level to determine probabilities of outage and recovery times. Because path connections between neighborhoods are unknown, the power distribution system in each neighborhood is treated as independent for the purpose of repairs. As mentioned previously, it is assumed that power lines feeding into the district are functional, so the recovery modeled is for lines completely within the boundaries of the district. Low resolution provincial data indicate that multiple feeder lines enter the district, so the power system in the area has redundant capacity. In addition, the power transmission and distribution system is established and maintained by a provincial organization. As a result, the district does not have much of an influence in decision making regarding which lines are repaired first, and the model presented here only considers an approximate overall recovery timeline.

As indicated in Sect. 3.2, the power distribution network was not set to fail in all neighborhoods in every trial. This is different than the water distribution and wastewater collection systems, which are both set to fail in every neighborhood in every trial. Instead, the probability of failure and recovery time for the power system in each neighborhood is determined using federal partner data scaled by neighborhood population. These probabilities range from a minimum of less than 1% to a maximum of 100%, indicating that certain neighborhoods do fail in each trial. Average repair time in individual neighborhoods ranges from less than one day to almost 450 days. Due to a lack of available data, the standard deviation for each of these averages was 50% of the mean as follows for similar parameters in Hazus documentation.

3.3.3 Road Networks

For road networks, Hazus repair times are given by the time required to repair a 1 km segment of road based on the level of damage it experiences (Applied Technology Council 1985; FEMA 2011). These so-called “damage states” are grouped into four categories—no damage, slight damage, moderate damage, and extensive/complete damage. Slight damage is characterized by a few inches of road settlement, moderate damage is characterized by several inches of settlement, and complete damage is characterized by a few feet of settlement.

The probability of occurrence of each damage state for road segments was produced by Hazus for the Journeay et al. (2015) district study. These probabilities are correlated with Hazus repair times and weighted by the length of roadway to provide an overall probability for each damage state at a neighborhood scale. The individual repair time parameters are included in Table 1. As demonstrated by our results, the methodology used to establish failure probabilities and repair time parameters can result in extreme outliers in subsequently reported repair times. These outliers are further discussed in Sect. 4.

Road networks in this study are not joined by a path to a central hub or network. While it is certainly possible to model them with a linear connection to a repair facility or maintenance yard for the purpose of incorporating access time into the model, it is assumed that crew and material locations vary day to day. Predicting their locations for modeling purposes adds computational complexity while likely not increasing accuracy.

The damage state methodology used in this case study was recently developed and added to the functionality of GMOR. This method allows for the probability of occurrence for each damage state (reported by Hazus) to be input into the GMOR model at a neighborhood level. Previous models lacked this capability and therefore required significant manual intervention to produce the same results. Inputs from different sources (such as damage state estimates from one source and repair times from another) can be added and correlated as well, which provides flexibility for modeling complex systems.

3.4 Interdependent Systems and Restoration Resources

Dependencies between systems are limited in this study due to their functional separation within the district. The exception to this is water distribution and wastewater treatment facilities, which depend on electrical power within their respective neighborhoods. This dependency is discussed further in Sects. 4.1 and 4.2. In addition, it is understood that road access is required to perform repairs on many systems. As mentioned in Sect. 3.2, however, the time required to access damaged components is considered less of a concern than the damage of the components themselves.

Information from the district and federal partners indicates that repair crews are specialized in the work that they perform, so cross-sector restoration resource sharing is unlikely. In different municipalities or studies with different levels of resolution, there may be conflicts in restoration timing due to resource sharing across sectors such as equipment or labor that would need to be included in the model.

4 Results and Discussion

Results are presented here for the different infrastructure systems explored in the case study. Separating the results into the varying domains of interest gives insight into which systems might be the most exposed in an earthquake and may benefit from additional preparation and protection. Figure 3 shows the initial failure characterization indicated for each infrastructure sector studied. Note the differing scales in the legend for each system and how the systems respond differently to the same hazard in each neighborhood.

Fig. 3
figure 3

Failure characterization by neighborhood for a Water distribution; b Wastewater collection; c Electrical power distribution; and d Road and highway networks. Water and wastewater are categorized by average number of breaks per neighborhood, while power and road networks are categorized by total failures (that is, the number of times the system failed in each neighborhood) indicated out of 500 trials

4.1 Water Distribution

The variation in the recovery times of the water distribution system is shown in Fig. 4a. The percentage on the vertical axis of the figure indicates the number of neighborhoods (out of 49) that are recovered at the time indicated on the horizontal axis. The “Average” line plots the average time taken (in all trials) for one neighborhood to recover, then two neighborhoods, and so on, regardless of the location or population of the neighborhood. The mean repair time for the neighborhood across all trials is 55 days with a standard deviation of 15 days. As shown in Fig. 4a, the shortest repair time observed for repair of all neighborhoods in the district is 57 days (best case), while the longest repair time is 88 days (worst case).

Fig. 4
figure 4

Recovery time by sector for a Water distribution; b Wastewater collection; c Electrical power distribution; and d Road and highway networks. Scales vary on the horizontal and vertical axes and the logarithmic scale on the horizontal axis for c and d. The best and worst correspond to fastest and slowest overall recovery time, respectively, for each system shown

Figure 4a shows recovery only for the pipelines within the water distribution system and does not include the dependence of some neighborhoods on electrical power distribution. This decision was made to better show trends of recovery in the district and eliminate extreme outliers from the data set. Further, it is assumed that emergency provisions (such as earthquake-resistant backup power) can be made for water systems to supply neighborhoods as necessary. Widespread power recovery may take longer, but functionality for critical systems like water distribution is likely to be restored sooner.

4.2 Wastewater Collection

The average repair time of the wastewater collection network for any neighborhood in any trial is 43 days, with a standard deviation of 25 days. For the whole district, the average recovery time for the wastewater collection network is 87 days (Fig. 4b), with a standard deviation of 5 days. The lower variability in overall recovery time illustrates the effect that ordering and path dependencies have on neighborhood recovery. While individual neighborhood recovery times can vary drastically due to the order in which they are addressed, the overall recovery time remains relatively more consistent as repairs in well-connected neighborhoods trigger recovery for many dependent neighborhoods at once.

The maximum recovery time calculated is 103 days, and the minimum is 72 days. The variability in the wastewater collection network is smaller than that of the water distribution system, indicating lower individual variability in recovery times at a neighborhood level. This is confirmed by the source data, where the average deviation in wastewater collection network recovery time is lower than the deviation in the water distribution network.

As is the case for the water distribution system, there is a dependence on power for the wastewater collection network in one of the district’s neighborhoods. This dependency was assumed negligible for the purposes of these results for the reason indicated in Sect. 4.1—the expected use of backup power.

4.3 Electric Power Distribution

Average recovery time of the electric power distribution system for any neighborhood in a trial was found to be just over 61 days with a standard deviation of over 123 days. These values include trials and scenarios in which the power network did not fail. As such, many of the repair times considered in the average and standard deviation calculations are zero, which skews the results. If these scenarios are ignored, average recovery time jumps to almost 158 days with a standard deviation of 155 days. This large increase in required repair time indicates that the electric power distribution system is very sensitive to disruption in that failures have a substantial impact on the resources and time required to return the system to an operational state.

The greatest number of neighborhood failures experienced in a single trial is 26, and the least is 12. That is, out of the 49 neighborhoods in the district, at least 12 and at most 26 are predicted to fail in any trial. While it may be useful to see how many neighborhoods fail in each trial to observe areas that are particularly vulnerable, the number of failures is not an accurate indicator of recovery time in the district given the high variability in individual neighborhood recovery time.

The overall average recovery time for the district for the power system is just over 531 days, with a minimum of 245 days, a maximum of 1013 days, and a standard deviation of 140 days, as shown in Fig. 4c. There is a high degree of variability in the repair time of the power network, largely because it is tied to the population in each neighborhood as noted in Sect. 3.3.2. Should further information be provided by the utility regarding the specifics of the distribution system, other methodologies for outage estimation, such as the work done by Duffey (2019), can be incorporated into the model.

4.4 Road Networks

As mentioned in Sect. 3.3.3, the methodology used to establish road failure probabilities can lead to extreme outliers in the dataset. While the average repair time per neighborhood presented in Fig. 6d does not show these outliers, their effect is clearly demonstrated in the box plot included in Fig. 7a.

The mean repair time for this data set is 451 days, with a standard deviation of 519 days. This includes eight scenarios in which no failures were indicated, so the minimum repair time for the road network in the district is zero days, while the maximum is almost 4700 days.

As demonstrated in Fig. 7a, repair values are largely concentrated in the lower end of the range shown and the median repair time is only 301 days. The outliers are indicated in this data set because it is important to recognize the effect that a catastrophic (or targeted) disaster could have on the district. Disaster recovery planning activities, however, should largely focus on the more realistic scenarios represented by the center of the box plot.

Figure 4d shows the road network recovery over time in the district. Note the logarithmic scale on the horizontal axis and the large discrepancy in average versus worst-case scenario recovery. Shortest recovery time is not shown in Fig. 4d due to the number of trials, which indicated a repair time of zero days.

4.5 End Users

The end users entity within the model refers to the overall time at which all included systems have been repaired within a single trial. This recovery in the district averages 702 days, with a standard deviation of 404 days. The shortest recovery time indicated is 252 days, while the longest is 4685 days. In general, the repair time of either the road network or the power distribution network governs recovery in the district as a whole.

The best overall district recovery time for the scenarios tested is longer than any of the fastest recovery times for individual systems. This is the result of the low likelihood of an ideal recovery scenario for one sector occurring in the same trial as the ideal scenario for all other systems. However, combining ordering and resource parameters from the best-case scenarios for all trials can help inform decision making that leads to improved repair time for the whole District.

Figure 5 shows the pattern of complete recovery for the systems within the District. The scale on the horizontal axis of the figure is logarithmic, indicating substantial horizontal jumps that are representative of large individual repair times for specific entities. In general, these jumps are the result of a combination of a rarely experienced high damage state and a repair time in the top range of a normal distribution. These extreme scenarios should certainly be considered when planning for disasters but are by no means representative of the level of damage and repair times that are likely to be experienced in the district.

Fig. 5
figure 5

Overall recovery time of all systems within the district. Note the logartithmic scale on the horizontal axis, and the categoraztion of best and worst repair times based on the fastest and slowest repair times indicated, respectively

The central portions of the box plots shown in Fig. 7 represent a more realistic view of the repair times that planners, emergency service personnel, and residents should prepare for following a disaster. The “End Users” plot represents the overall recovery time for the neighborhoods within the district.

4.6 Neighborhood Recovery

Recovery time for each system in the model at a neighborhood level is shown in Fig. 6. Some observations may be made by comparing the recovery patterns observed in Fig. 6 to the initial failure characterization shown in Fig. 3 for each neighborhood. The first is that the initial failures do not always coincide directly with increased repair times. In fact, factors other than initial failures can play a significant role in recovery times. For the water distribution and wastewater collection systems, for example, recovery time is influenced by the path dependencies noted in Sect. 3.2. As such, northernmost neighborhoods tend to recover water service sooner and southwestern neighborhoods tend to recover wastewater service sooner due to the proximity of these areas to primary facilities. This is not always the case, however, due to local failures within neighborhoods that may have an important impact on recovery times as well (Fig. 7).

Fig. 6
figure 6

Recovery time by neighborhood for the a Water distribution; b Wastewater collection; c Electrical power distribution; and d Road and highway networks. Note the differing scales indicated in the legend for each system

Fig. 7
figure 7

Box plots for a all studied infrastructure sectors in the district; and b water and wastewater systems. Note the different scales on the vertical axis of a and b. “End Users” is the entity used to represent overall recovery in the district. The bottom and top tails of the plots represent the lowest and highest quartiles of recovery time, respectively, while the box represents the second and third quartiles with the median indicated by a horizontal line

For the power distribution and road and highway networks, the correlation of initial failures and repair times is limited by other factors. Unlike the water and wastewater systems, the repairs for roads and power networks at a neighborhood level are considered independent with no reliance on centralized systems or adjacent neighborhood functionality. In the case of the power distribution network, the limited correlation is primarily related to the methodology used to determine outage time. The outage time estimate uses population as a proxy for the size of the power network in the neighborhood. As a result, neighborhoods with higher populations are assumed to have larger power networks that require more time to repair when they are damaged. Therefore, the population of a neighborhood is a strong predictor of the recovery time of the power network in this model.

The road network repair time is influenced by both the total roadway length in a neighborhood as well as the number of initial failures. Neighborhoods that have fewer roadways and low initial failures are always repaired much more quickly than neighborhoods with many roadways and high initial failures. Neighborhoods with few roadways but high failures or many roadways but low failures experience varied repair times due to a combination of these effects.

4.7 Discussion of Results

These results can offer insights to emergency management organizations and planners by estimating the duration of emergency or supplementary services needed after a disaster. By providing results at a neighborhood level, infrastructure managers can prioritize which areas will need the most attention for repairs and arrange for adequate resources in those locations. Residents can also be informed within their neighborhoods about how they can best prepare themselves for a disaster, especially by understanding which lifelines might be most at risk in their area and how they can function if those lifelines are out of service.

Associating these results with economic data can also help quantify losses in the district in the aftermath of a disaster. If managers and planners have a better understanding of how a disaster will impact their systems and residents, they may be driven to invest in disaster risk reduction strategies, which will improve outcomes for the district as a whole.

In addition, these results provide an indication of areas where uncertainty surrounding repair times is most prevalent. This is especially obvious in the wide range of repair times shown for road and highway systems and power distribution networks. Identifying where this uncertainty exists provides an opportunity to improve models and explore different methodologies to further understand and reduce uncertainty in the future.

Seeing where uncertainties exist can also provide an opportunity to understand their root causes and respond to them appropriately. In some cases, no immediate response is needed but it is valuable to know what conditions may be like. This is apparent for the road and highway network where, in many cases, slight damage to road segments is indicated. Slight damage is characterized by at most a few inches of ground settlement. Given the condition of roads currently in use in many cities, this sort of settlement would still be passable by residents (pending inspection and verification of safety) and simply added to normal maintenance operation schedules. In this model, however, the road network is not considered functional in a neighborhood until these repairs are made. As such, the repair time shown in the model is likely significantly higher than what would be realistically experienced in a city. Knowing about minimal damage, however, would still serve operators by preparing them for ongoing restoration work in addition to regular maintenance operations.

5 Conclusion

This study presents a city-scale, data-driven model of multi-infrastructure recovery in a suburban municipality. Key findings indicate that repair times for electrical power and road networks are highly variable and influential in the overall recovery of neighborhoods within the district. Other infrastructure systems are individually variable in their repair times, but do not affect overall recovery to the same extent that power and road networks do. Water and wastewater systems are expected to recover most quickly for the studied magnitude 7.3 earthquake. It is important to note that these repair times represent full recovery of the systems studied rather than the time at which society can adequately function and prevent further losses after a disaster. Partial functionality in some sectors may provide an opportunity to expedite repairs in others. Therefore, an awareness of the ways in which neighborhoods can continue to function while disaster recovery occurs is essential to improving outcomes for residents after a disaster.

Lessons learned from this study can inform the process of developing models for future work and improve the understanding of recovery in urban and suburban areas. Recognition of the influence of different infrastructure sectors on recovery will guide data collection for future studies in order to reduce uncertainty throughout the modeling process. Already, this work can be used to guide communities in understanding the general order and timing of recovery for their critical functions. Knowing that water and wastewater systems will be recovered long before power systems, for example, gives stakeholders the opportunity to focus on the ways in which they can prepare their communities for different types of outages. Concentrating resources on certain systems and providing hardening or redundancy for those systems can make post-disaster recovery smoother for communities. The ability to model all infrastructure systems within one platform provides a distinct advantage over other modeling processes.

There are many opportunities to advance this form of data-driven modeling. The uncertainty in repair times for some infrastructure sectors is substantial and can be challenging to characterize due to the data required about the underlying infrastructure network. Obtaining these data requires close collaboration with infrastructure operators. This collaboration also lends itself to moving beyond random resource prioritization and towards a better representation of the needs of members in the community. Connecting with infrastructure operators tends to be difficult, however. Many are bound by business interests or confidentiality agreements and are hesitant to share information. As such, data from operators can come in a variety of forms, from detailed information about outages and repair times, to generic descriptions of outage areas and plans for recovery. A beneficial feature of GMOR is flexibility to incorporate inputs from many other modeling methodologies, such as data acquired directly from infrastructure operators or those discussed in Sect. 2. Testing various models for different infrastructure sectors can provide a better understanding of uncertainties encountered in each and inform best practices for modeling in the future.

Another opportunity to advance this modeling approach includes expanding the geographic area that is included in the model. Doing so provides a more complete understanding of the boundaries of the systems represented and how they interact across neighborhood and municipal boundaries. Further work may also focus on including additional infrastructure systems such as telecommunications or natural gas distribution, as well as more thoroughly incorporating the interconnections between systems. In addition, disaster risk reduction scenarios may be considered and incorporated to demonstrate their effect on improving community recovery times. Hardening specific systems, for example, may provide widespread benefits to others within the study area.

Key to successful implementation of results from this and future studies is close collaboration with stakeholders and others who may be using this information. This is especially crucial in communicating uncertainties to decision makers so they can make choices that best reflect the needs of the communities that they represent.

Further, while this study did not incorporate other functions that may exist within neighborhoods, such as fuel stations or important buildings, future studies can include these entities to inform priorities for repair. For example, if a certain neighborhood contains a hospital and emergency shelter, it may be more critical to send repair resources to that neighborhood before another neighborhood in an industrial area, or an area with low population. As more data become available and more needs are identified, modeling should be continually improved to best protect infrastructure systems and the residents and communities that they support.

Lessons learned from this study indicate the importance of community and stakeholder engagement throughout the modeling process as a means of reducing uncertainty and providing more valuable results. In contrast to the randomized ordering used in this model, community input can inform priorities for repair and minimize uncertainty related to these processes. In addition, local plans and estimates of repair time and outage duration may provide a more accurate understanding of specific infrastructure systems compared to the national- or international-scale data used here and in many other studies.

Uncertainty in results may also be reduced by gathering information from communities and understanding their needs. If there are certain roads, for example, that would provide a great benefit to the community if quickly returned to operation, then they can be pinpointed in the output of results with a higher degree of accuracy than the aggregate results shown here. The context within which a community operates can only be identified through close collaboration and ongoing communication and should therefore be an important part of future initiatives and practice in this area.