Introduction

Modern societies rely on infrastructure systems to provide essential services supporting societal well-being, economic prosperity, and quality of life. Such systems allow ease of movement for people, goods and services, and deliver power, water and other utilities to households and businesses. The complexity and interdependence of these systems has increased over time as designers and planners have taken advantage of opportunities afforded by new technologies, and responded to increasing pressures to provide more efficient and cost-effective infrastructure services. One side-effect of this has been an increased potential for cascading failures such that small scale initial failures in one system can result in events of catastrophic proportions across the wider network (see for example Little 2002; Dueñas-Osorio and Vemuru 2009; Buldyrev et al. 2010; Lu et al. 2018).

Critical infrastructure networks have complex mechanisms in place for planning, financing, funding, design, construction and operation. Resilience and the emerging concept of resilience engineering within infrastructure are among the main concerns of those managing such complex systems (LRF 2014, 2015), alongside stewardship, sustainability, financing and funding mechanisms and project delivery and management (Aktan 2013). There is, therefore, a need to understand how resilience engineering techniques can be applied to interdependent critical infrastructure systems, and to identify examples of best practices in this area.

This paper is based on the findings from one of a series of reports produced for the Resilience Shift programme (The Resilience Shift 2018) which aimed to identify the applications of resilience engineering in relevant sectors and to determine any gaps in the understanding, communication and improvement of resilience. The paper provides a perspective on the current practice and future opportunities for resilience engineering in the critical interdependent infrastructure sectors of energy, water and transport. It centres on a review of academic literature and other relevant reports and research programmes linked to resilience engineering and the related topics of performance-based engineering and adaptive capacity, especially at the design and planning stages. It focuses on the identification of recent examples of the methodologies and implementation of resilience engineering in a range of geographic contexts, particularly where interdependencies between sectors have an impact on the methodologies or practices used. Unlike other engineering disciplines, resilience engineering has emerged through academia rather than through the experience and knowledge of engineers and planners. While this means there are only limited numbers of organisations or infrastructure providers explicitly using resilience engineering as part of their safety or business management philosophy, it provides an opportunity to embed the concept of resilience engineering within operational guidelines based on best practice rather than on incremental experience.

The structure of the remainder of the paper is as follows: Sect. 2 introduces resilience and associated concepts, setting out a range of definitions of terms and identifying where commonalities exist. The various models and frameworks used as part of resilience engineering implementation and monitoring are summarised in Sect. 3, while Sect. 4 focuses on current practices and existing approaches. Section 5 offers an insight into the opportunities and potential barriers associated with these methodologies and practices offering further insight into some of the implications of implementing and embedding resilience engineering in the future of infrastructure planning and design. Concluding remarks are given in Sect. 6.

Characterisations of resilience

Before assessing the evolution of resilience engineering, in this section, we explore the range of definitions of resilience, resilience engineering and resilience capacity, as well as the related concept of performance-based engineering. Resilience engineering first emerged within several different academic areas and has subsequently evolved to be a practical measure utilised in infrastructure development and elsewhere in industry. This has resulted in a diversity of definitions and characterisations, but there are also commonalities between sectors and between different engineering disciplines, which are explored below.

Resilience and resilience engineering in infrastructure

For many years, ‘resilience’ has been a term used across a range of different physical, social and ecological disciplines. Many definitions exist, and many reviews have been undertaken to attempt to clarify these definitions (see for example Zhou et al. 2008; Haimes 2009; Rose 2009; Aven 2011; Bhamra et al. 2011; Alexander 2013; Francis and Bekera 2014; Baum 2015; Bergström et al. 2015; Righi et al. 2015; Teodorescu 2015; Woods 2015; Cimellaro et al. 2016; Hosseini et al. 2016; Ibanez et al. 2016; Connelly et al. 2017; Kurth et al. 2018; Patriarca et al. 2018; Xue et al. 2018). The many distinctions and interpretations are dependent on which aspect of a resilient system is under scrutiny (Sikula et al. 2015; Lofquist 2017). For example, socio-ecological resilience can relate to issues of security, protection, emergency response, business continuity, environmental and ecological issues, social issues related to human health, safety, and general welfare. In contrast, engineering resilience is related to the integrity of physical infrastructure systems. Table 1 gives a brief outline of some of these various definitions of ‘resilience’ in three of these contexts: organisational, socio-ecological and physical.

Table 1 Varied definitions of resilience

This diversity in usage can imply difficulties in interpretation and measurement of resilience (Francis and Bekera 2014). In order to overcome these difficulties, a number of studies have attempted to reduce these complications, and consolidate the numerous definitions of resilience (examples include Hollnagel et al. 2006; Francis and Bekera 2014; Woods 2015; Connelly et al. 2017; Hollnagel 2017). Four main principles emerge from these studies that are common to many of the definitions of resilience: (i) anticipate; (ii) absorb; (iii) adapt; (iv) recover. For infrastructure systems, these principles relate to how a system is managed, how well it functions at thresholds of service delivery, how well a system can cope with changing conditions, and the time taken to return to normal conditions following a disruption (Connelly et al. 2017).

Following on from this framing of resilience, we can then characterise what is meant by resilience engineering. It could be seen as engineering which aims to enhance an infrastructure system’s performance with regard to the following three key features (as set out in NIAC 2009):

  • Robustness: The ability to maintain critical operations and functions in the face of crisis (i.e. absorbing and adapting).

  • Resourcefulness: The ability to prepare for, respond to and manage a crisis or disruption as it unfolds (i.e. involving all four of the key principles).

  • Rapid recovery: The ability to return to and/or reconstitute normal operations as quickly and efficiently as possible after a disruption (i.e. recovering).

In order to achieve this, resilience engineering should consider how organisations and systems function as a whole. It will assess changes in the adaptive capacity of these entities as they confront disruptions, change and pressures (Woods 2006). Hollnagel (2017) suggests that this assessment should consider four basic abilities which give an overview of how an organisation or system functions: how it responds, how it monitors, how it learns, and how it anticipates. Resilience engineering then comprises the ways in which these four capabilities can be established and managed.

Clearly these abilities will, at least to some extent, have been considered previously in the fields of risk governance and safety management (Righi et al. 2015). However, while traditional approaches have been based on hindsight and probabilities of failure, resilience engineering promotes robust yet flexible risk models, responding to disruptions more proactively (Dekker et al. 2008). For example, resilience engineering can be applied in the realm of air traffic management systems (EUROCONTROL 2009), where established approaches to risk and safety mainly focus on the things that go wrong, whereas resilience engineering within safety focuses on the whole set of outcomes, i.e., things that go right as well as things that go wrong.

Resilience capacity

One of the main factors emerging from the discussion of definitions of resilience and resilience engineering in Sect. 2.1 is that of adaptability. If we assume that a resilient system is one which can absorb, adapt and recover from unexpected events, then the resilience capacity can be considered as the combination of these three capacities, as shown in the ‘resilience triangle’ in Fig. 1 (adapted from Francis and Bekera 2014).

Fig. 1
figure 1

Reproduced with permission from Francis and Bekera (2014)

Resilience triangle.

Absorptive capacity is a measure of how much stress the system can withstand before an event impacts on its performance level, while restorative capacity is a measure of how quickly a system can return to some functional level after the event. Central to the idea of resilience, however, is adaptive capacity (Bergström et al. 2015; Lundberg and Johansson 2015). Francis and Bekera (2014) define adaptive capacity as “the ability of a system to adjust to undesirable situations by undergoing some changes”. This is distinct from absorptive capacity, in that adaptive systems change in response to adverse impacts in order to allow them to continue to function, especially if absorptive capacity has been exceeded. This capacity is enhanced by the system’s ability to anticipate any disruption prior to the event, to recognise unanticipated events, and to reorganise after the occurrence of an adverse event. In a network or distribution system, adaptive capacity allows flows through the system via alternate paths if the usual or preferred route is disrupted (Turnquist and Vugrin 2013).

A related example from the transport sector of how capacity of road networks can be adapted is provided by the finalists of a competition launched by the UK’s National Infrastructure Commission to find “innovative and creative ideas on how to deliver a world-class road network in the UK ready for connected and autonomous vehicles (CAVs)” (National Infrastructure Commission 2018b). The finalists could all be considered to provide aspects of adaptive capacity to road networks (and thereby to reduce the ‘undesirable situation’ of high congestion levels), especially in relation to future automated vehicle use. This is achieved through (i) time-sensitive kerbside use to accommodate separately cyclists, pedestrians, automated vehicles or freight; (ii) segregated CAV zones enabling enhanced network capacity; (iii) CAV fleet routing to help optimise local traffic flows; and (iv) enhanced traffic management using CAV-based data to influence traffic light sequences, or to provide speed optimisation information to vehicles travelling between traffic signal junctions.

Performance-based and specification-based engineering

A related topic to resilience engineering is that of ‘performance-based’ engineering (PBE), which has emerged from an architectural context (see for example Kolarevic and Malkawi 2005; Oxman 2008; Mosalam et al. 2018). PBE aims to design buildings and structures to broaden their capabilities in terms of how they are used (Whalley 2005), and to enhance performance, particularly in response to seismic activity (see for example Priestley 2000; Ghobarah 2001; PEER 2010). Therefore, while there is some common ground with resilience engineering, PBE is a more broadly-focused approach applied at the design stage, as opposed to resilience engineering which has a more specific focus but covers the whole life of infrastructure systems.

PBE contrasts with the more traditional approach in infrastructure system design, specification-based engineering (SBE), which is more prescriptive and process-oriented. SBE has served the engineering community well for many years, and established techniques and guidelines are easier to implement than a transition towards a performance-based approach (Aktan et al. 2007). However, the design, construction, evaluation, and preservation of constructed facilities that has been based only on implicit or qualitative descriptions of performance may not meet modern expectations. PBE is a more ‘product-oriented’ approach, such that the desired performance characteristics of the constructed system are described in terms of rational and measurable quantitative indicators, rather than the ‘bricks and mortar’ of the facility. Both PBE and SBE are likely to be needed at the planning and design stage, but while it has been acknowledged to be an important step towards building a resilient and sustainable civil infrastructure, use of performance-based engineering has yet to become a mainstream part of infrastructure planning and design (Li et al. 2011; Minsker et al. 2015; Ghosn et al. 2016).

Summary

While many definitions of resilience have emerged through various disciplines, four main principles have emerged which apply to infrastructure: resilient infrastructure systems should be able to anticipate and absorb any disruptions, then adapt and recover quickly. Resilience engineering should, therefore, be able to assess the adaptive capacity of a system, and result in systems which are designed to respond resourcefully to disruptions, maintain critical functions, and return to normal operations efficiently and quickly. Based on these definitions, the remainder of this paper discusses the use of resilience engineering both in practical situations and as a theoretical tool.

Resilience engineering as a theoretical tool

A review of 250 papers on resilience engineering carried out by Righi et al. (2015) found that around half were related to theoretical aspects of resilience engineering. This reflects the reality that other engineering disciplines have emerged through ‘hands-on’ experience, trial and error, and iterative learning and development. Resilience engineering, however, has emerged through a more academic route, with engineers only recently beginning to apply the methods proposed in the literature to real life situations. However, this theoretical work still provides a good background to the potential future options for resilience engineering in infrastructure. This section summarises the various methods in use, and existing and proposed approaches to the application of resilience engineering in the three critical infrastructure sectors of energy, water and transport, and in studies focusing on cross-sector interdependence. While many of the examples below are sector-specific, the methodologies used could be applicable across other sectors.

In addition to sectoral-based approaches, much work has been done to develop models and methods capable of analysing interdependent infrastructure systems (for a more detailed overview, see for example Yusta et al. 2011; Tamvakis and Xenidis 2013; Huang et al. 2014; Ouyang 2014). Johansson and Hassel (2008) suggest these methods can be divided into two categories: empirical and predictive approaches. Empirical approaches examine past events in order to increase understanding of infrastructure dependencies. Predictive approaches mainly focus on modelling or simulation, examining how interconnected infrastructures interact, for example, to assess how disturbances cascade through the systems. In contrast Ouyang (2014) provide a more detailed classification, grouping modelling and simulation approaches into six types: empirical approaches, agent based approaches, system dynamics based approaches, economic theory based approaches, network based approaches, and ‘others’. This range of types allows more differentiation between approaches, and is reflected in the selection of models and frameworks discussed below.

Resilience engineering in infrastructure modelling

The ability to model energy and water supply and distributions systems, transport networks, and the interdependencies between these sectors gives planners and modellers an opportunity to observe how these systems-of-systems would respond to unexpected shocks and changes given certain levels of demand or disruption, as well as interpreting the impact of new and disruptive technologies such as smart grids (Martins et al. 2017; Zoppi et al. 2017), hybrid heating technologies (Clegg and Mancarella 2018) or electric and automated vehicles (Wardziński 2008).

Different modelling approaches are appropriate in different contexts, and the range of approaches across the energy and transport sectors in particular is quite diverse (Hickford et al. 2017). A number of example modelling applications are summarised in Table 2.

Table 2 Examples of modelling approaches used in resilience engineering

The examples given in Table 2 show the diversity of the models currently being used to assess resilience and the impact of resilience engineering, with examples across the range of Ouyang’s classification (2014). While there is diversity in modelling techniques, a common feature of many of these studies is that they sit within a framework which helps define the scale of the problem and potential solution space. A range of frameworks identified from the literature is discussed in the next section.

Applying frameworks to assess system resilience

Conceptual frameworks can be useful at a planning stage, as well as offering a methodology for monitoring outcomes using a variety of metrics. As with the modelling techniques, the types of frameworks, methodologies and metrics are diverse, but they tend to emerge from the four principles of resilience defined in Sect. 2: anticipate, absorb, adapt and recover.

For example, public transport systems are vulnerable to the threat of terrorism, especially given the large numbers of people that are often confined in low-security areas. Cox et al. (2011) focus on how a system recovers after disruption, using operational metrics to assess resource allocation and to assist in designing security and recovery strategies. This framework is applied to a case study example of London’s transportation system. Pant et al. (2016) used their network and interdependency model to devise a vulnerability assessment framework, representing critical infrastructures as complex interdependent social-technological systems. By assessing the vulnerability of a system, this framework gives insight into how the system would need to adapt given particular disruption types. It is applied to the rail network in Great Britain to examine the potential impact on rail travel of infrastructure failure and flooding.

Similar approaches have been adopted in the energy sector. For example, the Adaptation and Resilience in Energy Systems (ARIES) programme in the UK provided a risk framework to assess the resilience of energy systems, to ensure a balance between changing patterns of demand and supply, helping to identify how energy providers can best anticipate the physical and economic impacts of climate change on current and new energy generation technologies, providing a range of adaptation options (ARCC 2018). Ji et al. (2017) consider a range of modelling approaches and metrics which could provide insight into how disruptions or damage to the energy distribution network caused by adverse weather events might be limited spatially (i.e. absorb), and how services could be quickly restored (i.e. recover).

Labaka et al. (2015) build on this approach by also considering the different types of resilience (technical, organisational, economic, social) in a ‘holistic resilience framework for Critical Infrastructures’ which aims to help determine how actors should respond resourcefully in emergency situations, with a case study application given for a Southern European nuclear power station.

Some frameworks support decision-making at the design and planning stage, helping to anticipate any issues that may arise during the operation of infrastructure. For instance, Lin and Gerber (2014) developed a framework based on multidisciplinary design optimisation to provide rapid iteration with performance feedback. This is ‘designing in performance’, a concept also put forward by Ajah (2009) using the FRAME concept (Flexibility, Reliability, Availability, Maintainability and Economics), which incorporates performance indicators at the design stage of energy infrastructure. An ‘anticipatory’ analytical planning framework is put forward by Hellström (2007) who suggests that system disruption can be caused by design flaws becoming incrementally embedded in a system, and considers how to mitigate system vulnerability and the potentially systemic and disruptive nature of technological change.

In the water sector, frameworks have tended to be developed around the recovery process. For example, a water distribution system ‘resilience index’, based on demand, capacity and water quality has been used to assess the functionality (and recovery) following an extreme event. Scenario events were applied in a case study in the small town of Calascibetta, Sicily, to determine how different stresses impact the local distribution network, and how quickly the network recovers (Cimellaro et al. 2015). Cost effectiveness is also a common objective in water distribution system design. An example ‘reliability index’ combines cost minimisation with a focus on water quality, as part of multi-objective optimisation to assess water distribution systems, applying a scenario-based case study in the town of Jahrom, Iran (Shokoohi et al. 2017).

In general, however, literature relating to design, planning and use of water infrastructure tends to focus on developing countries, where water quality and distribution systems are likely to be relatively poor. Engaging stakeholders in decision-making can help develop a future vision for infrastructure development, such as the strategy development aiming to improve the commitment to water resources in Latin America and the Caribbean (Miralles 2014). The stakeholder engagement also sought to address issues relating to future climate change and the potential impact of El Nino, with case studies in the region considering a range of topics including fire, flood, landslides and droughts, with each intervention or measure needing to be adapted to the particular context (CAF 2013, 2014, 2016).

The example framework studies discussed above reflect the wide range of approaches used across infrastructure sectors, although there are usually elements of the four principles of resilience evident in the framework development. Design and planning frameworks tend to be predictive, and can aid in anticipating disruptive events, while empirical approaches tend to revolve around how a system recovers and adapts.

Quantifying infrastructure resilience

Quantifying system resilience is an important step towards assessing system change, aiming to capture the complex behaviour of interdependent infrastructures, and assess changing performance following disruption (Sansavini 2016). Various approaches have been used, including probabilistic, graph theory, fuzzy inference, and analytical methods (see for example Cox et al. 2011; Turnquist and Vugrin 2013; Huang et al. 2014; Cimellaro et al. 2015, 2016; D’Lima and Medda 2015; Ibanez et al. 2016; Zimmerman et al. 2016; Nan and Sansavini 2017; Ouyang 2017; Zhang et al. 2018). The metrics identified in these studies are understandably varied, given the range of frameworks and models that they support.

In general, metrics tend to apply to a range of measurable impacts on networks, based on loss of functionality and recovery. For example, Turnquist and Vugrin (2013) adopt three distinct measures to assess system resilience: systemic impact (SI) to measure the level of disruption, total recovery effort (TRE) based on costs of recovery, and resilience-enhancing investments (REI) which are the costs incurred to improve and adapt the system beyond the original resilience capacity. Ganin et al. (2016) use critical functionality (CF) to quantify resilience. CF is defined as “a metric of system performance set by the stakeholders, to derive an integrated measure of resilience”, and hence can be adapted to different contexts. Nan and Sansavini (2017) suggest that resilience cannot be adequately addressed considering one single system capability, and aim to identify resilience capabilities based on absorptive, adaptive and restorative capacities.

Leu et al. (2010) use graph theory tools to assess the impact of disruption to transport networks, including nodal connectivity and spatial distribution of risk. In general, though transport resilience metrics tend to be related to reliability, based on journey times or efficiency of the network, and how the network responds to disruption. Tang and Heinimann (2018) utilise the ‘R4 Resilience Triangle’ (Robustness, Redundancy, Resourcefulness, and Rapidity) (Bruneau et al. 2003) to assess the impact of congestion based on spatial–temporal traffic patterns. Ganin et al. (2017) assess the impact on traffic flow of disruption, measuring traffic delays and the availability of alternative routes. The metric adopted by both Cox et al. (2011) and D’Lima and Medda (2015) is based on passenger journey data which is used to estimate the time to recovery after a disruption on the London Underground.

Recovery time is also prominent in metrics related to the access and quality of water in post-disaster situations (Cimellaro et al. 2015). Francis and Bekera (2014) propose an extension of the functional metrics relating to recovery time, by including aspects of resilience capacity (see Fig. 1). In energy systems, a similar approach to network disruption and recovery is used. For example, Panteli et al. (2017) consider metrics based on the rapidity and extent of loss of performance, together with the recovery rate in order to assess the resilience of power systems.

As well as measurable impacts, there are also more comparative (or relative) metrics, such as weighting some aspects of infrastructure networks as more influential or critical than others in a matrix-based Analytic Network Process (Huang et al. 2014). These are not specific measures of infrastructure resilience, but can provide insights into which aspects of interdependent infrastructure networks have greater influence and importance given certain criteria.

While a number of quantification methods have been used, there are in general incomplete and present a very narrow field of applications, possibly due to the diverse range of definitions and contexts for resilience which vary among researchers and discipline fields. This may also be a consequence of the dominance of traditional mathematical approaches, such as the probabilistic or the graph theory in systems engineering, which may not be entirely appropriate for resilience engineering. Further insights are, therefore, required in this area, and other techniques, such as entropy theory, may yield better, more complete assessments of resilience (Tamvakis and Xenidis 2013), as discussed below.

Summary

A wide range of modelling approaches, frameworks built around those models, and quantification methods exist in the field of resilience engineering. However, there are some similarities between different frameworks based on the underlying principles of resilience. Predictive models and frameworks tend to relate to planning and design, with a focus on how a system might anticipate future disruptions, while empirical frameworks and models are more focused on how the current system responds to disruption, how it can adapt, and how quickly it can recover back to the original state.

Resilience engineering in research and practice

Overview of research studies and practical applications

As stated earlier, much of the existing literature focuses on the theory of resilience, safety management or non-infrastructure systems. However, there are research programmes and national strategies which aim to place resilience engineering and related practices at the centre of infrastructure planning and design and increase the resilience of critical national infrastructure. A selection of these cross-sector and sector-specific programmes and studies are summarised below. Greater detail on these projects and studies is given in Hickford et al. (2017), and further resources are available on the project websites given below.

Knowledge-sharing and best practice demonstration is a common theme among research studies involving resilience engineering. This paper has emerged from one example of such knowledge sharing, The Resilience Shift programme (The Resilience Shift 2018) which aims to identify and promote best practice in resilience for infrastructure engineering and design. There are many other examples of projects focused on best practice or collaboration between disciplines. For instance, in Europe, recent EU-funded projects INTACT (2018) and RESILENS (2018) have focused on demonstrating methods and tools that will help advance the resilience of critical infrastructure, including the creation of web-based wiki-tools. A similar web-based approach emerged from the International Risk Governance Council (IRGC 2018), an independent non-profit foundation providing insight into systemic risks that have impacts on society. IRGC has developed a web-based Resource Guide on Resilience for researchers and practitioners. The guide is a collection of authored pieces reviewing existing concepts, approaches and illustrations or case-studies for comparing, contrasting and integrating risk and resilience, and for developing resilience.

Knowledge-sharing was also evident in DARWIN (2018), part of the Horizon 2020 research programme, which focused on improving the effectiveness and adaptability of responses to natural disasters (e.g. flooding, earthquakes) and man-made disasters (e.g. cyber-attacks) by developing and sharing resilience management guidelines aimed at all stakeholders in critical infrastructure management. In the UK, the EPSRC-funded ARCC Network (Adaptation and Resilience in the Context of Change) has aimed to give policymakers and practitioners the best evidence on resilience and adaptation in the built environment and infrastructure sectors, integrating knowledge with uptake of research outputs (ARCC 2018).

Further understanding in design, innovation, efficiency and resilience of interdependent critical infrastructure systems was the aim of Critical Resilient Interdependent Infrastructure Systems and Processes (CRISP) in the US, which fostered an interdisciplinary research community to create new approaches and engineering solutions for infrastructure design and operation. A multi-disciplinary collaborative approach was adopted for Future Resilience for African Cities and Lands (FRACTAL) in southern Africa, which aimed to advance and integrate scientific knowledge about regional climate responses to human activities, enabling climate-sensitive decisions at the city-regional scale (particularly decisions relating to water, energy and food with a lifetime of 5–40 years) (FRACTAL 2018).

In addition to sharing knowledge, there are a number of research programmes or institutions which also generate novel research into resilience in infrastructure, providing tools and evidence to inform decision-makers about investments in infrastructure, and the longer term impacts of those decisions. For example, in the UK the National Infrastructure Commission (NIC) has been created as an independent organisation providing the UK government with impartial, expert advice on major long-term infrastructure challenges. Their publications have included assessments of infrastructure planning for smart energy systems, and transport systems and the impact of technologies of such systems. One of their ‘policy insights’ is that cross-sector planning can help policymakers recognise the resilience implications for the entire infrastructure network (National Infrastructure Commission 2018a).

The NIC outputs have been informed by a number of studies, for example by the Infrastructure Transitions Research Consortium (ITRC), an EPSRC-funded collaborative programme developing models to investigate long-term planning and risk evaluation for the design of critical national infrastructures. The second phase of the ITRC programme, Multi-Scale Infrastructure Systems Analytics (MISTRAL) aims to enhance these modelling capabilities to inform infrastructure decision-making across scales, from local to global (ITRC 2018), and will continue to provide evidence for the NIC. Such programmes require large computing power and resource, as the shift towards big-data science in resilience engineering continues (Sala 2015). In the UK, the modelling and visualisation capabilities have been enhanced by the creation of Data and Analytics Facility for National Infrastructure (DAFNI), a secure data storage and high-performance computing facility (DAFNI 2018).

The long term aim of these analytical frameworks and tools is to provide evidence for decision-making at a national or international scale. This is turn leads to national programmes and plans to provide resilient and cost-effective infrastructure. The NIC is one such body which provides evidence to the UK government, but there are many other programmes and plans in place globally, and examples from Japan, Australia and Canada are given below.

Japan’s National Resilience initiative has been set up in response to the natural and nuclear disasters of 2011. The initiative includes the ‘Fundamental Plan for National Resilience’ aimed at building resilience in critical energy, water, transport and other lifeline infrastructures (National Resilience Promotion Office 2015; DeWit 2016). Disaster-resilient renewable energy systems have been among the largest markets in Japan’s private-sector spending since 2011. Other core markets include earthquake-proofing of infrastructure, reinforcement of transport systems, disaster-relief robotics, communications resilience, and training of specialist leadership.

In Australia, the Critical Infrastructure Resilience Strategy has been created, aiming to ensure the continued operation of critical infrastructure in the face of hazards (Australian Government 2015). There are four outcomes from this strategy that could deliver more resilient infrastructure: (1) business-government partnerships, ensuring information sharing and collaboration on risk and resilience initiatives; (2) risk management of the operating environment, aiming to increase sectoral and cross-sectoral understanding of critical infrastructure assets or networks; (3) risk-based strategic understanding and management, and (4) an understanding of organisational resilience, building capacity within organisations for unexpected events.

Canada’s Action Plan for Critical Infrastructure, and associated National Strategy (Public Safety Canada 2013) recognise that responsibilities for critical infrastructure in Canada are shared by federal, provincial and territorial governments, local authorities, and critical infrastructure owners and operators. One of the key aspects of the strategy is knowledge sharing via the National Cross Sector Forum, linking critical infrastructure operators, who can inform the development of comprehensive emergency management plans and, government bodies who have information on risks and threats relevant to operators.

A range of other national frameworks for resilience are set out in the OECD’s (2015) review, identifying if the particular policy drivers relate to economy, society, institutions, environment, natural disaster, or a combination of these factors, and the role that local government and cities are likely to play in the policy framework.

Summary

While much of the literature focuses on theoretical aspects of resilience engineering, it is a topic which is central to many emerging national strategies and national and international research programmes. Knowledge-sharing, best-practice demonstration and interdisciplinary collaborations can all help to promote the core ideas of resilience engineering in infrastructure design and operation. Such collaborations are providing tools and evidence to inform decision-makers about investments in infrastructure, and the longer term impacts of those decisions.

Opportunities and potential barriers

Very few organisations or infrastructure providers are explicitly using resilience engineering as part of their safety or business management philosophy, nor are they systematically integrating principles of resilience engineering into management routines. This can become a bottleneck for the evolution of resilience engineering, as theory building would benefit from the observation of experiences of large-scale ‘building in’ of resilience engineering by an infrastructure provider (Righi et al. 2015). Tamvakis and Xenidis (2013) note that “current methods [of resilience quantification] are mostly incomplete and largely dependent on concepts and approaches which emanate from other well-established and well-elaborated methodological frameworks, thus failing to provide solutions in the context of resilience engineering”.

Further to this lack of evidence of current practice, there are issues around the wide diversity of definitions and frameworks which add potential confusion during infrastructure planning and design. The Resilience Shift programme, together with the examples of other related programmes set out in Sect. 4 suggest that there is a movement towards a more standardised approach which will help put resilience engineering at the centre of the future of infrastructure planning and design.

This paper has highlighted the numerous approaches to modelling of infrastructure and related systems, but the accuracy and relevance of such models is dependent on good quality data. This may not be too problematic for developed countries, but infrastructure planning is also crucial in the developing world and in post-disaster and post-conflict situations, and reliable data can be much harder to acquire in such contexts. It may be beneficial to find approaches which can provide useful and usable outputs yet relying on minor data requirements.

Another potential barrier is the lack of coordination in governance, planning and delivery (‘silo-based’ thinking), especially for decisions about assets with long build times and asset lives. In the rural environment in particular, this can cause problems (Freeman and Hancock 2016). Greater collaboration between government, industry, not-for-profits and communities would help alleviate these problems, as would acknowledgement of interdependencies and cross-sector approaches to planning, design and implementation. Some of the programmes discussed above are designed to overcome these aims, but these may still be difficult to enable in practice.

That said, there are some outcomes from this review which offer such insight for future opportunities. For instance, Aktan et al. (2013) suggest that future civil engineers involved in the planning and design of critical infrastructures will need access to various classes of operating infrastructure to study and experiment with in the context of coordinated, multi-discipline, problem-focused field research. They suggest that these engineers will need access to actual infrastructures through ‘living infrastructure laboratories’, part of an academe-industry-government partnership that include infrastructure stewards as champions for innovation. Such laboratories would provide best-practice demonstrations for performance-based engineering and lifecycle asset-management of infrastructures. An early infrastructure-based example of this idea from the 1990s was the Instrumented City project (Chen and Bell 2002), whereby traffic data from various locations in Leicester and Nottinghamshire was collected and analysed from a central location, to help inform numerous studies, particularly city-centre traffic management and air quality issues.

This idea of monitoring a city or region has evolved, particularly as mobile phone use and computing power have increased, and Smart Cities are now an established concept, using information technology and communication systems to gain an understanding of how infrastructure is used. Cosgrave et al. (2013) suggest that Smart Cities can be considered as an “information marketplace” made up of a combination of Living Labs (where the city is used as a real-world testing ground for new ideas and technologies) and Innovation Districts (small areas of growth, usually made up of mostly start-up companies and creative industries). Living Labs are often pre-planned, structured, with clear aims and focused on product development, which aligns well with the concept of RE, to be able to monitor how well infrastructure copes with stress and disruptions in the real-world environment, although reviews of case study examples suggest Living Labs is still an emerging research area (Pieter and Dimitri 2015; Santonen et al. 2017).

Another interesting concept yet to be fully explored is entropy theory, one of the themes of the research of Tamvakis and Xenidis (2012, 2013). They assert that entropy can be considered as a measurable system property synonymous to resilience, since both describe some aspect of disorder or uncertainty or lack of information about the configuration of the separate modules of a system. A methodological framework based on entropy theory better captures the underlying interrelations of these systems modules, providing a more appropriate and effective framework for quantifying resilience of infrastructure systems than other resilience quantification methods, such as probabilistic, graph theory, fuzzy inference, and analytical methods, especially given that entropy is directly and explicitly measurable in a single metric. As yet, though, no such frameworks have been developed and put into practice in the realm of resilience engineering and infrastructure.

Conclusions

This paper has reviewed a large quantity of literature which describes studies of resilience engineering in a range of contexts, and has also discussed a number of national and international programmes to help increase awareness, cooperation and knowledge sharing of resilience of critical infrastructure systems. It is unfortunate (although perhaps unsurprising) that at this stage much of the literature is focused on the theory of resilience, safety management or non-infrastructure systems, which has limited the opportunities to review resilience engineering of interdependent infrastructure systems in practice.

However, such theoretical work can still provide a good background to the potential future options for resilience engineering in infrastructure, and numerous papers and studies have been considered here to this end. These papers illustrate the range of existing and proposed approaches to the application of resilience engineering in the three critical infrastructure sectors of energy, water and transport, and thereby highlight the lack of consistency in resilience engineering at present. Still, there are also commonalities in the approaches used based around the four main principles of resilience: anticipate, absorb, adapt, recover. These commonalities could help focus future research questions and research directions in a field which has been identified by industry stakeholders as a key research priority area (LRF 2015).

A number of key barriers have been identified during this work which may pose difficulties in achieving a consistent and widespread application of resilience engineering across infrastructure sectors. In addition to a lack of consistency in the definitions and approaches used, these include the absence of large scale implementations to allow benchmarking and practice-based learning, difficulties in obtaining the required data (particularly in developing contexts), a lack of coordination in infrastructure governance, planning and delivery, and difficulties in transferring theoretical knowledge from academia to practitioners.

Unlike other engineering disciplines, resilience engineering has emerged through academia rather than through experience and knowledge of engineers and planners. This is a potential barrier, and this review has identified that currently very few organisations or infrastructure providers are explicitly using resilience engineering as part of their safety or business management philosophy. In order for such change to be enacted, further research into the implicit methods and practices in place throughout infrastructure planning should be undertaken.

However, the fact that the field of practical resilience engineering is still in its infancy presents very significant opportunities to get things right, particularly in a context where coordinated planning and decision-making for infrastructure systems is becoming more common (for example through bodies such as the UK National Infrastructure Commission). Further work should be carried out to identify best-practice in infrastructure planning, design, operation and governance, and continued development of quantitative measures of resilience engineering based around the four main principles of resilience. Taking advantage of these opportunities would assist in transferring potentially transformative ideas around resilience and performance-based engineering from theory into practice, and help deliver resilient infrastructure systems which are better able to meet the demands of the economy and society of an ever more uncertain world.