1 Introduction

Resilience management aims to increase the ability of the system to respond to adverse events (e.g. Linkov et al. 2014; Linkov and Trump 2019). It widens the perspective of traditional reliability and risk-management approaches, so that besides focusing on identifying and reducing vulnerabilities, it also considers recovery and adaptability aspects of the system (Folke 2006; Manyena 2006; Linkov et al. 2014; de Bruijn et al. 2017). It also puts emphasis on continuing performing the critical functions that are needed to keep the system working (Fox-Lent et al. 2015). In recent years, resilience management has gained much attention in many fields including social–ecological system management (Ruiz-Mallén and Corbera 2013), natural resource management (Brown and Williams 2015), and organizational management (Annarelli and Nonino 2016).

In this paper, we develop and apply a structured resilience management framework for assessing the resilience of watercourse regulation and, in particular, the operational management of water reservoirs. The main research questions are to study how the application of the resilience matrix approach (Linkov et al. 2013a) can help to identify measures to increase the resilience of operative watercourse regulation process, and what are the best practices for applying the approach in this context. The topic is particularly interesting in Finland, where the lakes cover 33,500 km2, which is approximately 10% of the inland area. There are altogether 242 watercourse regulation permits, and the lakes, whose water levels are affected by these, cover approximately one-third (10,000 km2) of the total lake area. The initiatives for regulation have typically been hydro power production or flood prevention, but nowadays other objectives, such as recreation and state of aquatic ecosystem, are increasingly considered in operative decision-making.

The focus of this paper is on the reservoir operation, which is an issue of emerging concern, as flood damages are predicted to be increased due to the climate change and population growth (Raje and Mujumdar 2010; Watts et al. 2011; Wilby and Keenan 2012). There is already much research on how to optimally operate the dam in a case of natural hazards causing unwanted water levels (e.g., Kotzee and Reyers 2016; de Bruijn et al. 2017; Opdyke et al. 2017). However, there can also be threats related to the human-caused incidents as well as to workforce/infrastructure, which can lead to non-optimal decisions on controlling the flow in the dam (van Leuven 2011). Together with an already difficult water condition, these threats can lead to severe negative impacts, if not dealt with appropriately. The recent trend of digitalization has provided excellent means to automate many phases of the process, which, however, has simultaneously increased the dependence of the undisturbed functioning of the systems (e.g., Rajasegarar et al. 2008; Paul et al. 2018). In extreme cases, the related threats can even be the triggers for the flooding, for example, if erroneous measurements of water levels lead to the adjustment of the water flow in the wrong direction. The particular focus of this paper is to consider how to prepare for the fundamental reasons behind various threats with an aim to increase the resilience of reservoir operation.

There is a clear need for systemic approaches to manage resilience, as the systems typically consist of numerous different elements interacting with each other. Various approaches have been introduced to enable structured and transparent analysis of threats (see, e.g., Ganin et al. 2016; Sharifi et al. 2017). This paper focuses on studying the use of the resilience matrix approach of Linkov et al. (2013a), which provides a unifying framework to assess system resilience, and guidelines for assessing metrics that need to be developed and combined to measure overall system resilience. The matrix has been originally developed to understand how the doctrine of network-centric warfare applies to disaster resilience, but it has been also applied in other contexts such as cyber systems (Linkov et al. 2013b), community resilience assessment (Fox-Lent et al. 2015), and coastal system resilience (Rosati et al. 2015).

The paper is structured as follows. Section 2 describes the characteristics of the operational decision-making process of watercourse regulation, and discusses how to increase the resilience of this process by applying the resilience matrix approach to identify and analyze the threats in various phases of the process. Section 3 presents the actual resilience matrix applied to the watercourse regulation case and describes the process of creating the matrix. The applicability of the matrix is discussed in Sects. 4 and 5 concludes the paper.

2 Resilience in reservoir operation

2.1 The reservoir operation process

Making operative decisions on a dam is essentially a task of controlling the flow, so that the water levels and flows in different parts of the watercourse are balanced optimally. In spite of being “a one-dimensional problem” of only controlling the flow over the time, decision-making on reservoir operation is not straightforward. The water-level variation has various impacts related to ecological, economic as well as social objectives, and the desired water levels can vary among multiple objectives (e.g., Marttunen and Suomalainen 2005; Marttunen et al. 2006). For reviews of the cases dealing with multiple objectives in water management or in water systems, see, for example, Hajkowicz and Collins (2007), Hajkowicz and Higgins (2008), or Lai et al. (2008).

The threats related to the reservoir operation can have an impact either to the inflow coming to the reservoir or to the outflow from the reservoir through the dam. The dam operator can only influence on the second one, assuming that there are no regulating structures upstream. Van Leuven (2011) has grouped the possible threats related to water systems into three categories: (1) natural disasters, (2) human-caused incidents, and (3) workforce/infrastructure threats. Of these threats, natural hazards, such as extreme rainfall or drought, typically affect the inflow coming to the reservoir. Then, the task of the dam operator is to control the outflow, so that the water level in the reservoir keeps within the desired limits by also taking the inflow into account. However, the other two types of threats (i.e., those related to the human-caused incidents or workforce/infrastructure) can directly affect the work of the dam operator, so that the outflow is controlled non-optimally or even harmfully. These threats are more diverse and they can have an effect in all the phases of the dam operation process. We have identified the following six main phases (i.e., the critical functions) of this process (Fig. 1):

Fig. 1
figure 1

Phases of the decision-making process in the operational watercourse regulation. The numbers refer to the phases of dam operation process mentioned in the text

  1. 1.

    Observations on the watercourse (water level, flow, amount of water in snow, etc.)

  2. 2.

    Registering the observations into the data management system

  3. 3.

    Prediction of the water flows based on the observations and weather forecasts

  4. 4.

    Decision about the flow based on the hydrological predictions (possibly including discussions with colleagues)

  5. 5.

    Adjustment of the sluice gates

  6. 6.

    Informing the other operators and population about new conditions

The phases of the process are chained, so that an impact on some phase is likely to also have an indirect impact on all the subsequent phases. For example, an error in data is likely to lead to an erroneous prediction, which further can lead to a non-optimal decision, and so on. In this respect, it is very important to not only consider impacts of the threats to a single phase of the process, but also take the interactions between the phases into account.

2.2 From risk management to resilience management

The information required to estimate the inflow to the reservoir includes estimating the expected rainfall, but also other variables such as the amount of the water in the ground. The related uncertainties can usually be presented as probability distributions. Consequently, the analysis of related risks can be typically carried out with traditional risk-management approaches, which take both the probability and the possible impact of the event into account (Bogardi and Kundzewicz 2002). In the context of watercourse regulation, risk-management approaches have been applied especially for evaluating flood risks (e.g., Plate 2002; Apel et al. 2004) and risks related to dam safety practices including surveillance, dam safety reviews, emergency preparedness, and operation and maintenance procedures (e.g., Hartford and Baecher 2004; Zhang et al. 2016). Based on these approaches, various kinds of systems have also been developed to provide automated control for the reservoir operation utilizing, for example, genetic algorithms (e.g., Chang and Chang 2001), multi-objective optimization (e.g., Dittmann et al. 2009), or dynamic programming (Li et al. 2010).

A weakness of risk-management approaches is that they can only deal with such risks whose probability or impact can be estimated. However, in dam operation, there are also many threats which are unknown before their occurrence, and which can be realized either as sudden shocks or as increasing stresses that slowly build up (Delaney et al. 2010; Park et al. 2013; Merz et al. 2015). They can relate to, for example, climate change, terror attacks, political instability, development of technology, or organization culture. Risk-management approaches have also been applied to respond to these kinds of threats (e.g., Harrald et al. 2004; Danso-Amoako et al. 2012). However, more comprehensive approaches are needed to adequately estimate the impacts and probabilities of these threats due to the very complex interferences of the system (Park et al. 2013).

Resilience management aims to increase the resilience of the system by also considering unexpected threats and focusing on the system functionality (Park et al. 2013). It complements the risk-based approaches by identifying the critical functions of the system and changing the way of doing things, so that the functioning of the system can be assured regardless of the characteristics of the disturbance. In this respect, it is often more appropriate use of the term vulnerability of the system instead of using the risk-related terminology. The fundamental difference between these is that risk is a measure of the probability and severity of adverse effects, whereas vulnerability can be seen as “the manifestation of the inherent states of the system that can be subjected to a natural hazard or be exploited to adversely affect that system” (Aven 2011). With this interpretation, resilience can be seen as an ability of the system to cope with vulnerabilities caused by any events including unknown ones.Footnote 1 Roughly speaking, the design objective in risk management is to minimize the probability and extent of the failure and, in resilience management, the consequences of the failure (Park et al. 2013). Yet, the challenge is how to prepare for such disturbances that are too complex to understand or impossible to anticipate (Merz et al. 2015).

2.3 Systematic frameworks for dealing with the system resilience

Various kinds of general frameworks have been developed to better understand the resilience and vulnerability of the system (e.g., Nelson et al. 2007; Folke et al. 2010; Butler et al. 2017). The principles and ideas of these frameworks have also been adapted to different fields to develop customized operational frameworks including, for example, natural resource management (Plummer and Armitage 2010; Bakkensen et al. 2017) or disaster resilience (Cimellaro et al. 2010). Furthermore, frameworks to deal with a specific sector, such as the marine sector to cope with climate change (Davidson et al. 2013) or agroecosystems (Cabell and Oelofse 2012), have been developed.

The fundamental idea behind structured approaches is to first open up the problem and enlarge the perspective by identifying the different elements of the problem and the links between them (divergent phase) (Montibeller et al. 2008; Franco and Montibeller 2010). Finally, all the elements of the problem are combined together with an aim to get a comprehensive overall view of the problem (convergent phase). Besides process support, structured approaches can also provide means for other tasks of the process such as elicitation of stakeholders’ preference or creation of better alternatives (McDaniels 2019).

Many developed approaches have introduced lists of criteria or measures for characterizing resilience. For example, Sharifi and Yamagata (2016) suggest that any resilient system should entail the following characteristics: robustness, stability, flexibility, resourcefulness, coordination capacity, redundancy, diversity, foresight capacity, independence, connectivity and interdependence, collaboration capacity, agility, adaptability, self-organization, creativity and innovation, efficiency, and equity. Yet, many approaches also provide means for quantifying resilience capacity in terms of the defined measures (e.g., Angeler and Allen 2016; Platt et al. 2016; Quinlan et al. 2016; Bakkensen et al. 2017; Tran et al. 2017). At best, these can provide a transparent tool for assessing resilience and for considering the different aspects of it. On the other hand, measuring and monitoring of only a narrow set of indicators may reduce the understanding of system dynamics that is needed to apply resilience thinking (Quinlan et al. 2016).

Linkov et al. (2013a) have introduced a resilience matrix approach, which provides a mapping of system domains across an event management cycle of resilience functions (Table 1). The basic idea of the matrix is that to create resilience, achievement in all sectors of the system must be reached (Linkov and Trump 2019). This is done by systematically considering all types of the threats in different stages of the disruptive event management cycle.

Table 1 Resilience matrix of Linkov et al. (2013a) providing guidelines for resilience metrics that need to be developed and combined to measure overall system resilience

The columns of the matrix are based on the report of National Academy of Sciences (NAS), which describes resilience as the ability to (i) prepare and plan for, (ii) absorb, (iii) recover from, and (iv) more successfully adapt to adverse events (National Research Council 2012). The rows of the matrix consist of the four domains of the network-centric warfare doctrine: (i) physical, (ii) information, (iii) cognitive, and (iv) social (Alberts and Hayes 2003). They were initially influenced by the advances of military theory, but the classification can be easily adapted to different disciplines of civil society, too (Fox-Lent et al. 2015). The cells of the matrix describe what is important when considering achieving the different dimensions of resilience, and, in this way, support a transparent connection between resilience policies and potential outcomes. The approach was initially designed to be applied semi-quantitatively as a guideline for selecting appropriate measurements to judge functionality of a system from a broader perspective (Linkov and Trump 2019). However, it can be also be applied in a more qualitative way by just identifying appropriate measures for increasing resilience, or more quantitatively by defining metrics and by assessing the system performance with respect to these.

The process of assessing resilience with the matrix approach includes the following phases: (1) definition of the system boundary and range of threat scenarios under consideration, (2) identification of the critical functions of the system to be maintained, (3) selection of the indicators and generation of scores for system performance in each cell for each critical function, and (4) aggregation of the matrices to create an overall resilience rating (Fox-Lent et al. 2015).

In recent years, systematic resilience assessment approaches have been increasingly used to assess the resilience of water-management practices in the context of watercourse regulation. According to Simonovic and Arunkumar (2016), resilience-based approaches are powerful tools for selecting proactive and adaptive responses of a multipurpose reservoir to a disturbing event that cannot be achieved using the traditional measures. Merz et al. (2015) recognize that surprise is a neglected element in flood risk assessment and management, and discuss about the possible approaches to better understanding the complexity of flood risk systems and cognitive biases in human perception. Much of the earlier work has focused on the physical risks to dam security (e.g., Isomäki et al. 2012), but in recent years, the role of the societal issues has also been emphasized (e.g., Molarius et al. 2015). For example, Koks et al. (2015) present an approach for evaluating flood risk-management strategies based on the joint assessment of hazard, exposure and social vulnerability. In spite of the increasing interest to systematic resilience assessment approaches, to our knowledge, there are no studies related to watercourse regulation that systematically assess the different dimensions of resilience in different phases of the disruptive event management cycle. To fill in this gap, this paper presents a framework based on the resilience matrix of Linkov et al. (2013a) to assess the resilience of the operational management process of water reservoirs.

3 Resilience matrix for reservoir operation

3.1 The process for creating and applying the resilience matrix

The main aim of this paper is to support reservoir operators and supervisors of watercourse regulation in their task of assessing resilience of reservoir operation by means of a general framework. The other aim is to evaluate the applicability of the framework in the identification of actions to improve the resilience of the reservoir operation process (see Fig. 1). Compared to the earlier applications of the resilience matrix on different fields of society, this process is quite a specifically defined operational process (e.g., Roege et al. 2014; Fox-Lent et al. 2015; Rosati et al. 2015; Zussblatt et al. 2017). However, according to Fox-Lent et al. (2015), the resilience matrix approach is scalable to any size of a system, which gave us motivation to also test its applicability to this kind of a process.

In Finland, all the 242 watercourse regulation permits are supervised by 13 Regional Centres for Economic Development, Transport and the Environment (ELY Centres) (SYKE 2015). This number includes 48 state-owned permits, which are also operated by ELY Centres. Therefore, we involved the representatives of the ELY Centres actively to the development and testing of the resilience matrix approach. The project group included water-management engineers and system analysis experts from SYKE (Finnish Environmental Institute). In addition, the knowledge of SYKE experts in other fields (such as hydrology and flood risk management) was utilized in the planning process.

The process of creating and testing the resilience matrix is presented in Fig. 2. First, we defined the system boundaries of our case to be the reservoir operation including all the phases of the operational decision-making process on the dam. The considered threats include threats from all the three threat types of van Leuven (2011), i.e., those related to natural hazards, human-caused actions, and workforce/infrastructure. The focus is on those threats that can have an effect on the outflow from the lake. Thus, we do not explicitly consider the natural hazards affecting the inflow (such as excessive rain) as threats, but treat these as boundary conditions.

Fig. 2
figure 2

The process of creating and testing the resilience matrix for watercourse regulation

The realization of the possible threats to the system may lead to either too high or too low water levels/flows. The most harmful damages are typically obtained in severe flood situations. Thus, to facilitate the concretization of the possible consequences of the threats, we assume that the water level of the regulated lake or river is already at a high level at the realization of the threat. Then, the time frame to response to the threat is also much shorter, which emphasizes the need for careful preparation for the threats. However, in a case of low water levels, the fundamental reasons leading to the realization of the threat are usually the same as in a case of too high water level. For example, in a high water-level situation, the most harmful position of a seized sluice gate is typically “fully closed”, whereas in a case of a low water level, it is “fully open”. Nevertheless, the reason leading to the seized gate (e.g., power shortage) can be the same in both cases. Thus, the analysis of the threats themselves can be generalized to also include low water-level situations, even though the impacts are then opposite.

The next step of the process is to identify the possible actions for preparing for each of the identified threats. We did not find it reasonable to create any separate indicators for the measuring the system performance, as the level of implementing the actions can be seen as an indicator of how well one has managed to prepare for the threat. In this respect, the process was somewhat different from the one presented in Fox-Lent et al. (2015), whose phases three and four can be combined in our process to a general phase of assessing the resilience. Hence, the phases of our process include:

  1. 1.

    Definition of the system boundaries and range of threat scenarios under consideration.

  2. 2.

    Identification of the critical functions of the system to be maintained.

  3. 3.

    Defining the criteria and questions for assessing resilience in the case of operational watercourse regulation.

The critical functions of our system are the different phases of the operative decision-making process (Fig. 1). In each phase, the focus is on those issues related to that particular phase, but a successful operation of each phase also requires that there have not been any problems in the preceding phases. For example, a measurement error caused by malfunctioning equipment can lead to inaccurate water level and flow predictions, and consequently, to poor decisions made by the reservoir operator. We consider this to be fundamentally a physical threat of phase 1, and thus, the possible actions related to this threat are listed under physical issues. On the other hand, this threat can be a trigger for some other threats in the following phases of the process. For example, there can be a cognitive issue of not noticing clearly wrong information due to lack of regulation experience, which would be correspondingly dealt with under cognitive issues.

The identification of possible threats included research of the literature and interactive collaboration with experts. The study was part of the “From Failand to Winland” project (https://winlandtutkimus.fi), in which two expert workshops had been arranged beforehand for identifying threats related to the water security in general. We utilized the results of these workshops when creating a list of preliminary threats for operational regulation. This list was then used as a basis in Workshop 1 that was arranged in conjunction with the annual meeting of the people working in the field of reservoir operation. The workshop, which included interactive group work, was held to identify any new threats missing from the preliminary list and to discuss these threats. On the basis of this workshop, we modified our list of possible threats.

In the next phase, we created an e-mail questionnaire for the reservoir operators and the supervisors of the water course regulation projects. In the questionnaire, they were asked to evaluate how important it is to prepare for each threat. On each threat, they were also asked to describe the situation of a possible threat to get a more concrete view of the threats in practice. The respondents were also requested to identify 3–5 most relevant threats. In addition, they were asked whether these kinds of threats have occurred in the watercourses on their area, and whether actions related to improve the resilience on these threats have already been implemented. The three most relevant threats identified by the respondents are structure failures, lack of resources for high-quality water management, and the reduction of expertise both in organizational and individual level. Of the threats that have been materialized, the most common is the malfunction of the water-level measurement equipment, as this had happened on every area.

On the basis of the workshop and questionnaire, we created the initial resilience matrix of Linkov et al. (2013a) for assessing the resilience of different phases of the reservoir operation process (described in detail in Sect. 3.2). The rows of the matrix are the threats identified in the earlier phases divided into four categories, and the columns are the four main stages of the disaster management cycle (Linkov et al. 2013a). In theory, the matrix should have been filled in for each critical function (i.e., the phases of the regulation in our case), but we only filled in one matrix that was common to all the functions.

The initial resilience matrix was presented in the Workshop 2 arranged for the reservoir operators and the supervisors of regulation. In the workshop, discussions in small groups were carried out for completing the matrix. The group work also included discussion about the applicability of the approach. The participants were also asked to fill in a questionnaire about the approach and its applicability as well as its pros and cons.

Overall, the participation rate of the ELY Centre representatives in the process was high. The questionnaire got responses from the representatives of 11 different ELY Centres and Workshop 2 was attended by representatives of 8 different ELY Centres. The participating ELY Centres included almost all the ELY centres that own the permits and majority of ELY Centres that supervise the permits (see Table 2).

Table 2 The number of the permits supervised and owned by the ELY centres whose representative(s) responded the questionnaire and attended the workshop, respectively

3.2 The content of the resilience matrix

The initial idea of the resilience matrix of Linkov et al. (2013a) is to carry out the assessment separately for each critical function, which, in our case, would mean each phase of the reservoir operation process. However, most of the phases are only impacted by a few types of threats but not all of them. For example, physical threats mainly concern the early phases of the process, whereas social and cognitive threats mainly concern the latter phases of the process. Thus, separate assessment of each phase would have led the resilience matrix to include several blank cells. Thus, we decided to make an overall assessment, so that all the critical functions are considered together. However, if there are some impacts that specifically concern a certain critical function, these would be explicitly mentioned in the assessment.

To support the identification of how the phases are impacted by the threats, we created a supplementary matrix that shows which phases are impacted by each threat (Table 3). This matrix separates the cases of the threat causing missing and erroneous information, as the actions for responding these threats can be quite different (e.g., Kotamäki et al. 2009). That is, if information is missing, it can often be noticed quite rapidly, and corresponding actions can be made to collect the missing information and to adjust the decisions accordingly. However, if the information is erroneous, it may appear that things are in order, even if they are not. For example, if a failure in the water-level measurement equipment causes missing water-level information, it will typically be noticed very soon. However, if the sluice gates are stuck at a certain level, the operator may assume that the water is at a good level, even if there are potentially severe problems. Consequently, he/she can even take actions that worsen the situation. As can be seen in Table 3, some of the threats can concern many of the phases, but there are also threats that only impact one or two of the phases.

Table 3 Direct impacts of the threats to the phases of the operational watercourse regulation (“Miss.” stands for missing information and “Err.” for erroneous information)

The next phase was to create the actual resilience matrix of Linkov et al. (2013a). As mentioned above, the aim of our resilience matrix is somewhat different from the original purpose, as it does not have any fixed measures for estimating the level of resilience. Instead, each cell of the matrix describes issues that should be taken into account to achieve resilience in this particular cell. In this respect, the matrix can be considered as a check list of issues to be considered. Yet, on the basis of how well these issues are achieved, one can make an estimate of the level of resilience, which is, however, more a qualitative rather than a quantitative estimate. Nevertheless, the fundamental objective is the same as in the original matrix, i.e., to increase the resilience of the system by identifying beforehand the weak parts of the overall system.

Sabotage and hacking into the system are threats whose impacts are typically realized through other threats. For example, sabotage can cause construction failures and hacking into the system can cause malfunctioning of watershed simulation and forecasting system. In these kinds of threats, prepare and adapt stages of disaster management are typically related to the threat itself, but absorb and recover stages also relate to the consequent impact that sabotage or hacking had caused.

3.3 Application of the resilience matrix

In practice, the obtained resilience matrix (Table 4) can be used as a checklist when considering the issues that should be taken into account to make an individual reservoir operation process more resilient towards various kinds of threats. The expected users of the matrix are the persons responsible for the reservoir operation as well as the persons in the regional ELY Centres responsible for the supervision of the regulation permits. This list of issues can also be used to support the qualitative assessment of resilience in different phases of reservoir operation process.

Table 4 Resilience matrix for assessing the threats related to operational watercourse regulations

For the practical application of the matrix, we created an Excel form for evaluating the resilience of a single operation structure. The first part of the form lists the possible threats as well as the possible actions against them that are identified in Table 4. For each action, the user is asked to evaluate whether the reservoir operation structure has implemented the listed actions (Scale: Yes, Partly, No, Not relevant). The user is also asked to provide reasoning and/or comments for the response. Table 5 presents exemplary extract of Excel form filled for evaluating the resilience of a single dam operation structure in a case of mechanical measurement device failures. Similar assessment is made for all the threats listed in Table 4, and overall, the form consists of 106 rows of actions in 17 different categories. The full (non-filled) form is attached as a supplementary material.

Table 5 An exemplary extract of Excel form filled for evaluating the resilience of a single dam operation structure in a case of mechanical measurement device failures

Many actions in the initial resilience matrix are larger entities, so that, for example, the plan/prepare stage deals with acquiring the needed equipment and the absorb phase the actual use of this equipment. For conciseness, these are not presented separately in our form, but combined into one action. However, the following numbers are presented in brackets after each action to indicate the stages of the disaster management cycle to which the action belongs: 1 = plan/prepare, 2 = absorb, 3 = recover, 4 = adapt/learn.

In the second part of the form, the user is asked to provide plans for implementing those actions that were identified in the first part as only partly or not implemented. For each of these, the user is asked to provide:

  • suggestions for the actions needed to fix the issue

  • estimates about the benefits of the suggested actions (Scale: Large, Moderate, and Small) and verbal reasoning for this

  • estimates about the costs of implementing the suggested actions (Scale: Large, Moderate, and Small) and verbal reasoning for this

  • estimates about the feasibility of implementing the suggested actions (Scale: Easy, Intermediate, and Difficult) and verbal reasoning for this

We tested the approach for evaluating the resilience of a dam controlling a middle-sized lake in South Ostrobothnia in Finland in collaboration with the local ELY Centre. The analysis forced the reservoir operators to systematically go through and consider how they have prepared for the possible threats. Of the possible vulnerabilities, the analysis caused them to consider possible actions to fix the issues as well as the cost-effectiveness of the actions. We do not present the results of the actual analysis here in full, as the aim of presenting the case is to just to demonstrate the use of the approach. Furthermore, presenting possible vulnerabilities publicly would actually create vulnerability itself.

4 Discussion

4.1 Applicability of the resilience matrix approach in our case

The initial idea of the resilience matrix approach is to provide a framework for developing application-specific quantitative and qualitative metrics for each phase of threat management. For example, Fox-Lent et al. (2015) developed quite specific quantitative metrics for measuring the performance of the system on each cell of the matrix, and in this way provided a transparent framework for assessing the resilience of the system. We also considered creating metrics for measuring the performance in each cell, but ended up describing qualitatively how to improve the resilience of the system. The main reason was that besides identifying the issues needing improvement, we also wanted to find out reasoning for why the issues are not in order and how they can be fixed. In this respect, the verbal explanations are essential. In addition, we thought that any single (or even a set of) metrics would have oversimplified the assessment too much, and thus not been able to highlight all the nuances of the systems’ resilience. We found this kind of a qualitative approach to be sufficient in our case, as it provided a kind of a checklist of the wide variety of issues to be considered and of raising awareness of the different dimensions of the issues to be considered.

Naturally, in the future applications of the approach, there may be a need for quantitative assessment instead of a qualitative one. The proposed approach can also be applied quantitatively using numerical scales for estimating preparedness level of each action. Yet, to give a visual overall view of the preparedness, information about the preparedness level can also be added to Table 4, for example, by highlighting each action by stoplight colors (Yes = green, Partly = yellow, No = red, and Not relevant = gray). Furthermore, these estimates could also be aggregated with some technique to obtain one overall index for each cell (such as in Linkov and Trump 2019, p. 91).

Especially, in cases where one wants to compare the resilience capacity of several dams, some aggregate metrics could be useful to quickly give a comprehensive view of the most vulnerable dams. It is possible to calculate some approximate overall indices also for our case. In practice, this would require that for each cell of the resilience matrix, one identifies all the threats related to that cell and calculates an average of their values, for example, using a numerical scale: Yes = 100%, Partly = 50%, and No = 0%. Then, those threats concerning multiple phases are counted for each of these phases, and threats with a value “Not relevant” are not considered at all. For example, the first three threats in Table 5 are all counted in the “Physical/Plan/prepare” cell of the resilience matrix. Of these threats, “Availability of personnel and replacement parts ensured for disruptive events” is additionally counted in the “Physical/Absorb” cell, and “Contracts for rapid repairing of equipment” in both “Physical/Absorb” and “Physical/Recover” cells. Table 6 presents an example of an evaluation of the performance of a single dam with this kind of overall indices. Yet, when analyzing these indices, one has to keep in mind that they do not take stance on the severity nor the occurrence possibility of the threats, but treat all the threats equally.

Table 6 An example of the evaluation of a single dam using an index for measuring the resilience capacity of the dam on each cell of the resilience matrix

In our case, the critical functions are the phases of the reservoir operation (see Fig. 1). Using a resilience matrix approach requires that the links between different phases are explained as additional information. Yet, as mentioned earlier, we did not consider it reasonable to create an individual resilience matrix for each phase, but only created a common matrix for all the phases. In general, this worked fairly well, but in some cells, especially related to the failures of the equipment, it would have been good to have separate matrices for different phases. We solved this problem by considering mechanical measurement device failures and dam operation device failures as separate threats instead of only considering mechanical failures. This seemed to be an adequate solution for our purposes.

Our first plan was to make the actual resilience assessment in a cell-wise manner, in which the assessment would have been conducted for each of the cells of the initial matrix (Table 4). However, each cell could include several different actions, and the level of preparedness could vary between these. Thus, we decided to reduce the dimensionality of the matrix into one dimension, where all the possible actions for the threats are presented as a list (Table 5). In this way, we could add new evaluation dimensions, which included information about whether the action has been implemented, suggestions for the measures needed to fix the issue as well as estimates about the benefits, costs, and feasibility of the measures. From the viewpoint of the practical usefulness of the approach, this was essential, as the cost–benefit ratios of the actions are a crucial issue in the prioritization of the actions.

4.2 Strengths and weaknesses of the approach

After the second stakeholder workshop, the participants were asked to identify the strengths and weaknesses of the approach, which are summarized in Table 7. Strengths included that it produces a comprehensive and clear overall view of the vulnerabilities and how to deal with them. These are also noted by Linkov and Trump (2019), who emphasize relative simplicity and transparency of the approach as well as its easy capability to be utilized with multi-criteria decision analytical methods (see e.g. Belton and Stewart 2002) for further evaluation and risk assessment needs.

Table 7 Strengths and weaknesses/development needs of the approach

We applied the approach in a way, in which the matrix was preliminary filled in by the research group on the basis of information obtained from the first workshop, and this matrix was then complemented by information obtained from the second workshop. This was considered to be a good way, as the prefilled information helped the participants to understand which kind of input is required from them. On the other hand, there are risks related to the availability of already existing information, as it can easily anchor the thinking of participants into certain issues (e.g. Montibeller and von Winterfeldt 2015). In our case, this was not a problem, as the prefilled information was gathered from the same experts using open questions.

Another identified strength of the approach was that the classifications in both dimensions of the matrix were seen as a good way to systematically analyze the threats. Both of these characteristics were already considered as important objectives when planning the process, and in this respect, the process can be seen to have been successful. The framework also helped stakeholders to concretize the actions needed for improving the resilience. On the other hand, for some issues it was difficult to define in which cell they should belong. For example, preparedness exercises or training of the personnel can be considered as a means for preparing for almost any threat, but we classified them only on the most relevant threats. In this respect, we think that it is more important to have the issue somewhere in the matrix rather than classify it on all the cells or leave it out due to not fitting fully in any of the cells. Thus, our guidance was to not let the framework restrict the thinking, but in these cases to just put the action in the cell in which it would most appropriately belong.

Our project shows that it is essential to involve the people operating the watercourse in the process. In our research group, we had strong expertise in regulation in general, but there was a lack the operational expertise. Therefore, the discussions in the workshop with the regulation operators were very important, as they brought up issues of concern especially from an operators’ point of view. For example, the workshop participants strongly emphasized the issues related to outsourcing of several tasks (e.g., maintenance and service) and of careful preparation of the agreements to clearly define responsibilities between the contractors especially in emergency situations. On the other hand, our process did not include the people actually carrying out the countermeasures, such as personnel from rescue services or dam safety authorities. Retrospectively, it would have been useful to also have their views especially on the discussion about the threats related to the physical dimension. On the other hand, our aim was to cover all the four dimension of the resilience matrix in a balanced way. Thus, including, for example, rescue service personnel to the process could have shifted the focus of the discussion too strongly on the physical dimension. Nevertheless, this brought up an important point that enough attention has to be paid to how the process is carried out, for example, who is involved in the process and how.

Some of the workshop participants thought that the terminology was not fully clear and the recovery and adapt columns were not seen as relevant in the context of reservoir operation. However, there were also opposing views, as some other participants thought that the analysis after the adverse event has not gained enough attention. Nevertheless, we think that one reason for not considering the recovery and adapt stages very relevant is that typically, the impacts of an adverse event are not realized instantly, but only over time through water-level increase. Consequently, the actions for recovery can be seen to be conducted simultaneously with the actions for absorbing the impact, which may blur the line between these.

Another reason for not considering the adapt stage very relevant in this context might be that in watercourse regulation, the state of the watercourse can usually be returned to its initial state after the threat has been removed. In this respect, the term “learn”, which has been used in other contexts (e.g., Pearson and Mitroff 1993; Kotzee and Reyers 2016), would, perhaps, be a better term for describing the activity after recovery related to the learning from the event. It is mentioned as one part of adaptation in the initial resilience matrix, but in our case, its role should have been emphasized more.

One weakness of the approach mentioned in the feedback was that in the matrix all the threats are treated similarly. Thus, the approach does not take stance on the severity of the possible impacts nor their probability. In our case-specific evaluation form, we also asked the user to estimate the possible benefits of implementing the actions, which indirectly requires estimating the severity of the possible impacts and their probability. These could have been estimated separately, but we did not see it necessary, as it would have required quite a profound analysis of the severity of impacts and their probabilities. Yet, if this kind of an analysis was to be done, then the traditional risk analysis approaches can provide means to these kinds of considerations into account. In this respect, the resilience approach and traditional risk analysis should be considered as supplementary approaches to each other and the applicability of the approaches depend much on the objectives of the analysis. However, neither of the approaches are very good at “thinking about the unthinkable” (Aligica and Weinstein 2009; Phillips and Tayebi 2012), in which respect, some other approaches, such as scenario analysis (Schoemaker and Tetlock 2012), could also be considered to be used alongside.

The dimension of time is taken into account in the resilience matrix by considering the stages of the disruptive event management cycle separately. However, the assessment is carried in a static way that does not explicitly consider temporal characteristics or the dynamics between the different stages over the time (Linkov and Trump 2019). This issue was partly covered by demonstrating the targeting of the threats to the different phases of the dam operation process with Table 3. However, to fully cover the dynamics of the system, more systematic approaches such as systems dynamics (e.g., Fiksel 2006) are needed.

5 Conclusions

In this study, we developed and tested a structured framework based on the resilience matrix of Linkov et al. (2013a) for assessing the resilience of reservoir operation in Finland. The matrix was developed in close collaboration with the experts from the Regional Centres for Economic Development, Transport and the Environment, who have a central role in the supervision of the watercourse regulation permits and in many cases also in their operation.

Overall, the resilience matrix proves to be an applicable approach to systematically analyze the possible threats related to the decision-making process of reservoir operation. The strengths of the approach identified by the stakeholders include that it gives a structured and comprehensive overall view of the vulnerabilities and how to deal with them. In addition, it helps stakeholders to concretize the actions needed for improving the resilience in all the phases of decision-making process and in different stages of the disruptive event management cycle. A weakness of the approach is that it does not take into account severity of the possible impacts nor their probability. Thus, traditional approaches of risk management are still needed alongside to deal with these aspects.

The development of the matrix for reservoir operation was made at a general level, but we also applied the matrix to a single reservoir operation case. In this case, the cost efficiency of the actions to respond to the threats was an important issue to be considered, as there was a need to also prioritize the actions. For this, we created a form, which broadened the assessment to also cover costs, benefits, and feasibility of the actions along with the assessment of the preparedness. With this, we are able to operationalize the matrix approach in practice by identifying the most essential actions for improving the resilience of the reservoir operation process.