1 Introduction

Car crashes, building fires, hurricanes, pandemics, and terrorism, both small- and large-scale incidents, require the collaboration of multiple actors to limit the consequences for society, and meet the needs of those affected. Increased interconnectedness and dependency on critical infrastructures means that not only traditional first responders, but also other actors, such as governmental agencies and private companies, can be involved in the response to emergencies and disasters (Ansell et al. 2010). In these circumstances, minimizing the consequences of the initial event and its cascading effects (Rinaldi et al. 2001; Bruijne and van Eeten 2007) is not only a matter for individual actors, but also requires a joint effort involving several actors (Uhr 2017; Nohrstedt et al. 2018; Frykmer 2020). Therefore, not only the individual capability to respond is important, but also the capability to collaborate. How to improve collective disasterFootnote 1 response management is thus an essential question, but not an easy one to answer. For instance, the complexity of societal disturbancies brings into play a multitude of factors that may impact the outcome of the event and its consequences. As a result, it is difficult to distinguish between the effects of factors that we can influence, such as resource management, and factors that we cannot influence, like the weather. Due to this difficulty, it has been hard to identify the measures or interventionsFootnote 2 that are scientifically proven to have positive effects on the capability for collective response.

Disaster response management research has mainly developed within the social sciences, with a traditional focus on explanatory research. This type of research is strong in developing theory, but less so in terms of providing solutions for improving practical problems (Watts 2017). Less attention has traditionally been given to normative research with a focus on providing evidence for how to improve professional practice and we argue that the field would benefit from a framework for conducting such research. It should be noted that normative conclusions, for example, how one should arrange disaster response, are not uncommon. There are a multitude of handbooks and guidelines describing such conclusions in detail. What is often missing is the evidence, that is, the arguments that support the normative claims. Using insights from other domains, where normative research is more common, we suggest a complementary approach to disaster response management research where the generation of such evidence is central. Our aim is to contribute both methodologically and conceptually by focusing on how to investigate the effects of potential interventions. In short, it is about investigating what works—or not.

First, we analyze the problem with studying the effects of interventions, or “what works or not,” in emergency and disaster response management and introduce an approach supporting the development of design knowledge that can suggest improvements to the field. The approach draws upon experimental methods used in combination with explanatory field studies to investigate how different situational conditions affect disaster response management. Second, we illustrate this approach by empirically investigating what effect a particular intervention, here one related to “goal alignment,” has on the collective response. Exemplifying goal alignment interventions was found suitable due to implementations in national frameworks (FEMA 2017; MSB 2017) despite a lack of clear, nonambiguous literature and, to the best of our knowledge, empirical studies that show its effectiveness in a disaster context. Consequently, the real effects are unknown, and the decision to implement such an intervention lacks rigorous scientific support. It remains unclear whether, in a response situation, aligned goals inevitably improve outcomes—or not.

2 A Design Science Approach to Research on Emergency and Disaster Response Management

Much of disaster response management research has focused on describing and explaining various phenomena, such as emergence (Quarantelli et al. 1966; Dynes 1970; David 2006), improvisation (Wachtendorf 2004; Frykmer et al. 2018), and sensemaking (Weick 1988; Combe and Carrington2015). The research has thus been concerned with how the world works (with respect to disaster management). At the same time, there are numerous books, reports, and guidelines describing how disaster management should be conducted in practice (IASC 2010; Coppola 2011; UNHCR 2015). The knowledge contained in the first type of publication helps us understand why things happen, and it might even allow us to make predictions. The second type of publication contains knowledge about what we should do in certain circumstances.

There are several ways to categorize science. Here we make use of Aken’s (2004, p. 224) distinction between: “…three categories of scientific disciplines: (1) The formal sciences, such as philosophy and mathematics. (2) The explanatory sciences, such as the natural sciences and major sections of the social sciences. (3) The design sciences, such as the engineering sciences, medical science and modern psychotherapy.” The misson of the explanatory sciences corresponds to our description of the first research output described above, that is, it seeks to describe, explain, and possibly predict some type of phenomenon. The mission of a design science, on the other hand, corresponds to the knowledge contained in, for example, guidelines or handbooks,that is, it develops knowledge of how to best achieve goals in a specific professional context.

We acknowledge that this categorization of sciences is coarse and that, in practice, any science most likely contains aspects of both explanatory and design character. Nevertheless, we use this distiction when describing a central claim that we wish to make: that research on emergency and disaster response management is much more developed in terms of its explanatory ambitions than when it comes to design. With this claim, we mean that arguments supporting conclusions of an explanatory nature are generally stronger and more salient compared to arguments of a design nature. When scrutinizing the arguments supporting the explanatory claims, these are most likely found in scientific papers where, for example, theories, models, or constructs are proposed. Partly, this is what good scientific conduct is all about,that is, transparency of method and data, logical consistency, and so on. On the other hand, if the focus is on response research that claims how something should be (done) in order to achieve some kind of goal, it is often hard to clearly understand the arguments supporting such conclusions. There can be several reasons for this. Here we wish to highlight two types of studies in which this might be the case.

The first type involves explanatory studies where the authors overstate the normative importance of the findings. Overstatement is a problem that has been observed in fields such as medicine where the clinical importance of the results are sometimes exaggerated (Shinohara et al. 2017). Indicative of this problem are studies, either based on data from one or a few disasters, investigating some phenomenon of interest. The main focus is here on providing typical explanatory claims like “A led to B in a specific disaster.” Alternatively, if the study involves several disasters, one might see claims like “A leads to B, in disasters in general.” However, even if B is something desirable, whether such results can infer that “you should do A in disaster response (since it leads to B)” is questionable. For example, there might be better ways of achieving B than A, or A might be so costly so that this will outweigh the benefits of achieving B.

The second type involves studies of a theoretical nature where some method or model for how to solve some type of problem in a disaster response setting is suggested. In this case, the arguments supporting the implementation of the method/model in question rely on some basic assumptions from which the method/model is “derived.” An example is the principle of maximization of expected utility in decision situations involving uncertainty. The principle is supported by very strong conceptual arguments (Neumann and Morgenstern 1944), but, as Kahneman and Tversky (1979) demonstrate, it is a poor explanatory model of human decision making. Although the principle is theoretically appealing, there are often simpler strategies that outperforms it (Gigerenzer and Goldstein 1996).

Both types of studies are relevant in disaster response research, but neither of them lead to strong normative arguments, that is, evidence, with respect to what works in practice. Therefore, our ambition is to contribute to the development of response management research by describing how “design knowledge” (knowledge intended to improve professional practice) can be generated in a transparent and logically consistent way. To that end, we draw upon the design science literature in fields where such research is more developed, notably organizational research (Romme 2003), information systems research (Hevner et al. 2004), and management research (Aken 2004).

2.1 Developing Design Knowledge

Our focus is on propositional (“know that”), rather than procedural (“know how”) knowledge. Procedural knowledge is essential when implementing various measures to improve disaster response management, but to implement such measures, we first need to conclude that they are likely to work. For that, propositional knowledge is essential. Here, we use the term (propositional) knowledge to mean “justified beliefs.” This is in line with the Society for Risk Analysis Glossary (SRA 2018) and with Aven (2018), Aven and Renn (2019), and Hansson and Aven (2014). The key idea is that knowledge is the same as the most epistemically-warranted statements, in other words, “justified beliefs” about nature, humans, physical constructions, and so on. However, the justified beliefs we seek when pursuing design knowledge are not related to how things are but rather to what should be. Put differently, design knowledge is not about understanding disaster response management per se, but what we should do in order to achieve something that is desirable in this context. Design knowledge is thus normative or prescriptive, rather than descriptive. More precisely, it has the logical form of a so-called “design proposition” (Aken 2004, p. 227): “If you want to achieve O in context C, do something like I.”

The design proposition includes an objective (O), which is something one wants to achieve; a context (C) in which the knowledge is claimed to be applicable; and an intervention (I), which describes what should be done in order to achieve the objective (O) in the specific context (C).

Design propositions vary greatly in terms of how concrete they are. For example, an algorithm specifying a very precise method to do something or, more likely in the context of response management, a heuristic similar to the general proposition given above (that is, it includes the notion of “…do something like…”). Importantly, the proposition does not need to be condensed into an algorithm or resemble the statement above. It reflects the intervention-outcome logic of a specific proposition, but the actual description might be contained in, for example, a guideline, book, or instruction video (Aken 2005b). The ongoing construction of a body of design knowledge is a key activity in any professional context. It requires constantly asking questions about how to best achieve purposes relevant to the profession in question, whether curing people from a disease or managing the consequences of an emergency or a disaster. There must be a continuous evaluation of which statements (design propositions) are most justified (epistemically warranted); the process necessarily includes producing new statements and refining old ones. Changes can be justified by either empirical testing or reasoning.

2.2 Using Experiments to Support the Development of Design Knowledge

We suggest that controlled experiments become an integral part of the emergency and disaster response management research agenda, in order to develop design knowledge in the area of response management. Although various scholars have used experimental settings to examine different aspects of response management (Brehmer 1992; Pramanik et al. 2015; Danielsson 2016; Kalkman et al. 2018), examples are few. Moreover, as far as we know, there have been no efforts to explicitly support the development of design knowledge. We argue that controlled experiments have several benefits in the context of emergency and disaster response management research.

We draw upon an analogy with the development of modern medicine to underline our point. Up until the twentieth century, “it was not unusual for a sick person to be better off if there was no physician available because letting an illness take its natural course was less dangerous than what a physician would inflict. And treatments seldom got better, no matter how much time passed” (Tetlock and Gardner 2015, p. 28). Many treatments were available, and occasionally they were changed, giving the impression that they were improved; however, in most cases, they did not have any effect at all. The breakthrough that brought medicine from a practice-based craft into a research-based discipline was the scientization of the field (Aken 2005a). The key was an increased use of controlled experiments to test and evaluate treatments (design propositions), and the accumulation of general design knowledge that could be taught to new students and practitioners. Although medicine and disaster response management are different, the use of controlled experiments to develop design knowledge should be equally important in the two fields.

Although experiments have certainly been paramount to the development of modern medicine, we must acknowledge the limitations of the experimental approach to generating design knowledge in a context such as emergency and disaster response management. One obvious problem is that it is impossible to control these adverse events and, therefore, it is difficult to study the effects of an intervention, even if data are collected from several events. There are so-called field experiments, where the effects of different policies are investigated (Falk and Heckman 2009). Like antiterrorism interventions (Arce et al. 2011), however, it is hard to imagine actors conducting experiments during actual events. Therefore, in the remainder of this article, “experiment” refers to an experiment run in the laboratory. One major concern that has been raised regarding such experiments in the present context concerns their inability “to incorporate factors that are crucial to much real-life decision-making” (Eiser et al. 2012, p. 14). Similar opinions have also been described more generally in other areas of research (Leonard and Donnerstein 1982).

These concerns are part of a wider discussion related to the external validity of experiments, notably, the extent to which results can be generalized to other contexts and, specifically, the extent to which they can be generalized to a real-world context (sometimes referred to as ecological validity). Such concerns are, of course, important in explanatory research, where the aim is to explain and/or predict phenomena relevant to response management. But if the purpose is to support development of an artefact, the situation is different. In the latter case, the key question is whether the experimental context is a valid model of the practical context that it seeks to represent. As Mäki (2005, p. 306) notes, “[experiments are] mini-worlds that are directly examined in order to indirectly generate information about the uncontrolled maxi-world outside the laboratory.” Thus, whether an experiment is valid or not should be judged by the extent to which we have reason to believe that the effect of the intervention in the experimental context is correlated with its effect in practice. Although an experiment may have little external (ecological) validity (that is, the experimental context is unlike the practical context and results cannot be generalized), this does not discount the experimental method. From a design perspective, a single experiment could support development of the artefact by asking questions like, “based on the results of the experiment, do we have reason to believe that intervention I will lead to outcome O in context C?” The answer should then be used as a basis to determine if, and if so how, the development process should continue.

This highlights the fact that the purpose of an experiment is essential in determining whether it is a valid model. If the purpose is to support a decision early on in the development process, the model might be very simple compared to the practical context. On the other hand, if the purpose is to support a decision later in the development process, a more complex model might be warranted. To exemplify and make use of the analogy with medicine: a decision whether to continue the early development of a new drug (an intervention) might be based on experiments involving mice. Thus, even though we know that it is difficult to generalize from mice to humans (Leenaars et al. 2019), such experiments are still extremely valuable since they are very useful to support development decisions,that is, whether to continue developing the drug or not. Similarly, if we are developing design propositions in the field of emergency and disaster management we could use experiments as a basis for design choices. For example, is it worth pursuing the development of the intervention of interest or focus on something else?

These ideas can be combined into a model that shows how to relate and integrate explanatory and design research in the field of response management. The model, which builds upon Kuechler and Vaishnavi (2008), is illustrated in Fig. 1.

Fig. 1
figure 1

Source Adapted fromKuechler and Vaishnavi (2008)

A model showing how to integrate explanatory and design research in emergency and disaster response management.

Both explanatory and design research address what we call the practical context. This is the context in which we would like our design propositions to be valid. Explanatory research can produce statements that suggest a cause and effect relationship in a specific context. For example, the phenomenon of “drift into failure” (Dekker and Pruchnicki 2014) explains why disasters happen in high-risk industries: pressures of scarcity and competition lead to the normalization of signals of danger, thereby eroding safety margins and eventually leading to a disaster. Such explanatory statements can then be transformed into a prescriptive statement, or a design proposition, linking the desired effect to an objective, and the cause to some kind of intervention. An example of a prescriptive statement involving an intervention supposedly leading to fewer failures is taken from studies of high reliability organizations: “Continuously communicate rich, real-time information about the health of the system and any anomalies or incidents; this should be accurate, sufficient, unambiguous and properly understood; be aware that juniors are unlikely to speak up” (Denyer et al. 2008, p. 406). However, even if the supporting evidence for an explanatory statement is strong, it might not be sufficient to support the corresponding prescriptive statement. This is because justifying a prescriptive statement requires two elements. First, to support the cause/effect relationship between the intervention and the objective, we need to show that the intervention actually leads to the objective. Second, we require evidence to support the claim that the design proposition in question is the best one available, given our current level of knowledge and the practical context in which it is valid. There may, for example, be other interventions that achieve the same objective, but cost less or have fewer side effects.

This brings us to the experimental context, shown in Fig. 1. In this setting, the independent variable represents the proposed intervention, and the dependent variable represents the objective (what is measured and evaluated). Obviously, the context differs when running experiments that investigate cause and effect derived from a practical context. Here, the idea is, first, to identify factors that are needed to test the intervention, and evaluate them against the desired objective. Then, these factors are replicated in an experimental context that resembles the relevant parts of the practical context. Below, we apply the ideas outlined in this section to a study of goal alignment in emergency and disaster response management.

3 Investigating Goal Alignment in Disaster Response Management Using a Design Science Approach

Ekman and Uhr (2015) state that managerial efforts during emergencies and disasters are primarily concerned with providing direction and coordination to a variety of responders, so that various needs can be met, in space and time. Both direction and coordination are tightly linked to goals, because without goals, response efforts lack purpose. In a multiorganizational response operation where inter- and intraorganizational goals coexist, goal alignment is expected to improve outcomes (Aldrich 2019). In that case, efforts must be made not only to articulate goals, but also to harmonize them and address potential conflicts. Conflicts can be the result of, for example, a lack of resources, or a competitive culture among responding actors (Stirrat 2006).

Beyond the disaster research arena, the importance of aligning inter- and intraorganizational goals can be found in management literature and the concept of incentive alignment. The idea is that aligning incentives of, for instance, actors in a supply chain (Lee 2004; Narayanan and Raman 2004), CEOs and owners of firms (Tosi et al. 1997; Fong and Tosi 2007), or networks of small enterprises (Biswas 2011) is beneficial to all parties, and the end product. Conversely, if incentives are misaligned and individuals or organizations only focus on their own goals, the overall goal may not be achieved (Lee 2004; Narayanan and Raman 2004). The harmonization of incentives is a recommended intervention in, for instance, supply chain management (Lee 2004; Narayanan and Raman 2004) and business management (Tosi et al. 1997). In emergency and disaster response management practice, scholars have for decades discussed the idea that managers need to apply a holistic perspective (McEntire et al. 2002). Further, concepts such as “unity of effort” (FEMA 2010, 2017) or “joint direction” (MSB 2017) are used by national authorities to underline the importance of understanding how response actors’ goals are related, and how they should be aligned with the overall goal. A situation of misaligned goals, and where response actors act upon intraorganizational goals rather than the overall goal, can be compared to stove piping (Rodriguez et al. 2007; Phillips et al. 2016). For example, stove piping of information among agencies in the United States was found to have hampered efforts to connect the dots, and prevent the September 11 terrorist attack (Kramer 2011; Phillips et al. 2016).

The perceived benefits of goal alignment, however, may not be as straightforward as suggested by the literature and frameworks referenced by the authors. For example, literature on bureau-politics argues that the realities of crisis management, where tensions or conflict between response actors may prevail, can in fact be beneficial to decision quality and to avoidance of group think (Rosenthal et al. 1991). This research implies that a degree of goal misalignment actually can be constructive to the outcome of emergency and disaster response management. Further, there is literature pointing to difficulties in pursuing goal alignment in response management. For instance, the tension between local expectations from partners and vertical directions from superiors may cause frontline individuals engaged in response activities to deviate from centrally set goals (Kalkman and Groenewegen 2018). Kalkman et al. (2018) focus on how collective decisions are negotiated in a context of divergent perceptions and interests, which highlights how collective response can still come about despite these difficulties. Also, the management paradox concept points out how adopting a comprehensive perspective that aligns inter- and intraorganizational goals can be challenging for managers. In a response operation, a manager must have a comprehensive understanding of the details of their own organization and, at the same time, think holistically and laterally at the network level to be able to agree on shared goals (Uhr 2017). In this context, a challenge is that human cognition is constrained by bounded rationality (Simon 1996). In the specific context of emergency and disaster response, where individuals are part of a collaborative network with inter- and intraorganizational goals, the management paradox and bounded rationality imply that it can be problematic to understand the overall perspective and achieve a shared direction.

In light of this context, the perceived benefits of goal alignment can be disputed. Using design science terminology, explanatory research on goal alignment both indicates that goal alignment is beneficial for the outcome of response efforts, and that it may not be, as well as underscores difficulties in actually achieving goal alignment. This ambiguity in literature implies that despite perceived benefits of goal alignment, there is a risk that actually achieving goal alignment poses a problem for joint response efforts, or, which is worse, that it in fact may degrade outcomes. Nevertheless, the goal alignment concepts stressed in national response management frameworks can be seen as interventions serving to improve the collective response. To the best of our knowledge, such interventions have not been justified to any appreciable extent—in other words, they have not been evaluated against the desired objective of improving response management. We therefore argue that providing information that either supports or discredits the benefits of aligning goals is valuable and, in this study, we aim to provide such information by conducting an experiment that tests a goal alignment intervention against its desired objective. It should also be noted that “aligning goals” can be achieved in different ways. Thus, besides the examples drawn from national frameworks, there are many possible interventions that qualify as “goal alignment interventions.” For example, establishing mandatory meetings between various actors in a joint response where the overall goals are explicitly discussed, or introducing a new training program focusing on raising awareness among the responding organizations with respect to common goals. In our study, we are interested in investigating whether this class of interventions, namely goal alignment interventions, are actually contributing to a better response, so as to provide design knowledge to support improvements in the field.

4 Method

Based on previous research, we created a computer game called MikroRisk. The purpose of the game was to replicate some of the salient features of response management and to use it to test an intervention aimed at increasing goal alignment. We identified four salient features of the practical context that should be represented in the experimental context: (1) threats that need to be managed; (2) time pressure; (3) the potential to collaborate and share resources; and (4) the potential to positively influence the outcome of the situation. The independent variable was goals (aligned or misaligned), and the dependent variable was the outcome of the game, specifically, the number of consequences (fewer consequences indicated better response management). Our hypothesis, matching implemented interventions, was: When individuals in a collaborative, disaster-like setting are faced with aligned goals, there are likely to be fewer consequences than when they are faced with misaligned goals.

4.1 Experimental Design

The experiment was designed as follows: there were three players in each game, and each game contained a number of rounds. All players were in the same room, and each individual player sat in front of a computer screen that showed a 10 column × 8 row grid. The first five rows were made up by the player’s distribution of threats, represented in scope and time. The sixth row contained 100 “resources” (fire trucks), distributed across the 10 columns. These resources could be moved to any column, or shared between the players (bottom two rows). The only restriction on moving resources was that they could only be moved one column at a time (per round), or directly to a buffer called “own resources.” In each round, players were presented with individual threats that appeared in the top row of the grid. For example, Player one could be presented with 45 threats, Player two 60 threats, and Player three 70 threats, distributed over a number of columns. In each round, the threats moved down one row, so that, by round 5, the threats that appeared in the first round were above the resources shown in the bottom row (similar to the logic of a game of Tetris, see Fig. 2). As players were able to see approaching threats, they were able to plan ahead, and decide to move their resources or not.

Fig. 2
figure 2

Example of the MikroRisk screen during a game round

If, when the threat reached the sixth row, there were enough resources to match it, the consequences for that particular round were recorded as 0. However, if there were not enough resources, a consequence was recorded. Throughout the game, consequences were continuously summed to give the total number for both the individual player and the group. Each player could see both the group and their individual total. Participants were encouraged to discuss their situation (such as how many threats they faced, or how they planned to move resources), and were free to share resources or keep them for themselves. They were not, however, allowed to view each other’s computer screens. After each player made their decision, they moved their resources to meet a threat (or not), and the next round began. In this article, data from 12 rounds are analyzed. MikroRisk was tested internally by the authors and other colleagues, and an external pilot study was conducted with a group of fire and rescue service professionals. This testing enabled us to correct errors, and adjust the premises and conditions of the game before introducing it to the study’s participants.

4.2 Experimental Conditions and Participants

Participants were divided into two experimental groups. In condition A, they were told that “the goal of the game is to limit the number of consequences for the group” (a single goal, representing goal alignment). In condition B, they were told that “the goal is to minimize the number of consequences for the group while, at the same time, minimizing individual consequences” (dual goals, representing goal misalignment). The facilitator gave these instructions to participants before the game began. In this instruction, participants were also told that they needed to complete the game within one hour, in order to create some time pressure. Players were randomly assigned to groups of three, and these groups were then randomly allocated to condition A or B. The outcome was measured as the total number of consequences after completion of round 12.

Our aim was to simulate goal misalignment in condition B. Using the ideas of incentive alignment, we therefore created a context in which players were faced with a trade-off between inter- and intraorganizational goals, that is, where players were faced with misaligned inter- and intraorganizational goals, and where the incentive to focus on intraorganizational goals could override efforts to think holistically. Threats were distributed among players and rounds in such a way that players had to share resources, possibly at the expense of individual consequences, to minimize the group’s consequences. The instructions in condition B acted as an incentive, or trigger, to make this trade-off between inter- and intraorganizational goals. In theory, all threats in all rounds could be matched (0 consequences), provided the group shared resources. On the other hand, if no threats were matched during the 12 rounds, this would result in the maximum, 1,570 consequences. A realistic, worst-case estimate could be envisaged in which all players were passive. In this case, all resources are left in their original positions, leading to approximately (due to the random distribution of threats) 620 consequences for the group.

Participants in our study were recruited in two ways, resulting in a convenience sample. First, we approached a regional emergency management training facility, where we were given access to professionals. The second strategy was to approach student groups enrolled in courses connected to emergency and disaster management. This strategy resulted in three groups: one consisted of fire and rescue service commanders (24 participants), and two groups were made up of students. Fire and rescue service commanders ranged in age from 24 to 56, and there were two women. One student group was studying for an international Masters in Disaster Risk Management (aged 21 to 33, 15 women and 11 men), and the other consisted of Swedish engineering students (aged 21 to 31, 31 women and 29 men). Participation was voluntary and no compensation was offered. The distribution of participants and conditions is presented in Table 1. Informed consent was obtained from all 111 participants, and data were processed in accordance with prevailing data protection legislation, and anonymized for publication.

Table 1 Participants per experimental condition

5 Results

Given that 1,570 was the worst-case scenario, and 620 consequences reflected a laissez faire attitude, our respondents performed fairly well. The average, total number of consequences was 280 for all six combinations of condition and group. It seems that participants made an effort to achieve the first goal they were presented with—to minimize the total number of consequences for the entire group—and were committed to the game.

An analysis of variance (ANOVA) was run to investigate if the single or dual goal condition affected the results. The mean number of total consequences for each condition is shown in Table 2.

Table 2 Mean number of total consequences by experimental condition and group

The analysis revealed no statistically-significant difference between the two conditions. It seems our results are—to some extent—contrary to previous research. The findings do not seem to support the idea that goal alignment reduces consequences. Moreover, for two of the three groups, the goal alignment condition (minimizing group consequences, regardless of individual consequences) resulted in a higher number of consequences compared to the goal misalignment condition.

6 Discussion

How to improve emergency and disaster response management is a key question that is not easy to answer. Complex societal perturbations are made up of a multitude of factors that may impact outcomes and consequences. It is particularly difficult to distinguish between the direct effects of interventions (for example, resource management or goal alignment), and factors that we cannot influence (for example, the weather). Consequently, it has been difficult to identify interventions that can be scientifically proven to have positive effects on the collective response capability. Here, we argue that a combination of explanatory studies and experimental research can offer more rigorous support for aspects of effectiveness in the context of disaster response management. Although we have pointed out some key aspects of such an approach, much more work is needed to, for example, develop valid experimental models that can be used to test interventions. In the present context, the model corresponds to the computer game. In its present form, the arguments to support the use of the game as a model of response management relies on similiarity, that is that some salient features of response management are captured in the game. A much stronger argument could be built, however, if the predictions offered by a model were found to correlate with some important variable in the context of interest. This also would help strengthen scientific rigor with respect to design research in a research field mainly based on explanatory studies.

Turning to our experiment, it seems that our hypothesis (that goal alignment improves outcomes) is not supported. Our results show that performance was the same for both misaligned and aligned goals. Referring to Fig. 1, we conclude that a change in the independent variable (goal alignment) did not lead to a significant change in the dependent variable (performance). We propose two possible reasons for the nonsignificant results. The first relates to the design of our experiment, which relies on the relationship between the experimental context (our model) and the practical context. Unfortunately, little research has confirmed the relationship between experimental and practical contexts in our field; therefore, the value of our experimental results can only be established by evaluating design assumptions. We selected a few salient features of emergency and disaster response management situations, and created a trade-off situation in which we expected participants to choose between goals. There are, however, significant uncertainties regarding the validity of these assumptions. We might have overlooked other contextual factors that could affect the outcome, such as reciprocity or competition, or individual preferences and traits. The fact that the experimental context was a computer game may have failed to create the sense of urgency that is typical of real disaster situations, as also discussed in previous experimental studies in the field (Pramanik et al. 2015; Kalkman et al. 2018). Last, the trade-off between goals that we created may not have been strong enough to put pressure on the participants to act upon the intra-organizational goal.

The second reason for our nonsignificant results could simply be that our underlying hypothesis is wrong. It could be that our participants were not affected by misaligned goals but, rather, acted upon the interorganizational goal in both conditions. This suggests that the problem of misaligned goals might not, in fact, be a serious concern in the context of overall response management performance. However, many factors were not controlled. For instance, participants could have influenced each other to focus on the overall goal, and disregard conflicting goals. Relating to the experimental study conducted by Kalkman, Kerstholt, and Roelofs (2018), it could be that the participants negotiated divergent interests (that is, their intraorganizational goal) to reach collective decisions, allowing for a coordinated response to take place despite the misaligned goals.

Both of these reasons might be valid, suggesting that further inquiry is warranted. From the design science perspective, our results imply that an intervention stating that emergency and disaster response actors should mainstream their goals in order to improve the outcome would lack robust scientific support—at least in the way we have chosen to operationalize it. Put into practice, a goal alignment intervention may lead to inefficient use of resources when activities are focused on aligning goals rather than on activities that potentially give “more bang for the buck,” that is, may result in better response management outcomes.

The assumption that goal alignment improves emergency and disaster response management, and that goal alignment interventions therefore become implemented in national frameworks, seemed reasonable, given some of the current research (Narayanan and Raman 2004; Aldrich 2019). However, our results suggest that this hypothesis could be wrong, which in turn points to other strands of literature (Rosenthal et al. 1991). Confirming the disharmony in goal alignment research, our findings imply that testing of interventions is imperative to the field. Our study contributes with information to support development decisions for improving the field of emergency and disaster response management, as our results indicate that goal alignment interventions might not be a simple solution to improve collective performance in disaster management. We argue that if emergency and disaster response management interventions are developed and implemented without being empirically tested, this might worsen rather than improve response. In practice, the move from explanatory conclusions of cause and effect, to normative interventions or recommendations risk leading to ineffective, or even damaging, interventions being implemented. Our study shows the importance of questioning an explanatory hypotheses; a thorough investigation is a sound, pragmatic approach that should be encouraged. Untested hypotheses should not be transformed into practice just because they might seem reasonable. It highlights the need to be more careful when suggesting interventions intended to improve response management, due to the lack of rigorous research and robust conclusions. Clarity regarding which hypotheses are found to be valid, and which are deemed false, can help advancing scientific fields such as emergency and disaster response management.

7 Conclusions

The present article makes two contributions. First, we introduce an approach that supports the development of design knowledge in the field of emergency and disaster response management. This field is a domain that traditionally leans towards explanatory research. At the same time, professionals are asking for improvements, not an explanation of what is already happening. Our proposed approach is an attempt to meet these demands. We argue that the integration of explanatory and design approaches, notably through the use of experiments, will make it possible to investigate and propose improvements and, at the same time, increase scientific rigor. Our approach, and this empirical study, supports the argument that explanatory statements should not be developed into normative interventions without testing and evaluating them. It is, of course, too early to draw any definitive conclusions, but our empirical results suggest that the approach could be an important way to improve rigor and draw robust conclusions regarding the effectiveness of various interventions.

Second, we experimentally investigated goal alignment interventions in response management. Here, the aim was to illustrate how the general approach for generating design knowledge can be applied to a concrete problem. We illustrated how results from previous studies of goal alignment can be used as a basis for suggesting a design proposition and then test it in an experiment. The experiment involved groups of three participants playing a game designed to reproduce some salient features of real response management situations. Our results were statistically nonsignificant, indicating that performance was the same regardless of whether participants were faced with aligned or misaligned goals. More scientific evidence of the function of inverventions related to goal alignment in emergency and disaster situations are required in order to provide more concrete practical advice regarding whether it is a good intervention or not. Neverthless, the results serve as a reminder that it might be prudent to not assume that goal alignment will automatically lead to a better collective response.