1 Introduction

Catching as many near misses as possible is considered crucial to prevent accidents not only for on-site safety and risk management but also for organizational performance overall. It is common knowledge today that there are multiple “close calls” or “weak signals” before an accident (Turner 1976; Tamuz 1987; Reason 1990; Sitkin 1992; March et al. 1996; Weick et al. 1999; Roberto et al. 2006; Hopkins 2010). Thus, it is expected that experts and professionals are informed of those problems and that they take precautionary measures. In many cases of accidents, however, it is also reported that experts and professionals are aware of near misses but do not change a chosen course of action (e.g., Gioia 1992; Vaughan 1996; Columbia Accident Investigation Board 2003; National Commission on Terrorist Attacks upon the United States 2004; Hopkins 2010). In this article, the authors submit that knowledge of near misses whose probabilities are low actually reinforces beliefs in existing organizational routines, but so-called uncertainty does not make a difference in the reinforcement, proposing a concept of “justification shift”. Justification shift is underestimation of risks of known near misses vis-à-vis overestimation of reliability of existing routines. When justification shift occurs, decision criteria on a chosen course of action change from how a reported event is risky to how existing routines allow someone to disregard the risk.

Justification shift has three prerequisites. First, it requires near misses, or “gaps” between what is intended and what is actually occurring. The gaps signal experts and professionals that existing routines have risks that may lead to accidents. Second, however, a program, project, or activity has to go through a series of successes in spite of the gaps. Successes here are different from being without errors and problems. On the contrary, those enterprises experience issues—i.e., near misses—but the issues have contributed to organizational learning without causing disasters and demanding that experts and professionals abandon existing routines. In other words, the enterprises go through not a total failure but an “overall success” with which existing organizational routines are tested, proven or improved, and retained. Finally, third, a history of overall successes and accordingly surviving routines render “burden of proof” to those who want to change a chosen course of action based on risks signaled by the gaps. Overall successes and gaps by themselves may not lead to accidents as far as informed experts and professionals successfully prove their points with available data and change existing routines. However, with a series of overall successes, proving risks in existing routines becomes a demanding task whereas keeping the status quo becomes easier. Unless experts and professionals overcome the burden, existing routines are considered more reliable than reported cases of near misses, and hence, known risks will be disregarded.

In the process, the “void” created by lack of clearly defined requirements and sufficient data—that is, uncertainty—may aggravate the burden and make it more likely for justification shift to become worse in its frequencies and degrees. However, as explained in this article, uncertainty does not make a difference in qualities of organizational decisions under conditions that experts and professionals have to make a decision with available data from fixed probabilities of near misses.

To explore how justification shift occurs and whether uncertainty may or may not aggravate the shift, the authors have developed a model for an agent-based simulation by mainly drawing on literature about organizational learning, routines, high-reliability organizations, organizational cognition, and a classic case of the space shuttle Challenger accident. In the next section, a concept model of justification shift is developed from existing studies, and a theoretical question about uncertainty is proposed. Then the agent-based model of justification shift is explained, followed by a section about results of the simulation. In the last section, the authors discuss theoretical implications of the results and justification shift to efforts to utilize knowledge of near misses in general and future research.

2 Literature review: how justification shift may occur

In the existing literature, near misses are welcomed for a reason; they are helpful in preventing accidents by exposing hidden or unnoticed risks and by suggesting how accidents may occur. Although this benefit is true only in retrospect, researchers argue that it is a good strategy to record and study near misses for organizations and industries. In the field of safety and risk management, the near-miss management system has been an emerging tool that managers and engineers rely on to improve processes, standards, and procedures (e.g., Tamuz 1987; Phimister et al. 2003; Macrae 2010; Oktem et al. 2010). In the field of organizational management, near misses, once they occur, are not a curse that managers should never touch again but a blessing to embrace for a better future. For example, some argue that near misses are useful to supplement small-n datasets of large-scale accidents and to explain how those accidents occur (Tamuz 1987; March et al. 1996). Others even propose to intentionally cause near misses under controlled conditions so that a project can be improved by trial and error (Sitkin 1992; Harvard Business Review 2011). Researchers on high-reliability organizations argue that sharing concerns about near misses without hesitation is an essential step to avoid accidents in disaster-prone organizations, such as the military and nuclear power plants (Weick et al. 1999).

Near misses in these studies are not always severe incidents with imminent threats to human life and properties, but their message is clear. Near misses are precious learning opportunities and should not be wasted. However, it is a different story whether organizational learning occurs if their members are informed of near misses and possible risks in organizational routines. Organizational routines consist of visible and invisible factors that define how members interact among themselves and with technologies, such as culture, rules, standards, codes, procedures, and forms (Levitt and March 1988; Becker 2004; Pentland and Feldman 2005). Those routines evolve as members retain workable ones and discard unworkable ones each time they encounter uncertain situations, and thus routines are repositories of results of organizational learning (Nelson and Winter 1982; Feldman 1984; Battenhausen and Murninghan 1985; Gersick and Hackman 1990; Cohen 1996; Cohen and Bacdayan 1996; Feldman and Pentland 2003; Becker 2004; Pentland and Feldman 2005). Organizational learning occurs when members cannot attain their aspiration levels—in other words, organizational performance is below their expectations (March and Simon 1958; Levitt and March 1988). Therefore, a question is whether near misses are sufficient to motivate members to change existing routines, and existing literature provides conflicting views on this point.

A classic view on organizational routines offers that changes in the routines need external shocks, which are clear and imminent threats to organizational performance (Cyert and March 1963; Nelson and Winter 1982; Feldman 1984; Gersick and Hackman 1990). In this view, accidents and extremely severe near misses are necessary for change. A more recent view submits that members proactively change routines without those shocks by judging which routines are working well in a certain environment (Edmondson et al. 2001; Feldman 2003; Feldman and Pentland 2003; Howard-Grenville 2005; Levinthal and Rerup 2006). In this case, accidents and severe near misses are not necessary, but it may be sufficient for members to observe minor gaps between what is intended and what is actually occurring. In addition to these two views, case studies and investigation reports on large-scale accidents suggest that near misses do not lead to changes in organizational routines (e.g., Presidential Commission on the Space Shuttle Challenger Accident 1986; Starbuck and Milliken 1988; Vaughan 1996; Columbia Accident Investigation Board 2003; National Commission on Terrorist Attacks upon the United States 2004; The BP US Refineries Independent Safety Review Panel 2007; Mahler 2009; Hopkins 2010; National Commission on the BP Deepwater Horizon Oil Spill and Offshore Drilling 2011). The case of the 1986 space shuttle Challenger, among others, illustrates how near misses are insufficient for organizational learning and changes in routines.

The Challenger exploded 73 seconds after its launch on January 28, 1986, killing all seven astronauts on board and grounding all space shuttle flights for more than two years. The direct cause of the explosion was said to be the weather condition (Presidential Commission on the Space Shuttle Challenger Accident 1986). A combination of the cold temperature and strong wind tampered with the sealing capability of O-rings in joints of Solid Rocket Boosters (SRBs) attached to a propellant tank. Hot gas leaking from the SRBs encroached into the tank filled with oxygen and hydrogen, causing the explosion that disintegrated the orbiter Challenger (Presidential Commission on the Space Shuttle Challenger Accident 1986). After the accident, the National Aeronautics and Space Administration (NASA) was criticized not only because the accident occurred but also because its managers and contractor recognized problems with the O-rings but kept expanding allowable limits of acceptable risks (Presidential Commission on the Space Shuttle Challenger Accident 1986).

The SRB joints were considered a critical item whose failure would lead to loss of human life or space shuttles (Presidential Commission on the Space Shuttle Challenger Accident 1986). Thus, in-flight problems of O-rings that seal the joints could be called near misses, and before the Challenger, which was the 25th flight of the space shuttle, there were 14 flights with O-ring problems (Presidential Commission on the Space Shuttle Challenger Accident 1986:129–131). In most of them, primary O-rings closer to solid propellant eroded or experienced blow-by, but secondary ones were working to prevent hot gas from leaking from the boosters. However, in flight 51-C in 1985 with the orbiter Discovery, the primary O-ring was penetrated by soot, and the secondary ring was affected by the heat of burning propellant. These near misses triggered actions in NASA and the contractor, Morton Thiokol, Inc. (MTI). They conducted ground tests, heightened the level of criticality of SRBs as a component, set an action item for MTI, circulated memos, established a task force in MTI and searched for different designs, put a launch constraint, and so on (Presidential Commission on the Space Shuttle Challenger Accident 1986). In spite of these actions, their conclusions were always to keep the space shuttle flying with the risk of near misses, and thus, accidents. In addition, their justifications for the flights evolved as problems became worse (Presidential Commission on the Space Shuttle Challenger Accident 1986).

For example, after primary O-ring eroded, they justified subsequent flights by mentioning that the primary rings were doing their job, there were sufficient margins, or there were still secondary rings. Even after the primary O-ring was penetrated in flight 51-C, following flights were approved for the reason that the rings were exposed to hot gas for limited time and the secondary O-ring would be still working. According to the report of the Presidential Commission on the Challenger accident, “NASA and Thiokol accepted escalating risks apparently because they ‘got away with it last time’ … despite a history of persistent O-ring erosion and blow-by …” (Presidential Commission on the Space Shuttle Challenger Accident 1986:148). In short, near misses provided NASA and the contractor an excuse to proceed with known risks until the accident actually occurred instead of causing them to change their routines.

On this insufficiency of near misses to change routines, Dillon and Tinsley (2008), referring to the case of space shuttle Columbia accident in 2003, provide an explanation from the perspective of managers’ psychology. According to the article, near misses shift managers’ risk perceptions in an optimistic direction, and as managers are exposed to more cases of near misses, they become more likely to underestimate risks. For the managers, near misses in hindsight mean not danger or failure but cases of success in which actual disasters are avoided. Thus, statistical risks are not correlated with perceived risks in their mind. Since a task ends as a success (in spite of near misses) and perceived risks are low, there are no reasons to change existing routines. This explanation provides helpful insights to examine how justification shift occurs. It clearly supports that “gaps” between what is intended and what is actually occurring and a series of “overall successes” may cancel perceived impacts of near misses in individuals’ mind.

However, organizational learning and changes in routines, or lack thereof, involve actions among members as the problem of O-rings induced various actions in NASA and MTI. They discussed the problem in and beyond their organizations and tried to sell how serious the problem was, and some of them eventually tried to delay the launch of the Challenger. However, their actions, or interactions, never changed what they were doing. There seems to have been something more than individual risk perceptions in the Challenger case. Here, it is necessary to pay attention to the third prerequisite of justification shift, burden of proof.

Burden of proof is a legal concept that a claiming party is obliged to make reasonable people believe that a claim is true beyond reasonable doubts (Simon and Mahan 1971; Jeffries and Stephan 1979). For example, in criminal cases in countries where defendants are presumed to be innocent until proven guilty, prosecutors have to persuade juries and/or judges with “proof” reasonable enough to believe that the defendants commit crimes. Similar burden rests with those who want to change existing routines, which means they are responsible for providing proof from cases of near misses that existing routines do not work as intended and for making their counterparts believe that changing routines is a reasonable choice. However, the burden may work as a barrier to the changes and thus to organizational learning, especially when a project, program, or organization goes through a series of overall successes. Existing literature and the case of the Challenger suggest how the overall successes lead to the burden and the latter becomes a barrier to change existing routines.

As described in the paragraphs on the insufficiency of near misses, organizational routines evolve as organizational learning proceeds. Each time members encounter uncertain situations, they first rely on similar cases to decide what to do, and then evaluate how their choices do or do not work (Weick 1979, 1995; Battenhausen and Murninghan 1985). Workable choices are selected and retained, becoming routines, whereas unworkable ones are discarded. In the process, as Dillon and Tinsley (2008) explain, near misses are considered overall successes, not disasters or failures that demand that existing routines be discarded. Although minor tuning of the routines may occur, no drastic changes in the way of doing business follow near misses. Burden of proof emerges in parallel with this evolutionary process of organizational routines in the following manner.

First, organizational routines surviving the process become proven and reliable ones. At least, each time members learn from their choices, repositories of their learning—that is, organizational routines—become the best available to them. Second, as similar near misses are repeated without causing accidents (in other words, as an organization goes through a series of overall successes), the best available routines repeatedly prove themselves, reinforcing their reliability. It is considered that under the same condition, the routines are constantly effective. According to the literature on organizational learning (March and Simon 1958; Cyert and March 1963; Herriott et al. 1985; Levitt and March 1988; Huber 1996), repeated overall successes reinforce the choice of members to keep relying on the routines. Finally, someone who is concerned about risks that repeated near misses render has to face colleagues who rely on the best available routines. S/he has to prove that the routines are dangerous enough to persuade colleagues of the reasonable need for changes in the routines. If the burden is successfully overcome, the routines in question may be changed and no accident may occur. However, with a series of overall successes, proving risks in existing routines becomes a demanding task whereas keeping the status quo becomes easier. The case of the Challenger casts a light on how the task is difficult.

The difficulty in overcoming burden of proof is highlighted in the teleconference between NASA and MTI the night before the Challenger launch. Although the fatal launch decision was a result of actions by experts and professionals over more than five years, which some called a “normalization of deviance” (Vaughan 1996) or “fine-tuning odds” until a system breaks down (Starbuck and Milliken 1988), the mechanism of burden of proof is the same between the teleconference and the long-term process. Discussions revolve around why the best available routines do not work as expected this time although members are facing possibilities of the same near misses. In retrospect, it is clear that the weather conditions, especially the cold temperature, were deterministic in the case of the Challenger (Presidential Commission on the Space Shuttle Challenger Accident 1986), but to experts and professionals in the teleconference, they were doing what they had been doing for several years. As a result, the teleconference became a conflict between those who believed that risks that near misses suggested were high enough to delay the launch and those who believed that routines that survived with the near misses were reliable enough to launch the Challenger.

At the teleconference, the believers of the risks, who were mainly engineers of MTI but included the Vice President and Director of Engineering, recommended not launching the Challenger below 53 degrees Fahrenheit because in flight 51-C, the primary O-ring of SRB joints was penetrated at the temperature (Presidential Commission on the Space Shuttle Challenger Accident 1986). However, the believers of routines also had reasons not to support what their counterparts demanded. First, there was no requirement about the temperature for SRB joints to launch a shuttle or to cancel the launch (Presidential Commission on the Space Shuttle Challenger Accident 1986). This absence of a criterion left room for different opinions about how risky the condition was. Second, the MTI engineers could not quantify the risk of the cold temperature because available data were insufficient; there were only 24 flights before the Challenger and even fewer cases of the primary O-ring burn-through at extremely cold temperatures (Presidential Commission on the Space Shuttle Challenger Accident 1986). The lack of data added uncertainty to the demand of the believers of the risks and contributed to the room for discussions on how SRB joints were reliable under the condition. Thus, the believers of routines wanted an explanation of why the launch at the cold temperature would not work this specific time when previous launches did. In addition, the explanation had to be persuasive enough for them to give up their beliefs in the proven, best available routines.

In the conflict between the two camps, MTI switched its position from anti-launch to pro-launch. It justified the recommendation for the following reasons (Presidential Commission on the Space Shuttle Challenger Accident 1986). First, data were not conclusive enough about the relationship between the cold temperature and O-ring behaviors. Second, the secondary O-ring would work even if the primary one did not although the cold temperature might make those rings harder and slower to seal the joints. Finally, solid rocket motors would not show significantly different behaviors between 51-C and this launch of the Challenger, although O-rings of the latter might become 20 degrees colder than those of the former. In short, the believers of the risks could not persuade the believers of routines that it would be different this time. Prior cases of near misses, or a series of overall successes, were used to justify existing routines, which in this case involved launching space shuttles by expecting that the secondary rings would do their job. Burden of proof was not overcome, and the Challenger accident occurred.

The above review shows how the justification shift occurs and decision criteria on a chosen course of action change from the risk of a reported event to reliability of existing routines (Fig. 1).

Fig. 1
figure 1

A process of the justification shift

First, gaps between what is intended and what is occurring, such as near misses, are not enough by themselves to change routines. Second, each time overall successes are repeated, routines are proven reliable, and their reliability is reinforced. At the same time, burden of proof develops for those who are concerned about risks in the routines. Third, if those who have concerns fail to overcome the burden, reliabilities of the routines are overestimated whereas risks of the gaps are underestimated. At this stage, lack of clear requirements and sufficient data aggravate the burden of proof and may make justification shift more likely and more severe. In the Fig. 1, “psychological distances” from cases of near misses also contribute to reinforce reliabilities of existing routines in addition to a series of overall successes. The distances determine how directly agents experience the cases. Dillon and Tinsley (2008) suggest that the more frequently overall successes are repeated, the more likely that mangers will underestimate risks of near misses. However, behaviors of experts and professionals at the teleconference in the Challenger case suggest that the frequent exposure to near misses may not be the only cause of the underestimation.

First, most of the experts and professionals in NASA and MTI went through the same history of overall successes, and some of them supported the recommendation to delay the launch (Presidential Commission on the Space Shuttle Challenger Accident 1986). Second, the supporters initially included the management of MTI, especially the Vice President and Director of Engineering, and a manager of NASA (Presidential Commission on the Space Shuttle Challenger Accident 1986). Thus, their positions and functions did not determine the fault line between the supporters and others. Third, however, most of the MTI management was not as determined about the recommendation as its engineers were, and at the end of the teleconference, the management of MTI, including the Vice President and Director of Engineering, flipped the recommendation (Presidential Commission on the Space Shuttle Challenger Accident 1986). The clear difference between those who were tenacious and those who were not was the “distance” or, in other words, how closely and directly they monitored problems of O-rings. Engineers who had directly studied O-rings never changed their opinions (Presidential Commission on the Space Shuttle Challenger Accident 1986). In addition, although the Vice President and Director switched his position, he had been informed of for what those engineers were struggling. This episode of the teleconference also indicates that experts and professionals are more likely to believe that existing routines are risky as their psychological distances become smaller whereas they are more likely to believe that the risk is smaller as the distances become larger.

The process in Fig. 1 raises an important question on how the justification shift occurs. The lack of requirements and data is a reason why near misses are precious learning opportunities (Tamuz 1987; March et al. 1996; Weick et al. 1999; Phimister et al. 2003; Oktem et al. 2010). It is a contradiction if near misses are necessary to fill the “void” but the “void” makes near misses ineffective for experts and professionals to overcome burden of proof. How does the void make it difficult to avoid justification shift under the condition with which experts and professionals have to rely only on near misses as grounds of their concerns? So far, it is clear that accidents may eradicate the void, as NASA changed its safety routines after the Challenger, although another accident, the Columbia, cast a doubt whether the changes were truly effective (e.g., Columbia Accident Investigation Board 2003; Mahler 2009). To answer this question, the authors have developed an agent-based model and run it on a simulation toolkit, Repast Simphony. The model is explained in the next section.

3 An agent-based model of the justification shift

In the model, agents are individual experts and professionals. Figure 2 shows the relationship between the concept model in Fig. 1 and equations in the agent-based model.

Fig. 2
figure 2

Equations in the agent-based model of the justification shift

Justification shift is measured by differences between the agents’ beliefs in risks of near misses and in reliabilities of existing routines. If values of the reliability belief are larger than those of the risk belief, justification shift is considered to have occurred. In other words,

$$\mathit{JS}_{(t)} =\begin{cases} \mbox{Occurs}& \mbox{if } \mathit{RN}_{(t)i} \leq \mathit{RR}_{(t)i}\\ \mbox{Does not occur}& \mbox{if }\mathit{RN}_{(t)i} > \mathit{RR}_{(t)i}, \end{cases} $$

where JS (t) denotes justification shift at time t, RN is agents’ beliefs in risks of near misses, RR is the beliefs in reliabilities of existing routines, and i is an index for agents.

Initial values of RN and RR, which are denoted as RN (t−1)i and RR (t−1)i , depend on agents’ psychological distances from cases of near misses in addition to probabilities that events end with overall successes. The episode of the teleconference suggests that values of RN of individual experts and professionals are inverse values to the distances, or the directness with which they are exposed to cases of near misses, and values of RR are proportional to the distances. Hence, initial values of agents’ RN and RR are defined as RN (t−1)i =[Pr(NM) t−1⋅(1/D (t−1)i )]/100 and RR (t−1)i =[(1/Pr(NM) t−1)⋅D (t−1)i ]/100, where Pr(NM) is probabilities that near misses occur and events end as overall successes, D is the distances between agents and near misses, 0<Pr(NM)≤1, and D={0.1,0.2,…,1.0}. The denominators of the equations are to control the ranges of RN and RR so that their values do not explode. It is necessary to take inverse values of the probabilities of near misses in the equation for RR because otherwise, values of RR become smaller as the probabilities lower, which is illogical. The above ranges of Pr(NM) and D are set for the following reasons. First, if no near misses occur, that is, if Pr(NM) t−1=0, no justification shift follows. Second, if D=0, it means that someone is directly involved in a near miss, such as a victim of a severe incident. No justification shift follows also in this case. In the model, values of Pr(NM) are controlled for comparison across runs of the simulation, but those of D are randomly assigned to individual agents because as shown in the episode of the Challenger teleconference, none of their positions, functions, or physical distances from near misses seems to predetermine the values.

With the various values of their RN (t−1) and RR (t−1), agents negotiate for or against their beliefs. The negotiation determines how agents’ values of RN (t) and RR (t) change from those at t−1, and thus, whether justification shift occurs or not at time t. As explained in the previous section, justification shift occurs unless believers of risks overcome burden of proof. Thus, agents become the believers of risks and face the burden if their values of RN (t−1) are larger than those of RR (t−1) are. On the contrary, agents that have larger values of RR (t−1) than RN (t−1) become those who have to be persuaded by the former group of agents. In the process, the void created by lack of requirements and data may make the burden more taxing. The question is to what extent the void increases the burden.

For individual agents, values of RR (t−1) reflect how they are attached to existing routines, and those of RN (t−1) represent to what extent they believe risks of near misses. The episode of the Challenger teleconference suggests that as agents are more attached to the routines and disbelieving in the risks, they are more uncomfortable with insufficiencies of data and comfortable with lack of requirements. In negotiating their positions, the believers of risks try to sell their own values of RN (t−1) and RR (t−1), but they have to overcome the attachment and disbelief on the side of their counterparts, which are differences in RN (t−1)i RN (t−1)j and in |RR (t−1)i RR (t−1)j |, where j is an index for believers of routines. The sum of the differences is equal to burden that believers of risks face in the negotiation to persuade a believer of routines, thus, B (t−1)j =RN (t−1)i RN (t−1)j +|RR (t−1)i RR (t−1)j |, where B denotes burden of proof. However, B (t−1)j is a minimum value of the burden because it does not accommodate effects of the void created by lack of requirements and lack of data. As the case of the Challenger shows, the effects seem to be more onerous when cases of near misses are rare because infrequencies of near misses make it difficult to obtain sufficient data to prove or disprove beliefs in risks and reliabilities of routines. In other words, the effects become more cumbersome as probabilities of near misses, Pr(NM), are lower. To satisfy this condition, the equation for burden of proof is changed into AB (t−1)j =B (t−1)j +[B (t−1)j ⋅log10(1/Pr(NM) t−1)]=B (t−1)j ⋅[1+log10(1/Pr(NM) t−1)], where AB is “aggravated” burden of proof. The “aggravated” burden of proof may increase degrees of justification shift by making it more difficult for believers of risks to negotiate with believers of risks.

In the negotiation, agents make a decision as follows (Fig. 3). At the beginning of the process, agents check values of RN and RR of their own and other agents. If RN i >RR i , they are believers of risks, and they try to persuade believers of routines, that is, agents with RN j RR j . The goal of the negotiation is to change values of RN j and RR j to RN i and RR i , but believers of routines resist the attempt depending on values of their AB (t−1)j . The values of AB (t−1) determine the probability that a believer of routines accepts values of RN i and RR i of an agent in the negotiation, Pr(R ji ), with the following equation: Pr(R ji )=|log1000(1/AB (t−1)j )|. The base value of the log, 1000, has been chosen so that the range of the probability becomes 0≤Pr(R ji )≤1. Believers of risks are also vulnerable to influences of their counterparts in the negotiation, especially if they are at the border between the two camps of believers, as demonstrated by MTI’s Vice President and Director of Engineering changing his opinion about the Challenger launch. Therefore, in the model, if values of the distance of a believer of risks from near misses, D i , are smaller by 0.1 than values of the distance of a believer of routines that the former agent encounters, D j , the former accepts opinions of the latter, RN j and RR j , with the probability Pr(R ij )=|log1000(1/AB (t−1)i )|. Since values of the distance is a set of D={0.1,0.2,…,1.0}, the vulnerability emerges only for believers of risks at the border.

Fig. 3
figure 3

Agents’ decision rules

Changes in the above variables are compared across runs of the simulation to examine effects of the void on justification shift.

There are eight conditions for the comparison (Table 1). The probabilities of near misses in the table are based on hypothetical probabilities (0.042 and 0.583) and actual ones (0.150 and 1.000) of O-ring penetrations before the Challenger accident. The hypothetical probabilities are simple odds that O-ring erosions or blow-bys occurred in the history of the 24 space shuttle flights. If only the severest case of those erosions or blow-bys, flight 51-C, is considered, the odds are 0.042, which is, 1 out of 24 flights. On the contrary, if all cases are considered regardless of severity, the odds are 14/24=0.583. However, the Presidential Commission argued that if experts and professionals more seriously examined the cases of heat distress of O-rings as a function of launch temperatures, the probabilities were 1.000 (4 out of 4 flights) at temperatures below 65 degrees Fahrenheit versus 0.150 (3 out of 20 flights) above 65 degrees (Presidential Commission on the Space Shuttle Challenger Accident 1986:146). With these four different probabilities, it is possible to examine whether the launch decision would have been different if experts and professionals had been more cautious, as the Commission suggested, even when the only available sources for their discussions were cases of near misses.

Table 1 Conditions for the Simulation (agents=34)

In addition, behaviors of the variables are compared with two types of burden of proof, as shown in the far right column of Table 1. The two types of the burden are necessary to examine effects of lack of requirements and data on justification shift. If there are differences in the behaviors between a condition with minimum burden and that with aggravated one, and especially if degrees and frequencies of justification shift are larger with the latter, it is possible to say that lack of requirements and data more likely causes experts and professionals to commit underestimation of risks of known near misses vis-à-vis overestimation of reliabilities of existing routines. Otherwise, justification shift occurs with a series of overall successes, psychological distances between experts/professionals and near misses, and minimum burden of proof that is not aggravated by lack of requirements and data. Under the conditions with the minimum burden of proof, Conditions 1, 3, 5, and 7, Pr(R ji )=|log1000(1/AB (t−1)j )| is changed into Pr(R ji )=|log1000(1/B (t−1)j )|, and Pr(R ij )=|log1000(1/AB (t−1)i )| is changed into Pr(R ij )=|log1000(1/B (t−1)i )|. In the simulation, the number of agents is set to 34, which is the number of participants in the final phase of teleconference the night before the Challenger launch (Presidential Commission on the Space Shuttle Challenger Accident 1986:111).

The simulation has been run a hundred times (100 runs) for each of the eight conditions until the steady state is reached. The steady state represents agents’ decisions after they have been exposed to a series of overall successes. In order to examine effects of the void by lack of requirements and data, agents’ decisions at the steady state have been compared across the eight conditions in the following manners. First, how many believers of routines commit justification shift has been compared with how many avoid it. Second, how many believers of risks fall into the shift has been compared with how many avoid it. Finally, degrees of justification shift have been compared across conditions with the mean of differences between values of beliefs in risks of near misses (RN) and those of beliefs in existing routines (RR) before and after agents’ negotiations. These comparisons have led to surprising results and interesting insights on justification shift and agents’ decisions, as explained in the next two sections.

4 Results of the agent-based simulation on the justification shift

The most important finding from the simulation is that aggravated burden of proof does not tend to make a difference in frequencies and degrees of justification shift. Figures 4 and 5 show how many of believers of routines and those of risks fall prey to justification shift.

Fig. 4
figure 4

Justification shift among believers of routines (n=3,400 per condition)

Fig. 5
figure 5

Justification shift among believers of risks (n=3,400 per condition)

From those figures, four points instantly become clear. First, numbers of the agents that commit justification shift decrease as probabilities of near misses become larger from Condition 1 to Condition 8, and all agents eventually avoid the shift regardless of their beliefs after the probabilities become larger than 50 % under Condition 5. Second, numbers of the believers of routines also decrease as the probabilities become larger. Third, there are no major differences in the outcomes of justification shift between conditions with minimum burden of proof and aggravated burden of proof, that is, Conditions 1 and 2, 3 and 4, 5 and 6, and 7 and 8. Fourth, however, under Conditions 3 and 4, some of the believers of routines avoid justification shift whereas most of the believers of risks commit justification shift, which raises a concern. Under both conditions, only a fraction of them is as tenacious as the engineers in the teleconference for the Challenger launch decision were.

The first and second points do not seem to be surprising at all, and they show that less uncertainty, overall, may contribute to avoiding justification shift. Under Conditions 5 to 8, even believers of routines eventually and perfectly give up their beliefs and avoid justification shift. Higher probabilities of near misses under those conditions mean more cases, and thus, more data of near misses and requirements based on the data. As more data on near misses are accumulated, it becomes easier for believers of risks to refute their counterparts’ beliefs in existing routines. In addition, it is more likely that those accumulated cases evolve into stipulated requirements on what to do with the near misses, such as canceling a space shuttle launch in extremely cold temperatures. Therefore, burden of proof seems to become more easily overcome, and justification shift seems to be more likely avoidable. The two points also seem to support the argument by the Presidential Commission on the Challenger accident that if experts and professionals had more seriously examined the relationship between launch temperatures and O-ring malfunction, their decision might have been different. The probability of near misses under Conditions 7 and 8 is 1.000, based on the Commission’s finding that if the cut point of the examination had been set to 65 degrees Fahrenheit, the odds of the malfunction were 100 % below that temperature (see Table 1). With that probability, fewest agents among all eight conditions become believers of routines at the beginning, and those small numbers of believers eventually avoid justification shift.

However, the benefit of more data and the possibility of no Challenger accident seem to be overrated for two reasons. First, between two conditions with the same probabilities of near misses, aggravated burden of proof does not make a difference in frequencies and degrees of justification shift. Second, experts and professionals in the case of the Challenger focused on the effects of “cold” temperatures, concentrating on the severest case of the malfunction at 53 degrees—the launch temperature of flight 51-C—instead of the four fights below 65 degrees. As a result, the probabilities of near misses that they faced were far lower than 1.000. These reasons necessitate delving into the third and fourth points above.

The third point possibly provides an antithesis to the benefit of more data and less uncertainty. If aggravated burden of proof does not make a difference from minimum burden of proof, its lack of significance suggests that the void created by lack of requirements and data actually does not worsen qualities of agents’ decisions, even when larger uncertainty increases the burden of proof to believers of risks. In other words, less uncertainty does not necessarily increase the frequency of justification shift among them. In addition, aggravated burden of proof does not make a difference in degrees of justification shift. Table 2 shows degrees of justification shift in Conditions 1 to 4, which are represented by sample means of differences between values of beliefs in risks and those of beliefs in existing routines before and after negotiations. If the mean becomes larger, the negotiations cause agents’ belief to shift to justify existing routines. If not, the negotiations help agents to avoid the justification. Statistical parameters other than the mean are described in the Appendix (Table 3) at the end of this article.

Table 2 Degrees of justification shift (sample means)

In Table 2, the degrees of justification shift are the same between two conditions if their probabilities of near misses are same, such as Conditions 1 and 2 and Conditions 3 and 4. Thus, not only is the frequency of justification shift almost the same but also are its degrees the same regardless of burden of proof aggravated by the void in requirements and data. The same degrees of the shift suggest that more or less uncertainty does not make a difference in precisions of agent’s decisions, especially when the probability of near misses is lower than 20 %, as in the above four conditions. In short, the void caused by uncertainty does not make a difference in frequencies and precisions of agents’ decisions as far as the decisions are based on known risks of near misses.

The lack of difference implies that unless experts and professionals are facing rare and clearly problematic situations in which the same near misses repeat in every event or once in every two events, justification shift is not likely to be avoidable. It also implies that lack of requirements and data does not seem to be a reasonable excuse for experts and professionals to overestimate reliabilities of existing routines based on available data on near misses. In other words, it is probably useless to expect that another trial should generate sufficient data that may reduce uncertainty. Instead, such a trial means taking the risk that believers of risks warn without benefiting qualities of decisions. It is not worthwhile disregarding concerns of the believers of risks with the excuse that margins to keep going on remain because of incomplete requirements and insufficient data.

Conditions 3 and 4 in the above figures and table also illustrate the difficulty of both avoiding justification shift and making lack of requirements and data an excuse to keep existing routines. Under the conditions, a few percent of believers of routines and those of risks give up their beliefs. It is especially notable that more than 95 % of believers of risks commit justification shift. As an overall trend, it is clear that numbers of believers of risks become larger and the ratio of “converters” among them decreases as probabilities of near misses increase. However, experts and professionals make their decision not in the overall trend but with a certain probability of near misses, which is 0.150 in Conditions 3 and 4. The probability is based on the odds of O-ring malfunction at the launch temperature above 65 degrees—3 out of 20 space shuttle flights. If there had not been the four cases of the malfunction at the temperature below the degree, which has been used for Conditions 7 and 8 for their probability of 100 %, experts and professionals in NASA and MTI should have made their decisions based on the probability and data from those three cases.

Unfortunately, as shown in Figs. 4 and 5, decisions by the experts and professionals under the conditions would have been to go for the launch of the Challenger. Although some of believers of routines give up their beliefs, 97–98 % of agents, regardless of their beliefs, commit justification shift, overestimating reliabilities of existing routines. Numbers of believers of risks are larger than those in Conditions 1 and 2, but numbers of the converters among them are clearly far larger than those in Conditions 5 to 8. In addition, no major difference exists in the distribution of agents that commit justification shift in the two conditions in the figures and in degrees of justification shift in Table 2. Thus, believers of risks face burden of proof rendered by their counterparts’ beliefs, but their eventual choices are not influenced by lack of requirements and data, that is, uncertainty. The probability of near misses in the conditions, 0.150, means that near misses occur more than once in every ten events. It may depend on industries and situations whether the probability is considered to be low enough to neglect those cases. However, regardless of the judgment, outcomes of agents’ negotiations tend to be overestimation of reliabilities of existing routines vis-à-vis underestimation of risks of near misses.

The results from the simulation show that aggravated burden of proof, that is, the void or uncertainty created by lack of requirements and data, does not make a difference in frequency and precision in agents’ decisions to avoid justification shift. This finding is especially true when agents have to make a decision based on available data on near misses or, in other words, a certain amount of data predetermined by probabilities of cases of near misses. Under those conditions, keeping existing routines and going for another event under an excuse of uncertainty is useless in enhancing qualities of decisions and also equal to wasting warnings by believers of risks. In such a situation, uncertainty is no more than an excuse to justify overestimation of existing routines although available data suggest that those routines carry risks on which some experts and professionals raise a red flag.

Outcomes under Conditions 1 and 2 paradoxically highlight the problems of wasted warnings and justified routines. The probability of the two conditions, 0.041, hypothetically replicates the focus of the Challenger teleconference on the O-ring malfunction at 53 degrees—flight 51-C, 1 out of 24 flights. With that probability, none of the agents becomes believers of risks at the beginning, and none of them naturally gives up his or her beliefs because no negotiations occur. The extreme outcomes depict how the original recommendation of no launch by MTI was a rare case and how experts and professionals failed to embrace its value at the teleconference. According to the report of the Presidential Commission, it was actually rare that contractors recommended no launch, although they were allowed to if necessary (Presidential Commission on the Space Shuttle Challenger Accident 1986). As described in the section on the literature review, the rare warnings by MTI engineers were underestimated, and others’ beliefs in reliabilities of the O-ring, that is, those of the launch routines at the cold temperature, were overestimated. To justify the overestimation, lack of requirements and data provided believers of routines a foothold, but from the results of the simulation, the foothold seems to have been no more than an excuse to keep the launch routines.

5 Discussions and conclusion

This article is about how knowledge of low-probability near misses reinforces beliefs of experts and professionals in existing organizational routines and under what conditions underestimation of risks of known near misses vis-à-vis overestimation of the routines occurs. To explain the process, an agent-based model has been developed by drawing on the literature of organizational studies and a real-life case of the Challenger accident. The results of the simulation show that uncertainty due to lack of requirements and data does not make a difference in frequencies and degrees of underestimation of known risks of near misses vis-à-vis overestimation of reliabilities of existing routines, that is, justification shift. It is also clear from the results that as probabilities of near misses become unnaturally high, justification shift is less likely to occur.

Findings in this study leads to two noteworthy points concerning how knowledge on near misses contributes to justification shift. First, efforts to collect data on near misses and to accumulate knowledge on their cases seem to be helpful to avoid the justification shift, as the overall trend in its frequency shows in the last section. However, second, once experts and professionals have to make a decision on whether existing routines are too risky to keep them, making another trial to fill the void created by lack of clear requirements and sufficient data will not serve qualities of the decision. Frequencies and degrees of justification shift do not change even with burden of proof aggravated by lack of requirements and data—in other words, agents’ decisions are not likely to change even without the uncertainty. This point is true especially when experts and professionals have to rely on available data on near misses at a specific time of decision, that is, a fixed rate of occurrences of those near misses.

Near misses are helpful to avoid justification shift only when probabilities of near misses are higher than 50 %. Thus, reporting and recording cases of near misses so that experts and professionals can utilize the knowledge of near misses is a good practice. However, if the probabilities are high, it is natural that experts and professionals will pay attention to the problems. The issue is what occurs when the probabilities are not as high. From the results of the simulation, they fail to avoid justification shift in such a case. In this case, risky organizational routines do not change, and experts and professionals keep working with the routines until external shocks force them to abandon the routines. In the process, available data provide them not with reasons to change a course of their action but with excuses to justify it, such as that the risk does not seem to be risky enough or that there are still margins to go. In other words, organizational routines become reinforced while they have been surviving with near misses, and the reinforcement renders burden of proof to those who believe the routines are risky and want to change them.

The above points imply that in spite of benefits of the near-miss management system in general, it is another question whether the knowledge is utilized as expected by the system when experts and professionals make a specific decision on their next move with low-probability near misses. Findings in this article suggest that uncertainty in requirements and data is not the cause of the unexpected responses of experts and professionals to the knowledge, that is, justification shift. At least, as far as comparisons across the eight models in the simulation go, that uncertainty is not a problem in terms of frequency and degrees of justification shift. Beliefs of experts and professionals, whether regarding risks of near misses or reliabilities of existing routines, have been already established by probability of overall successes and psychological distances from cases of near misses. When believers of risk want to negotiate with those of routines, they face burden of proof rendered by the beliefs of their counterparts, which may lead to justification shift under some conditions. However, even if burden of proof is aggravated with uncertainty in requirements and data, outcomes of their negotiations are same; the same frequencies and degrees in underestimating risks of near misses and overestimating existing routines.

Then why is the uncertainty mentioned as a reason to make another trial with a probability of near misses, that is, risky routines as in the case of the Challenger? Existing studies provide two different explanations, data-driven culture (Roberto et al. 2006) and symbolic information gathering (Feldman and March 1981). Data-driven culture is a type of organizational culture that demands objective data to make a decision, especially for drastic changes. However, when experts and professionals have to make a decision with data available from past events, the culture puts them into a dilemma; to increase accuracy in their decision, they need certainty in the data, but to enhance the certainty in the data, they need more trials. With this circular demand, the most likely outcome is doing nothing drastic, in other words, indecision for existing routines. On the other hand, symbolic information gathering is an organizational attitude driven by a belief that gathering more information is a good sign to inside and outside of an organization. In an organization with the belief, experts and professionals tend to seek information so that they respond to incentives by the organization, simply to look for what easily catches attentions, or to show how they are rational in making their decisions. However, gathered information is not utilized for actual decision-making because the motive to collect information, at the beginning, is not to use it. Information is even presented not as it is but rather as it conforms to the belief in symbolic values of information gathering.

Findings in this article suggest that uncertainty becomes a reason for another trial because of the symbolic information gathering by experts and professionals. Since more data do not change how justification shift occurs, data-driven culture, even if such culture actually exists in an organization, is more likely to be skeletal or ritualistic. On the contrary, mentioning uncertainty as a reason not to change existing routines satisfies an appearance as a rational decision-maker, a good member who shares organizational values of information gathering, and a smart individual who can select information that deserves attention. As a result, uncertainty is presented as a reason not to stop but to keep a chosen course of action although efforts to obtain more information will actually have no influence on the decision to keep existing routines.

Finally, this study has an implication for further research on how knowledge of near misses may contribute to preventing accidents. Outcomes of this study show that individual beliefs about risks and reliabilities of existing routines may change while experts and professionals are discussing cases of near misses and those changes eventually become organizational choices. Therefore, it is necessary to explain interpersonal dynamics and collective decision-making for understanding how the knowledge helps to avoid disasters. Regardless of industries and sectors, such as healthcare, transportation, telecommunication, process, and utilities, near misses and accidents become concerns when organizations encounter the public more than when an individual takes his/her own risk. In this regard, a true test of the effectiveness of the near-miss management system is whether organizations, not individual managers or engineers, respond to the knowledge. Individual choices certainly matter, but studying the relationship between knowledge of near misses and prevention of accidents demands perspectives beyond individual perceptions, decisions, and behaviors on near misses. From these efforts, practical, useful, and effective approaches to near misses and accident prevention may also emerge.