Learning about systematic scientific misconduct is a shock to the scientific community (Stroebe, Postmes, & Spears, 2012). Uncovering research fraud typically leads to the retraction of the original article. Retractions are found in all major scientific disciplines (Grieneisen & Zhang, 2012), and although still relatively rare when compared with the overall number of scientific articles, retraction notices are clearly on the rise. Over the past decade, the number of retracted articles has increased approximately tenfold (Steen, 2011), whereas the total numbers of scientific papers has risen by only 44 % (Van Noorden, 2011). Most retracted publications appear to result from misconduct, including fraud, duplicate publication, and plagiarism, whereas error accounts for only about 21 % of retracted scientific articles (Fang, Steen, & Casadevall, 2012). But does a retraction of an original article cause readers to no longer believe in the article’s conclusions, or do they still believe in the reported findings even after the removal of their initial evidential basis? The present research examines the extent to which participants believe in the empirical findings of an original article after learning that the authors deliberately attempted to deceive, in that they fabricated their data.

Ideally, corrections ensure that people no longer believe in the perceived truth of misinformation. However, numerous studies have shown that corrections do not work as intended, in that individuals are influenced in their later judgments by misinformation even after correction. For instance, Loftus (1979) found that after witnessing an event, exposure to misleading information makes a person often report something that was only suggested. This phenomenon has been labeled the misinformation effect (for a review, see Ayers & Reder, 1998). Although the underlying mechanisms are debated (e.g., McCloskey & Zaragoza, 1985), this line of research provides clear evidence that memory impairment may ensue from misleading postevent information.

Later work has shown that even when individuals clearly understand, believe, and remember the correction, they are still influenced by the misinformation (e.g., Ecker, Lewandowsky, Swire, & Chang, 2011). That is, people draw upon misinformation in memory even though they acknowledge the information to be invalid, a phenomenon known as the continued influence effect (Johnson & Seifert, 1994). In one study (Wilkes & Leatherbarrow, 1988), participants received different pieces of information regarding a warehouse fire in progress. Later, they learned a correction of a previous piece of information, which was then followed by a memory task. Results revealed that participants who learned the corrected information were more likely to refer to the discredited material than were participants in a control condition who did not learn the initial misinformation. Moreover, the former group of participants did not significantly differ from participants who learned the information without any correction. In terms of the underlying mechanism, Wilkes and Leatherbarrow suggested that people use the information to make elaborative inferences before the correction occurs. When the correction appears, they are able to acknowledge the misinformation as being invalid, but they are unable to edit all prior inferences from memory. Overall, these studies suggest that even when individuals are aware that an original article has been retracted due to fabricated data, they should be more likely to believe in the truth of the article’s findings than should individuals who were never exposed to the article.

This hypothesis can be also derived from social cognition research showing that individuals are typically reluctant to revise initial beliefs (Greitemeyer & Schulz-Hardt, 2003; Nickerson, 1998). In fact, people persevere in beliefs even after learning that the evidence on which the beliefs were originally based was invalidated (Ross, Lepper, & Hubbard, 1975; Ross, Lepper, Strack, & Steinmetz, 1977). In one study (Ross et al., 1975), participants were presented with a task requiring them to distinguish between authentic and unauthentic suicide notes. After participants received bogus feedback indicating that they had succeeded or failed at the test, they were debriefed about the false nature of the feedback and were asked to estimate their actual performance. Despite this invalidation, participants who were provided with false success feedback still believed that they had performed better than participants who were provided with false failure feedback.

Subsequent research (Anderson, 1983; Anderson, Lepper, & Ross, 1980; Anderson, New, & Speer, 1985) has suggested that attributional processes account for belief perseverance. Once a belief is formed, people generate explanations that fit the evidence. These explanations continue to imply that the belief is correct even after exposure to evidence that invalidates the evidence once used to support one’s belief. For instance, Anderson and colleagues (1980) gave participants different case histories of firefighters that either indicated a positive relationship between risk preference and job performance or a negative relationship. Afterward, participants learned that the case histories were fictitious. Those participants who were initially led to believe that riskiness was positively associated with job performance continued to believe that high riskiness predicts job success, whereas those participants who were initially led to believe that riskiness was negatively associated with job performance continued to believe that high riskiness predicts job failure. Results further showed that the availability of causal arguments (explanation availability) was positively correlated with belief perseverance, suggesting that individuals are more likely to persist in their beliefs to the extent that there are relatively more explanations available to the individual to support the belief than to oppose the belief or to support an alternative belief. That is, after being informed that the original evidence has been discredited, the relative availability of supporting to conflicting explanations that were generated before debriefing underlies belief perseverance (Anderson et al., 1985). Anderson (1983) further showed that causal explanations were generated spontaneously and that belief perseverance lasted at least for 1 week after debriefing.

The field of psychology has recently been plagued by a number of fraud cases. Some of the best journals, such as Science, Journal of Experimental Psychology: General, Journal of Personality and Social Psychology, and Psychological Science, had to retract articles that were based on fraudulent data. But do readers take the retraction notice of an original article fully into account in that they no longer believe in the study’s conclusions? On the basis of previous research into the continued influence effect and the perseverance of beliefs, it was expected that the answer would have to be no. Concretely, it was predicted that readers would still believe in the conclusions of an article that was later known to contain fabricated data and subsequently retracted. It was further examined via mediational analyses whether causal explanations may account for participants’ persistent belief in the article’s findings.

The present research

In 2011, Sanna, Chang, Miceli, and Lundberg published an article in the Journal of Experimental Social Psychology. On the basis of the metaphorical relationship between heightened virtue and prosociality, they allegedly found that the embodiment of higher or lower elevation (e.g., such as sitting higher vs. lower) causally affects prosocial behavior. This article was later retracted, and the data were reportedly fabricated and invalid (Sanna, Chang, Miceli, & Lundberg, 2013). In the present study, some participants learned about the article’s findings, whereas others in the control condition did not. Afterward, participants in the debriefing condition were told that the article has been retracted. Participants in the no-debriefing condition and the control condition received no debriefing. It was predicted that participants in the debriefing condition would estimate the relationship between the embodiment of height and prosocial behavior to be stronger than would participants in the control condition. Moreover, participants in the debriefing condition should be more likely to generate causal explanations as to why the embodiment of height affects prosocial behavior than should participants in the control condition. These differences in the availability of causal explanations should then underlie differences in belief perseverance across the debriefing and the control conditions.

Previous research (e.g., Anderson et al., 1980) has shown that debriefing procedures do affect participants’ beliefs (although insufficiently), and thus it was predicted that participants in the no-debriefing condition would estimate the relationship between the embodiment of height and prosocial behavior to be stronger than would participants in the debriefing condition.

Method

Participants were 158 psychology students (107 female, 50 male; 1 did not respond to this item; mean age = 20.6 years, SD = 1.9) who were randomly assigned to one of three experimental conditions. There were 46 participants in the debriefing condition, 57 participants in the no-debriefing condition, and 55 participants in the control condition. Age of participants did not significantly differ across experimental conditions, F(2, 154) = 1.46, p = .235, η 2 = .02. The gender distribution was also relatively similar, χ2(2, N = 157) = 0.66, p = .717.

At the onset, participants learned that the researchers were interested in their perception of the relationship between the embodiment of height and prosocial behavior and were given some examples for elevated (e.g., riding up escalators) and lowered (e.g., riding down escalators) embodiment of height. To manipulate participants’ initial beliefs concerning the relationship between the embodiment of height and prosocial behavior, participants in the debriefing condition and the no-debriefing condition were told that research showed that the embodiment of height causally affects prosocial behavior. Note that participants did not actually read the retracted article but only learned a summary of the main findings. I will come back to this issue in the Discussion section. Participants were told that in several studies, the authors had found that (relative to a control condition) elevated physical height increased subsequent helping behavior, whereas lowered physical height decreased subsequent helping behavior. Participants in the control condition were not informed about the article. Afterward, participants were asked to estimate the true relationship between elevated physical height and prosocial behavior and lowered physical height and prosocial behavior, respectively. Both items were assessed on a scale from −5 (negative relationship) to +5 (positive relationship). After the item concerning the relationship between lowered physical height and prosocial behavior was reverse coded, both items were combined into an overall estimated initial relationship between the embodiment of height and prosocial behavior (Cronbach’s α = .86), with higher values representing a positive relationship.

Participants were then asked to note all ideas that they had about the relationship between the embodiment of height and prosocial behavior. One rater, who was blind to the experimental condition and to the hypothesis of the study, coded the number of reasons as to (1) why the embodiment of elevated height increases prosocial behavior or why the embodiment of lowered height decreases prosocial behavior and (2) why the embodiment of elevated height decreases prosocial behavior or why the embodiment of lowered height increases prosocial behavior. Explanation availability was then calculated by subtracting (2) from (1), with higher scores indicating a positive influence of the embodiment of elevated height on prosocial behavior and a negative influence of the embodiment of lowered height on prosocial behavior, respectively.

After all participants had responded to some filler items, participants in the debriefing condition learned that the article had been retracted because of fabricated data. It was further stressed that there was no scientific evidence whether the embodiment of height indeed affects prosocial behavior. The debriefing was printed in bold and enlarged font. Participants in the no-debriefing condition and the control condition did not receive the debriefing. Then all participants responded to the same two items concerning the true relationship between elevated (lowered) physical height and prosocial behavior, which were combined into an estimated overall postdebriefing relationship between the embodiment of height and prosocial behavior (Cronbach’s α = .81). After the study was over, all participants were thanked and thoroughly debriefed about the aim of the study.

Results

There were no data exclusions, and all manipulations and all measures analyzed in the experiment are reported. Participants’ sex did not affect the main dependent measure and, thus, is not considered further.

Estimated initial beliefs

The manipulation check was successful. Participants’ initial estimated beliefs about the true relationship between the embodiment of height and prosocial behavior differed across experimental conditions, F(2, 155) = 12.78, p < .001, η 2 = .14. Planned contrasts revealed that participants in the control condition estimated the relationship between the embodiment of height and prosocial behavior to be weaker than did participants in the debriefing condition and participants in the no-debriefing condition, t(155) = 5.02, p < .001, whereas the debriefing condition did not differ from the no-debriefing condition, t(155) = 0.89, p = .375. Means and standard deviations are reported in Table 1.

Table 1 Means and standard deviations (in parentheses) for estimated initial and postdebriefing beliefs concerning the relationship between the embodiment of height and prosocial behavior as a function of experimental condition

Estimated postdebriefing beliefs

Most important, participants’ postdebriefing estimated beliefs about the true relationship between the embodiment of height and prosocial behavior also differed across experimental conditions, F(2, 155) = 9.17, p < .001, η 2 = .11 (see Table 1). Planned contrasts revealed that participants in the debriefing condition estimated the relationship between the embodiment of height and prosocial behavior to be stronger than did participants in the control condition, t(155) = 2.03, p = .045. At the same time, the debriefing did affect participants’ postdebriefing beliefs, in that participants in the no-debriefing condition estimated the relationship between the embodiment of height and prosocial behavior to be stronger than did participants in the debriefing condition, t(155) = 2.04, p = .043. That is, participants did not fail to understand or believe the debriefing. They did adjust their beliefs after debriefing—but insufficiently.

Explanation availability

Explanation availability significantly differed across experimental conditions, F(2, 155) = 12.78, p < .001, η 2 = .14. Planned contrasts revealed that participants in the debriefing condition (M = 0.87, SD = 0.78) and participants in the no-debriefing condition (M = 0.61, SD = 0.73) generated more reasons why the embodiment of height causally affects prosocial behavior than did participants in the control condition (M = 0.02, SD = 1.08), t(155) = 4.92, p < .001, whereas the debriefing condition did not differ from the no-debriefing condition, t(155) = 1.47, p = .144. Explanation availability was significantly correlated with postdebriefing beliefs, r(158) = .50, p < .001.

Mediational analysis

In the following, the hypothesis was tested that explanation availability mediates the effect of the contrast between the debriefing condition and the control condition on postdebriefing beliefs. In fact, when the contrast and explanation availability were entered simultaneously, the regression equation accounted for substantial variance in postdebriefing beliefs, R 2 = .29, F(2, 98) = 20.10, p < .001. Moreover, explanation availability received a significant regression weight, t(98) = 5.94, β = .55, p < .001, whereas the contrast did not, t(98) = 0.41, β = .04, p = .686. A Sobel test was significant, Z = 3.58, p < .001, indicating that participants in the debriefing (relative to the control) condition clung to their beliefs because they were more likely to generate causal reasons as to why the embodiment of height affects prosocial behavior.

Discussion

A retraction of a scientific article aims to ensure that readers are alerted to the fact that the original article should not have been published and that its findings are not trustworthy. This mission is not easily accomplished. For one, readers of the original articles may fail to learn about the retraction notice. But even readers who are well aware that an article has been retracted may still accept the reported findings as true. In fact, the present research suggests that participants persevered in their initial beliefs of a scientific finding even after learning that the finding was based on fabricated data. Simply letting readers know that an article has been retracted does not guarantee that the message of the article goes away.

In fact, retracted articles continue to be cited. A study of 235 retracted publications in the biomedical literature over a 30-year period found that the retracted articles received more than 2,000 postretraction citations (Budd, Sievert, & Schultz, 1998). Only about 8 % of the citations acknowledged the retraction; the remaining citations implied either explicitly or implicitly that the retracted articles represent valid research. It also made almost no difference in terms of the citations received whether the cause of retraction was an error or misconduct. More recent work (Neale, Dailey, & Abrams, 2010) corroborated that most citing papers (95 %) did not indicate that the cited article was retracted. Moreover, this analysis showed that retracted articles did not have fewer citations than a comparison sample.

Mediation analyses showed that causal explanations underlie belief perseverance. The more participants generated explanations as to why the embodiment of height causally affects prosocial behavior, the more they clung to their belief even after the original evidence had been discredited. This finding may also suggest how perseverance could be decreased. In fact, increasing the availability of counterexplanations (i.e., explanations supporting competing relations) has been shown to lead to reduced perseverance (Anderson, 1982) and has been conceptually replicated in other studies (Anderson & Sechler, 1986; Lord, Lepper, & Preston, 1984). Likewise, research has shown that the provision of alternative causal information can eliminate the continued influence effect (Johnson & Seifert, 1994). Thus, a retraction of an original article that includes asking readers to consider reasons why the article’s conclusions are not valid could effectively achieve the result that the readers no longer believe in the findings reported by the original article. The journal’s editors may even suggest to readers possible reasons why a valid study might reach different conclusions than the retracted article (Slusher & Anderson, 1996).

Note that due to time considerations, participants in the experimental conditions did not read the original article but, rather, received a summary of the main findings. Of course, hearing about an article is not the same as actually reading it, so it would be informative for future research to examine the extent to which readers of an actual article still believe in the reported findings after the article was retracted. It would also be interesting to examine in future research whether the findings would be similar if the fraudulent evidence were highly plausible. As was pointed out by a reviewer, the relationship between embodiment of height and prosocial behavior seems counterintuitive. In fact, participants in the control condition who did not learn about the retracted article did not estimate the relationship between the embodiment of height and prosocial behavior to be significantly different from the scale midpoint. In contrast, in a case in which a relationship between two variables is intuitively obvious, the inherent plausibility remains even after the article is removed. Over time, that enduring plausibility might overwhelm the retraction notice, and thus it may be even harder to expunge a fraudulent article that reports plausible evidence from the literature.

It is important to note that psychology students participated in the present study. It would be interesting to examine whether academic psychologists would also suffer from clinging to believe in an original article’s finding after the article has been retracted. Nevertheless, most academic psychologists used to be psychology students. Moreover, psychology students do read articles that contain original data, so a retraction should ensure that all readers are successfully alerted to the fact that the article’s findings are not trustworthy. Hopefully, empirical work that has been retracted will not be perceived (and presented to others) as discredited but still likely to be true.