Literature is plagued with characters haunted by remorse, constantly ruminating on their immoral actions, from Shakespeare’s Macbeth, to Conrad’s Lord Jim, to Dostoevsky’s Raskolnikov in Crime and Punishment. These characters exemplify an all-too-human psychological experience whereby memories of unethical behaviors remain vivid, difficult to forget, ruminated upon, and frequently retrieved. Although scant, extant scientific evidence supports this observation. For instance, Evans, Ehlers, Mezey, and Clark (2007) interviewed 105 convicted criminals, half of whom suffered from repetitive intrusive memories of their crimes, and found that such memories tended to be very vivid, clear, and rich in sensory and cognitive content, with levels similar to those reported by assault and trauma victims (Ehlers, Hackmann, & Michael, 2004; Ehlers et al., 2002). More recently, Woodworth et al. (2009) asked 50 convicted murderers to reminisce about either their homicide or a past positive event, and found that memories for their criminal actions were more vivid and had more sensory components than their memories of positive events, as measured both by self-reported ratings and external coders.

These prior studies have explored differences in the phenomenological experience of remembering criminal versus noncriminal events. Although related, less is known as to whether memories for immoral actions are also more accurate than memories for morally permissible events. However, research on accuracy for emotional experiences indirectly suggests that memories for immoral actions, which are typically negative and emotionally charged (Escobedo & Adolphs, 2010; Stanley, Henne, Iyengar, Sinnott-Armstrong, & De Brigard, 2017), might be more likely to be remembered. A wealth of evidence has demonstrated that experiences evoking strong emotions are more likely to be accurately remembered than experiences that do not (Bradley, Greenwald, Petry, & Lang, 1992; Cahill & McGaugh, 1995; Kensinger, 2004). Importantly, this emotional memory enhancement effect is more pronounced for negative emotional events relative to those with positive or neutral valence (Kensinger, 2007, 2009; Kensinger & Schacter, 2006). Because unethical actions tend to elicit negative feelings (Escobedo & Adolphs, 2010; Stanley et al., 2017), it is likely that the memory accuracy advantage for negative as opposed to neutral or positive experiences carries over when such negative emotions are linked to unethical actions. Given these results, one would expect that memories for unethical actions would be better remembered—both from the point of view of phenomenology as well as accuracy—than memories for ethically permissible actions.

However, a recent article by Kouchaki and Gino (2016) challenges this expectation. The authors report results from nine studies suggesting that people develop “unethical amnesia”: the process by which memories of unethical misdeeds become—according to the authors—“less clear, less detailed, and less vivid” over time, leading people to remember unethical actions “less well” compared with ethical ones. Unfortunately, this terminology seems to confuse two related but distinct notions: one having to do with the accuracy of the memory and another having to do with the experience, or phenomenology, of remembering. But, by definition, amnesia involves at least some memory loss, meaning that accuracy must be impaired to some extent (Cohen & Eichenbaum, 1993). In memory research, ample evidence has demonstrated that higher ratings in phenomenological characteristics, such as vividness and clarity, need not translate into higher accuracy (Rubin, Schrauf, & Greenberg, 2003; Otgaar, Scoboria, & Mazzoni, 2014). Dissociations between the accuracy and phenomenological characteristics of retrieved memories have been well documented in a variety of domains, including false memories (Payne, Neuschatz, Lampinen, & Lynn, 1997; Roediger & McDermott, 1995), flashbulb and traumatic memories (Talarico & Rubin, 2007), and recognition memory (Craik, Rose, & Gopie, 2015; Voss, Baym, & Paller, 2008; Voss & Paller, 2009). This evidence overwhelmingly suggests that people can vividly remember events that did not happen, just as they can dimly remember events that did.

As such, it is worth wondering whether the effect uncovered by Kouchaki and Gino (2016) pertains to the phenomenology of remembering unethical deeds, the accuracy of those memories, or both. Prima facie, their suggestion seems to be that unethical amnesia affects both. Indeed, they suggest that it does by including one study (Study 5) using memory accuracy as a dependent variable. In their other eight studies, the authors employed as dependent variables only phenomenological measures of participants’ recollections (i.e., modified versions of the Memory Characteristics Questionnaire, Johnson, Foley, Suengas, & Raye, 1988; and the Autobiographical Memory Questionnaire; Rubin et al., 2003). For their one particular study employing an objective memory accuracy measure, Kouchaki and Gino (2016) asked participants (N = 88) to read vignettes purportedly depicting either ethical or unethical behaviors. The vignettes did not describe an event that participants had necessarily personally experienced; rather, they were asked to imagine experiencing the event described in the vignette (i.e., cheating or not cheating on a test; see Appendix A). One week later, participants were presented with a recognition memory test that included 18 statements pertaining to information in the vignettes. The authors found a significant small-to-medium size effect (p = .049, Cohen’s d = 0.43), whereby participants who read the ethical vignette remembered, on average, almost one more statement (M = 15.23) than those who read the unethical vignette (M = 14.37).

To further investigate the extent to which Kouchaki and Gino’s (2016) reported “unethical amnesia” effect pertains to memory accuracy for imagined unethical versus ethical actions, we conducted three studies. Study 1 investigates the effect of condition (imagined unethical versus ethical action) on memory accuracy by directly replicating Kouchaki and Gino’s (2016) fifth study. Study 2 includes several additional variables to help increase our chances of finding the target effect from Kouchaki and Gino’s (2016) fifth study. Study 3 utilizes a vignette depicting a different type of moral violation to ensure that any accuracy effect is not specific to the particular vignette employed by Kouchaki and Gino (2016).

Study 1

Materials and method

Participants

A total of 290 individuals were recruited to participate in this study through Amazon Mechanical Turk (AMT) and completed the first session. Participant recruitment was restricted to fluent English speakers from the United States with a prior approval rating above 85%. Two hundred and twenty-eight individuals (78.6% of those from the first session) returned for the second session one week later (Mage = 35.92 years, SD = 10.39, age range:20–69, 107 females, 121 males).Footnote 1 Following the recommendation from Simonsohn (2015), our sample size was selected to ensure that we would have 2.5 times as many participants as the original study conducted by Kouchaki and Gino (2016). Assuming an alpha level of .05, we have the statistical power at the recommended .80 level, with 228 participants to detect even a small effect (Cohen’s d = .30) of condition (ethical vs. unethical) on the number of items correctly remembered using a two-sided independent-samples t test, even with an attrition rate of 25%. All studies reported here were approved by the Duke University Campus Institutional Review Board.

Materials

The ethical and unethical vignettes used in this study were identical to those used by Kouchaki and Gino (2016), and the 18 items used in the recognition memory test were also identical to those used by Kouchaki and Gino. The vignettes describe a situation that involves being tempted to cheat on a chemistry exam. The items in the recognition memory test consisted of nine true and nine false statements that asked about details common to both ethical and unethical vignettes. All materials are provided in Appendix A.

Procedure

In the first session, participants were randomly assigned to read either an ethical or unethical version of the cheating vignette. Before reading the vignette, participants received the following instruction: “Please read the short story on the next page. While reading, please take a first-person perspective and put yourself in the position of the main character.” One week later, after receiving an e-mail reminder, participants completed the second part of the study. In this second session, participants were asked to answer 18 true or false questions about the vignettes from the first session; the order of the statements was randomized. The procedure we employed was identical to the procedure from Kouchaki and Gino (2016).

Results and discussion

The purpose of Study 1 was to attempt to replicate Kouchaki and Gino’s (2016) fifth study which, as mentioned, was their only study indexing objective memory accuracy. We found no statistically significant difference in the number of statements correctly identified between participants who read the vignette depicting the ethical behavior (n = 108, M = 12.52, SD = 2.53) versus the unethical cheating behavior (n = 120, M = 13.15, SD = 2.52), t(226) = 1.89, p = .06, 95% CI [−.03, 1.29], Cohen’s d = .26 (see Fig. 1).Footnote 2 Although it did not reach significance, this pattern of results was in the opposite direction of the one reported by Kouchaki and Gino (2016). Participants who read the vignette describing ethical behavior performed somewhat worse on the recognition memory test than participants who read the vignette describing unethical behavior. These results fail to replicate the effect reported by Kouchaki and Gino, and they cast doubt upon their claim, solely based on their fifth study, that memory accuracy is impaired for imagined unethical relative to ethical actions.

Fig. 1
figure 1

For ethical and unethical conditions in all three studies, means and standard errors for the number of items correctly remembered are depicted. Error bars indicate SEM

Study 2

There are several reasons why Study 1 may have failed to replicate those results from Kouchaki and Gino (2016). For instance, the ability to imagine the events described in the vignettes may be a necessary precondition for identifying a difference in memory accuracy between ethical and unethical behaviors. If enough participants have difficulty simulating the events described in the vignettes, then we may not truly be indexing differences in memory accuracy for imagined ethical and unethical events. Furthermore, if enough participants do not believe that the cheating behavior described in the vignettes is unethical, then we may not actually be characterizing memory accuracy between ethical and unethical actions. To extend our first study and to further investigate the possibility of a memory accuracy effect for unethical relative to ethical events, we conducted a second replication. This time, we included two additional variables as covariates to increase our chances of finding an effect of condition on memory accuracy. Specifically, participants were asked to rate their ability to simulate the events in the vignettes as well as the moral wrongness of cheating more generally.

Materials and method

Participants

As in Study 1, we recruited 290 individuals to participate in this study through AMT and completed the first session. All individuals who had participated in Study 1 were prevented from participating in Study 2. Participant recruitment was restricted to fluent English speakers from the United States with a prior approval rating above 85%. Two hundred and thirty-two individuals (80.0% of those from the first session) returned for the second session one week later (Mage = 37.62 years, SD = 11.54, age range: 19–71, 90 females, 138 males). As before, assuming an alpha level of .05, we have the statistical power at the recommended .80 level to detect even a small effect (Cohen’s d = .30) of condition on the number of items correctly remembered using a two-sided independent-samples t test, allowing for an attrition rate of 25%.

Materials

The two vignettes and the 18 items for the recognition memory test used in Study 2 were identical to those in our Study 1 and to those used by Kouchaki and Gino (2016). All materials are provided in Appendix A.

Procedure

Procedures for Study 2 were identical to Study 1, except from the addition of two new items. In the first session, after participants read either an ethical or unethical version of the cheating vignette, they also rated on a 5-point scale their ability to imagine the events in the vignettes (1 = very difficult to imagine; 5 = very easy to imagine). Then, at the end of the second session, participants also rated on a 5-point scale how morally wrong it is to cheat on an exam (1 = not at all morally wrong; 5 = extremely morally wrong).

Results and discussion

The purpose of Study 2 was to attempt to replicate Kouchaki and Gino’s (2016) fifth study once more, and also to eliminate two potential problems that could have reduced our chances of finding an effect of condition on memory accuracy in Study 1. However, we did not find a significant difference in the number of statements correctly identified between participants who read the vignette depicting the ethical behavior (n = 119, M = 13.24, SD = 2.38) versus the unethical cheating behavior (n = 113, M = 13.15, SD = 2.45), t(230) = 0.27, p > .78, 95% CI [−.71, .54], Cohen’s d = .04 (see Fig. 1).

Next, we examined memory accuracy as a result of our two additional items relating to ease of imagining and moral wrongness. Most participants thought that the events in the vignettes were easy to imagine and that cheating is morally wrong, as evidenced by distributions of both intervals being severely negatively skewed. However, it remains possible that an effect of unethical amnesia could be found only for participants who found the vignette easy to imagine, or who believed cheating to be unethical. However, isolating the subset of participants who thought that the events in the vignettes were easy to imagine (ratings of 4 and 5 on the 5-pt scale), yielded no significant difference in memory accuracy between participants who read the ethical vignette (n = 106, M = 13.42, SD = 2.32) versus the unethical vignette (n = 99, M = 13.25, SD = 2.41), t(203) = 0.49, p > .62, 95% CI [−.49, .81], Cohen’s d = .07. Similarly, isolating the subset of participants who judged cheating on exams to be morally wrong (ratings of 4 and 5 on the 5-pt scale) yielded no significant difference in the number of statements accurately recalled between participants who read the ethical vignette (n = 72, M = 12.86, SD = 2.55) versus the unethical vignette (n = 61, M = 13.38, SD = 2.22), t(131) = 1.23, p > .21, 95% CI [−.31, 1.34], Cohen’s d = .22. Lastly, we isolated the subset of participants who both thought that the events in the vignettes were easy to imagine and judged cheating on exams to be morally wrong (ratings of 4 and 5 on the 5-pt scale). As before, this did not yield a significant difference in the number of statements accurately recalled between participants who read the vignette depicting the ethical vignette (n = 65, M = 13.02, SD = 2.47) versus the unethical vignette (n = 53, M = 13.40, SD = 2.08), t(131) = 0.90, p > .37, 95% CI [−.46, 1.22], Cohen’s d = .17. Note that no matter how the subset of participants was selected for these follow-up analyses, there was still a greater number of participants in each of our follow-up analyses relative to the number of participants in Kouchaki and Gino’s (2016) entire fifth study.

Study 3

Both Studies 1 and 2 cast doubt on the claim that memory accuracy is impaired for imagined unethical actions relative to ethical actions. To ensure that these findings are not specific to just one vignette describing one type of moral violation, the procedure of Study 3 is identical to the one from Study 2, but Study 3 uses a new vignette describing a different moral violation.

Materials and method

Participants

As in our prior two studies, 290 individuals were recruited to participate in this study through AMT and completed the first session. All individuals who had participated in Studies 1 and 2 were automatically prevented from participating in Study 3. Participant recruitment was restricted to fluent English speakers from the United States with a prior approval rating above 85%. Two hundred and twenty-eight individuals (78.62% of those from the first session) returned for the second session one week later (Mage = 37.64 years, SD = 11.15, age range: 20–68, 119 females, 104 males). As in the prior two studies, assuming an alpha level of .05, we have the statistical power at the recommended .80 level to detect even a small effect (Cohen’s d = .30) of condition on the number of items correctly remembered using a two-sided independent-samples t test, allowing for an attrition rate of 25%.

Materials

The ethical and unethical vignettes used in Study 3 and the 18 items used in the recognition memory test are provided in Appendix B. This time, the first-person vignette describes a driver who accidentally backs into a parked car in a parking lot. In the ethical condition, the driver leaves a note on the windshield of the damaged car with contact information, whereas in the unethical condition, the driver drives away without leaving a note. These vignettes were written to be similar in tone, structure, and length to the original cheating vignettes used by Kouchaki and Gino (2016). Mirroring Studies 1 and 2, the items in the recognition memory test consisted of nine true and nine false statements that asked about details common to both ethical and unethical vignettes.

Procedure

The procedure in Study 3 is the same as the procedure in Study 2, with the difference between the two studies being the content of the vignettes and the recognition memory test items.

Results and discussion

The purpose of Study 3 is to attempt to conceptually replicate our findings from Study 2 using a different type of moral violation. We did not find a significant difference in the number of statements correctly identified between participants who read the vignette depicting the ethical behavior (n = 117, M = 12.68, SD = 2.66) versus the unethical cheating behavior (n = 111, M = 13.11, SD = 2.38), t(226) = 1.27, p > .20, 95% CI [−.24, 1.08], Cohen’s d = .17 (see Fig. 1).

As in Study 2, we examined potential differences in memory accuracy within subsets of our sample by isolating participants who (1) thought the events in the vignettes were easy to imagine (ratings of 4 and 5 on the 5-pt scale), (2) believed it is morally wrong to damage another car and drive off without leaving a note (ratings of 4 and 5’ on the 5-pt scale), and (3) found the vignettes easy to imagine and believed the scenario descried is morally wrong. As in Study 2, regardless of condition (ethical vs. unethical), most participants thought (1) that the events in the vignettes were easy to imagine and (2) that it is morally wrong to damage another car and just drive off without leaving any contact information.

Isolating the subset of participants who thought that the events in the vignettes were easy to imagine (ratings of 4 and 5 on the 5-pt scale), there was still no significant difference in the number of statements correctly identified between participants who read the vignette depicting the ethical behavior (n = 97, M = 13.05, SD = 2.57) versus the unethical cheating behavior (n = 100, M = 13.19, SD = 2.45),; t(195) =.39, p > .69, 95% CI [−.57, .85], Cohen’s d = .06. Furthermore, isolating the subset of participants who thought that damaging another car and driving off without leaving any contact information is morally wrong (ratings of 4 and 5 on the 5-pt scale), there was still no significant difference in the number of statements correctly identified between participants who read the vignette depicting the ethical behavior (n = 98, M = 12.80, SD = 2.66) versus the unethical behavior (n = 94, M = 13.29, SD = 2.33), t(190) = 1.36, p > .17, 95% CI [−.22, 1.20], Cohen’s d = .20. Finally, isolating the subset of participants who both thought that the events in the vignettes were easy to imagine and thought that damaging another car and driving off without leaving any contact information is morally wrong (ratings of 4 and 5 on the 5-pt scale), there was still no significant difference in the number of statements correctly identified between participants who read the vignette depicting the ethical behavior (n = 82, M = 13.10, SD = 2.62) versus the unethical cheating behavior (n = 89, M = 13.30, SD = 2.37), t(169) = .54, p > .58, 95% CI [−.55, .96], Cohen’s d = .08. Note that no matter how the subset of participants was selected for these follow-up analyses, there was still a greater number of participants in each of our follow-up analyses relative to the number of participants in Kouchaki and Gino’s (2016) entire fifth study.

General discussion

In their thought-provoking paper, Kouchaki and Gino (2016) reported results from nine studies suggesting, according to the authors, that people develop “unethical amnesia,” which they characterize as “impaired,” “worse,” or “obfuscated” memory for unethical relative to ethical actions. However, their description of the results blurs a critical distinction between the phenomenology of our recollective experience and the accuracy of the retrieved memorial content. As mentioned, a wealth of evidence from several lines of research demonstrates that lower ratings in phenomenological characteristics (e.g., vivacity) do not necessarily result in lower memory accuracy, and vice versa (e.g., Otgaar et al., 2014; Roediger & McDermott, 1995; Talarico & Rubin, 2007; Voss et al., 2008). By definition, however, amnesia involves memory loss: A failure to accurately remember at least some information of past events is a necessary condition for identifying a case as one of amnesia. But only one of the nine studies reported by Kouchaki and Gino (2016) provides an objective measure of memory accuracy that could be used to identify a possible amnesia effect. We have reported the results of three studies aimed at replicating and extending the only study conducted by Kouchaki and Gino (2016) that used a memory accuracy measure, as opposed to self-reported measures of phenomenology.

Nevertheless, we were unable to directly or conceptually replicate Kouchaki and Gino’s (2016) memory accuracy effect. Participants in our studies did not show a memory disadvantage for details of the unethical relative to the ethical vignette, despite having sample sizes in each of our studies more than 2.5 times as large as the sample size from Kouchaki and Gino (2016).Footnote 3 In fact, in our first study (the attempted direct replication of Kouchaki and Gino’s, 2016, fifth study), the pattern of results was in the opposite direction as the one reported by Kouchaki and Gino (2016). Although the results were not statistically significant (p = .06), participants who read the ethical vignette performed somewhat worse on the recognition memory test than those who read the unethical vignette. In our other two studies, however, memory performance did not differ as a function of whether participants were assigned to the ethical or unethical condition. In Kouchaki and Gino’s (2016) only study that measured objective memory performance on the recognition test, it was unclear whether participants could reasonably simulate the events described in the vignettes or whether participants thought that cheating on an exam was unethical. Even after investigating whether their finding depended on participants’ ability to simulate the events or the perceived moral wrongness of the behaviors described, there was still no memory accuracy advantage for ethical relative to unethical actions.

It is also important to stress that the nine studies reported by Kouchaki and Gino (2016) investigated two different kinds of memories: those that involved imagined events and those that involved actually experienced events. Their one study that obtained a measure of memory accuracy only indexed recognition performance for imagined events. Our findings cast doubt on an unethical amnesia effect for imagined events specifically, so the lack of an unethical amnesia effect in our results does not necessarily generalize to actually experienced events. However, converging results from other lines of research cast doubt on the possibility that there is unethical amnesia involving the accuracy of remembered events that were actually experienced (i.e., not merely imagined). For instance, emotional memory enhancement effects show that people are better at accurately remembering details from negative emotional events than positive or neutral ones (Kensinger, 2007, 2009; Kensinger & Schacter, 2006). Given that memories of unethical actions tend to be both emotionally charged and more negative than memories of ethical actions, one might reasonably predict that memories for details of previously experienced unethical actions are more accurate than memories for details of previously experienced ethical actions. Taking our reported results and existing research on emotional memory enhancement together, we do not believe that there is, at present, compelling evidence for an “amnesia” effect, which is a serious cognitive impairment that renders individuals incapable of retrieving many (if any at all) details of past experiences.

Despite considerable evidence that negative, charged, and personally significant events tend to be better remembered than positive, neutral, or nonsignificant ones (Baumeister, Bratslavsky, Finkenauer, & Vohs, 2001), there are still some documented memory biases that could, in principle, reduce retrieval accuracy for some unethical actions relative to ethical ones. For example, research on self-enhancement and self-protection motivations suggests that people tend to remember certain kinds of positive information about themselves better than negative information about themselves (Alicke & Sedikides, 2009; Baumeister et al., 2001; Sedikides & Green, 2000). Relatedly, for memories of lying to or emotionally harming others, people judge their own past behaviors as less morally wrong and less negative than those in which other people lied to or emotionally harmed them (Stanley et al., 2017). The mechanisms responsible for motivated forgetting may, under some circumstances, enable people to forget undesirable details of past actions (Anderson & Hanslmayr, 2014). And people tend to forget the unethical actions of third parties when they benefit from those actions (Bell, Schain, & Echterhoff, 2014; Reczek, Irwin, Zane, & Ehrich, 2017). It is possible, therefore, that our memory for some moral violations could tap into these self-serving biases and thus reduce the accuracy of recollections for certain details. Further research is needed to explore the extent to which memory biases influence how accurately personal immoral behaviors are recalled.

We believe that our failed replications cast doubt on the reality of the effect uncovered by the only study Kouchaki and Gino (2016) reported with an accuracy measure. But what can we say about the phenomenology of ethical relative to unethical remembered events? Contrary to Kouchaki and Gino’s (2016) postulated “unethical amnesia” for phenomenological characteristics, many prior studies actually suggest that committing serious moral transgressions elicits vivid, detailed, and highly emotional unwanted memories, tantamount to those elicited by serious traumatic events (Cima & van Oorsouw, 2013; Ehlers et al., 2004; Ehlers et al., 2004; Evans et al., 2007; Scott, 2012; Woodworth et al., 2009). Given these prior findings, perhaps the phenomenological component of the purported unethical amnesia effect reported by Kouchaki and Gino (2016) is strictly confined to less serious moral violations, or perhaps only to cheating scenarios, which is the only type of unethical action they tested. While our data do not directly speak to the phenomenology effects reported by Kouchaki and Gino (2016), we do believe that there is a need for further research on both the phenomenology and accuracy of remembered (im)moral actions and decisions.