One: but not the same


Ordinary judgments about personal identity are complicated by the fact that phrases like “same person” and “different person” have multiple uses in ordinary English. This complication calls into question the significance of recent experimental work on this topic. For example, Tobia (Anal 75: 396–405, 2015) found that judgments of personal identity were significantly affected by whether the moral change described in a vignette was for the better or for the worse, while Strohminger and Nichols (Cogn 131: 159–171, 2014) found that loss of moral conscience had more of an effect on identity judgments than loss of biographical memory. In each case, however, there are grounds for questioning whether the judgments elicited in these experiments engaged a concept of numerical personal identity at all (cf. Berniūnas and Dranseika in Philos Psychol 29: 96–122, 2016; Dranseika in AJOB Neurosci 8: 184–186, 2017; Starmans and Bloom in Trends Cogn Sci 22: 566–568, 2018). In two pre-registered studies we validate this criticism while also showing a way to address it: instead of attempting to engage the concept of numerical identity through specialized language or the terms of an imaginary philosophical debate, we should consider instead how the identity of a person is described through the connected use of proper names, definite descriptions, and the personal pronouns “I”, “you”, “he”, and “she”. When the experiments above are revisited in this way, there is no evidence that the differences in question had an effect on ordinary identity judgments.

  1. We will return at the end of this paper to consider whether qualitative identity or difference is the only concept that can be expressed by the phrase “same person” in a statement like (1).

  2. To view the pre-registration visit

  3. We realized only after completing the study that there was an error in our wording of the statement in (Ci): while the vignette begins by saying that Phineas grew up in Brooklyn, this statement says instead that he was born there. A post hoc analysis revealed that overall agreement with the statement in (Ci), M = 2.21, SD = 1.53, was significantly lower than overall agreement with the statement in (Cii), M = 1.61, SD = 0.94: t(267) < .001. However, this difference is immaterial to our argument, since as we explain just below there was no effect of direction of change on agreement with either of the two statements, and this last thing is what our predictions all concerned. A similar point applies to prompt (B) in our Study 2.

  4. Notably, this exclusion did not affect our results: a post hoc analysis including all participants who finished the survey found a significant effect of direction of change on ratings of (A), F(1,289) = 12.2, p < .0001, but no significant effect of condition on ratings of the other prompts: for (B), X2(3,291) = 1.052, p = .789; for (Ci), F(1,289) = 0.216, p = .643; for (Cii), F(1,289) = 0.014, p = .903.

  5. F(1,266) = 10.31, p = .002, d = .39. For comparison, in the corresponding experiment in Tobia (2015), the mean response on an identical 7-point scale (1 = Strongly Agree With Art; 7 = Strongly Agree with Bart) was 3.26 (SD = 1.91) in the IMPROVEMENT condition, and 2.61 (SD = 1.67) in the DETERIORATION condition.

  6. Prompt (B): 93.1% of participants answered “1” in the condition of moral IMPROVEMENT and 92.0% answered “1” in the condition of moral DETERIORATION: X2(1,268) = 0.016, p = .898.

  7. Prompt (Ci): in the condition of moral IMPROVEMENT, M = 2.29, SD = 1.52; in the condition of moral DETERIORATION, M = 2.14, SD = 1.55: F(1,266) = 0.653, p = .42.

  8. Prompt (Cii): in the condition of moral IMPROVEMENT, M = 1.58, SD = 0.99; in the condition of moral DETERIORATION, M = 1.64, SD = 0.89: F(1,266) = 0.293, p = .589.

  9. Once again, this exclusion did not affect our results: a post hoc analysis including all participants who finished the survey found a significant main effect of direction of change on responses to (A), F(2,180) = 10.45, p < .0001, but no significant effect of condition on responses to the other prompts: for (B), F(2,180) = 0.811, p = .446; for (C), X2(6,183) = 4.140, p = .658.

  10. F(2,168) = 10.7, p < .001. For comparison, in the corresponding experiment in Strohminger and Nichols (2014), the mean responses on an identical 7-point scale (1 = Strongly Agree; 7 = Strongly Disagree) were 2.34 (SD = 1.43) in the UNCHANGED condition, 3.68 (SD = 1.72) in the MEMORY condition, and 4.77 (SD = 2.03) in the MORALITY condition. We thank Nina Strohminger for providing us with these statistics.

  11. F(1,114) = 21.68, p < .001, d = .86.

  12. F(1,110) = 10.01, p = .002, d = .60.

  13. F(1,112) = 2.414, p = .123, d = .29.

  14. Since our primary interest was not in the pattern of responses to question (A) we were not troubled by this lack of a statistically significant difference, but will pause to emphasize that failure to replicate a previously observed effect with a p-value of less than .05 does not amount to a non-replication of that earlier finding. While our observed effect size of d = .29 for the comparison between the MORAL and MEMORY conditions was smaller than the effect size of d = .58 observed by Strohminger and Nichols it was still not negligible, and there is no reason to see one or the other of these as the “true” size of the effect in question. (We thank Nina Strohminger for providing us with this last statistic.) Once there are further replications of these experiments, researchers should examine how many samples’ confidence intervals fall outside the range of the population effect size, rather than how many p-values are below a given threshold of statistical significance (Cumming, 2014).

  15. Statement (C): 86.0% of participants answered “1” in the UNCHANGED condition, 93.2% answered “1” in the MORALITY condition, and 90.9% answered “1” in the MEMORY condition: X2(2,171) = 1.77, p = .413.

  16. Statement (B): in the UNCHANGED condition, M = 1.23, SD = 0.80; in the MORALITY condition, M = 1.14, SD = 0.55; in the MEMORY condition, M = 1.22, SD = 0.69: F(2,168) = 0.282, p = .754.


We are grateful especially to Josh Knobe, as well as to Randy Clarke, Shaun Nichols, David Rose, Nina Strohminger, and two referees with this journal, for valuable feedback and discussion. JS’s research has been supported by an Academic Cross-Training Fellowship from the John F. Templeton Foundation, and compensation for experimental participants was provided by the Tufts University Center for Cognitive Studies. Author contributions were distributed as follows, according to the CRediT taxonomy ( Conceptualization: EL, JS, MT; Data curation: JS; Formal analysis: NB, JS; Funding acquisition: EL, JS; Investigation: NB, JS; Methodology: NB, EL, JS, MT; Project administration: JS; Visualization: NB, JS; Writing -- original draft: JS; Writing -- review & editing: NB, EL, JS, MT.

