Skip to main content
Log in

Love and Power: Grau and Pury (2014) as a Case Study in the Challenges of X-Phi Replication

  • Published:
Review of Philosophy and Psychology Aims and scope Submit manuscript

Abstract

Grau and Pury (Review of Philosophy and Psychology, 5, 155–168, 2014) reported that people’s views about love are related to their views about reference. This surprising effect was however not replicated in Cova et al.’s (in press) replication study. In this article, we show that the replication failure is probably due to the replication’s low power and that a metaanalytic reanalysis of the result in Cova et al. suggests that the effect reported in Grau and Pury is real. We then report a large, highly powered replication that successfully replicates Grau and Pury 2014. This successful replication is a case study in the challenges involved in replicating scientific work, and our article contributes to the discussion of these challenges.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. In this case, the effect may exist in both the original population and the population of the replication, but may not be observed in the replication.

  2. Other aspects of the replication might explain the different results (on the importance of these considerations, see Trafimow and Earp 2016): For instance, the vignettes were translated and the participants were from a different country.

  3. The power of the follow-up study is even lower.

  4. Meta-analyses are often done on many studies, but can also be done on a few, indeed a couple of studies (Goh et al. 2016).

  5. dO differs from the effect size reported in Grau and Pury 2014 (η2 = .4 corresponding to d = .408) and used in the replication of Grau and Pury to compute the desired sample size and to assess whether the observed effect size of the replication was included in the confidence interval of the original study.

  6. Unfortunately, preregistration only included the materials used in the experiment, but not the sample size and exclusion criteria.

  7. This difference was introduced in order to maximize power while keeping the sample size manageable.

  8. A third strategy is followed by Camerer et al. (2018).

References

  • Beebe, J.R., and R.J. Undercoffer. 2015. Moral valence and semantic intuitions. Erkenntnis 80: 445–466.

  • Beebe, J.R., and R.J. Undercoffer. 2016. Individual and cross-cultural differences in semantic intuitions: New experimental findings. Journal of Cognition and Culture 6: 322–357.

  • Camerer, C.F., et al. 2018. Evaluating the replicability of social science experiments in nature and science between 2010 and 2015. Nature Human Behaviour 2: 637–644.

  • Cohen, J. 1992. A power primer. Psychological Bulletin 112: 155–159.

  • Cova, F., et al. in press. Estimating the reproducibility of experimental philosophy. Review of Philosophy and Psychology.

  • Etz, A., and J. Vandekerckhove. 2016. A Bayesian perspective on the reproducibility project: Psychology. PLoS One 11: e0149794.

  • Faul, F., E. Erdfelder, A.G. Lang, and A. Buchner. 2007. G* power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods 39: 175–191.

  • Feltz, A., and E. Cokely. 2009. Do judgments about free will and responsibility depend on who you are? Personality differences in intuitions about compatibilism and incompatibilism. Consciousness and Cognition 18: 342–350.

  • Feltz, A., and E. Cokely. 2019. Extraversion and compatibilist intuitions: A ten-year retrospective and meta-analyses. Philosophical Psychology 32: 388–403.

  • Goh, J.X., J.A. Hall, and R. Rosenthal. 2016. Mini meta-analysis of your own studies: Some arguments on why and a primer on how. Social and Personality Psychology Compass 10: 535–549.

  • Grau, C., and C.L. Pury. 2014. Attitudes towards reference and replaceability. Review of Philosophy and Psychology 5: 155–168.

  • Hannikainen, I., et al. 2019. For whom does determinism undermine moral responsibility? Surveying the conditions for free will across cultures. Frontiers in Psychology 10: 2428.

    Article  Google Scholar 

  • Knobe, J. 2019. Philosophical intuitions are surprisingly robust across demographic differences. Epistemology & Philosophy of Science 56: 29–36.

    Article  Google Scholar 

  • Knobe, J. n.d.. Difference and robustness in the patterns of philosophical intuition across demographic groups.

  • Kraut, R. 1986. Love de re. In Midwest studies in philosophy, vol. X, ed. P.A. French, T.E. Uehling, and H.K. Wettstein, 413–430. Minneapolis: University of Minnesota Press.

    Google Scholar 

  • Kripke, S. 1980. Naming and necessity. Cambridge: Harvard University Press.

    Google Scholar 

  • Machery, E. 2017. Philosophy within its proper bounds. Oxford: Oxford University Press.

    Book  Google Scholar 

  • Machery, E. (Forthcoming). What is a replication? Philosophy of Science.

  • Machery, E., R. Mallon, S. Nichols, and S.P. Stich. 2004. Semantics, cross-cultural style. Cognition 92: B1–B12.

    Article  Google Scholar 

  • Machery, E., C. Olivola, and M. De Blanc. 2009. Linguistic and metalinguistic intuitions in the philosophy of language. Analysis 69: 689–694.

    Article  Google Scholar 

  • Machery, E., M. Deutsch, J. Sytsma, R. Mallon, S. Nichols, and S.P. Stich. 2010. Semantic intuitions: Reply to lam. Cognition 117: 361–366.

    Article  Google Scholar 

  • Machery, E., J. Sytsma, and M. Deutsch. 2015. Speaker’s reference and cross-cultural semantics. In On reference, ed. A. Bianchi, 62–76. Oxford: Oxford University Press.

    Chapter  Google Scholar 

  • Makel, M.C., J.A. Plucker, and B. Hegarty. 2012. Replications in psychology research: How often do they really occur? Perspectives on Psychological Science 7: 537–542.

    Article  Google Scholar 

  • Nadelhoffer, T., T. Kvaran, and E. Nahmias. 2009. Temperament and intuition: A commentary on Feltz and Cokely. Consciousness and Cognition 18: 351–355.

    Article  Google Scholar 

  • Nagel, J. 2012. Intuitions and experiments: A defense of the case method in epistemology. Philosophy and Phenomenological Research 85: 495–527.

    Article  Google Scholar 

  • Nelson, L.D., J. Simmons, and U. Simonsohn. 2018. Psychology's renaissance. Annual Review of Psychology 69: 511–534.

    Article  Google Scholar 

  • Nozick, R. 1974. Anarchy, state, and utopia. New York: Basic Books, Inc..

    Google Scholar 

  • Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349 (6251): aac4716.

  • Schmidt, F.L. 1996. Statistical significance testing and cumulative knowledge in psychology: Implications for training of researchers. Psychological Methods 1: 115–129.

    Article  Google Scholar 

  • Schmidt, F. 2010. Detecting and correcting the lies that data tell. Perspectives on Psychological Science 5: 233–242.

    Article  Google Scholar 

  • Simmons, J.P., L.D. Nelson, and U. Simonsohn. 2011. False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science 22: 1359–1366.

    Article  Google Scholar 

  • Stich, S.P. 2013. Do different groups have different epistemic intuitions? A reply to Jennifer Nagel. Philosophy and Phenomenological Research 87: 151–178.

    Article  Google Scholar 

  • Stich, S. P., Rose, D., and Machery, E. (n.d.). Demographic differences in philosophical intuition – a reply to Knobe.

  • Sytsma, J., & Machery, E. (2010). Two conceptions of subjective experience. Philosophical studies, 151(2), 299-327.

  • Sytsma, J., J. Livengood, R. Sato, and M. Oguchi. 2015. Reference in the land of the rising sun: A cross-cultural study on the reference of proper names. Review of Philosophy and Psychology 6: 213–230.

  • Trafimow, D., and B.D. Earp. 2016. Badly specified theories are not responsible for the replication crisis in social psychology: Comment on Klein. Theory & Psychology 26: 540–548.

    Article  Google Scholar 

  • van Dongen, N., Colombo, M., Romero, F., and Sprenger, J. (n.d.). Intuitions about the reference of proper names: A meta-Analysis.

  • Weinberg, J.M. 2007. How to challenge intuitions empirically without risking skepticism. Midwest Studies in Philosophy 31: 318–343.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Edouard Machery.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

In this appendix we report a series of analysis of the data collected in our paper that extends the analyses done in Grau and Pury (2014).

Type of Replicant: Paralleling Grau and Pury (2014), the effect of type of replicant was tested with a 2 (Love for Replicant: Loved One and Pet versus Acquaintance and Shoes [Unspecified]) × 2 (Humanness of Replicant: Loved One and Acquaintance versus Pet and Shoes) × 2 (Measure of Replaceability: Appropriateness versus Likelihood) repeated measures ANOVA (Table 3). Note that, as in Grau and Pury, Love had the strongest effect (equivalent to d = 2.02, with loved targets (marginal mean = 2.93, sd = .06) rated as less replaceable than nonloved targets (marginal mean = 4.48, sd = .05). This was followed by a strong effect for Humanness (equivalent to d = 1.15) – again, with human targets (marginal mean = 3.34, sd = .06), rated as less replaceable than nonhuman targets (marginal mean = 4.06, sd = .05), modified by other smaller effects and interactions. The same pattern and approximate effect sizes remained when Linguistic Reference was added as a between participants variable. Thus, as in Grau and Pury, loved and humanness strongly predicted replaceability.

Table 3 Within Participants ANOVA results for two Type of Replicant Effects (Loved and Humanness), Measure, and their interactions

Sentimentality Unlike in Grau and Pury (2014), all participants were asked about the replaceability of a pair of shoes that were a gift from a friend (High Sentiment) and a favorite pair (Low Sentiment). Thus, we were able to test the results with a 2 (Sentiment: High versus Low) × 2 (Measure of Replaceability: Appropriateness versus Likelihood) repeated measures ANOVA. We found significant main effects of both Sentiment (F(1,714) = 12.00, p = .001, η2 = .017, equivalent d = .26) and Measure (F(1,714) = 45.32, p < .001, η2 = .060, equivalent to d = .51) that were modified by a significant interaction of Sentiment and Measure (F(1,714) = 13.37, p = .02, η2 = .018, equivalent d = .27). Participants gave their highest ratings for the appropriateness of feeling the same way about their favorite shoes (m = 4.8, sd = 1.7), next for the appropriateness of feeling the same way about gift shoes (m = 4.6, sd = 1.8), then the likelihood of feeling the same way about their favorite shoes (m = 4.5, sd = 1.8), with the lowest ratings made for the likelihood of feeling the same way about gift shoes (m = 4.4, sd = 1.8). Means for the same type of shoes differed significantly based on Measurement (minimum t(714) = 4.04, p < .001). While participants rated it as more appropriate to feel the same way about replaced favorite shoes compared to gift shoes (t(714) = 4.84, p < .001), there was no significant difference between the likelihood of feeling the same way about the different pairs (t(714) = 1.49, p = .138). These findings paralleled the between-participant findings of Grau and Pury, finding an effect of sentiment for objects on appropriateness but not likelihood ratings.

Sentimentality for pets was tested by correlating the belief that a pet is like a member of the family with likelihood of feeling the same way about a replaced pet (r = −.14, p < .001, r2 = .018, equivalent d = 0.27) and appropriateness of feeling the same way about that replacement (r = −.12, p = .002, r2 = .014, equivalent d = .24). These effects were smaller than in Grau and Pury (2014)‘s correlations of −.22 and − .21, respectively, but still statistically significant and in the same direction.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Machery, E., Grau, C. & Pury, C.L. Love and Power: Grau and Pury (2014) as a Case Study in the Challenges of X-Phi Replication. Rev.Phil.Psych. 11, 995–1011 (2020). https://doi.org/10.1007/s13164-020-00465-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13164-020-00465-x

Navigation