The phenomenon of regression toward the mean is notoriously liable to be overlooked or misunderstood; regression fallacies are easy to commit. But even when regression phenomena are duly recognized, it remains perplexing how they can feature in explanations. This article develops a philosophical account of regression explanations as “statistically autonomous” explanations that cannot be deepened by adducing details about causal histories, even if the explananda as such are embedded in the causal structure of the world. That regression explanations have statistical autonomy was first suggested by Ian Hacking and has recently been defended and elaborated by André Ariew, Yasha Rohwer, and Collin Rice. However, I will argue that these analyses fail to capture what regression’s statistical autonomy consists in and how it sets regression explanations apart from other kinds of explanation. The alternative account I develop also shows what is amiss with a recent denial of regression’s statistical autonomy. Marc Lange has argued that facts that can be explained as regression phenomena can in principle also be explained by citing a conjunction of causal histories. The account of regression explanation developed here shows that his argument is based on a misunderstanding of the nature of statistical autonomy.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
Tax calculation will be finalised during checkout.
Note that this is not to say that the instructor must have been wrong about the effects of punishment and praise on fighter pilots. Perhaps his feedback did have the hypothesized effect on performance over and above the effect of regression toward the mean. The point is that his observations provide no evidence for this.
Although Ariew et al. (2017) present this as the final step, Rice et al. (in press) suggest that Galton’s explanatory schema included a third step: the interpretation of the modeled result as being applicable to the biological phenomenon by “justify[ing] the application of results obtained from highly idealized statistical models to real-world systems.” It is unclear what makes this final interpretation/justification step necessary, since the success of the first step already depended on the justification for the interpretation of the biological problem as a statistical one.
Letter from Francis Galton to George Darwin, 12 January 1877, Galton Papers, University College London (GALTON/3/3/7). The illustration of Galton’s two-stage quincunx that Ariew et al. (2017) include in their paper is taken from Stigler (1986), who reproduced it from Galton’s letter. ARR suggest that they are following Stigler’s analysis of how Galton explained intergenerational stability using this device. However, Stigler only asserts that Galton used it “to provide an analogue proof that a normal mixture of normal distributions was itself a normal” (Stigler 1986, p. 280–281).
Although ARR briefly discuss this experiment, they fail to appreciate its import. They take the outcome of the experiment to be supported by the simulation of two-stage quincunx, rather than the balancing quincunx. Immediately following their discussion of the two-stage quincunx, they write: “This is the same result seen in the sweet pea breeding experiment,” and “The sweet pea experiment acted exactly in the way that the [two-stage] quincunx predicts” (Ariew et al. 2017).
If the degree of deviation of offspring character values from the parental mean had varied with the parental character value, the uniform quincunxal pattern would not have been a good model of the action of family variability. Galton recognized this and reported that “if it had been otherwise, I cannot imagine, from theoretical considerations, how the typical problem could be solved” (Galton 1877, p. 291).
Galton noted that the order in which he modeled these two processes was arbitrary. It was only for modeling purposes that he needed to present the two processes as acting sequentially. Hence, the distribution in the middle is an artifact of the material simulation and has no real-world referent. It should not be mistaken for an intermediate ‘generation’.
Galton argued that the height of the mother needed to be multiplied by a factor 1,08 before taking the average. The details of this calculation and Galton’s defense of the mid-parent concept need not concern us here.
Ariew A, Rice C, Rohwer Y (2015) Autonomous-statistical explanations and natural selection. Br J Philos Sci 66(3):635–658
Ariew A, Rohwer Y, Rice C (2017) Galton, reversion and the quincunx: the rise of statistical explanation. Stud Hist Philos Sci Part C Stud Hist Philos Biol Biomed Sci 66:63–72
Galton F (1872) On blood-relationship. Proc R Soc 20:294–402
Galton F (1877) Typical laws of heredity. Proc R Inst 8:282–301
Galton F (1886) Presidential address, section H, anthropology. Rep Br Assoc Adv Sci 55:1206–1214
Galton F (1889) Natural inheritance. Macmillan and Co, London
Hacking I (1983) The autonomy of statistical law. In: Rescher N (ed) Scientific explanation and understanding. University Press of America, Lanham, pp 3–19
Hacking I (1990) The taming of chance. Cambridge University Press, Cambridge
Hacking I (1992) Statistical language, statistical truth and statistical reason: the self-authentification of a style of scientific reason. In: McMullin E (ed) The social dimension of science. University of Notre Dame Press, Notre Dame, pp 130–157
Hempel C (1965) Aspects of scientific explanation and other essays in the philosophy of science. Free Press, New York
Hotelling H (1933) Review of the triumph of mediocrity in business, by Horace Secrist. J Am Stat Assoc 28(184):463–465
Kahneman D (2012) Thinking, fast and slow. Penguin Books, London
Lange M (2013) Really statistical explanations and genetic drift. Philos Sci 80(2):169–188
Lange M (2017) Because without cause. Oxford University Press, Oxford
Lipton P (2004) Inference to the best explanation. Routledge, London
Lipton P (2009) Causation and explanation. In: Beebee H, Hitchcock C, Menzies P (eds) The Oxford handbook of causation. Oxford University Press, Oxford
Morton V, Torgerson DJ (2003) Effect of regression to the mean on decision making in health care. BMJ 326(7398):1083–1084
Nesselroade JR, Stigler SM, Baltes PB (1980) Regression toward the mean and the study of change. Psychol Bull 88(3):622–637
Rice C, Rohwer Y, Ariew A Explanatory schema and the process of model building. Synthese (in press)
Salmon W (1971) Statistical explanation. In: Salmon W (ed) Statistical explanation and statistical relevance. University of Pittsburgh Press, Pittsburgh, pp 29–87
Schall T, Smith G (2000) Do baseball players regress toward the mean? Am Stat 54(4):231
Senn S (1997) Editorial—regression to the mean. Stat Methods Med Res 6(2):99–104
Senn SJ, Collie GS (1988) Accident blackspots and the bivariate negative binomial. Traffic Eng Control 29(3):168–169
Smith G (2018) What the luck?. Bloomsbury Publishing Plc, London
Stigler SM (1986) The history of statistics: the measurements of uncertainty before 1900. Harvard University Press, Cambridge
Stigler SM (1999) Statistics on the table. Harvard University Press, Cambridge
Stigler SM (2010) Darwin, Galton and the Statistical Enlightenment. J R Stat Soc Ser A Stat Soc 173(3):469–482
I thank the audience of the Videnskabsteori Seminar at the Niels Bohr Institute and my colleagues in the Section for History and Philosophy of Science for helpful comments and suggestions. This work was supported by a Veni research Grant from the Netherlands Organisation for Scientific Research (NWO), Grant Number 275-20-060.
Conflict of interest
The author declares that he has no conflict of interest.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Witteveen, J. Regression explanation and statistical autonomy. Biol Philos 34, 51 (2019). https://doi.org/10.1007/s10539-019-9705-z
- Regression toward the mean
- Regression explanation
- Statistical autonomy
- Statistical explanation
- Regression fallacy
- Francis Galton