Abstract
Simple extrapolation is the orthodox approach to extrapolating from clinical trials in evidence-based medicine: extrapolate the relative effect size (e.g. the relative risk) from the trial unless there is a compelling reason not to do so. I argue that this method relies on a myth and a fallacy. The myth of simple extrapolation is the idea that the relative risk is a ‘golden ratio’ that is usually transportable due to some special mathematical or theoretical property. The fallacy of simple extrapolation is an unjustified argument from ignorance: we conclude that the relative effect size is transportable in the absence of evidence to the contrary. In short, simple extrapolation is a deeply problematic solution to the problem of extrapolation.
Similar content being viewed by others
Notes
An even less complicated strategy—simplest extrapolation—would let the assumption of generalizability go completely unchecked. Broadbent (2013) uses ‘simple extrapolation’ to connote this simplest approach.
Not all members of the EBM community recommend simple extrapolation. Howick et al. (2013) argue that several historical examples of failed extrapolations are enough to reject it as a robust strategy.
In math, two numbers (A, B) are ‘in the golden ratio’ if the ratio of the larger number (A) to the smaller number (B) is equal to the ratio of their sum to the larger number, or A/B = (A + B)/A = φ. The golden ratio (φ)—variously known as the ‘golden section’ or ‘divine proportion’—is a constant, approximately equal to 1.618. It is an important ratio in mathematics, but also creeps up frequently in art, architecture and nature. Analogously, in medicine we often proceed as if relative risks are golden constants.
In comparison, some clinical researchers assume that quantitative effects typically differ across patient subgroups inside or outside of a trial (Lubsen and Tijssen 1989; Bailey 1994). Salim Yusuf et al. suggest that “‘quantitative’ differences in the size of the treatment effect in different subgroups are quite likely to exist” (1984, p. 413). Meanwhile, in the first edition of Evidence-Based Medicine: How to Practice and Teach EBM, David Sackett et al. note: “this (constancy of RR) is a big assumption” (1997, p. 170).
If we construe RRs and ARRs as quantifying a change in probability of the outcome (as a ratio or as a difference, respectively), the problem does not go away because negative probabilities and probabilities greater than 100% are incoherent.
While the framework assumes determinism, we could adapt this formula to allow for indeterminism by inserting coefficients representing the probability of the outcome for each unique type of individual in the population. The RR would then represent the effect size as a change in probability of the outcome.
Kravitz et al. (2004) provide several examples of treatment effect heterogeneity produced by particular genetic, behavioural and environmental variations.
One reviewer suggests that my criticism of GG relies on an argument from ignorance: I do not know of any justification for GG; therefore, GG is not justified. My argument is rather that GG is unjustified because it must itself make unjustified assumptions: either specious assumptions about the mathematical constancy of the RR or specious theoretical assumptions about homogeneity of individual treatment effects.
Elliott Sober (2009) provides a formal probabilistic justification for these ‘absence of evidence’ arguments.
It turns out that this RR is not generalizable to the 30–49% stenosis population (Rothwell et al. 2003).
References
Bailey, K. R. (1994). Generalizing the results of randomized clinical trials. Controlled Clinical Trials, 15(1), 15–23.
Bareinboim, E., & Pearl, J. (2013). A general algorithm for deciding transportability of experimental results. Journal of Causal Inference, 1(1), 107–134.
Barnett, H. J. M., Taylor, D. W., Haynes, R. B., Sackett, D. L., Peerless, S. J., Ferguson, G. G., et al. (1991). Beneficial effect of carotid endarterectomy in symptomatic patients with high-grade carotid stenosis. New England Journal of Medicine, 325(7), 445–453.
Broadbent, A. (2013). Philosophy of epidemiology. Basingstoke: Palgrave Macmillan.
Broadbent, A. (2015). Risk relativism and physical law. Journal of Epidemiology and Community Health, 69(1), 92–94.
Cartwright, N. (2010). What are randomised controlled trials good for? Philosophical Studies, 147(1), 59–70.
Cartwright, N. (2011). A philosopher’s view of the long road from RCTs to effectiveness. Lancet, 377(9775), 1400–1401.
Cartwright, N. (2012). Will this policy work for you? Predicting effectiveness better: How philosophy helps. Philosophy of Science, 79(5), 973–989.
Dans, A. L., Dans, L. F., Guyatt, G. H., & Richardson, S. (1998). Users’ guides to the medical literature. XIV. How to decide on the applicability of clinical trial results to your patient. Journal of the American Medical Association, 279(7), 545–549.
Douglas, H. (2000). Inductive risk and values in science. Philosophy of Science, 67(4), 559–579.
ECST Collaborative Group. (1991). MRC-European-Carotid-Surgery-Trial—interim results for symptomatic patients with severe (70-99-percent) or with mild (0-29-percent) carotid stenosis. Lancet, 337(8752), 1235–1243.
Fuller, J. (2013a). Rhetoric and argumentation: How clinical practice guidelines think. Journal of Evaluation in Clinical Practice, 19(3), 433–441.
Fuller, J. (2013b). Rationality and the generalization of randomized controlled trial evidence. Journal of Evaluation in Clinical Practice, 19(4), 644–647.
Fuller, J., & Flores, L. J. (2015). The Risk GP Model: The standard model of prediction in medicine. Studies in History and Philosophy of Biological and Biomedical Sciences, 54, 49–61.
Glasziou, P. P., & Irwig, L. M. (1995). An evidence based approach to individualising treatment. British Medical Journal, 311(7016), 1356–1359.
Goulding, M. R., Rogers, M. E., & Smith, S. M. (2003). Public health and aging: Trends in aging—United States and worldwide. Journal of the American Medical Association, 289(11), 1371–1373.
Greenland, S., & Robins, J. M. (1986). Identifiability, exchangeability, and epidemiological confounding. International Journal of Epidemiology, 15(3), 413–419.
Guyatt, G., Rennie, D., Meade, M. O., & Cook, D. J. (2008). Users’ guides to the medical literature: Essentials of evidence-based clinical practice (2nd ed.). New York: McGraw-Hill Medical.
Hempel, C. G. (1965). Aspects of scientific explanation and other essays in the philosophy of science. New York: The Free Press.
Horton, R. (2000). Common sense and figures: The rhetoric of validity in medicine. Bradford Hill Memorial Lecture 1999. Statistics in Medicine, 19(23), 3149–3164.
Howick, J., Glasziou, P., & Aronson, J. K. (2013). Problems with using mechanisms to solve the problem of extrapolation. Theoretical Medicine and Bioethics, 34, 275–291.
Kravitz, R. L., Duan, N., & Braslow, J. (2004). Evidence-based medicine, heterogeneity of treatment effects, and the trouble with averages. Milbank Quarterly, 82(4), 661–687.
Lubsen, J., & Tijssen, J. G. (1989). Large trials with simple protocols: Indications and contraindications. Controlled Clinical Trials, 10(4 Suppl), 151s–160s.
Petticrew, M., & Chalmers, I. (2011). Use of research evidence in practice. Lancet, 378(9804), 1696.
Post, P. N., de Beer, H., & Guyatt, G. H. (2013). How to generalize efficacy results of randomized trials: Recommendations based on a systematic review of possible approaches. Journal of Evaluation in Clinical Practice, 19(4), 638–643.
Rothwell, P. M. (1995). Can overall results of clinical trials be applied to all patients? Lancet, 345(8965), 1616–1619.
Rothwell, P. M. (2005). External validity of randomised controlled trials: “To whom do the results of this trial apply?”. Lancet, 365(9453), 82–93.
Rothwell, P. M., Eliasziw, M., Gutnikov, S. A., Fox, A. J., Taylor, D. W., Mayberg, M. R., et al. (2003). Analysis of pooled data from the randomised controlled trials of endarterectomy for symptomatic carotid stenosis. Lancet, 361(9352), 107–116.
Rothwell, P. M., Mehta, Z., Howard, S. C., Gutnikov, S. A., & Warlow, C. P. (2005). Treating individuals 3: From subgroups to individuals: General principles and the example of carotid endarterectomy. Lancet, 365(9455), 256–265.
Sackett, D. L., Richardson, W. S., Rosenberg, W., & Haynes, R. B. (1997). Evidence-based medicine: How to practice and teach EBM. New York: Churchill Livingstone.
Schulz, U., & Rothwell, P. (2001). Sex differences in the angiographic and gross pathological appearance of carotid atherosclerotic plaques. Journal of the Neurological Sciences, 187(Suppl 1), S127.
Sheldon, T. A., Guyatt, G. H., & Haines, A. (1998). Getting-research findings into practice—When to act on the evidence. British Medical Journal, 317(7151), 139–142.
Sober, E. (2009). Absence of evidence and evidence of absence: Evidential transitivity in connection with fossils, fishing, fine-tuning, and firing squads. Philosophical Studies, 143(1), 63–90.
Steel, D. P. (2008). Across the boundaries: Extrapolation in biology and social science. Oxford: Oxford University Press.
Stegenga, J. (2015). Measuring effectiveness. Studies in History and Philosophy of Biological and Biomedical Sciences, 54, 62–71.
Straus, S. E., Glasziou, P., Richardson, W. S., & Haynes, R. B. (2011). Evidence-based medicine: How to practice and teach it (4th ed.). New York: Churchill-Livingstone.
The NNT Group. (2018). Therapy (NNT) Reviews. Retrieved from http://www.thennt.com/home-nnt/. Accessed Nov 2018.
Van Spall, H. G. C., Toren, A., Kiss, A., & Fowler, R. A. (2007). Eligibility criteria of randomized controlled trials published in high-impact general medical journals: A systematic sampling review. Journal of the American Medical Association, 297(11), 1233–1240.
Walton, D. (2006). Rules for reasoning from knowledge and lack of knowledge. Philosophia, 34(3), 355–376.
Weiss, N. S. (2006). Clinical epidemiology: The study of the outcome of illness. Oxford: Oxford University Press.
Worrall, J. (2002). What evidence in evidence-based medicine? Philosophy of Science, 69(Proceedings), S316–S330.
Yusuf, S., Collins, R., & Peto, R. (1984). Why do we need some large, simple randomized trials? Statistics in Medicine, 3(4), 409–422.
Acknowledgements
Thanks to Nancy Cartwright, Iain Chalmers, Luis Flores, Nicholas Howell, Ayelet Kuper, Mathew Mercuri, Jacob Stegenga, David Teira, Paul Thompson, and Ross Upshur for helpful and insightful feedback on earlier drafts of this paper. I am especially grateful to Mathew Mercuri for suggesting the analogy between unknown differences in extrapolation and unknown confounders in causal inference. Thanks also to the audience at the “Science and Certainty” workshop at the University of California, San Diego (2015) as well as the “Too Much Medicine” conference at Oxford University (2017) for comments and discussion of these ideas. I am thankful for funding support from the Canadian Institutes of Health Research and the McLaughlin Centre. I have no conflicts of interest to declare.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
We can model the question of whether or not to extrapolate the effect size using decision theory, by comparing the expected utility of extrapolating and intervening (EUI) with the expected utility of not extrapolating and not intervening (EU¬I). If we intervene with I, there are two possible scenarios: the intervention’s major benefit is generalizable (G), or the intervention’s major benefit is not generalizable (¬G). There is a partial expected utility associated with G (EUG), as well as a partial expected utility associated with ¬G (EU¬G). The EUG depends on the benefits and harms of the intervention, as well as their probabilities. We can predict the probability of a beneficial outcome (B) or harmful outcome (H) using the effect size and the untreated outcome probability (e.g. RR = p(B|I)/p(B|¬I), thus: *p(B|I) = RR × p(B|¬I)). If we assume one major benefit B (the primary outcome targeted by the intervention) and one major harm H (a major side effect or adverse event), then *EUG = p(B|I)u(B) + p(H|I)u(H), where p(B|I) and p(H|I) are the probabilities and u(B) and u(H) are the utilities (u(B) is positive and u(H) is negative). We can further assume that EU¬G < EUG; otherwise, there is no point in even considering intervention. Meanwhile, the total expected utility of intervening is: EUI = p(G)EUG + p(¬G)EU¬G. Substituting the previous equations for *EUG and *p(B|I):
According to decision theory, we should implement the intervention if EUI > EU¬I. Simple extrapolation implies that this will usually be the case unless compelling evidence substantially raises the p(¬G) (and thus substantially lowers EUI). However, EUI also depends on EUG and EU¬G, and thus on the benefits and harms of intervention and their effect sizes. Therefore, even if p(¬G) is low to begin with (a risky assumption), EUI might only be slightly greater than EU¬I to begin with, and raising p(¬G) by just a little might tip the balance in favour of EU¬I.
Rights and permissions
About this article
Cite this article
Fuller, J. The myth and fallacy of simple extrapolation in medicine. Synthese 198, 2919–2939 (2021). https://doi.org/10.1007/s11229-019-02255-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11229-019-02255-0