The myth and fallacy of simple extrapolation in medicine

Abstract

Simple extrapolation is the orthodox approach to extrapolating from clinical trials in evidence-based medicine: extrapolate the relative effect size (e.g. the relative risk) from the trial unless there is a compelling reason not to do so. I argue that this method relies on a myth and a fallacy. The myth of simple extrapolation is the idea that the relative risk is a ‘golden ratio’ that is usually transportable due to some special mathematical or theoretical property. The fallacy of simple extrapolation is an unjustified argument from ignorance: we conclude that the relative effect size is transportable in the absence of evidence to the contrary. In short, simple extrapolation is a deeply problematic solution to the problem of extrapolation.

This is a preview of subscription content, access via your institution.

Fig. 1

Notes

  1. 1.

    An even less complicated strategy—simplest extrapolation—would let the assumption of generalizability go completely unchecked. Broadbent (2013) uses ‘simple extrapolation’ to connote this simplest approach.

  2. 2.

    Not all members of the EBM community recommend simple extrapolation. Howick et al. (2013) argue that several historical examples of failed extrapolations are enough to reject it as a robust strategy.

  3. 3.

    In math, two numbers (A, B) are ‘in the golden ratio’ if the ratio of the larger number (A) to the smaller number (B) is equal to the ratio of their sum to the larger number, or A/B = (A + B)/A = φ. The golden ratio (φ)—variously known as the ‘golden section’ or ‘divine proportion’—is a constant, approximately equal to 1.618. It is an important ratio in mathematics, but also creeps up frequently in art, architecture and nature. Analogously, in medicine we often proceed as if relative risks are golden constants.

  4. 4.

    In comparison, some clinical researchers assume that quantitative effects typically differ across patient subgroups inside or outside of a trial (Lubsen and Tijssen 1989; Bailey 1994). Salim Yusuf et al. suggest that “‘quantitative’ differences in the size of the treatment effect in different subgroups are quite likely to exist” (1984, p. 413). Meanwhile, in the first edition of Evidence-Based Medicine: How to Practice and Teach EBM, David Sackett et al. note: “this (constancy of RR) is a big assumption” (1997, p. 170).

  5. 5.

    If we construe RRs and ARRs as quantifying a change in probability of the outcome (as a ratio or as a difference, respectively), the problem does not go away because negative probabilities and probabilities greater than 100% are incoherent.

  6. 6.

    While the framework assumes determinism, we could adapt this formula to allow for indeterminism by inserting coefficients representing the probability of the outcome for each unique type of individual in the population. The RR would then represent the effect size as a change in probability of the outcome.

  7. 7.

    Kravitz et al. (2004) provide several examples of treatment effect heterogeneity produced by particular genetic, behavioural and environmental variations.

  8. 8.

    One reviewer suggests that my criticism of GG relies on an argument from ignorance: I do not know of any justification for GG; therefore, GG is not justified. My argument is rather that GG is unjustified because it must itself make unjustified assumptions: either specious assumptions about the mathematical constancy of the RR or specious theoretical assumptions about homogeneity of individual treatment effects.

  9. 9.

    Elliott Sober (2009) provides a formal probabilistic justification for these ‘absence of evidence’ arguments.

  10. 10.

    It turns out that this RR is not generalizable to the 30–49% stenosis population (Rothwell et al. 2003).

  11. 11.

    Steel (2008) as well as Elias Bareinboim and Judea Pearl (2013) offer theories of extrapolation using structural models. Cartwright formulates sufficient conditions for external validity using the probabilistic theory of causality (2010) as well as causal equations (2012).

References

  1. Bailey, K. R. (1994). Generalizing the results of randomized clinical trials. Controlled Clinical Trials, 15(1), 15–23.

    Article  Google Scholar 

  2. Bareinboim, E., & Pearl, J. (2013). A general algorithm for deciding transportability of experimental results. Journal of Causal Inference, 1(1), 107–134.

    Article  Google Scholar 

  3. Barnett, H. J. M., Taylor, D. W., Haynes, R. B., Sackett, D. L., Peerless, S. J., Ferguson, G. G., et al. (1991). Beneficial effect of carotid endarterectomy in symptomatic patients with high-grade carotid stenosis. New England Journal of Medicine, 325(7), 445–453.

    Article  Google Scholar 

  4. Broadbent, A. (2013). Philosophy of epidemiology. Basingstoke: Palgrave Macmillan.

    Google Scholar 

  5. Broadbent, A. (2015). Risk relativism and physical law. Journal of Epidemiology and Community Health, 69(1), 92–94.

    Article  Google Scholar 

  6. Cartwright, N. (2010). What are randomised controlled trials good for? Philosophical Studies, 147(1), 59–70.

    Article  Google Scholar 

  7. Cartwright, N. (2011). A philosopher’s view of the long road from RCTs to effectiveness. Lancet, 377(9775), 1400–1401.

    Article  Google Scholar 

  8. Cartwright, N. (2012). Will this policy work for you? Predicting effectiveness better: How philosophy helps. Philosophy of Science, 79(5), 973–989.

    Article  Google Scholar 

  9. Dans, A. L., Dans, L. F., Guyatt, G. H., & Richardson, S. (1998). Users’ guides to the medical literature. XIV. How to decide on the applicability of clinical trial results to your patient. Journal of the American Medical Association, 279(7), 545–549.

    Article  Google Scholar 

  10. Douglas, H. (2000). Inductive risk and values in science. Philosophy of Science, 67(4), 559–579.

    Article  Google Scholar 

  11. ECST Collaborative Group. (1991). MRC-European-Carotid-Surgery-Trial—interim results for symptomatic patients with severe (70-99-percent) or with mild (0-29-percent) carotid stenosis. Lancet, 337(8752), 1235–1243.

    Article  Google Scholar 

  12. Fuller, J. (2013a). Rhetoric and argumentation: How clinical practice guidelines think. Journal of Evaluation in Clinical Practice, 19(3), 433–441.

    Article  Google Scholar 

  13. Fuller, J. (2013b). Rationality and the generalization of randomized controlled trial evidence. Journal of Evaluation in Clinical Practice, 19(4), 644–647.

    Article  Google Scholar 

  14. Fuller, J., & Flores, L. J. (2015). The Risk GP Model: The standard model of prediction in medicine. Studies in History and Philosophy of Biological and Biomedical Sciences, 54, 49–61.

    Article  Google Scholar 

  15. Glasziou, P. P., & Irwig, L. M. (1995). An evidence based approach to individualising treatment. British Medical Journal, 311(7016), 1356–1359.

    Article  Google Scholar 

  16. Goulding, M. R., Rogers, M. E., & Smith, S. M. (2003). Public health and aging: Trends in aging—United States and worldwide. Journal of the American Medical Association, 289(11), 1371–1373.

    Article  Google Scholar 

  17. Greenland, S., & Robins, J. M. (1986). Identifiability, exchangeability, and epidemiological confounding. International Journal of Epidemiology, 15(3), 413–419.

    Article  Google Scholar 

  18. Guyatt, G., Rennie, D., Meade, M. O., & Cook, D. J. (2008). Users’ guides to the medical literature: Essentials of evidence-based clinical practice (2nd ed.). New York: McGraw-Hill Medical.

    Google Scholar 

  19. Hempel, C. G. (1965). Aspects of scientific explanation and other essays in the philosophy of science. New York: The Free Press.

    Google Scholar 

  20. Horton, R. (2000). Common sense and figures: The rhetoric of validity in medicine. Bradford Hill Memorial Lecture 1999. Statistics in Medicine, 19(23), 3149–3164.

    Article  Google Scholar 

  21. Howick, J., Glasziou, P., & Aronson, J. K. (2013). Problems with using mechanisms to solve the problem of extrapolation. Theoretical Medicine and Bioethics, 34, 275–291.

    Article  Google Scholar 

  22. Kravitz, R. L., Duan, N., & Braslow, J. (2004). Evidence-based medicine, heterogeneity of treatment effects, and the trouble with averages. Milbank Quarterly, 82(4), 661–687.

    Article  Google Scholar 

  23. Lubsen, J., & Tijssen, J. G. (1989). Large trials with simple protocols: Indications and contraindications. Controlled Clinical Trials, 10(4 Suppl), 151s–160s.

    Article  Google Scholar 

  24. Petticrew, M., & Chalmers, I. (2011). Use of research evidence in practice. Lancet, 378(9804), 1696.

    Article  Google Scholar 

  25. Post, P. N., de Beer, H., & Guyatt, G. H. (2013). How to generalize efficacy results of randomized trials: Recommendations based on a systematic review of possible approaches. Journal of Evaluation in Clinical Practice, 19(4), 638–643.

    Article  Google Scholar 

  26. Rothwell, P. M. (1995). Can overall results of clinical trials be applied to all patients? Lancet, 345(8965), 1616–1619.

    Article  Google Scholar 

  27. Rothwell, P. M. (2005). External validity of randomised controlled trials: “To whom do the results of this trial apply?”. Lancet, 365(9453), 82–93.

    Article  Google Scholar 

  28. Rothwell, P. M., Eliasziw, M., Gutnikov, S. A., Fox, A. J., Taylor, D. W., Mayberg, M. R., et al. (2003). Analysis of pooled data from the randomised controlled trials of endarterectomy for symptomatic carotid stenosis. Lancet, 361(9352), 107–116.

    Article  Google Scholar 

  29. Rothwell, P. M., Mehta, Z., Howard, S. C., Gutnikov, S. A., & Warlow, C. P. (2005). Treating individuals 3: From subgroups to individuals: General principles and the example of carotid endarterectomy. Lancet, 365(9455), 256–265.

    Article  Google Scholar 

  30. Sackett, D. L., Richardson, W. S., Rosenberg, W., & Haynes, R. B. (1997). Evidence-based medicine: How to practice and teach EBM. New York: Churchill Livingstone.

    Google Scholar 

  31. Schulz, U., & Rothwell, P. (2001). Sex differences in the angiographic and gross pathological appearance of carotid atherosclerotic plaques. Journal of the Neurological Sciences, 187(Suppl 1), S127.

    Google Scholar 

  32. Sheldon, T. A., Guyatt, G. H., & Haines, A. (1998). Getting-research findings into practice—When to act on the evidence. British Medical Journal, 317(7151), 139–142.

    Article  Google Scholar 

  33. Sober, E. (2009). Absence of evidence and evidence of absence: Evidential transitivity in connection with fossils, fishing, fine-tuning, and firing squads. Philosophical Studies, 143(1), 63–90.

    Article  Google Scholar 

  34. Steel, D. P. (2008). Across the boundaries: Extrapolation in biology and social science. Oxford: Oxford University Press.

    Google Scholar 

  35. Stegenga, J. (2015). Measuring effectiveness. Studies in History and Philosophy of Biological and Biomedical Sciences, 54, 62–71.

    Article  Google Scholar 

  36. Straus, S. E., Glasziou, P., Richardson, W. S., & Haynes, R. B. (2011). Evidence-based medicine: How to practice and teach it (4th ed.). New York: Churchill-Livingstone.

    Google Scholar 

  37. The NNT Group. (2018). Therapy (NNT) Reviews. Retrieved from http://www.thennt.com/home-nnt/. Accessed Nov 2018.

  38. Van Spall, H. G. C., Toren, A., Kiss, A., & Fowler, R. A. (2007). Eligibility criteria of randomized controlled trials published in high-impact general medical journals: A systematic sampling review. Journal of the American Medical Association, 297(11), 1233–1240.

    Article  Google Scholar 

  39. Walton, D. (2006). Rules for reasoning from knowledge and lack of knowledge. Philosophia, 34(3), 355–376.

    Article  Google Scholar 

  40. Weiss, N. S. (2006). Clinical epidemiology: The study of the outcome of illness. Oxford: Oxford University Press.

    Google Scholar 

  41. Worrall, J. (2002). What evidence in evidence-based medicine? Philosophy of Science, 69(Proceedings), S316–S330.

    Article  Google Scholar 

  42. Yusuf, S., Collins, R., & Peto, R. (1984). Why do we need some large, simple randomized trials? Statistics in Medicine, 3(4), 409–422.

    Article  Google Scholar 

Download references

Acknowledgements

Thanks to Nancy Cartwright, Iain Chalmers, Luis Flores, Nicholas Howell, Ayelet Kuper, Mathew Mercuri, Jacob Stegenga, David Teira, Paul Thompson, and Ross Upshur for helpful and insightful feedback on earlier drafts of this paper. I am especially grateful to Mathew Mercuri for suggesting the analogy between unknown differences in extrapolation and unknown confounders in causal inference. Thanks also to the audience at the “Science and Certainty” workshop at the University of California, San Diego (2015) as well as the “Too Much Medicine” conference at Oxford University (2017) for comments and discussion of these ideas. I am thankful for funding support from the Canadian Institutes of Health Research and the McLaughlin Centre. I have no conflicts of interest to declare.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Jonathan Fuller.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

We can model the question of whether or not to extrapolate the effect size using decision theory, by comparing the expected utility of extrapolating and intervening (EUI) with the expected utility of not extrapolating and not intervening (EU¬I). If we intervene with I, there are two possible scenarios: the intervention’s major benefit is generalizable (G), or the intervention’s major benefit is not generalizable (¬G). There is a partial expected utility associated with G (EUG), as well as a partial expected utility associated with ¬G (EU¬G). The EUG depends on the benefits and harms of the intervention, as well as their probabilities. We can predict the probability of a beneficial outcome (B) or harmful outcome (H) using the effect size and the untreated outcome probability (e.g. RR = p(B|I)/p(B|¬I), thus: *p(B|I) = RR × p(B|¬I)). If we assume one major benefit B (the primary outcome targeted by the intervention) and one major harm H (a major side effect or adverse event), then *EUG = p(B|I)u(B) + p(H|I)u(H), where p(B|I) and p(H|I) are the probabilities and u(B) and u(H) are the utilities (u(B) is positive and u(H) is negative). We can further assume that EU¬G < EUG; otherwise, there is no point in even considering intervention. Meanwhile, the total expected utility of intervening is: EUI = p(G)EUG + p(¬G)EU¬G. Substituting the previous equations for *EUG and *p(B|I):

$${\text{EU}}_{\text{I}} = {\text{p}}\left( {\text{G}} \right)\left[ {\left( {{\text{RR x p}}\left( {{\text{B}}|\neg {\text{I}}} \right)} \right){\text{u}}\left( {\text{B}} \right) + {\text{p}}\left( {{\text{H}}|{\text{I}}} \right){\text{u}}\left( {\text{H}} \right)} \right] + {\text{p}}\left( {\neg {\text{G}}} \right){\text{EU}}_{{\neg {\text{G}}}}$$

According to decision theory, we should implement the intervention if EUI > EU¬I. Simple extrapolation implies that this will usually be the case unless compelling evidence substantially raises the p(¬G) (and thus substantially lowers EUI). However, EUI also depends on EUG and EU¬G, and thus on the benefits and harms of intervention and their effect sizes. Therefore, even if p(¬G) is low to begin with (a risky assumption), EUI might only be slightly greater than EU¬I to begin with, and raising p(¬G) by just a little might tip the balance in favour of EU¬I.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Fuller, J. The myth and fallacy of simple extrapolation in medicine. Synthese (2019). https://doi.org/10.1007/s11229-019-02255-0

Download citation

Keywords

  • Extrapolation
  • Clinical trials
  • Evidence-based medicine
  • External validity
  • Relative risk
  • Philosophy of medicine