Skip to main content

The Virtues and Limitations of Randomized Experiments

Abstract

Despite the consensus promoted by the evidence-based medicine framework, many authors continue to express doubts about the superiority of randomized controlled trials. This paper evaluates four objections targeting the legitimacy, feasibility, and extrapolation problems linked to the experimental practice of random allocation. I argue that random allocation is a methodologically sound and feasible practice contributing to the internal validity of controlled experiments dealing with heterogeneous populations. I emphasize, however, that random allocation is solely designed to ensure the validity of causal inferences at the level of groups. By itself, random allocation cannot enhance test precision, doesn’t contribute to external validity, and limits the applicability of causal claims to individuals.

This is a preview of subscription content, access via your institution.

Fig. 1

Notes

  1. 1.

    For the purposes of the present discussion, we may treat the outcome “stress” as an operationalized variable and ignore issues related to its physical interpretation, such as the reality to which the variable ultimately refers and the extent to which the assessment method accurately measures this reality.

  2. 2.

    “One of the continuing appeals of deterministic methods for case study researchers is the power of the methods. For example, Mill’s Method of Difference can determine causality with only two observations. This power can only be obtained by assuming that the observation with the antecedent of interest, A, B, C and the one without, B, C are exactly alike except for the manipulation of A, and by assuming deterministic causation and the absence of measurement error and interactions among antecedents” (Sekhon 2008, 286–87).

  3. 3.

    Jaynes further generalizes the dilemma via an argument from infinite regression: “any specific experiment for which the existence of a physical probability is asserted is subject to physical analysis […] which will lead eventually to an understanding of its mechanism. But as soon as this understanding is reached, then this new experiment will also appear as an exceptional case […] where physical considerations obviate the usual postulates of physical probabilities” (2003, 324).

  4. 4.

    The intraclass correlation coefficient ρ is the ratio of between-group variance and total, within and between-group variance. Values of ρ range from 0, corresponding to no correlation of outcomes within a group (outcomes within and between the group are independent), to 1, corresponding to a situation when all individual outcomes within a cluster are identical (sample size is reduced to the number of clusters rather than the number of individuals). As ρ increases, the sample size required to detect a significant difference for the variable under investigation also increases.

References

  1. Altman, D. G. (1985). Comparability of randomized groups. The Statisticia, 341, 125–136.

    Article  Google Scholar 

  2. Altman, D. G., & Bland, J. M. (1999). Treatment allocation in controlled trials: Why randomise. British Medical Journal, 318, 1209.

    Article  Google Scholar 

  3. Andersen, H. (2012). Mechanisms: What are they evidence for in evidence-based medicine? Journal of Evaluation in Clinical Practice, 18(5), 992–999.

    Article  Google Scholar 

  4. Bowers, D. 2014. Medical statistics from scratch: An introduction for health professionals. Hoboken, NJ: Wiley.

  5. Button, K. S., Ioannidis, J. P. A., et al. (2013). Power failure: Why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience, 14, 365–376.

    Article  Google Scholar 

  6. Cartwright, N. (2010). What are randomised controlled trials food for? Philosophical Studies, 147, 59–70.

    Article  Google Scholar 

  7. Chalmers, T. C., Celano, P., et al. (1983). Bias in treatment assignment in controlled clinical trials. New England Journal of Medicine, 309(22), 1359–1361.

    Article  Google Scholar 

  8. Deaton, A., & Cartwright, N. (2018). Understanding and misunderstanding randomized controlled trials. Social Science & Medicine, 210, 2–21.

    Article  Google Scholar 

  9. Diaconis, P., Holmes, S., et al. (2007). Dynamical bias in the coin toss. SIAM Review, 49(2), 211–235.

    Article  Google Scholar 

  10. Donner, A., & Klar, N. (2004). Pitfalls of and controversies in cluster randomization trials. American Journal of Public Health, 94(3), 416–422.

    Article  Google Scholar 

  11. Feinstein, A. R., & Horwitz, I. R. (1997). Problems in the “evidence” of “evidence-based medicine.” American Journal of Medicine, 103, 529–535.

    Article  Google Scholar 

  12. Fisher, R. A. 1947. The design ol experiments. Fourth edition ed. Edinburgh: Oliver and Boyd.

  13. Fuller, J., & Flores, L. (2015). The risk GP model: The standard model of prediction in medicine. Studies in History and Philosophy of Biological and Biomedical Sciences, 54, 49–61.

    Article  Google Scholar 

  14. Godlee, F. (2007). Milestones on the long road to knowledge. BMJ, 334(suppl 1), s2–s3.

    Article  Google Scholar 

  15. Godwin, M., L. Ruhland, et al. 2003. “Pragmatic controlled clinical trials in primary care: The struggle between external and internal validity.” BMC Medical Research Methodology 3 (28): https://doi.org/10.1186/471-2288-3-28.

  16. Greenfield, S., Kravitz, R., et al. (2007). Heterogeneity of treatment effects: Implications for guidelines, payment, and quality assessment. American Journal of Medicine, 120, S3–S9.

    Article  Google Scholar 

  17. Guyatt, G., and B. Djulbegovic. 2019. “Evidence-based medicine and the theory of knowledge.” In Users’ guides to the medical literature: A manual for evidence-based clinical practice, ed. G. Guyatt, D. Rennie, M. O. Meade and D. J. Cook. New Yotk, NY: JAMA/McGraw-Hill Education.

  18. Guyatt, G., D. Rennie, et al. 2015. Users’ guides to the medical literature: Essentials of evidence-based clinical practice. 3rd ed. New York: McGraw-Hill.

  19. Hernán, M., & Vanderweele, T. (2011). Compound treatments and transportability of causal inference. Epidemiology, 22(3), 368–377.

    Article  Google Scholar 

  20. Higgins, J. P., Thomas, T. J., et al. (2019). Cochrane handbook for systematic reviews of interventions. John Wiley & Sons.

  21. Hill, A. B. (1952). The clinical trial. New England Journal of Medicine, 247, 113–119.

    Article  Google Scholar 

  22. Hill, A. B. 1955. Principles of Medical Statistics. 6th ed. New York: Oxford University Press.

  23. Holland, P. W. (1986). Statistics and causal inference. Journal of the American Statistical Association, 81(396), 945–960.

    Article  Google Scholar 

  24. Howick, J. 2011. The philosophy of evidence-based medicine. Oxford: BMJ Books.

  25. Howick, J., Glaziou, P., et al. (2013). Problems with using mechanisms to solve the problem of extrapolation. Theoretical Medicine and Bioethics, 34(4), 275–291.

    Article  Google Scholar 

  26. Howson, C., and P. Urbach. 2006. Scientific reasoning, a Bayesian approach. 3rd ed. Chicago: Open Court.

  27. Jaynes, E. T. 2003. Probability theory: the Logic of science. Cambridge University Press.

  28. Kabisch, M., Ruckes, C., et al. (2011). Randomized controlled trials: Part 17 of a series on evaluation of scientific publications. Deutsches Ärzteblatt International, 108(39), 663–668.

    Google Scholar 

  29. Kish, L. 1987. Statistical design for research. Wiley.

  30. Koepke, D., and R. Flay. 1989. “Levels of analysis.” In New directions for program evaluation: evaluating health promotion programs, ed. M. T. Braverman. San Francisco: Jossey-Bass.

  31. Landis, S. C., Amara, S. G., et al. (2012). A call for transparent reporting to optimize the predictive value of preclinical research. Nature, 480, 187–191.

    Article  Google Scholar 

  32. Lavori, P. W., T. A. Louis, et al. 1986. “Designs for experiments: Parallel comparisons of treatment.” In Medical uses of statistics, ed. C. Bailar and F. Mosteller, 61–82. Waltham, MA: New England Journal of Medicine.

  33. Leighton, J. P. 2010. “Internal validity.” In Encyclopedia of research design, ed. N. J. Salkind. Thousand Oaks, CA: SAGE.

  34. Lindley, D. V. (1982). The role of randomization in inference. Philosophy of Science Association, 2, 431–446.

    Google Scholar 

  35. Mant, M. (1999). Can randomized trials inform clinical decisions about individual patients? Lancet, 353, 743–746.

    Article  Google Scholar 

  36. Miettinen, O. (1974). Confounding and effect-modification. American Journal of Epidemiology, 100(5), 350–353.

    Article  Google Scholar 

  37. Musters, A., & Tas, S. W. (2020). Room for improvement in clinical trials for rare diseases. Nature Reviews Rheumatology, 16, 131–132.

    Article  Google Scholar 

  38. Papineau, D. (1994). The virtues of randomization. British Journal for the Philosophy of Science, 45, 437–450.

    Article  Google Scholar 

  39. Pearl, J. 2000. Causality, models, reasoning, and inference. Cambridge University Press.

  40. Pearl, J., & Bareinboim, E. (2014). External validity: From do-calculus to transportability across populations. Statistical Science, 29, 579–595.

    Article  Google Scholar 

  41. Rosenbaum, P. R. (1988). Sensitivity analysis for matching with multiple controls. Biometrika, 75, 577–581.

    Article  Google Scholar 

  42. Rosenbaum P. R. 1995.Observational studies. Springer.

  43. Rothwell, P. M. (2005). Subgroup analysis in randomized controlled trials: Importance, indications, and interpretation. Lancet, 365, 76–86.

    Google Scholar 

  44. Rubin, D. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(5), 688–701.

    Article  Google Scholar 

  45. Russo, F., & Williamson, J. (2007). Interpreting causality in the health sciences. International Studies in the Philosophy of Science, 21(2), 157–170.

    Article  Google Scholar 

  46. Russo, F., & Williamson, J. (2011a). Epistemic causality and evidence-based medicine. History and Philosophy of the Life Sciences, 33(4), 563–581.

    Google Scholar 

  47. Russo, F., & Williamson, J. (2011b). Generic versus single-case causality: The case of autopsy. European Journal for Philosophy of Science, 1(1), 47–69.

    Article  Google Scholar 

  48. Sekhon, J. S. (2008). The Neyman-Rubin model of causal inference and estimation via matching methods. In J. M. Box-Steffensmeier, H. E. Brady, & D. Collier (Eds.), The Oxford handbook of political methodology (pp. 271–299). Oxford University Press.

    Google Scholar 

  49. Sergent, C., Baillet, S., et al. (2005). Timing of the brain events underlying access to consciousness during the attentional blink. Nature Neuroscience, 8(10), 1391–1400.

    Article  Google Scholar 

  50. Shadish, W. R., T. D. Cook, et al. 2002. Experimental and quasi-experimental designs for generalized causal inference. Houghton Mifflin.

  51. Sharabiani, M., Aylin, P., et al. (2012). Systematic review of comorbidity indices for administrative data. Medical Care, 50(12), 1109–1118.

    Article  Google Scholar 

  52. Steel, D. 2008. Across the boundaries: Extrapolation in biology and social science. Oxford University Press.

  53. Upshur, R. E. 2002. If Not Evidence, then What? Or Does Medicine Really Need a Base? Journal of Evaluation in Clinical Practice, 8, 113–19.

  54. Urbach, P. (1985). Randomization and the design of experiments. Philosophy of Science, 52, 256–273.

    Article  Google Scholar 

  55. Urbach, P. (1993). The value of randomization and control in clinical trials. Statistics in Medicine, 12, 1421–1431.

    Article  Google Scholar 

  56. Winch, R. F., & Campbell, D. T. (1969). Proof? No. Evidence? Yes. The significance of tests of significance. The American Sociologist, 4(2), 140–143.

    Google Scholar 

  57. Woodward, J. 2003. Making things happen: A theory of causal explanation. Oxford: Oxford University Press.

  58. Worrall, J. (2007a). Evidence in medicine and evidence-based medicine. Philosophy Compass, 2(6), 981–1022.

    Article  Google Scholar 

  59. Worrall, J. (2007b). Why there’s no cause to randomize. British Journal for the Philosophy of Science, 58, 451–488.

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Tudor M. Baetu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Baetu, T.M. The Virtues and Limitations of Randomized Experiments. Acta Anal (2021). https://doi.org/10.1007/s12136-021-00497-7

Download citation