Detecting Moderator Effects Using Subgroup Analyses

Wang, Rui; Ware, James H.

doi:10.1007/s11121-011-0221-x

Detecting Moderator Effects Using Subgroup Analyses

Published: 12 May 2011

Volume 14, pages 111–120, (2013)
Cite this article

Prevention Science Aims and scope Submit manuscript

Rui Wang¹ &
James H. Ware¹

3630 Accesses
92 Citations
Explore all metrics

Abstract

In the analysis of prevention and intervention studies, it is often important to investigate whether treatment effects vary among subgroups of patients defined by individual characteristics. These “subgroup analyses” can provide information about how best to use a new prevention or intervention program. However, subgroup analyses can be misleading if they test data-driven hypotheses, employ inappropriate statistical methods, or fail to account for multiple testing. These problems have led to a general suspicion of findings from subgroup analyses. This article discusses sound methods for conducting subgroup analyses to detect moderators. Multiple authors have argued that, to assess whether a treatment effect varies across subgroups defined by patient characteristics, analyses should be based on tests for interaction rather than treatment comparisons within the subgroups. We discuss the concept of heterogeneity and its dependence on the metric used to describe treatment effects. We discuss issues of multiple comparisons related to subgroup analyses and the importance of considering multiplicity in the interpretation of results. We also discuss the types of questions that would lead to subgroup analyses and how different scientific goals may affect the study at the design stage. Finally, we discuss subgroup analyses based on post-baseline factors and the complexity associated with this type of subgroup analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A 24-step guide on how to design, conduct, and successfully publish a systematic review and meta-analysis in medical research

Article 13 November 2019

Defining the Study Cohort: Inclusion and Exclusion Criteria

Methodological quality (risk of bias) assessment tools for primary and secondary medical studies: what are they and which is better?

Article Open access 29 February 2020

References

Aguinis, H., & Gottfredson, R.K. (2010). Best-practice recommendations for estimating interaction effects using moderated multiple regression. Journal of Organizational Behavior, 31, 776–786. doi:10.1002/job.719.
Article Google Scholar
Aiken, L.S., & West, S.G. (1991). Multiple regression: testing and interpreting interactions. Newbury Park, CA: Sage.
Google Scholar
Altman, D.G., & Andersen, K. (1999). Calculating the number needed to treat for trials where the outcome is time to an event. British Medical Journal, 319, 1492–1495. Retrieved from http://www.bmj.com/.
Article PubMed CAS Google Scholar
Altman, D.G., Schulz, K.F., Moher, D., Egger, M., Davidoff, F., Elbourne, D., ... Lang, T. (2001). The revised CONSORT statement for reporting randomized trials: Explanation and elaboration. Annals of Internal Medicine, 134, 663–694. Retrieved from http://www.annals.org/.
PubMed CAS Google Scholar
Assmann, S.F., Hosmer, D.W., Lemeshow, S., & Mundt, K.A. (1996). Confidence intervals for measures of interactions. Epidemiology, 7, 286–290. doi:10.1097/00001648-199605000-00012.
Article PubMed CAS Google Scholar
Assmann, S.F., Pocock, S.J., Enos, L.E., & Kasten, L.E. (2000). Subgroup analysis and other (mis)uses of baseline data in clinical trials. Lancet, 355, 1064–1069. doi:10.1016/S0140-6736(00)02039-0.
Article PubMed CAS Google Scholar
Baron, R.M., & Kenny, D.A. (1986). The moderator-mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173–1182. doi:a0020761/0022-3514.51.6.1173.
Article PubMed CAS Google Scholar
Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B, 57, 289–300. Retrieved from http://www.wiley.com/bw/journal.asp?ref=1369-7412&site=1.
Google Scholar
Bombardier, C., Laine, L., Reicin, A., Shapiro, D., Burgos-Vargas, R., Davis, B., ...Schnitzer, T.J. (2000). Comparison of upper gastrointestinal toxicity of rofecoxib and naproxen in patients with rheumatoid arthritis. New England Journal of medicine, 343, 1520–1528. doi:10.1056/NEJM200011233432103.
Article PubMed CAS Google Scholar
Bonetti, M., & Gelber, R.D. (2000). A graphical method to assess treatment-covariate interactions using the Cox model on subsets of the data. Statistics in Medicine, 19, 2595–2609. doi:10.1002/1097-0258(20001015)19:19<2595::AIDSIM562>3.0.CO;2-M.
Article PubMed CAS Google Scholar
Bonetti, M., & Gelber, R.D. (2004). Patterns of treatment effects in subsets of patients in clinical trials. Biostatistics, 5, 465–481. doi:10.1093/biostatistics/kxh002.
Article PubMed Google Scholar
Breiman, L., Friedman, J.H., Olshen, R.A., & Stone, C.J. (1998). Classification and regression trees. Boca Raton, FL: Chapman & Hall/CRC.
Google Scholar
Byar, D. P. (1985). Assessing apparent treatment-covariate interactions in randomized clinical trials. Statistics in Medicine, 4, 255–263. doi:10.1002/sim.4780040304.
Article PubMed CAS Google Scholar
Byar, D.P., & Green, S. (1980). The choice of treatment for cancer patients based on covariate information: Application to prostate cancer. Bulletin du Cancer, 67, 477–490. Retrieved from http://www.john-libbey eurotext.fr/en/revues/medecine/bdc/sommaire.md.
PubMed CAS Google Scholar
Cai, T., Tian, L., Wong, P.H., & Wei, L.J. (2010). Analysis of randomized comparative clinical trial data for personalized treatment selections. Biostatistics, Advance online publication. doi: 10.1093/biostatistics/kxq060
Cole, S.R., & Hernan, M.A. (2002). Fallibility in estimating direct effects. International Journal of Epidemiology, 31, 163–165. doi:10.1093/ije/31.1.163.
Article PubMed Google Scholar
Collins, L.M. (2009). Design of experiments with multiple independent variables: A resource management perspective on complete and reduced factorial designs. Psychological Methods., 14, 202–224. doi:a0020761/a0015826.
Article PubMed Google Scholar
Cook, R.J., & Sackett, D.L. (1995). The number needed to treat: A clinically useful measure of treatment effect. BMJ, 310, 452–454. Retrieved from http://www.bmj.com/.
Article PubMed CAS Google Scholar
Curfman, G.D., Morrissey, S., & Drazen, J.M. (2005). Expression of concern: Bombardier et al., Comparison of upper gastrointestinal toxicity of rofecoxib and naproxen in patients with rheumatoid arthritis. N Engl J Med 2000;343:1520–8. New England Journal of Medicine, 353, 2813–14. doi: 10.1056/NEJMe058314
Gail, M., & Simon, R. (1985). Testing for qualitative interactions between treatment effects and patient subsets. Biometrics, 41, 361–372. doi:10.2307/2530862.
Article PubMed CAS Google Scholar
Gardner, F., Connell, A., Trentacosta, C.J., Shaw, D.S., Dishion, T.J., & Wilson, M.N. (2009). Moderators of Outcome in a Brief Family-Centered Intervention for Preventing Early Problem Behavior. Journal of Consulting and Clinical Psychology, 77, 543–553. doi:a0020761/a0015622.
Article PubMed Google Scholar
Halperin, M., Ware, J.H., Byar, D.P., Mantel, N., Brown, C.C., Koziol, J., ...Green, S.B. (1977). Testing for interaction in an I × J × K contingency table. Biometrika, 64, 271–275. doi:10.2307/2335693.
Google Scholar
Hastie, T., & Tibshirani, R. (1990). Generalised additive models. Boca Raton, FL: Chapman and Hall/CRC.
Google Scholar
Hernández, A., Boersma, E., Murray, G.D., Habbema, J.D., & Steyerberg, E.W. (2006). Subgroup analyses in therapeutic cardiovascular clinical trials: Are most of them misleading? American Heart Journal, 151, 257–264. doi:10.1016/j.ahj.2005.04.020.
Article PubMed Google Scholar
Hochberg, Y. (1988). A sharper Bonferroni procedure for multiple tests of significance. Biometrika, 75, 800–802. doi:10.1093/biomet/75.4.800.
Article Google Scholar
Holm, S. (1979). A simple sequential rejective multiple test procedure. Scandinavian Journal of Statistics, 6, 65–70. Retrieved from http://www.blackwellpublishing.com/journal.asp?ref=0303-6898.
Google Scholar
Hommel, G. (1988). A stagewise rejective multiple test procedure on a modified Bonferroni test. Biometrika, 75, 383–386. doi:10.1093/biomet/75.2.383.
Article Google Scholar
Hosmer, D.W., & Lemeshow, S. (1992). Confidence interval estimation of interaction. Epidemiology, 3, 452–456. doi:10.1097/00001648-199209000-00012.
Article PubMed CAS Google Scholar
Imai, K., Keele, L., & Tingley, D. (2010). A general approach to causal mediation analysis. Psychological Methods, 15, 309–334. doi::a0020761/a0020761.
Article PubMed Google Scholar
Jackson, R.D., LaCroix, A.Z., Gass, M., Wallace, R.B., Robbins, J., Lewis, C.E., ...Barad, D. (2006). Calcium plus vitamin D supplementation and the risk of fractures. New England Journal of Medicine, 354, 669–683. doi:10.1056/NEJMoa055218 [Erratum, N Engl J Med 2006; 354:1102].
Article PubMed CAS Google Scholar
Jo, B. (2008). Causal inference in randomized experiments with mediational processes. Psychological Methods, 13, 314–336. doi:a0020761/a0014207.
Article PubMed Google Scholar
Judd, C.M., & Kenny, D.A. (1981). Process analysis: Estimating mediation in treatment evaluations. Evaluation Review, 5, 602–619. doi:10.1177/0193841X8100500502.
Article Google Scholar
Julius, S., Nesbitt, S.D., Egan, B.M., Weber, M.A., Michelson, E.L., Kaciroti, N.,.... Schork, M.A. (2006). Feasibility of treating prehypertension with an angiotension-receptor blocker. New England Journal of Medicine, 354, 1685–1697. doi:10.1056/NEJMoa060838.
Article PubMed CAS Google Scholar
Kent, D.M., & Hayward, R.A. (2007). Limitations of applying summary results of clinical trials to individual patients, the need for risk stratification. Journal of American Medical Association, 298, 1209–1212. doi:10.1001/jama.298.10.1209.
Article CAS Google Scholar
Keppel, G., & Wickens, T.D. (2004). Design and analysis: A researcher’s handbook. Upper Saddle River, NJ: Pearson/Prentice Hall.
Google Scholar
Koch, G.G., & Gansky, S.A. (1996). Statistical considerations for multiplicity in confirmatory protocols. Drug Information Journal, 30, 523–533. Retrieved from http://www.diahome.org/DIAHome/Resources/FindPublications.aspx.
Article Google Scholar
Kraemer, H.C. (2004). Reconsidering the odds ratio as a measure of 2 × 2 association in a population. Statistics in Medicine, 23, 257–270. doi:10.1002/sim.1714.
Article PubMed Google Scholar
Kraemer, H.C. (2006). Moderators of treatment outcomes: Clinical, research, and policy importance. Journal of the American Medical Association, 296, 1–4. doi:10.1001/jama.296.10.1286.
Article Google Scholar
Kraemer, H.C. (2008). Toward non-parametric and clinically meaningful moderators and mediators. Statistics in Medicine, 27, 1679–1692. doi:10.1002/sim.3149.
Article PubMed Google Scholar
Kraemer, H.C., Wilson, T., Fairburn, C. G., & Agras, W. S. (2002). Mediators and moderators of treatment effects in randomized clinical trials. Archives of General Psychiatry, 59, 877–883. doi:10.1001/archpsyc.59.10.877.
Article PubMed Google Scholar
Kraemer, H.C., Kiernan, M., Essex, M., & Kupfer, D. J. (2008). How and why criteria defining moderators and mediators differ between the Baron & Kenny and the MacArthur approaches. Health Psychology, 27, S101–S108. Retrieved from http://www.apa.org/pubs/journals/hea/.
Article PubMed Google Scholar
Lagakos, S.W. (2006). The challenge of subgroup analyses—reporting without distorting. New England Journal of Medicine, 354, 1667–1669. doi:10.1056/NEJMp068070.
Article PubMed CAS Google Scholar
Lemon, S.C., Roy, J., Clark, M.A., Friedmann, P.D., & Rakowski, W. (2003). Classification and regression tree analysis in public health: Methodological review and comparison with logistic regression. Annals of Behavioral Medicine, 26, 172–181. doi:10.1207/S15324796ABM2603_02.
Article PubMed Google Scholar
Li, R., & Chambless, L. (2007). Test for additive interaction in proportional hazards models. Annals of Epidemiology, 17, 227–236. doi:10.1016/j.annepidem.2006.10.009.
Article PubMed Google Scholar
MacCallum, R.C., Zhang, S., Preacher, K.J., & Rucker, D.D. (2002). On the practice of dichotomization of quantitative variables. Psychological Methods, 7, 19–40. doi:a0020761/1082-989X.7.1.19.
Article PubMed Google Scholar
MacKinnon, D.P. (2008). Introduction to statistical mediation analysis. New York, NY: Taylor & Francis Group.
Google Scholar
MacKinnon, D.P., & Dwyer, J.H. (1993). Estimating mediated effects in prevention studies. Evaluation Review, 17, 144–158. doi:10.1177/0193841X9301700202.
Article Google Scholar
Marra, G., & Radice, R. (2010). Penalised regression splines: Theory and application to medical research. Statistical Methods in Medical Research, 19, 107–125. doi:10.1177/0962280208096688.
Article PubMed Google Scholar
Meckstroth, A., Burwick, A., Moore, Q., Ponza, M., Marsh, S., McGuirk, A., Zhao, Z. (2008). Teaching self-sufficiency: An impact and benefit-cost analysis of a home visitation and life skills education program. Retrieved from Mathematics Policy Research website: http://www.mathematica-mpr.com/publications/pdfs/teaching_self.pdf
Newcombe, R.G. (2006). A deficiency of the odds ratio as a measure of effect size. Statistics in Medicine, 25, 4235–4240. doi:10.1002/sim.2683.
Article PubMed Google Scholar
Pan, G., & Wolfe, D.A. (1997). Test for qualitative interaction of clinical significance. Statistics in Medicine, 16, 1645–1652. doi:10.1002/(SICI)1097-0258(19970730)16:14<1645::AID-SIM596>3.0.CO;2-G.
Article PubMed CAS Google Scholar
Patel, K.M., & Hoel, D.G. (1973). A nonparametric test for interaction in factorial experiments. Journal of the American Statistical Association, 68, 615–620. doi:10.2307/2284788.
Article Google Scholar
Pearl, J. (2001). Direct and indirect effects. Proceedings of the Seventeenth Conference on Uncertainty and Artificial Intelligence, 411–20. San Francisco: Morgan Kaufmann.
Peto, R. (1982). Statistical aspects of cancer trials. In K. E. Halnan (Ed.), Treatment of Cancer (pp. 867–871). London: Chapman and Hall.
Google Scholar
Piantadosi, S., & Gail, M.H. (1993). A comparison of the power of two tests for qualitative interactions. Statistics in Medicine, 12, 1239–1248. doi:10.1002/sim.4780121105.
Article PubMed CAS Google Scholar
Robins, J.M., & Greenland, S. (1992). Identifiabilty and exchangeability for direct and indirect effects. Epidemiology, 3, 143–155. doi:10.1097/00001648-199203000-00013.
Article PubMed CAS Google Scholar
Rothman, K.J. (1986). Modern Epidemiology. Boston, MA: Little, Brown and Company.
Google Scholar
Sackett, D.L. (1996). Down with odds ratios! Evidence-Based Medicine, 1, 164–166. doi:10.1629/09178.
Google Scholar
Sacks, F.M., Pfeffer, M.A., Moye, L.A., Rouleau, J.L., Rutherford, J.D., Cole, T.G.,... Braunwald, E. (1996). The effect of Pravastatin on coronary events after Myocardial infarction in patients with average cholesterol levels. The New England Journal of Medicine, 335, 1001–1009. doi:10.1056/NEJM199610033351401.
Article PubMed CAS Google Scholar
Schemper, M. (1988). Non-parametric analysis of treatment-covariate interaction in the presence of censoring. Statistics in Medicine, 7, 1257–1266. doi:10.1002/sim.4780071206.
Article PubMed CAS Google Scholar
Schwartz, L.M., Woloshin, S., & Welch, H.G. (1999). Misunderstandings about the effects of race and sex on physicians’ referrals for cardiac catheterization. New England Journal of Medicine, 341, 279–283. doi:10.1056/NEJM199907223410411.
Article PubMed CAS Google Scholar
Shaffer, J.P. (1995). Multiple Hypothesis Testing. Annual Review of Psychology, 46, 561–584. doi:10.1146/annurev.ps.46.020195.003021.
Article Google Scholar
Shuster, J., & van Eys, J. (1983). Interaction between prognostic factors and treatment. Controlled Clinical Trials, 4, 209–214. doi:10.1016/0197-2456(83)90004-1.
PubMed CAS Google Scholar
Silvapulle, M.J. (2001). Tests against qualitative interaction: Exact critical values and robust tests. Biometrics, 57, 1157–1165. doi:10.1111/j.0006-341X.2001.01157.x.
Article PubMed CAS Google Scholar
Simes, J.R. (1986). An improved Bonferroni procedure for multiple tests of significance. Biometrika, 73, 751–754. doi:10.1093/biomet/73.3.751.
Article Google Scholar
Sleeper, L.A., & Harrington, D.P. (1990). Regression splines in the Cox model with application to covariate effects in liver disease. Journal of the American Statistical Association, 85, 941–949. doi:10.2307/2289591.
Article Google Scholar
Sobel, M.E. (2008). Identification of causal parameters in randomized studies with mediating variables. Journal of Educational and Behavioral Statistics, 33, 230–251. doi:10.3102/1076998607307239.
Article Google Scholar
Song, S., & Pepe, M.S. (2004). Evaluating markers for selecting a patient’s treatment. Biometrics, 60, 874–883. doi:10.1111/j.0006-341X.2004.00242.x.
Article PubMed Google Scholar
Storey, J.D., & Tibshirani, R. (2003). Statistical significance for genomewide studies. Proceedings of the National Academy of Sciences of the United States of America, 100, 9440–9445. doi:10.1073/pnas.1530509100.
Article PubMed CAS Google Scholar
Tolan, P.H., Gorman-Smith, D., Henry, D., & Schoney, M. (2009). The Benefits of Booster Interventions: Evidence from a Family-Focused Prevention Program. Prevention Science, 10, 287–297. doi:10.1007/s11121-009-0139-8.
Article PubMed Google Scholar
Van den Berghe, G., Wilmer, A., Hermans, G., Meersseman, W., Wouters, P.J., Milants, L., ... Bouillon, R. (2006). Intensive Insulin Therapy in the Medical ICU. New England Journal of Medicine, 354, 449–461. doi:10.1056/NEJMoa052521.
Article PubMed Google Scholar
VanderWeele, T.J. (2010). Bias formulas for sensitivity analysis for direct and indirect effects. Epidemiology, 21, 540–551. doi:10.1097/EDE.0b013e3181df191c.
Article PubMed Google Scholar
VanderWeele, T.J., & Knol, M.J. (2011). The interpretation of subgroup analyses in randomized trials: Heterogeneity versus secondary interventions. Annals of Internal Medicine, in press.
VanderWeele, T.J., & Robins, J.M. (2007). Four types of effect modification: A classification based on directed acyclic graphs. Epidemiology, 18, 561–568. doi:10.1097/EDE.0b013e318127181b.
Article PubMed Google Scholar
VanderWeele, T.J., & Vansteelandt, S. (2009). Conceptual issues concerning mediation, interventions and composition. Statistics and Its Interface, 2, 457–468. Retrieved from http://www.intlpress.com/SII/.
Google Scholar
VanderWeele, T.J., & Vansteelandt, S. (2010). Odds ratios for mediation analysis with a dichotomous outcome. American Journal of Epidemiology, 172, 1339–1348. doi:10.1093/aje/kwq332.
Article PubMed Google Scholar
Wactawski-Wende, J., Kotchen, J.M., Anderson, G.L., Assaf, A.R., Brunner, R.L., O’Sullivan, M.J., ... Manson, E. (2006). Calcium plus vitamin D supplementation and the risk of colorectal cancer. New England Journal of Medicine, 354, 684–696. doi:10.1056/NEJMoa055222.
Article PubMed CAS Google Scholar
Wang, R., Lagakos, S.W., Ware, H., Hunter, D.J., & Drazen, J.M. (2007). Statistics in medicine—reporting of subgroup analyses in clinical trials. New England Journal of Medicine, 357, 2189–2194. doi:10.1056/NEJMsr077003.
Article PubMed CAS Google Scholar
Wen, L., Badgett, R., & Cornell, J. (2005). Number needed to treat: A descriptor for weighing therapeutic options. American Journal of Health-System Pharmacology, 62, 2031–2036. doi:10.2146/ajhp040558.
Article Google Scholar

Download references

Acknowledgment

We dedicate this paper to our friend and colleague, Dr. Stephen W. Lagakos, who inspired the work and provided valuable insights and discussions on many aspects of subgroup analyses. We are grateful to Drs. Robert J. McMahon, David P. Mackinnon, Tyler VanderWeele, and three reviewers for their comments, which have improved the paper. This work was in part supported by grant AI24643 from the National Institutes of Health.

Author information

Authors and Affiliations

Department of Biostatistics, Harvard School of Public Health, 655 Huntington Avenue, SPH2, 4th Floor, Boston, MA, 02115, USA
Rui Wang & James H. Ware

Authors

Rui Wang
View author publications
You can also search for this author in PubMed Google Scholar
James H. Ware
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rui Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, R., Ware, J.H. Detecting Moderator Effects Using Subgroup Analyses. Prev Sci 14, 111–120 (2013). https://doi.org/10.1007/s11121-011-0221-x

Download citation

Published: 12 May 2011
Issue Date: April 2013
DOI: https://doi.org/10.1007/s11121-011-0221-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Detecting Moderator Effects Using Subgroup Analyses

Abstract

Access this article

Similar content being viewed by others

A 24-step guide on how to design, conduct, and successfully publish a systematic review and meta-analysis in medical research

Defining the Study Cohort: Inclusion and Exclusion Criteria

Methodological quality (risk of bias) assessment tools for primary and secondary medical studies: what are they and which is better?

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Detecting Moderator Effects Using Subgroup Analyses

Abstract

Access this article

Similar content being viewed by others

A 24-step guide on how to design, conduct, and successfully publish a systematic review and meta-analysis in medical research

Defining the Study Cohort: Inclusion and Exclusion Criteria

Methodological quality (risk of bias) assessment tools for primary and secondary medical studies: what are they and which is better?

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation