Abstract
In this chapter we provide an overview of the principles and practice of subgroup analysis in late-stage clinical trials. For convenience, we classify different subgroup analyses into two broad categories: data-driven and confirmatory. The two settings are different from each other primarily by the scope and extent of pre-specification of patient subgroups. First, we review key considerations in confirmatory subgroup analysis based on one or more pre-specified patient populations. This includes a survey of multiplicity adjustment methods recommended in multi-population Phase III clinical trials and decision-making considerations that ensure clinically meaningful inferences across the pre-defined populations. Secondly, we consider key principles for data-driven subgroup analysis and contrast it with that for a guideline-driven approach. Methods that emerged in the area of principled data-driven subgroup analysis in the last 10 years as a result of cross-pollination of machine learning, causal inference and multiple testing are reviewed. We provide examples of recommended approaches to data-driven and confirmatory subgroup analysis illustrated with data from Phase III clinical trials. We also illustrate common errors, pitfalls and misuse of subgroup analysis approaches in clinical trials often resulting from employing overly simplistic or naive methods. Overview of available statistical software and extensive bibliographical references are provided.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alosh M, Huque MF (2013) Multiplicity considerations for subgroup analysis subject to consistency constraint. Biom J 55:444–462
Alosh M, Fritsch K, Huque M, Mahjoob K, Pennello G, Rothmann M, Russek-Cohen E, Smith F, Wilson S, Yue L (2015) Statistical considerations on subgroup analysis in clinical trials. Stat Biopharm Res 7(4):286–303
Alosh M, Huque MF, Bretz F, D’Agostino RB (2016) Tutorial on statistical considerations on subgroup analysis in confirmatory clinical trials. Stat Med 36:1334–1360
Athey S, Imbens G (2016) Recursive partitioning for heterogeneous causal effects. Proc Natl Acad Sci 113:7353–7360
Ballarini NM, Rosenkranz GK, Jaki T, König F, Posch M (2018) Subgroup identification in clinical trials via the predicted individual treatment effect. PLoS One 13:e0205971
Battioui C, Denton B, Shen L (2018) TSDT: treatment-specific subgroup detection tool. R package version 1.0.0. https://CRAN.R-project.org/package=TSDT
Benda N, Branson M, Maurer W, Friede T (2010) Aspects of modernizing drug development using clinical scenario planning and evaluation. Drug Inf J 44:299–315
Berger J, Wang X, Shen L (2014) A Bayesian approach to subgroup identification. J Biopharm Stat 24:110–129
Bornkamp B, Ohlssen D, Magnusson BP, Schmidli H (2017) Model averaging for treatment effect estimation in subgroups. Pharm Stat 16:133–142
Bonetti M, Gelber R (2000) A graphical method to assess treatment–covariate interactions using the cox model on subsets of the data. Stat Med 19:2595–2609
Brannath W, Zuber E, Branson M, Bretz F, Gallo P, Posch M, Racine-Poon A (2009) Confirmatory adaptive designs with Bayesian decision tools for a targeted therapy on oncology. Stat Med 28:1445–1463
Breiman L (2001) Statistical modeling: the two cultures. Stat Sci 16:199–231
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth, Belmont
Bretz F, Schmidli H, Koenig F, Racine A, Maurer W (2006) Confirmatory seamless phase II/III clinical trials with hypotheses selection at interim: general concepts. Biom J 48:623–634
Brookes ST, Whitley E, Peters TJ, Mulheran PA, Egger M, Davey Smith G (2001) Subgroup analyses in randomised controlled trials: quantifying the risks of false-positives and false-negatives. Health Technol Assess 5:1–56
Cappuzzo F et al (2010) Erlotinib as maintenance treatment in advanced non-small-cell lung cancer: a multicentre, randomised, placebo-controlled Phase 3 study. Lancet Oncol 11:521–529
Carroll KJ, Fleming TR (2013) Statistical evaluation and analysis of regional interactions: the PLATO trial case study. Stat Biopharm Res 5(2):91–101
Carroll KJ, Le Maulf F (2011) Japanese guideline on global clinical trials: statistical implications and alternative criteria for assessing consistency. Drug Inf J 45:657–667
CFDA (China Food and Drug Administration) (2007) Provisions for drug registration. State Food and Drug Administration Order No. 28
Chen G, Zhong H, Belousov A, Viswanath D (2015) PRIM approach to predictive-signature development for patient stratification. Stat Med 34:317–342
Chipman HA, George EI, McCulloch RE (2010) BART: Bayesian additive regression trees. Ann Appl Stat 4:266–298
Cohen AT et al (2016) Extended thromboprophylaxis with betrixaban in acutely ill medical patients. N Engl J Med 375:534–544
Dane A, Spencer A, Rosenkranz G, Lipkovich I, Parke T on behalf of the PSI/EFSPI Working Group on Subgroup Analysis (2019) Subgroup analysis and interpretation for phase 3 confirmatory trials: white paper of the EFSPI/PSI working group on subgroup analysis. Pharm Stat 18:126–139. https://doi.org/10.1002/pst.1919
Dixon DO, Simon R (1991) Bayesian subset analysis. Biometrics 47:871–882
Dmitrienko A, D’Agostino RB (2013) Tutorial in biostatistics: traditional multiplicity adjustment methods in clinical trials. Stat Med 32:5172–5218
Dmitrienko A, D’Agostino RB (2018) Multiplicity considerations in clinical trials. N Engl J Med 378:2115–2122
Dmitrienko A, Paux G (2017) Subgroup analysis in clinical trials. In: Dmitrienko A, Pulkstenis E (eds) Clinical trial optimization using R. Chapman and Hall/CRC Press, New York
Dmitrienko A, Tamhane AC (2011) Mixtures of multiple testing procedures for gatekeeping applications in clinical trials. Stat Med 30:1473–1488
Dmitrienko A, Tamhane AC (2013) General theory of mixture procedures for gatekeeping. Biom J 55:402–419
Dmitrienko A, Soulakova JN, Millen B (2011) Three methods for constructing parallel gatekeeping procedures in clinical trials. J Biopharm Stat 53:768–786
Dmitrienko A, Muysers C, Fritsch A, Lipkovich I (2016) General guidance on exploratory and confirmatory subgroup analysis in late-stage clinical trials. J Biopharm Stat 26:71–98
Douillard JY et al (2014) Final results from PRIME: randomized phase 3 study of panitumumab with FOLFOX4 for first-line treatment of metastatic colorectal cancer. Ann Oncol 25:1346–1355
Dusseldorp E, Van Mechelen I (2014) Qualitative interaction trees: a tool to identify qualitative treatment-subgroup interactions. Stat Med 33:219–237
Dusseldorp E, Conversano C, Van Os BJ (2010) Combining an additive and tree-based regression model simultaneously: STIMA. J Comput Graph Stat 19:514–530
EMA (European Medicines Agency) (2007) Reflection paper on methodological issues in confirmatory clinical trials planned with an adaptive design. European Medicines Agency/Committee for Medicinal Products for Human Use. CHMP/EWP/2459/02
EMA (European Medicines Agency) (2014) Guideline on the investigation of subgroups in confirmatory clinical trials. Draft. European Medicines Agency/Committee for Medicinal Products for Human Use. EMA/CHMP/539146/2013
EMA (European Medicines Agency) (2015) Guideline on adjustment for baseline covariates in clinical trials. European Medicines Agency/Committee for Medicinal Products for Human Use. EMA/CHMP/295050/2013
EMA (European Medicines Agency) (2017) Guideline on multiplicity issues in clinical trials. Draft. European Medicines Agency/Committee for Medicinal Products for Human Use. EMA/CHMP/44762/2017
FDA (U.S. Food and Drug Administration) (2014) Guidance: evaluation of sex-specific data in medical device clinical studies. https://www.fda.gov/media/82005/download
FDA (U.S. Food and Drug Administration) (2017a) Guidance for industry: evaluation and reporting of age-, race-, and ethnicity-specific data in medical device clinical studies; doc number 1500626, pp 1–36. https://www.fda.gov/media/98686/download
FDA (U.S. Food and Drug Administration) (2017b) Guidance for industry: multiple endpoints in clinical trials. https://www.fda.gov/media/102657/download
FDA (U.S. Food and Drug Administration) (2018) Guidance for industry: adaptive design clinical trials for drugs and biologics. https://www.fda.gov/media/78495/download
FDA (U.S. Food and Drug Administration) (2019) Guidance for industry: enrichment strategies for clinical trials to support determination of effectiveness of human drugs and biological products. https://www.fda.gov/media/121320/download
Foster JC, Taylor JMC, Ruberg SJ (2011) Subgroup identification from randomized clinical trial data. Stat Med 30:2867–2880
Freidlin B, Simon R (2005) Adaptive signature design: an adaptive clinical trial design for generating and prospectively testing a gene expression signature for sensitive patients. Clin Cancer Res 21:7872–7878
Freidlin B, McShane LM, Korn EL (2010) Randomized clinical trials with biomarkers: design issues. J Natl Cancer Inst 102:152–160
Freidlin B, Korn EL, Gray R (2014) Marker sequential test (MaST) design. Clin Trials 11:19–27
Friede T, Parsons N, Stallard N (2012) A conditional error function approach for subgroup selection in adaptive clinical trials. Stat Med 31:4309–4320
Friedman JH (2002) Stochastic gradient boosting. Comput Stat Data Anal 38:367–378
Friedman JH, Fisher NI (1999) Bump hunting in high-dimensional data. Stat Comput 9:123–143
Fu H (2018) Individualized treatment recommendation (ITR) for survival outcomes. Presentation at conference on statistical learning and data science/nonparametric statistics. Columbia University, New York. https://publish.illinois.edu/sldsc2018/2018/05/20/session-35-machine-learning-and-precision-medicine/
Fu H, Zhou J, Faries DE (2016) Estimating optimal treatment regimes via subgroup identification in randomized control trials and observational studies. Stat Med 35:3285–3302
Graf AC, Wassmer G, Friede T, Gera RG, Posch M (2019) Robustness of testing procedures for confirmatory subpopulation analyses based on a continuous biomarker. Stat Methods Med Res 28:1879–1892
Graf AC, Magirr D, Dmitrienko A, Posch M (2020) Optimized multiple testing procedures for nested subpopulations based on a continuous biomarker. Stat Med. To appear
Gunter L, Zhu J, Murphy S (2011) Variable selection for qualitative interactions in personalized medicine while controlling the familywise error rate. J Biopharm Stat 21:1063–1078
Hemmings R (2014) An overview of statistical and regulatory issues in the planning, analysis, and interpretation of subgroup analyses in confirmatory clinical trials. J Biopharm Stat 24:4–18
Hemmings R (2015) Comment. Stat Biopharm Res 7:305–308
Henderson NC, Louis TA, Rosner G, Varadhan R (2020) Individualized treatment effects with censored data via fully nonparametric Bayesian accelerated failure time models. Biostatistics 21:50–68. https://doi.org/10.1093/biostatistics/kxy02
Herbst RS et al (2016) Pembrolizumab versus docetaxel for previously treated, PD-L1-positive, advanced non-small-cell lung cancer (KEYNOTE-010): a randomised controlled trial. Lancet 387:1540–1550
Hill JL (2016) Bayesian nonparametric modeling for causal inference. J Comput Graph Stat 20:217–240
Hirakawa A, Kinoshita F (2017) An analysis of Japanese patients enrolled in multiregional clinical trials in oncology. Ther Innov Regul Sci 51:207–211
Hodges JS, Cui Y, Sargent DJ, Carlin BP (2007) Smoothing balanced single-error-term analysis of variance. Technometrics 49:12–25
Huang Y, Fong Y (2014) Identifying optimal biomarker combinations for treatment selection via a robust kernel method. Biometrics 70:891–901
Huang X, Sun Y, Trow P, Chatterjee S, Chakravatty A, Tian L, Devanarayan V (2017) Patient subgroup identification for clinical drug development. Stat Med 36:1414–1428
ICH (1998) Ethnic factor in the acceptability of foreign data. ICH E5 Expert Working Group. The US Federal Register, vol 83, pp 31790–31796
ICH (International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use) (1999) Topic E9 Statistical principles for clinical trials. CPMP/ICH/363/96
ICH (2014) Final Concept Paper E9 (R1): Addendum to Statistical principles for clinical trials on choosing appropriate estimands and defining sensitivity analyses in clinical trials. ICH Steering Committee
ICH (International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use) (2017) Guideline E17 on general principles for planning and design of multi-regional clinical trials. EMA/CHMP/ICH/453276/2016
Ikeda K, Bretz F (2010) Sample size and proportion of Japanese patients in multi-regional trials. Pharm Stat 9:207–216
Imai K, Ratkovic M (2013) Estimating treatment effect heterogeneity in randomized program evaluation. Ann Appl Stat 7:443–470
Johnston S et al (2009) Lapatinib combined with letrozole versus letrozole and placebo as first-line therapy for postmenopausal hormone receptor-positive metastatic breast cancer. J Clin Oncol 33:5538–5546
Kehl V, Ulm K (2006) Responder identification in clinical trials with censored data. Comput Stat Data An 50:1338–1355
Koch A, Framke T (2014) Reliably basing conclusions on subgroups of randomized clinical trials. J Biopharm Stat 24:42–57
Koch G, Schwartz TA (2014) An overview of statistical planning to address subgroups in confirmatory clinical trials. J Biopharm Stat 24:72–93
Laber EB, Zhao YQ (2015) Tree-based methods for individualized treatment regimes. Biometrika 102:501–514
Lamont AE, Lyons M, Jaki TF, Stuart E, Feaster D, Ishwaran H, Tharmaratnam K, Van Horn ML (2018) Identification of predicted individual treatment effects in randomized clinical trials. Stat Methods Med Res 27:142–157
Leblanc M, Crowley J (1993) Survival trees by goodness of split. J Am Stat Assoc 88:457–467
Lee JD, Sun DL, Sun Y, Taylor JE (2016) Exact post-selection inference, with application to the lasso. Ann Stat 44:907–927
Lipkovich I, Dmitrienko A (2014a) Strategies for identifying predictive biomarkers and subgroups with enhanced treatment effect clinical trials using SIDES. J Biopharm Stat 24:130–153
Lipkovich I, Dmitrienko A (2014b) Biomarker identification in clinical trials. In: Carini C, Menon S, Chang M (eds) Clinical and statistical considerations in personalized medicine. Chapman and Hall/CRC Press, New York
Lipkovich I, Dmitrienko A, Denne J, Enas G (2011) Subgroup identification based on differential effect search (SIDES): a recursive partitioning method for establishing response to treatment in patient subpopulations. Stat Med 30:2601–2621
Lipkovich I, Dmitrienko A, D’Agostino BR (2017a) Tutorial in biostatistics: data-driven subgroup identification and analysis in clinical trials. Stat Med 36:136–196
Lipkovich I, Dmitrienko A, Patra K, Ratitch B, Pulkstenis E (2017b) Subgroup identification in clinical trials by Stochastic SIDEScreen methods. Stat Biopharm Res 9:368–378
Lipkovich I, Dmitrienko A, Muysers C, Ratitch B (2018) Multiplicity issues in exploratory subgroup analysis. J Biopharm Stat 28:63–81
Liu JT, Tsou HH, Gordon Lan KK et al (2016) Assessing the consistency of the treatment effect under the discrete random effects model in multiregional clinical trials. Stat Med 35:2301–2314
Loh WY, He X, Man M (2015) A regression tree approach to identifying subgroups with differential treatment effects. Stat Med 34:1818–1833
Loh WY, Fu H, Man M, Champion V, Yu M (2016) Identification of subgroups with differential treatment effects for longitudinal and multiresponse variables. Stat Med 35:4837–4855
Lu M, Sadiq S, Feaster DJ, Ishwaran H (2018) Estimating individual treatment effect in observational data using random Forest methods. J Comput Graph Stat 27:209–219
Mahaffey KW, Wojdyla DM, Carroll K, Becker RC, Storey RF, Angiolillo DJ, Held C, Cannon CP, James S, Pieper KS, Horrow J, Harrington RA, Wallentin L (2011) Ticagrelor compared with clopidogrel by geographic region in the Platelet Inhibition and Patient Outcomes (PLATO) Trial. Circulation 124:544–554
Mayer C, Lipkovich I, Dmitrienko A (2015) Survey results on industry practices and challenges in subgroup analysis in clinical trials. Stat Biopharm Res 7:272–282
Millen B, Dmitrienko A, Ruberg S, Shen L (2012) A statistical framework for decision making in confirmatory multi-population tailoring clinical trials. Drug Inf J 46:647–656
Millen B, Dmitrienko A, Song G (2014) Bayesian assessment of the influence and interaction conditions in multi-population tailoring clinical trials. J Biopharm Stat 24:94–109
NMPA (China National Medical Product Administration) (2015) Guidance for international multicenter clinical trials (IMCT). Issued final trial implementation, as of March 2015
Paux G, Dmitrienko A (2018) Penalty-based approaches to evaluating multiplicity adjustments in clinical trials: traditional multiplicity problems. J Biopharm Stat 28:146–168
Piccart-Gebhart MJ et al (2005) Trastuzumab after adjuvant chemotherapy in HER2-positive breast cancer. N Engl J Med 353:1659–1672
PMDA (Pharmaceuticals and Medical Devices Agency) (2007) Ministry of Health, Labour and Welfare. Basic principles on global clinical trials. https://www.pmda.go.jp/files/000153265.pdf
PMDA (Pharmaceuticals and Medical Devices Agency) (2012) Ministry of Health, Labour and Welfare. Basic principles on global clinical trials (Reference cases). https://www.pmda.go.jp/files/000157520.pdf
Qian M, Murphy SA (2011) Performance guarantees for individualized treatment rules. Ann Stat 39:1180–1210
Ridgeway G (1999) The state of boosting. Comput Sci Stat 31:172–181
Rosenkranz GK (2016) Exploratory subgroup analysis in clinical trials by model selection. Biom J 58:1217–1228
Rothmann MD, Zhang JJ, Lu L, Fleming TR (2012) Testing in a pre-specified subgroup and the intent-to-treat population. Drug Inf J 46:175–179
Rothwell PM (2005) Subgroup analysis in randomized controlled trials: importance, indications, and interpretation. Lancet 365:176–86
Royston P, Sauerbrei W (2004) A new approach to modelling interaction between treatment and continuous covariates in clinical trials by using fractional polynomials. Stat Med 23:2509–2525
Royston P, Sauerbrei W (2013) Interaction of treatment with a continuous variable: simulation study of power for several methods of analysis. Stat Med 32:3788–3803
Russek-Cohen E (2014) EMA workshop on the investigation of subgroups in confirmatory clinical trials. Presentation available online: https://www.ema.europa.eu/documents/presentation/presentation-comments-us-food-drug-administration-fdaworking-group-subgroup-analyses-estelle-russek_en.pdf
Seibold H, Zeileis A, Hothorn T (2016) Model-based recursive partitioning for subgroup analyses. Int J Biostat 12:45–63
Seibold H, Zeileis A, Hothorn T (2018) Individual treatment effect prediction for ALS patients. Stat Methods Med Res 10:3104–3125
Su X, Zhou T, Yan X, Fan J, Yang S (2008) Interaction trees with censored survival data. Int J Biostat 4(1):Article 2
Su X, Tsai CL, Wang H, Nickerson DM, Li B (2009) Subgroup analysis via recursive partitioning. J Mach Learn Res 10:141–158
Su X, Peña AT, Liu L, Levine RA (2018) Random forests of interaction trees for estimating individualized treatment effects in randomized trials. Stat Med 37:2547–2560
Sun X, Briel M, Walter SD, Guyatt GH (2010) Is a subgroup effect believable? Updating criteria to evaluate the credibility of subgroup analyses. BMJ 340:c117
Taylor J, Tibshirani R (2017) Post-selection inference for l1-penalized likelihood models. Can J Stat 1:1–21
Tian L, Alizaden AA, Gentles AJ, Tibshirani R (2014) A simple method for detecting interactions between a treatment and a large number of covariates. J Am Stat Assoc 109:1517–1532
Wager S, Athey S (2018) Estimation and inference of heterogeneous treatment effects using random forests. J Am Stat Assoc 113:1228–1242
Wager S, Hastie T, Efron B (2014) Confidence intervals for random forests: the jackknife and the infinitesimal jackknife. J Mach Learn Res 15:1625–1651
Wang SJ, Dmitrienko A (2014) Guest Editors’ Note: Special issue on subgroup analysis in clinical trials. J Biopharm Stat 24:1–3
Wang SJ, Hung HMJ (2014) A regulatory perspective on essential considerations in design and analysis of subgroups when correctly classified. J Biopharm Stat 24:19–41
Wang SJ, O’Neill RT, Hung HMJ (2007) Approaches to evaluation of treatment effect in randomized clinical trials with genomic subset. Pharm Stat 6:227–244
Wassmer G, Brannath W (2016) Group sequential and confirmatory adaptive designs in clinical trials. Springer, New York
Xu Y, Yu M, Zhao YQ, Li Q, Wang S, Shao J (2015) Regularized outcome weighted subgroup identification for differential treatment effects. Biometrics 71:645–653
Zhang B, Tsiatis AA, Davidian M, Zhang M, Laber EB (2012) Estimating optimal treatment regimes from a classification perspective. Statistics 1:103–114
Zhang Y, Laber EB, Tsiatis A, Davidian M (2016) Interpretable dynamic treatment regimes. Preprint. arXiv:1606.01472
Zhao Y, Zheng D, Rush AJ, Kosorok MR (2012) Estimating individualized treatment rules using outcome weighted learning. J Am Stat Assoc 107:1106–1118
Zhao Y, Zheng W, Zhuo DY, Lu Y, Ma X, Liu H, Zeng Z, Laird G (2018) Bayesian additive decision trees of biomarker-by treatment interactions for predictive biomarkers detection and subgroup identification. J Biopharm Stat 28:534–549
Acknowledgement
We are grateful to Lei Xu, Anthony Zagar, Lanju Zhang and the book’s editors for their insightful comments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Dmitrienko, A., Lipkovich, I., Dane, A., Muysers, C. (2020). Data-Driven and Confirmatory Subgroup Analysis in Clinical Trials. In: Ting, N., Cappelleri, J., Ho, S., Chen, (G. (eds) Design and Analysis of Subgroups with Biopharmaceutical Applications. Emerging Topics in Statistics and Biostatistics . Springer, Cham. https://doi.org/10.1007/978-3-030-40105-4_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-40105-4_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-40104-7
Online ISBN: 978-3-030-40105-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)