Abstract
GUIDE is a multi-purpose algorithm for classification and regression tree construction with special capabilities for identifying subgroups with differential treatment effects. It is unique among subgroup methods in having all these features: unbiased split variable selection, approximately unbiased estimation of subgroup treatment effects, treatments with two or more levels, allowance for linear effects of prognostic variables within subgroups, and automatic handling of missing predictor variable values without imputation in piecewise-constant models. Predictor variables may be continuous, ordinal, nominal, or cyclical (such as angular measurements, hour of day, day of week, or month of year). Response variables may be univariate, multivariate, longitudinal, or right-censored. This article gives a current account of the main features of the method for subgroup identification and reviews a bootstrap method for conducting post-selection inference on the subgroup treatment effects. A data set pooled from studies of amyotrophic lateral sclerosis is used for illustration.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aalen O (1978) Nonparametric inference for a family of counting processes. Ann Stat 6:701–726
Ahn H, Loh W-Y (1994) Tree-structured proportional hazards regression modeling. Biometrics 50:471–485
Aitkin M, Clayton D (1980) The fitting of exponential, Weibull and extreme value distributions to complex censored survival data using GLIM. Appl Stat 29:156–163
Atassi N, Berry J, Shui A, Zach N, Sherman A, Sinani E, Walker J, Katsovskiy I, Schoenfeld D, Cudkowicz M, Leitner M (2014) The PRO-ACT database: Design, initial analyses, and predictive features. Neurology 83(19):1719–1725
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth, Belmont
Breslow N (1972) Contribution to the discussion of regression models and life tables by D. R. Cox. J R Stat Soc Ser B 34:216–217
Chan K-Y, Loh W-Y (2004) LOTUS: An algorithm for building accurate and comprehensible logistic regression trees. J. Comput. Graph. Stat. 13:826–852
Chaudhuri P, Loh W-Y (2002) Nonparametric estimation of conditional quantiles using quantile regression trees. Bernoulli 8:561–576
Chaudhuri P, Huang M-C, Loh W-Y, Yao R (1994) Piecewise-polynomial regression trees. Stat Sin 4:143–167
Chaudhuri P, Lo W, Loh W, Yang C (1995) Generalized regression trees. Stat Sin 5:641–666
Dusseldorp E, Meulman JJ (2004) The regression trunk approach to discover treatment covariate interaction. Psychometrika 69:355–374
Foster JC, Taylor JMG, Ruberg SJ (2011) Subgroup identification from randomized clinical trial data. Stat Med 30:2867–2880
Gnanadesikan R (1997) Methods for statistical data analysis of multivariate observations, 2nd edn. Wiley, New York
Kim H, Loh W-Y (2001) Classification trees with unbiased multiway splits. J Am Stat Assoc 96:589–604
Kim H, Loh W-Y (2003) Classification trees with bivariate linear discriminant node models. J Comput Graph Stat 12:512–530
Laird N, Olivier D (1981) Covariance analysis of censored survival data using log-linear analysis techniques. J Am Stat Assoc 76:231–240
Lawless J (1982) Statistical models and methods for lifetime data. Wiley, New York
Lipkovich I, Dmitrienko A (2014) Strategies for identifying predictive biomarkers and subgroups with enhanced treatment effect in clinical trials using SIDES. J Biol Stand 24:130–153
Lipkovich I, Dmitrienko A, Denne J, Enas G (2011) Subgroup identification based on differential effect search — a recursive partitioning method for establishing response to treatment in patient subpopulations. Stat Med 30:2601–2621
Loh W-Y (1987) Calibrating confidence coefficients. J Am Stat Assoc 82:155–162
Loh W-Y (1991a) Bootstrap calibration for confidence interval construction and selection. Stat Sin 1:477–491
Loh W-Y (1991b) Survival modeling through recursive stratification. Comput Stat Data Anal 12:295–313
Loh W-Y (2002) Regression trees with unbiased variable selection and interaction detection. Stat Sin 12:361–386
Loh W-Y (2006) Regression tree models for designed experiments. In: Rojo J (ed) Second E. L. Lehmann Symposium, vol 49. IMS lecture notes-monograph series. Institute of Mathematical Statistics, pp 210–228
Loh W-Y (2009) Improving the precision of classification trees. Ann Appl Stat 3:1710–1737
Loh W-Y (2014) Fifty years of classification and regression trees (with discussion). Int Stat Rev 34:329–370
Loh W-Y (2018) GUIDE user manual. University of Wisconsin, Madisons
Loh W-Y, Shih Y-S (1997) Split selection methods for classification trees. Stat Sin 7:815–840
Loh W-Y, Vanichsetakul N (1988) Tree-structured classification via generalized discriminant analysis (with discussion). J Am Stat Assoc 83:715–728
Loh W-Y, Zheng W (2013) Regression trees for longitudinal and multiresponse data. Ann Appl Stat 7:495–522
Loh W-Y, He X, Man M (2015) A regression tree approach to identifying subgroups with differential treatment effects. Stat Med 34:1818–1833
Loh W-Y, Fu H, Man M, Champion V, Yu M (2016) Identification of subgroups with differential treatment effects for longitudinal and multiresponse variables. Stat Med 35:4837–4855
Loh W-Y, Cao L, Zhou P (2019a) Subgroup identification for precision medicine: a comparative review of thirteen methods. Data Min Knowl Disc 9(5):e1326
Loh W-Y, Eltinge J, Cho MJ, Li Y (2019b) Classification and regression trees and forests for incomplete data from sample surveys. Stat Sin 29:431–453
Loh W-Y, Man M, Wang S (2019c) Subgroups from regression trees with adjustment for prognostic effects and post-selection inference. Stat Med 38:545–557
Morgan JN, Sonquist JA (1963) Problems in the analysis of survey data, and a proposal. J Am Stat Assoc 58:415–434
Negassa A, Ciampi A, Abrahamowicz M, Shapiro S, Boivin J (2005) Tree-structured subgroup analysis for censored survival data: validation of computationally inexpensive model selection criteria. Stat Comput 15:231–239
Seibold H, Zeileis A, Hothorn T (2016) Model-based recursive partitioning for subgroup analyses. Int J Biostat 12(1):45–63
Su X, Tsai C, Wang H, Nickerson D, Bogong L (2009) Subgroup analysis via recursive partitioning. J Mach Learn Res 10:141–158
Therneau T, Atkinson B (2018) rpart: recursive partitioning and regression trees. R package version 4.1-13
Zeileis A, Hothorn T, Hornik K (2008) Model-based recursive partitioning. J Comput Graph Stat 17:492–514
Acknowledgements
The authors thank Tao Shen, Yu-Shan Shih and Shijie Tang for their helpful comments. Data used in the preparation of this article were obtained from the Pooled Resource Open-Access ALS Clinical Trials (PRO-ACT) Database. As such, the following organizations and individuals within the PRO-ACT Consortium contributed to the design and implementation of the PRO-ACT Database and/or provided data, but did not participate in the analysis of the data or the writing of this report: Neurological Clinical Research Institute, MGH Northeast ALS Consortium Novartis Prize4Life Israel Regeneron Pharmaceuticals, Inc., and Sanofi Teva Pharmaceutical Industries, Ltd.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Loh, WY., Zhou, P. (2020). The GUIDE Approach to Subgroup Identification. In: Ting, N., Cappelleri, J., Ho, S., Chen, (G. (eds) Design and Analysis of Subgroups with Biopharmaceutical Applications. Emerging Topics in Statistics and Biostatistics . Springer, Cham. https://doi.org/10.1007/978-3-030-40105-4_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-40105-4_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-40104-7
Online ISBN: 978-3-030-40105-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)