Skip to main content

The GUIDE Approach to Subgroup Identification

  • Chapter
  • First Online:
Design and Analysis of Subgroups with Biopharmaceutical Applications

Part of the book series: Emerging Topics in Statistics and Biostatistics ((ETSB))

Abstract

GUIDE is a multi-purpose algorithm for classification and regression tree construction with special capabilities for identifying subgroups with differential treatment effects. It is unique among subgroup methods in having all these features: unbiased split variable selection, approximately unbiased estimation of subgroup treatment effects, treatments with two or more levels, allowance for linear effects of prognostic variables within subgroups, and automatic handling of missing predictor variable values without imputation in piecewise-constant models. Predictor variables may be continuous, ordinal, nominal, or cyclical (such as angular measurements, hour of day, day of week, or month of year). Response variables may be univariate, multivariate, longitudinal, or right-censored. This article gives a current account of the main features of the method for subgroup identification and reviews a bootstrap method for conducting post-selection inference on the subgroup treatment effects. A data set pooled from studies of amyotrophic lateral sclerosis is used for illustration.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 16.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Aalen O (1978) Nonparametric inference for a family of counting processes. Ann Stat 6:701–726

    Article  MathSciNet  Google Scholar 

  • Ahn H, Loh W-Y (1994) Tree-structured proportional hazards regression modeling. Biometrics 50:471–485

    Article  Google Scholar 

  • Aitkin M, Clayton D (1980) The fitting of exponential, Weibull and extreme value distributions to complex censored survival data using GLIM. Appl Stat 29:156–163

    Article  Google Scholar 

  • Atassi N, Berry J, Shui A, Zach N, Sherman A, Sinani E, Walker J, Katsovskiy I, Schoenfeld D, Cudkowicz M, Leitner M (2014) The PRO-ACT database: Design, initial analyses, and predictive features. Neurology 83(19):1719–1725

    Article  Google Scholar 

  • Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth, Belmont

    MATH  Google Scholar 

  • Breslow N (1972) Contribution to the discussion of regression models and life tables by D. R. Cox. J R Stat Soc Ser B 34:216–217

    MathSciNet  Google Scholar 

  • Chan K-Y, Loh W-Y (2004) LOTUS: An algorithm for building accurate and comprehensible logistic regression trees. J. Comput. Graph. Stat. 13:826–852

    Article  MathSciNet  Google Scholar 

  • Chaudhuri P, Loh W-Y (2002) Nonparametric estimation of conditional quantiles using quantile regression trees. Bernoulli 8:561–576

    MathSciNet  MATH  Google Scholar 

  • Chaudhuri P, Huang M-C, Loh W-Y, Yao R (1994) Piecewise-polynomial regression trees. Stat Sin 4:143–167

    MATH  Google Scholar 

  • Chaudhuri P, Lo W, Loh W, Yang C (1995) Generalized regression trees. Stat Sin 5:641–666

    MathSciNet  MATH  Google Scholar 

  • Dusseldorp E, Meulman JJ (2004) The regression trunk approach to discover treatment covariate interaction. Psychometrika 69:355–374

    Article  MathSciNet  Google Scholar 

  • Foster JC, Taylor JMG, Ruberg SJ (2011) Subgroup identification from randomized clinical trial data. Stat Med 30:2867–2880

    Article  MathSciNet  Google Scholar 

  • Gnanadesikan R (1997) Methods for statistical data analysis of multivariate observations, 2nd edn. Wiley, New York

    Book  Google Scholar 

  • Kim H, Loh W-Y (2001) Classification trees with unbiased multiway splits. J Am Stat Assoc 96:589–604

    Article  MathSciNet  Google Scholar 

  • Kim H, Loh W-Y (2003) Classification trees with bivariate linear discriminant node models. J Comput Graph Stat 12:512–530

    Article  MathSciNet  Google Scholar 

  • Laird N, Olivier D (1981) Covariance analysis of censored survival data using log-linear analysis techniques. J Am Stat Assoc 76:231–240

    Article  MathSciNet  Google Scholar 

  • Lawless J (1982) Statistical models and methods for lifetime data. Wiley, New York

    MATH  Google Scholar 

  • Lipkovich I, Dmitrienko A (2014) Strategies for identifying predictive biomarkers and subgroups with enhanced treatment effect in clinical trials using SIDES. J Biol Stand 24:130–153

    MathSciNet  Google Scholar 

  • Lipkovich I, Dmitrienko A, Denne J, Enas G (2011) Subgroup identification based on differential effect search — a recursive partitioning method for establishing response to treatment in patient subpopulations. Stat Med 30:2601–2621

    MathSciNet  Google Scholar 

  • Loh W-Y (1987) Calibrating confidence coefficients. J Am Stat Assoc 82:155–162

    Article  MathSciNet  Google Scholar 

  • Loh W-Y (1991a) Bootstrap calibration for confidence interval construction and selection. Stat Sin 1:477–491

    MathSciNet  MATH  Google Scholar 

  • Loh W-Y (1991b) Survival modeling through recursive stratification. Comput Stat Data Anal 12:295–313

    Article  Google Scholar 

  • Loh W-Y (2002) Regression trees with unbiased variable selection and interaction detection. Stat Sin 12:361–386

    MathSciNet  MATH  Google Scholar 

  • Loh W-Y (2006) Regression tree models for designed experiments. In: Rojo J (ed) Second E. L. Lehmann Symposium, vol 49. IMS lecture notes-monograph series. Institute of Mathematical Statistics, pp 210–228

    Google Scholar 

  • Loh W-Y (2009) Improving the precision of classification trees. Ann Appl Stat 3:1710–1737

    Article  MathSciNet  Google Scholar 

  • Loh W-Y (2014) Fifty years of classification and regression trees (with discussion). Int Stat Rev 34:329–370

    Article  Google Scholar 

  • Loh W-Y (2018) GUIDE user manual. University of Wisconsin, Madisons

    Google Scholar 

  • Loh W-Y, Shih Y-S (1997) Split selection methods for classification trees. Stat Sin 7:815–840

    MathSciNet  MATH  Google Scholar 

  • Loh W-Y, Vanichsetakul N (1988) Tree-structured classification via generalized discriminant analysis (with discussion). J Am Stat Assoc 83:715–728

    Article  Google Scholar 

  • Loh W-Y, Zheng W (2013) Regression trees for longitudinal and multiresponse data. Ann Appl Stat 7:495–522

    Article  MathSciNet  Google Scholar 

  • Loh W-Y, He X, Man M (2015) A regression tree approach to identifying subgroups with differential treatment effects. Stat Med 34:1818–1833

    Article  MathSciNet  Google Scholar 

  • Loh W-Y, Fu H, Man M, Champion V, Yu M (2016) Identification of subgroups with differential treatment effects for longitudinal and multiresponse variables. Stat Med 35:4837–4855

    Article  MathSciNet  Google Scholar 

  • Loh W-Y, Cao L, Zhou P (2019a) Subgroup identification for precision medicine: a comparative review of thirteen methods. Data Min Knowl Disc 9(5):e1326

    Google Scholar 

  • Loh W-Y, Eltinge J, Cho MJ, Li Y (2019b) Classification and regression trees and forests for incomplete data from sample surveys. Stat Sin 29:431–453

    MathSciNet  MATH  Google Scholar 

  • Loh W-Y, Man M, Wang S (2019c) Subgroups from regression trees with adjustment for prognostic effects and post-selection inference. Stat Med 38:545–557

    Article  MathSciNet  Google Scholar 

  • Morgan JN, Sonquist JA (1963) Problems in the analysis of survey data, and a proposal. J Am Stat Assoc 58:415–434

    Article  Google Scholar 

  • Negassa A, Ciampi A, Abrahamowicz M, Shapiro S, Boivin J (2005) Tree-structured subgroup analysis for censored survival data: validation of computationally inexpensive model selection criteria. Stat Comput 15:231–239

    Article  MathSciNet  Google Scholar 

  • Seibold H, Zeileis A, Hothorn T (2016) Model-based recursive partitioning for subgroup analyses. Int J Biostat 12(1):45–63

    Article  MathSciNet  Google Scholar 

  • Su X, Tsai C, Wang H, Nickerson D, Bogong L (2009) Subgroup analysis via recursive partitioning. J Mach Learn Res 10:141–158

    Google Scholar 

  • Therneau T, Atkinson B (2018) rpart: recursive partitioning and regression trees. R package version 4.1-13

    Google Scholar 

  • Zeileis A, Hothorn T, Hornik K (2008) Model-based recursive partitioning. J Comput Graph Stat 17:492–514

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors thank Tao Shen, Yu-Shan Shih and Shijie Tang for their helpful comments. Data used in the preparation of this article were obtained from the Pooled Resource Open-Access ALS Clinical Trials (PRO-ACT) Database. As such, the following organizations and individuals within the PRO-ACT Consortium contributed to the design and implementation of the PRO-ACT Database and/or provided data, but did not participate in the analysis of the data or the writing of this report: Neurological Clinical Research Institute, MGH Northeast ALS Consortium Novartis Prize4Life Israel Regeneron Pharmaceuticals, Inc., and Sanofi Teva Pharmaceutical Industries, Ltd.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei-Yin Loh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Loh, WY., Zhou, P. (2020). The GUIDE Approach to Subgroup Identification. In: Ting, N., Cappelleri, J., Ho, S., Chen, (G. (eds) Design and Analysis of Subgroups with Biopharmaceutical Applications. Emerging Topics in Statistics and Biostatistics . Springer, Cham. https://doi.org/10.1007/978-3-030-40105-4_6

Download citation

Publish with us

Policies and ethics