Skip to main content

Advertisement

Log in

A comparison of five recursive partitioning methods to find person subgroups involved in meaningful treatment–subgroup interactions

  • Regular Article
  • Published:
Advances in Data Analysis and Classification Aims and scope Submit manuscript

Abstract

In case multiple treatment alternatives are available for some medical problem, the detection of treatment–subgroup interactions (i.e., relative treatment effectiveness varying over subgroups of persons) is of key importance for personalized medicine and the development of optimal treatment assignment strategies. Randomized Clinical Trials (RCT) often go without clear a priori hypotheses on the subgroups involved in treatment–subgroup interactions, and with a large number of pre-treatment characteristics in the data. In such situations, relevant subgroups (defined in terms of pre-treatment characteristics) are to be induced during the actual data analysis. This comes down to a problem of cluster analysis, with the goal of this analysis being to find clusters of persons that are involved in meaningful treatment–person cluster interactions. For such a cluster analysis, five recently proposed methods can be used, all being of a recursive partitioning type. However, these five methods have been developed almost independently, and the relations between them are not yet understood. The present paper closes this gap. It starts by outlining the basic principles behind each method, and by illustrating it with an application on an RCT data set on two treatment strategies for substance abuse problems. Next, it presents a comparison of the methods, hereby focusing on major similarities and differences. The discussion concludes with practical advice for end users with regard to the selection of a suitable method, and with an important challenge for future research in this area.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. A recursive algorithm is defined here as an algorithm involving a stepwise manner of repeating the same procedure.

  2. Clinical Trials Network databases and information are available at www.ctndatashare.org.

  3. Note that Conversano and Dusseldorp (2010) introduce the use of STIMA in the case of a binary outcome variable \(Y\).

  4. As the same splitting process is repeated in a step-wise manner the STIMA algorithm is recursive according to the definition given in footnote 1. When a more restrictive definition of recursive partitioning is used, whereby what happens in one node from a data-analytic perspective may not depend on information in other nodes, STIMA cannot be called recursive as the splitting procedure in a node of the regression trunk depends on the current reference model and, hence, also on information from other nodes.

  5. Given that the total sum of squares is fixed, the \(\eta ^2\) values can be straightforwardly compared across the five methods.

  6. For the analyses reported in this paper we used for Model-based recursive partitioning and STIMA, the party (Zeileis et al. 2008) and stima (Dusseldorp and Conversano 2013) R-code. For the other methods we made use of software kindly made available by the authors of the methods, under the form of R-code for Interaction Trees and Virtual Twins, and under the form of an Excel add-in for SIDES.

References

  • Bala MM, Akl EA, Sun X, Bassler D, Mertz D, Mejza F, Vandvik PO, Malaga G, Johnston BC, Dahm P, Alonso-Coello P, Diaz-Granados N, Srinathan SK, Hassouneh B, Briel M, Busse JW, You JJ, Walter SD, Altman DG, Guyatt GH (2013) Randomized trials published in higher vs. lower impact journals differ indesign, conduct, and analysis. J Clin Epidemiol 66(3):286–295. doi:10.1016/j.jclinepi.2012.10.005. http://www.sciencedirect.com/science/article/pii/S0895435612003174

    Google Scholar 

  • Boonacker C, Hoes A, van Liere-Visser K, Schilder A, Rovers M (2011) A comparison of subgroup analyses in grant applications and publications. Am J Epidemiol 174(2):219–225

    Article  Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45:5–32

    Article  MATH  Google Scholar 

  • Carroll KM, Ball SA, Nich C, Martino S, Frankforter TL, Farentinos C, Kunkel LE, Mikulich-Gilbertson SK, Morgenstern J, Obert JL, Polcin D, Snead N, Woody GE (2006) Motivational interviewing to improve treatment engagement and outcome in individuals seeking treatment for substance abuse: a multisite effectiveness study. Drug Alcohol Depend 81(3):301–312. doi:10.1016/j.drugalcdep.2005.08.002

    Article  Google Scholar 

  • Conversano C, Dusseldorp E (2010) Simultaneous threshold interaction detection in binary classification. In: Palumbo F, Lauro CN, Greenacre MJ (eds) Data analysis and classification, studies in classification, data analysis, and knowledge organization. Springer, Berlin, pp 225–232. doi:10.1007/978-3-642-03739-9_26

  • Dehejia RH (2005) Program evaluation as a decision problem. J Econ 125(1–2):141–173

    Article  MathSciNet  Google Scholar 

  • Dixon D, Simon R (1991) Bayesian subset analysis. Biometrics 47(3):871–81

    Article  MATH  Google Scholar 

  • Dusseldorp E, Conversano C (2013) stima: simultaneous threshold interaction modeling algorithm. http://CRAN.R-project.org/package=stima, r package version 1.1

  • Dusseldorp E, Meulman JJ (2001) Prediction in medicine by integrating regression trees into regression analysis with optimal scaling. Methods Inf Med 40:403–409

    Google Scholar 

  • Dusseldorp E, Meulman JJ (2004) The regression trunk approach to discover treatment covariate interaction. Psychometrika 69(3):355–374

    Article  MathSciNet  Google Scholar 

  • Dusseldorp E, Conversano C, Van Os BJ (2010) Combining an additive and tree-based regression model simultaneously: stima. J Comput Graph Stat 19(3):514–530. doi:10.1198/jcgs.2010.06089

    Article  Google Scholar 

  • Foster J, Taylor J, Ruberg S (2011) Subgroup identification from randomized clinical trial data. Stat Med 30(24):2867–2880

    Article  MathSciNet  Google Scholar 

  • Hayward RA, Kent DM, Vijan S, Hofer TP (2006) Multivariable risk prediction can greatly enhance the statistical power of clinical trial subgroup analysis. BMC Med Res Methodol 6(18):18. doi:10.1186/1471-2288-6-18

    Article  Google Scholar 

  • Kraemer H, Wilson G, Fairburn CG, Agras W (2002) Mediators and moderators of treatment effects in randomized clinical trials. Arch Gen Psychiatry 59(10):877–883. doi:10.1001/archpsyc.59.10.877

    Google Scholar 

  • LeBlanc M, Crowley J (1993) Survival trees by goodness of split. J Am Stat Assoc 88:457–467

    Article  MATH  MathSciNet  Google Scholar 

  • Lipkovich I, Dmitrienko A, Denne J, Enas G (2011) Subgroup identification based on differential effect search-a recursive partitioning method for establishing response to treatment in patient subpopulations. Stat Med 30(21):2601–2621

    MathSciNet  Google Scholar 

  • McGahan PL, Griffith JA, Parente R, McLellan AT (1986) Addiction severity index composite scores manual. Treatment Research Institute, Philadelphia

    Google Scholar 

  • McLellan AT, Al Alterman (1992) A quantitative measure of substance abuse treatments: the treatment services review. J Nerv Mental Dis 180:101–110

    Article  Google Scholar 

  • R Development Core Team (2013) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org/

  • Rubin DB (1974) Estimating causal effects of treatment in randomized and nonrandomized studies. J Educ Psychol 66:688–701

    Article  Google Scholar 

  • Shaffer J (1991) Probability of directional errors with disordinal (qualitative) interaction. Psychometrika 56(1):29–38

    Article  Google Scholar 

  • Su X, Zhou T, Yan X, Fan J, Yang S (2008) Interaction trees with censored survival data. Int J Biostat 4(1):2

    MathSciNet  Google Scholar 

  • Su X, Tsai CL, Wang H, Nickerson DM, Li B (2009) Subgroup analysis via recursive partitioning. J Mach Learn Res 10:141–158

    Google Scholar 

  • Tunis SR, Benner J, McClellan M (2010) Comparative effectiveness research: policy context, methods development and research infrastructure. Stat Med 29(19):1963–1976

    Article  MathSciNet  Google Scholar 

  • Zeileis A, Hothorn T, Hornik K (2008) Model-based recursive partitioning. J Comput Graph Stat 17(2):492–514. doi:10.1198/106186008X319331

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to L. L. Doove.

Additional information

The research reported in this paper was supported by the Fund for Scientific Research Flanders (G.0546.09N). We would like to thank Xiaogang Su, Ilya Lipkovich and Jared Foster for providing us with the software for Interaction Trees, SIDES and Virtual Twins. In addition, we would like to thank the anonymous reviewers for their valuable comments and suggestions to improve the quality of the paper. For the illustrative applications, we made use of data from a clinical trial conducted as part of the National Drug Abuse Treatment Clinical Trials Network (CTN) sponsored by National Institute on Drug Abuse (NIDA). Specifically, data from CTN-0005 entitled ‘MI (Motivational Interviewing) to Improve Treatment Engagement and Outcome in Subjects Seeking Treatment for Substance Abuse’ were included. CTN databases and accompanying information are available at www.ctndatashare.org.

Electronic supplementary material

Appendix A

Appendix A

Table 6 provides the estimates for Model 2 within each regression based split of the interaction tree that is shown in Fig. 2.

Table 6 Summary of the regression models at each split of the interaction tree for the substance abuse data

Rights and permissions

Reprints and permissions

About this article

Cite this article

Doove, L.L., Dusseldorp, E., Van Deun, K. et al. A comparison of five recursive partitioning methods to find person subgroups involved in meaningful treatment–subgroup interactions. Adv Data Anal Classif 8, 403–425 (2014). https://doi.org/10.1007/s11634-013-0159-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11634-013-0159-x

Keywords

Mathematics Subject Classification (2000)

Navigation