Abstract
In case multiple treatment alternatives are available for some medical problem, the detection of treatment–subgroup interactions (i.e., relative treatment effectiveness varying over subgroups of persons) is of key importance for personalized medicine and the development of optimal treatment assignment strategies. Randomized Clinical Trials (RCT) often go without clear a priori hypotheses on the subgroups involved in treatment–subgroup interactions, and with a large number of pre-treatment characteristics in the data. In such situations, relevant subgroups (defined in terms of pre-treatment characteristics) are to be induced during the actual data analysis. This comes down to a problem of cluster analysis, with the goal of this analysis being to find clusters of persons that are involved in meaningful treatment–person cluster interactions. For such a cluster analysis, five recently proposed methods can be used, all being of a recursive partitioning type. However, these five methods have been developed almost independently, and the relations between them are not yet understood. The present paper closes this gap. It starts by outlining the basic principles behind each method, and by illustrating it with an application on an RCT data set on two treatment strategies for substance abuse problems. Next, it presents a comparison of the methods, hereby focusing on major similarities and differences. The discussion concludes with practical advice for end users with regard to the selection of a suitable method, and with an important challenge for future research in this area.
Similar content being viewed by others
Notes
A recursive algorithm is defined here as an algorithm involving a stepwise manner of repeating the same procedure.
Clinical Trials Network databases and information are available at www.ctndatashare.org.
Note that Conversano and Dusseldorp (2010) introduce the use of STIMA in the case of a binary outcome variable \(Y\).
As the same splitting process is repeated in a step-wise manner the STIMA algorithm is recursive according to the definition given in footnote 1. When a more restrictive definition of recursive partitioning is used, whereby what happens in one node from a data-analytic perspective may not depend on information in other nodes, STIMA cannot be called recursive as the splitting procedure in a node of the regression trunk depends on the current reference model and, hence, also on information from other nodes.
Given that the total sum of squares is fixed, the \(\eta ^2\) values can be straightforwardly compared across the five methods.
For the analyses reported in this paper we used for Model-based recursive partitioning and STIMA, the party (Zeileis et al. 2008) and stima (Dusseldorp and Conversano 2013) R-code. For the other methods we made use of software kindly made available by the authors of the methods, under the form of R-code for Interaction Trees and Virtual Twins, and under the form of an Excel add-in for SIDES.
References
Bala MM, Akl EA, Sun X, Bassler D, Mertz D, Mejza F, Vandvik PO, Malaga G, Johnston BC, Dahm P, Alonso-Coello P, Diaz-Granados N, Srinathan SK, Hassouneh B, Briel M, Busse JW, You JJ, Walter SD, Altman DG, Guyatt GH (2013) Randomized trials published in higher vs. lower impact journals differ indesign, conduct, and analysis. J Clin Epidemiol 66(3):286–295. doi:10.1016/j.jclinepi.2012.10.005. http://www.sciencedirect.com/science/article/pii/S0895435612003174
Boonacker C, Hoes A, van Liere-Visser K, Schilder A, Rovers M (2011) A comparison of subgroup analyses in grant applications and publications. Am J Epidemiol 174(2):219–225
Breiman L (2001) Random forests. Mach Learn 45:5–32
Carroll KM, Ball SA, Nich C, Martino S, Frankforter TL, Farentinos C, Kunkel LE, Mikulich-Gilbertson SK, Morgenstern J, Obert JL, Polcin D, Snead N, Woody GE (2006) Motivational interviewing to improve treatment engagement and outcome in individuals seeking treatment for substance abuse: a multisite effectiveness study. Drug Alcohol Depend 81(3):301–312. doi:10.1016/j.drugalcdep.2005.08.002
Conversano C, Dusseldorp E (2010) Simultaneous threshold interaction detection in binary classification. In: Palumbo F, Lauro CN, Greenacre MJ (eds) Data analysis and classification, studies in classification, data analysis, and knowledge organization. Springer, Berlin, pp 225–232. doi:10.1007/978-3-642-03739-9_26
Dehejia RH (2005) Program evaluation as a decision problem. J Econ 125(1–2):141–173
Dixon D, Simon R (1991) Bayesian subset analysis. Biometrics 47(3):871–81
Dusseldorp E, Conversano C (2013) stima: simultaneous threshold interaction modeling algorithm. http://CRAN.R-project.org/package=stima, r package version 1.1
Dusseldorp E, Meulman JJ (2001) Prediction in medicine by integrating regression trees into regression analysis with optimal scaling. Methods Inf Med 40:403–409
Dusseldorp E, Meulman JJ (2004) The regression trunk approach to discover treatment covariate interaction. Psychometrika 69(3):355–374
Dusseldorp E, Conversano C, Van Os BJ (2010) Combining an additive and tree-based regression model simultaneously: stima. J Comput Graph Stat 19(3):514–530. doi:10.1198/jcgs.2010.06089
Foster J, Taylor J, Ruberg S (2011) Subgroup identification from randomized clinical trial data. Stat Med 30(24):2867–2880
Hayward RA, Kent DM, Vijan S, Hofer TP (2006) Multivariable risk prediction can greatly enhance the statistical power of clinical trial subgroup analysis. BMC Med Res Methodol 6(18):18. doi:10.1186/1471-2288-6-18
Kraemer H, Wilson G, Fairburn CG, Agras W (2002) Mediators and moderators of treatment effects in randomized clinical trials. Arch Gen Psychiatry 59(10):877–883. doi:10.1001/archpsyc.59.10.877
LeBlanc M, Crowley J (1993) Survival trees by goodness of split. J Am Stat Assoc 88:457–467
Lipkovich I, Dmitrienko A, Denne J, Enas G (2011) Subgroup identification based on differential effect search-a recursive partitioning method for establishing response to treatment in patient subpopulations. Stat Med 30(21):2601–2621
McGahan PL, Griffith JA, Parente R, McLellan AT (1986) Addiction severity index composite scores manual. Treatment Research Institute, Philadelphia
McLellan AT, Al Alterman (1992) A quantitative measure of substance abuse treatments: the treatment services review. J Nerv Mental Dis 180:101–110
R Development Core Team (2013) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org/
Rubin DB (1974) Estimating causal effects of treatment in randomized and nonrandomized studies. J Educ Psychol 66:688–701
Shaffer J (1991) Probability of directional errors with disordinal (qualitative) interaction. Psychometrika 56(1):29–38
Su X, Zhou T, Yan X, Fan J, Yang S (2008) Interaction trees with censored survival data. Int J Biostat 4(1):2
Su X, Tsai CL, Wang H, Nickerson DM, Li B (2009) Subgroup analysis via recursive partitioning. J Mach Learn Res 10:141–158
Tunis SR, Benner J, McClellan M (2010) Comparative effectiveness research: policy context, methods development and research infrastructure. Stat Med 29(19):1963–1976
Zeileis A, Hothorn T, Hornik K (2008) Model-based recursive partitioning. J Comput Graph Stat 17(2):492–514. doi:10.1198/106186008X319331
Author information
Authors and Affiliations
Corresponding author
Additional information
The research reported in this paper was supported by the Fund for Scientific Research Flanders (G.0546.09N). We would like to thank Xiaogang Su, Ilya Lipkovich and Jared Foster for providing us with the software for Interaction Trees, SIDES and Virtual Twins. In addition, we would like to thank the anonymous reviewers for their valuable comments and suggestions to improve the quality of the paper. For the illustrative applications, we made use of data from a clinical trial conducted as part of the National Drug Abuse Treatment Clinical Trials Network (CTN) sponsored by National Institute on Drug Abuse (NIDA). Specifically, data from CTN-0005 entitled ‘MI (Motivational Interviewing) to Improve Treatment Engagement and Outcome in Subjects Seeking Treatment for Substance Abuse’ were included. CTN databases and accompanying information are available at www.ctndatashare.org.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Doove, L.L., Dusseldorp, E., Van Deun, K. et al. A comparison of five recursive partitioning methods to find person subgroups involved in meaningful treatment–subgroup interactions. Adv Data Anal Classif 8, 403–425 (2014). https://doi.org/10.1007/s11634-013-0159-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-013-0159-x