A comparison of five recursive partitioning methods to find person subgroups involved in meaningful treatment–subgroup interactions

Doove, L. L.; Dusseldorp, E.; Van Deun, K.; Van Mechelen, I.

doi:10.1007/s11634-013-0159-x

A comparison of five recursive partitioning methods to find person subgroups involved in meaningful treatment–subgroup interactions

Regular Article
Published: 24 December 2013

Volume 8, pages 403–425, (2014)
Cite this article

Advances in Data Analysis and Classification Aims and scope Submit manuscript

L. L. Doove¹,
E. Dusseldorp^1,2,
K. Van Deun¹ &
…
I. Van Mechelen¹

1181 Accesses
44 Citations
3 Altmetric
Explore all metrics

Abstract

In case multiple treatment alternatives are available for some medical problem, the detection of treatment–subgroup interactions (i.e., relative treatment effectiveness varying over subgroups of persons) is of key importance for personalized medicine and the development of optimal treatment assignment strategies. Randomized Clinical Trials (RCT) often go without clear a priori hypotheses on the subgroups involved in treatment–subgroup interactions, and with a large number of pre-treatment characteristics in the data. In such situations, relevant subgroups (defined in terms of pre-treatment characteristics) are to be induced during the actual data analysis. This comes down to a problem of cluster analysis, with the goal of this analysis being to find clusters of persons that are involved in meaningful treatment–person cluster interactions. For such a cluster analysis, five recently proposed methods can be used, all being of a recursive partitioning type. However, these five methods have been developed almost independently, and the relations between them are not yet understood. The present paper closes this gap. It starts by outlining the basic principles behind each method, and by illustrating it with an application on an RCT data set on two treatment strategies for substance abuse problems. Next, it presents a comparison of the methods, hereby focusing on major similarities and differences. The discussion concludes with practical advice for end users with regard to the selection of a suitable method, and with an important challenge for future research in this area.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimating psychological networks and their accuracy: A tutorial paper

Article Open access 24 March 2017

A Tutorial on Applying the Difference-in-Differences Method to Health Data

Article Open access 07 September 2023

Defining the Study Cohort: Inclusion and Exclusion Criteria

Notes

A recursive algorithm is defined here as an algorithm involving a stepwise manner of repeating the same procedure.
Clinical Trials Network databases and information are available at www.ctndatashare.org.
Note that Conversano and Dusseldorp (2010) introduce the use of STIMA in the case of a binary outcome variable \(Y\).
As the same splitting process is repeated in a step-wise manner the STIMA algorithm is recursive according to the definition given in footnote 1. When a more restrictive definition of recursive partitioning is used, whereby what happens in one node from a data-analytic perspective may not depend on information in other nodes, STIMA cannot be called recursive as the splitting procedure in a node of the regression trunk depends on the current reference model and, hence, also on information from other nodes.
Given that the total sum of squares is fixed, the \(\eta ^2\) values can be straightforwardly compared across the five methods.
For the analyses reported in this paper we used for Model-based recursive partitioning and STIMA, the party (Zeileis et al. 2008) and stima (Dusseldorp and Conversano 2013) R-code. For the other methods we made use of software kindly made available by the authors of the methods, under the form of R-code for Interaction Trees and Virtual Twins, and under the form of an Excel add-in for SIDES.

References

Bala MM, Akl EA, Sun X, Bassler D, Mertz D, Mejza F, Vandvik PO, Malaga G, Johnston BC, Dahm P, Alonso-Coello P, Diaz-Granados N, Srinathan SK, Hassouneh B, Briel M, Busse JW, You JJ, Walter SD, Altman DG, Guyatt GH (2013) Randomized trials published in higher vs. lower impact journals differ indesign, conduct, and analysis. J Clin Epidemiol 66(3):286–295. doi:10.1016/j.jclinepi.2012.10.005. http://www.sciencedirect.com/science/article/pii/S0895435612003174
Google Scholar
Boonacker C, Hoes A, van Liere-Visser K, Schilder A, Rovers M (2011) A comparison of subgroup analyses in grant applications and publications. Am J Epidemiol 174(2):219–225
Article Google Scholar
Breiman L (2001) Random forests. Mach Learn 45:5–32
Article MATH Google Scholar
Carroll KM, Ball SA, Nich C, Martino S, Frankforter TL, Farentinos C, Kunkel LE, Mikulich-Gilbertson SK, Morgenstern J, Obert JL, Polcin D, Snead N, Woody GE (2006) Motivational interviewing to improve treatment engagement and outcome in individuals seeking treatment for substance abuse: a multisite effectiveness study. Drug Alcohol Depend 81(3):301–312. doi:10.1016/j.drugalcdep.2005.08.002
Article Google Scholar
Conversano C, Dusseldorp E (2010) Simultaneous threshold interaction detection in binary classification. In: Palumbo F, Lauro CN, Greenacre MJ (eds) Data analysis and classification, studies in classification, data analysis, and knowledge organization. Springer, Berlin, pp 225–232. doi:10.1007/978-3-642-03739-9_26
Dehejia RH (2005) Program evaluation as a decision problem. J Econ 125(1–2):141–173
Article MathSciNet Google Scholar
Dixon D, Simon R (1991) Bayesian subset analysis. Biometrics 47(3):871–81
Article MATH Google Scholar
Dusseldorp E, Conversano C (2013) stima: simultaneous threshold interaction modeling algorithm. http://CRAN.R-project.org/package=stima, r package version 1.1
Dusseldorp E, Meulman JJ (2001) Prediction in medicine by integrating regression trees into regression analysis with optimal scaling. Methods Inf Med 40:403–409
Google Scholar
Dusseldorp E, Meulman JJ (2004) The regression trunk approach to discover treatment covariate interaction. Psychometrika 69(3):355–374
Article MathSciNet Google Scholar
Dusseldorp E, Conversano C, Van Os BJ (2010) Combining an additive and tree-based regression model simultaneously: stima. J Comput Graph Stat 19(3):514–530. doi:10.1198/jcgs.2010.06089
Article Google Scholar
Foster J, Taylor J, Ruberg S (2011) Subgroup identification from randomized clinical trial data. Stat Med 30(24):2867–2880
Article MathSciNet Google Scholar
Hayward RA, Kent DM, Vijan S, Hofer TP (2006) Multivariable risk prediction can greatly enhance the statistical power of clinical trial subgroup analysis. BMC Med Res Methodol 6(18):18. doi:10.1186/1471-2288-6-18
Article Google Scholar
Kraemer H, Wilson G, Fairburn CG, Agras W (2002) Mediators and moderators of treatment effects in randomized clinical trials. Arch Gen Psychiatry 59(10):877–883. doi:10.1001/archpsyc.59.10.877
Google Scholar
LeBlanc M, Crowley J (1993) Survival trees by goodness of split. J Am Stat Assoc 88:457–467
Article MATH MathSciNet Google Scholar
Lipkovich I, Dmitrienko A, Denne J, Enas G (2011) Subgroup identification based on differential effect search-a recursive partitioning method for establishing response to treatment in patient subpopulations. Stat Med 30(21):2601–2621
MathSciNet Google Scholar
McGahan PL, Griffith JA, Parente R, McLellan AT (1986) Addiction severity index composite scores manual. Treatment Research Institute, Philadelphia
Google Scholar
McLellan AT, Al Alterman (1992) A quantitative measure of substance abuse treatments: the treatment services review. J Nerv Mental Dis 180:101–110
Article Google Scholar
R Development Core Team (2013) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org/
Rubin DB (1974) Estimating causal effects of treatment in randomized and nonrandomized studies. J Educ Psychol 66:688–701
Article Google Scholar
Shaffer J (1991) Probability of directional errors with disordinal (qualitative) interaction. Psychometrika 56(1):29–38
Article Google Scholar
Su X, Zhou T, Yan X, Fan J, Yang S (2008) Interaction trees with censored survival data. Int J Biostat 4(1):2
MathSciNet Google Scholar
Su X, Tsai CL, Wang H, Nickerson DM, Li B (2009) Subgroup analysis via recursive partitioning. J Mach Learn Res 10:141–158
Google Scholar
Tunis SR, Benner J, McClellan M (2010) Comparative effectiveness research: policy context, methods development and research infrastructure. Stat Med 29(19):1963–1976
Article MathSciNet Google Scholar
Zeileis A, Hothorn T, Hornik K (2008) Model-based recursive partitioning. J Comput Graph Stat 17(2):492–514. doi:10.1198/106186008X319331
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Psychology, Katholieke Universiteit Leuven, Tiensestraat 102-bus 3713, Leuven, Belgium
L. L. Doove, E. Dusseldorp, K. Van Deun & I. Van Mechelen
Netherlands Organisation for Applied Scientific Research TNO, PO Box 2215, 2301 CE , Leiden, The Netherlands
E. Dusseldorp

Authors

L. L. Doove
View author publications
You can also search for this author in PubMed Google Scholar
E. Dusseldorp
View author publications
You can also search for this author in PubMed Google Scholar
K. Van Deun
View author publications
You can also search for this author in PubMed Google Scholar
I. Van Mechelen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to L. L. Doove.

Additional information

The research reported in this paper was supported by the Fund for Scientific Research Flanders (G.0546.09N). We would like to thank Xiaogang Su, Ilya Lipkovich and Jared Foster for providing us with the software for Interaction Trees, SIDES and Virtual Twins. In addition, we would like to thank the anonymous reviewers for their valuable comments and suggestions to improve the quality of the paper. For the illustrative applications, we made use of data from a clinical trial conducted as part of the National Drug Abuse Treatment Clinical Trials Network (CTN) sponsored by National Institute on Drug Abuse (NIDA). Specifically, data from CTN-0005 entitled ‘MI (Motivational Interviewing) to Improve Treatment Engagement and Outcome in Subjects Seeking Treatment for Substance Abuse’ were included. CTN databases and accompanying information are available at www.ctndatashare.org.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (R 18 KB)

Supplementary material 2 (R 1 KB)

Supplementary material 3 (R 2 KB)

Supplementary material 4 (R 1 KB)

Supplementary material 5 (R 3 KB)

Appendix A

Table 6 provides the estimates for Model 2 within each regression based split of the interaction tree that is shown in Fig. 2.

Table 6 Summary of the regression models at each split of the interaction tree for the substance abuse data

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Doove, L.L., Dusseldorp, E., Van Deun, K. et al. A comparison of five recursive partitioning methods to find person subgroups involved in meaningful treatment–subgroup interactions. Adv Data Anal Classif 8, 403–425 (2014). https://doi.org/10.1007/s11634-013-0159-x

Download citation

Received: 06 March 2013
Revised: 05 December 2013
Accepted: 09 December 2013
Published: 24 December 2013
Issue Date: December 2014
DOI: https://doi.org/10.1007/s11634-013-0159-x

Keywords

Mathematics Subject Classification (2000)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comparison of five recursive partitioning methods to find person subgroups involved in meaningful treatment–subgroup interactions

Abstract

Access this article

Similar content being viewed by others

Estimating psychological networks and their accuracy: A tutorial paper

A Tutorial on Applying the Difference-in-Differences Method to Health Data

Defining the Study Cohort: Inclusion and Exclusion Criteria

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material 1 (R 18 KB)

Supplementary material 2 (R 1 KB)

Supplementary material 3 (R 2 KB)

Supplementary material 4 (R 1 KB)

Supplementary material 5 (R 3 KB)

Appendix A

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2000)

Navigation

A comparison of five recursive partitioning methods to find person subgroups involved in meaningful treatment–subgroup interactions

Abstract

Access this article

Similar content being viewed by others

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Appendix A

Appendix A

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2000)

Search

Navigation