Classification and regression tree analysis in public health: Methodological review and comparison with logistic regression
- Cite this article as:
- Lemon, S.C., Roy, J., Clark, M.A. et al. ann. behav. med. (2003) 26: 172. doi:10.1207/S15324796ABM2603_02
Background: Audience segmentation strategies are of increasing interest to public health professionals who wish to identify easily defined, mutually exclusive population subgroups whose members share similar characteristics that help determine participation in a health-related behavior as a basis for targeted interventions. Classification and regression tree (C&RT) analysis is a nonparametric decision tree methodology that has the ability to efficiently segment populations into meaningful subgroups. However, it is not commonly used in public health.Purpose: This study provides a methodological overview of C&RT analysis for persons unfamiliar with the procedure.Methods and Results: An example of a C&RT analysis is provided and interpretation of results is discussed. Results are validated with those obtained from a logistic regression model that was created to replicate the C&RT findings. Results obtained from the example C&RT analysis are also compared to those obtained from a common approach to logistic regression, the stepwise selection procedure. Issues to consider when deciding whether to use C&RT are discussed, and situations in which C&RT may and may not be beneficial are described.Conclusions: C&RT is a promising research tool for the identification of at-risk populations in public health research and outreach.