Random effects clustering in multilevel modeling: choosing a proper partition
A novel criterion for estimating a latent partition of the observed groups based on the output of a hierarchical model is presented. It is based on a loss function combining the Gini income inequality ratio and the predictability index of Goodman and Kruskal in order to achieve maximum heterogeneity of random effects across groups and maximum homogeneity of predicted probabilities inside estimated clusters. The index is compared with alternative approaches in a simulation study and applied in a case study concerning the role of hospital level variables in deciding for a cesarean section.
KeywordsHierarchical modelling Model based clustering Label switching Bayesian nonparametric Gini income inequality ratio Goodman and Kruskal predictability index
Mathematics Subject Classification62C10 62C12 62H30 62J12 62J20
We would like to thank the Autonomous Region of Sardinia for providing the data used in Sect. 6. We also thank the editors and the two anonymous referees for their comments, which allowed us to consistently improve the quality of the paper in several parts.
- Dunson D (2008) Nonparametric Bayes applications to biostatistics (Tech. Rep.). Biostatistics Branch, National Institute of Environmental Health Sciences, U.S. National, Institute of Health, USAGoogle Scholar
- European Perinatal Health Report (2013) The health and care of pregnant women and babies in Europe in 2010. EURO-PERISTAT Project with SCPE and EUROCAT, BruxellesGoogle Scholar
- MacEachern SN (2000) Dependent nonparametric processes, Technical report. Dept. of Statistics, Ohio State University, OhioGoogle Scholar
- Pauger D, Wagner H (2018) Bayesian effect fusion for categorical predictors. Bayesian Anal. https://doi.org/10.1214/18-BA1096