Model-based approach for household clustering with mixed scale variables
- 158 Downloads
The Ministry of Social Development in Mexico is in charge of creating and assigning social programmes targeting specific needs in the population for the improvement of the quality of life. To better target the social programmes, the Ministry is aimed to find clusters of households with the same needs based on demographic characteristics as well as poverty conditions of the household. Available data consists of continuous, ordinal, and nominal variables, all of which come from a non-i.i.d complex design survey sample. We propose a Bayesian nonparametric mixture model that jointly models a set of latent variables, as in an underlying variable response approach, associated to the observed mixed scale data and accommodates for the different sampling probabilities. The performance of the model is assessed via simulated data. A full analysis of socio-economic conditions in households in the Mexican State of Mexico is presented.
KeywordsBayes nonparametrics Complex design Latent variables Multivariate normal Poisson–Dirichlet process
Mathematics Subject Classification62D05 62G86 62P25 62H30
The authors are grateful to the constructive comments of a guest editor and two anonymous referees. The first author acknowledges support from Consejo Nacional de Ciencia y Tecnología, Mexico. The second author acknowledges support from Asociación Mexicana de Cultura, A. C. Mexico. The third author is also affiliated with the Collegio Carlo Alberto and acknowledges support of Grant CPDA154381/15 from the University of Padua, Italy.
- Carmona C, Nieto-Barajas LE (2017) Package BNPMIXcluster. R package version 1.2.0Google Scholar
- CONEVAL (2009) Metodología para la medición multidimensional de la pobreza en México. Consejo Nacional de Evaluación de la Política de Desarrollo Social, México. http://www.coneval.org.mx/rw/resource/Metodologia_Medicion_Multidimensional.pdf (in Spanish)
- Dahl DB (2006) Model-based clustering for expression data via a Dirichlet process mixture model. In: Vanucci M, Do K-A, Müller P (eds) Bayesian inference for gene expression and proteomics. Cambridge University Press, CambridgeGoogle Scholar
- R Core Team (2016) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. https://www.R-project.org/
- Wade S, Ghahramani Z (2017) Bayesian cluster analysis: point estimation and credible balls. Bayesian Anal. https://doi.org/10.1214/17-BA1073