Statistics and Computing

, Volume 28, Issue 5, pp 1009–1031 | Cite as

Calibrating covariate informed product partition models

  • Garritt L. PageEmail author
  • Fernando A. Quintana


Covariate informed product partition models incorporate the intuitively appealing notion that individuals or units with similar covariate values a priori have a higher probability of co-clustering than those with dissimilar covariate values. These methods have been shown to perform well if the number of covariates is relatively small. However, as the number of covariates increase, their influence on partition probabilities overwhelm any information the response may provide in clustering and often encourage partitions with either a large number of singleton clusters or one large cluster resulting in poor model fit and poor out-of-sample prediction. This same phenomenon is observed in Bayesian nonparametric regression methods that induce a conditional distribution for the response given covariates through a joint model. In light of this, we propose two methods that calibrate the covariate-dependent partition model by capping the influence that covariates have on partition probabilities. We demonstrate the new methods’ utility using simulation and two publicly available datasets.


High-dimensional covariate space Prediction Covariate-based clustering Mixture of experts Random partition models 



The authors would like to thank Peter Müller for helpful comments. The authors also thank all the reviewers for their valuable suggestions that substantially improved presentation. Garritt L. Page gratefully acknowledges the financial support of FONDECYT Grant 11121131 and Fernando A. Quintana was partially funded by Grant FONDECYT 1141057.

Supplementary material

11222_2017_9777_MOESM1_ESM.pdf (364 kb)
Supplementary material 1 (pdf 363 KB)


  1. Antoniano-Villalobos, I., Walker, S.G.: A nonparametric model for stationary time series. J. Time Ser. Anal. 37(1), 126–142 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  2. Barcella, W., Iorio, M.D., Baio, G.: A comparative review of variable selection techniques for covariate dependent Dirichlet process mixture models (2016).
  3. Barcella, W., Iorio, M.D., Baio, G., Malone-Lee, J.: Variable selection in covariate dependent random partition models: an application to urinary tract infection. Stat. Med. 35, 1373–1389 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  4. Barrientos, A.F., Jara, A., Quintana, F.A.: On the support of MacEachern’s dependent Dirichlet processes and extensions. Bayes Anal. 7, 277–310 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  5. Blei, D.M., Frazier, P.I.: Distant dependent chinese restaurant processes. J. Mach. Learn. Res. 12, 2461–2488 (2011)MathSciNetzbMATHGoogle Scholar
  6. Christensen, R., Johnson, W., Branscum, A.J., Hanson, T.: Bayesian Ideas and Data Analysis: An Introduction for Scientists and Statisticians. CRC Press, Boca Raton (2011).
  7. Chung, Y., Dunson, D.B.: Nonparametric bayes conditional distribution modeling with variable selection. J. Am. Stat. Assoc. 104, 1646–1660 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  8. Cook, R.D., Weisberg, S.: Sliced inverse regression for dimension reduction: comment. J. Am. Stat. Assoc. 86, 328–332 (1991)zbMATHGoogle Scholar
  9. Dahl, D.B.: Model-based clustering for expression data via a Dirichlet process mixture model. In: Vannucci, M., Do, K.A., Müller, P. (eds.) Bayesian Inference for Gene Expression and Proteomics, pp. 201–218. Cambridge University Press, Cambridge (2006)CrossRefGoogle Scholar
  10. Dahl, D.B., Day, R., Tsai, J.W.: Random partition distribution indexed by pairwise information. J. Am. Stat. Assoc. (2016). doi: 10.1080/01621459.2016.1165103
  11. De Iorio, M., Müller, P., Rosner, G., MacEachern, S.: An ANOVA model for dependent random measures. J. Am. Stat. Assoc. 99, 205–215 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  12. Dunson, D.B., Park, J.H.: Kernel stick-breaking processes. Biometrika 95, 307–323 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  13. Geisser, S., Eddy, W.F.: A predictive approach to model selection. J. Am. Stat. Assoc. 74(365), 153–160 (1979)MathSciNetCrossRefzbMATHGoogle Scholar
  14. Gelfand, A.E., Kottas, A., MacEachern, S.N.: Bayesian nonparametric spatial modeling with Dirichlet process mixing. J. Am. Stat. Assoc. 102, 1021–1035 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  15. Gower, J.C.: A general coefficient of similarity and some of its properties. Biometrics 27, 857–871 (1971)CrossRefGoogle Scholar
  16. Griffin, J.E., Steel, M.F.J.: Order-based dependent Dirichlet processes. J. Am. Stat. Assoc. 101, 179–194 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  17. Guhaniyogi, R., Dunson, D.B.: Bayesian compressed regression. J. Am. Stat. Assoc. 110, 1500–1514 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  18. Hannah, L., Blei, D., Powell, W.: Dirichlet process mixtures of generalized linear models. J. Mach. Learn. Res. 12, 1923–1953 (2011)MathSciNetzbMATHGoogle Scholar
  19. Hartigan, J.A.: Partition models. Commun. Stat. Theory Methods 19, 2745–2756 (1990)MathSciNetCrossRefGoogle Scholar
  20. Jacobs, R.A., Jordan, M.I., Nowlan, S.J., Hinton, G.E.: Adaptive mixtures of local experts. Neural Comput. 3, 79–87 (1991)CrossRefGoogle Scholar
  21. Lichman, M.: UCI machine learning repository (2013).
  22. MacEachern, S.N.: Dependent Dirichlet processes. Ohio State University, Department of Statistics, Technical report (2000)Google Scholar
  23. Maechler, M., Rousseeuw, P., Struyf, A., Hubert, M., Hornik, K.: Cluster: Cluster Analysis Basics and Extensions (2016). R package version 2.0.4—For new features, see the ’Changelog’ file (in the package source)Google Scholar
  24. McLachlan, G., Peel, D.: Finite Mixture Models, 1st edn. Wiley Series in Probability and Statistics, New York (2000)CrossRefzbMATHGoogle Scholar
  25. Miller, J.W., Dunson, D.B.: Robust Bayesian inference via coarsening (2015).
  26. Molitor, J., Papathomas, M., Jerrett, M., Richardson, S.: Random partition models with regression on covariates. Biostatistics 11, 484–498 (2010)CrossRefGoogle Scholar
  27. Müller, P., Erkanli, A., West, M.: Bayesian curve fitting using multivariate normal mixutres. Biometrika 83, 67–79 (1996)MathSciNetCrossRefzbMATHGoogle Scholar
  28. Müller, P., Quintana, F.A., Jara, A., Hanson, T.: Bayesian Nonparametric Data Analysis, 1st edn. Springer, Switzerland (2015)CrossRefzbMATHGoogle Scholar
  29. Müller, P., Quintana, F.A., Rosner, G.L.: A product partition model with regression on covariates. J. Comput. Graph. Stat. 20(1), 260–277 (2011)MathSciNetCrossRefGoogle Scholar
  30. Müller, P., Quintana, F.A., Rosner, G.L., Maitland, M.L.: Bayesian inference for longitudinal data with non-parametric treatment effects. Biostatistics 15(2), 341–352 (2013)CrossRefGoogle Scholar
  31. Neal, R.M.: Markov chain sampling methods for Dirichlet process mixture models. J. Comput. Graph. Stat. 9, 249–265 (2000)MathSciNetGoogle Scholar
  32. Page, G.L., Bhattacharya, A., Dunson, D.B.: Classification via Bayesian nonparametric learning of affine subspaces. J. Am. Stat. Assoc. 108, 187–201 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  33. Page, G.L., Quintana, F.A.: Predictions based on the clustering of heterogeneous functions via shape and subject-specific covariates. Bayesian Anal. 10, 379–410 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  34. Page, G.L., Quintana, F.A.: Spatial product partition models. Bayesian Anal. 11(1), 265–298 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  35. Papathomas, M., Molitor, J., Hoggart, C., Hastie, D., Richardson, S.: Exploring data from genetic association studies using bayesian variable selection and the Dirichlet process: application to searchingfor gene \(\times \) gene patterns. Genet. Epidemiol. 36, 663–674 (2012)CrossRefGoogle Scholar
  36. Park, J.H., Dunson, D.B.: Bayesian generalized product partition model. Stat. Sin. 20, 1203–1226 (2010)MathSciNetzbMATHGoogle Scholar
  37. Quintana, F.A., Müller, P., Papoila, A.L.: Cluster-specific variable selection for product partition models. Scand. J. Stat. 42, 1065–1077 (2015). doi: 10.1111/sjos.12151 MathSciNetCrossRefzbMATHGoogle Scholar
  38. R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2016).
  39. Rand, W.M.: Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc. 66, 846–850 (1971)CrossRefGoogle Scholar
  40. Rodriguez, A., Dunson, D.B., Gelfand, A.E.: Bayesian nonparametric functional data analysis through density estimation. Biometrika 96, 149–162 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  41. Wade, S., Dunson, D.B., Petrone, S., Trippa, L.: Improving prediction from Dirichlet process mixtures via enrichment. J. Mach. Learn. Res. 15, 1041–1071 (2014)MathSciNetzbMATHGoogle Scholar
  42. Wang, H., Xia, Y.: Sliced regression for dimension reduction. J. Am. Stat. Assoc. 103, 811–821 (2008)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.Department of StatisticsBrigham Young UniversityProvoUSA
  2. 2.Departamento de EstadísticaPontificia Universidad Católica de ChileSantiagoChile

Personalised recommendations