Advertisement

Clustering via Nonsymmetric Partition Distributions

  • Asael Fabian MartínezEmail author
Conference paper
Part of the Springer Proceedings in Mathematics & Statistics book series (PROMS, volume 301)

Abstract

Random partition models are widely used to perform clustering, since their features make them appealing options. However, additional information regarding group properties is not straightforward to incorporate under this approach. In order to overcome this difficulty, a novel approach to infer about clustering is presented. By relaxing the symmetry property of random partitions’ distributions, we are able to include group sizes in the computation of the probabilities. A Bayesian model is also given, together with a sampling scheme, and it is tested using simulated and real datasets.

Keywords

Bayesian modeling Density estimation Ordered set partitions 

Notes

Acknowledgements

I would like to thank two anonymous referees for many helpful comments made on a previous version of the paper.

References

  1. 1.
    Crane, H.: The cut-and-paste process. Ann. Probab. 42(5), 1952–1979 (2014)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Lijoi, A., Prünster, I.: Models beyond the Dirichlet process. In: Hjort, N.L., Holmes, C.C., Müller, P., Walker, S.G. (eds.) Bayesian Nonparametrics, pp. 80–136. Cambridge University Press, Cambridge (2010)Google Scholar
  3. 3.
    Martínez, A.F.: Usages of random combinatorial structures in statistics; a Bayesian nonparametric approach. Ph.D. thesis, Universidad Nacional Autónoma de México (2015)Google Scholar
  4. 4.
    McCullagh, P., Yang, J.: How many clusters? Bayesian Anal. 3(1), 101–120 (2008)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Mena, R.H., Walker, S.G.: On the Bayesian mixture model and identifiability. J. Comput. Graph. Stat. 24(4), 1155–1169 (2015)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Nobile, A., Fearnside, A.T.: Bayesian finite mixtures with an unknown number of components: the allocation sampler. Stat. Comput. 17, 147–162 (2007)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Papastamoulis, P., Iliopoulos, G.: An artificial allocations based solution to the label switching problem in Bayesian analysis of mixtures of distributions. J. Comput. Graph. Stat. 19(2), 313–331 (2010)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Richardson, S., Green, P.J.: On Bayesian analysis of mixtures with an unknown number of components (with discussion). J. R. Stat. Soc: Ser. B (Stat. Methodol.) 59(4), 731–792 (1997)CrossRefGoogle Scholar
  9. 9.
    Tran, T., Phung, D., Venkatesh, S.: Learning from ordered sets and applications in collaborative ranking. In: JMLR: Workshop and Conference Proceedings, vol. 25, pp. 427–442 (2012)Google Scholar
  10. 10.
    Truyen, T., Phung, D., Venkatesh, S.: Probabilistic models over ordered partitions with applications in document ranking and collaborative filtering. In: Proceedings of the 2011 SIAM International Conference on Data Mining, pp. 426–437 (2011)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Departamento de MatemáticasUniversidad Autónoma Metropolitana, Unidad IztapalapaMexico CityMexico

Personalised recommendations