Clustering via Nonsymmetric Partition Distributions
Random partition models are widely used to perform clustering, since their features make them appealing options. However, additional information regarding group properties is not straightforward to incorporate under this approach. In order to overcome this difficulty, a novel approach to infer about clustering is presented. By relaxing the symmetry property of random partitions’ distributions, we are able to include group sizes in the computation of the probabilities. A Bayesian model is also given, together with a sampling scheme, and it is tested using simulated and real datasets.
KeywordsBayesian modeling Density estimation Ordered set partitions
I would like to thank two anonymous referees for many helpful comments made on a previous version of the paper.
- 2.Lijoi, A., Prünster, I.: Models beyond the Dirichlet process. In: Hjort, N.L., Holmes, C.C., Müller, P., Walker, S.G. (eds.) Bayesian Nonparametrics, pp. 80–136. Cambridge University Press, Cambridge (2010)Google Scholar
- 3.Martínez, A.F.: Usages of random combinatorial structures in statistics; a Bayesian nonparametric approach. Ph.D. thesis, Universidad Nacional Autónoma de México (2015)Google Scholar
- 9.Tran, T., Phung, D., Venkatesh, S.: Learning from ordered sets and applications in collaborative ranking. In: JMLR: Workshop and Conference Proceedings, vol. 25, pp. 427–442 (2012)Google Scholar
- 10.Truyen, T., Phung, D., Venkatesh, S.: Probabilistic models over ordered partitions with applications in document ranking and collaborative filtering. In: Proceedings of the 2011 SIAM International Conference on Data Mining, pp. 426–437 (2011)Google Scholar