Abstract
We present a Bayesian Nonparametric model for Hierarchical Clustering (HC). Such a model has two main components. The first component is the random walk process from parent to child in the hierarchy and we apply nested Chinese Restaurant Process (nCRP). Then, the second part is the diffusion process from parent to child where we employ Hierarchical Dirichlet Process Mixture Model (HDPMM). This is different from the common choice which is Gaussian-to-Gaussian. We demonstrate the properties of the model and propose a Markov Chain Monte Carlo procedure with elegantly analytical updating steps for inferring the model variables. Experiments on the real-world datasets show that our method obtains reasonable hierarchies and remarkable empirical results according to some well known metrics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
http://jmcauley.ucsd.edu/data/amazon: the data is available upon request.
- 2.
References
Adams, R.P., Ghahramani, Z., Jordan, M.I.: Tree-structured stick breaking for hierarchical data. In: NeurIPS, pp. 19–27 (2010)
Ahmed, A., Hong, L., Smola, A.J.: Nested Chinese restaurant franchise processes: applications to user tracking and document modeling. In: ICML, vol. 28, pp. 2476–2484 (2013)
Antoniak, C.E.: Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Ann. Stat. 2, 1152–1174 (1974)
Blei, D.M., Griffiths, T.L., Jordan, M.I.: The nested Chinese restaurant process and Bayesian nonparametric inference of topic hierarchies. JACM 57, 1–30 (2010)
Charikar, M., Chatziafratis, V., Niazadeh, R.: Hierarchical clustering better than average-linkage. In: SODA, pp. 2291–2304. SIAM (2019)
Williams, C.K.I.: A MCMC approach to hierarchical mixture modelling. In: Advances in Neural Information Processing Systems, vol. 12, pp. 680–686 (2000)
Dyk, D.A.V., Jiao, X.: Metropolis-hastings within partially collapsed Gibbs samplers. J. Comput. Graph. Stat. 24(2), 301–327 (2015)
Ferguson, T.S.: A Bayesian analysis of some nonparametric problems. Ann. Stat. 1, 209–230 (1973)
Gavish, M., Donoho, D.L.: The optimal hard threshold for singular values is 4/\(\sqrt{3}\). IEEE Trans. Inf. Theory 60(8), 5040–5053 (2014)
He, R., McAuley, J.: Ups and downs: modeling the visual evolution of fashion trends with one-class collaborative filtering, WWW 2016, pp. 507–517 (2016)
Heller, K.A., Ghahramani, Z.: Bayesian hierarchical clustering. In: Proceedings of the 22nd International Conference on Machine Learning, pp. 297–304 (2005)
Iwayama, M., Tokunaga, T.: Hierarchical Bayesian clustering for automatic text classification. In: IJCAI, vol. 2, pp. 1322–1327 (1995)
Karypis, M.S.G., Kumar, V.: A comparison of document clustering techniques. In: KDD Workshop on Text Mining (2000)
Kemp, C., Tenenbaum, J.B.: The discovery of structural form. Proc. Natl. Acad. Sci. 105(31), 10687–10692 (2008)
Knowles, D.A., Ghahramani, Z.: Pitman yor diffusion trees for Bayesian hierarchical clustering. IEEE TPAMI 37(2), 271–289 (2015)
Kobren, A., Monath, N., Krishnamurthy, A., McCallum, A.: A Hierarchical algorithm for extreme clustering. In: SIGKDD, pp. 255–264. ACM (2017)
Kuang, D., Park, H.: Fast Rank-2 nonnegative matrix factorization for hierarchical document clustering. In: SIGKDD, pp. 739–747. ACM (2013)
Lee, J., Choi, S.: Bayesian hierarchical clustering with exponential family: small-variance asymptotics and reducibility. In: AISTATS, pp. 581–589 (2015)
Monath, N., Zaheer, M., Silva, D., McCallum, A., Ahmed, A.: Gradient-based hierarchical clustering using continuous representations of trees in hyperbolic space, pp. 714–722 (2019). https://doi.org/10.1145/3292500.3330997
Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT Press, Cambridge (2012)
Neal, R.M.: Density modeling and clustering using Dirichlet diffusion trees. Bayesian Stat. 7, 619–629 (2003)
Paisley, J., Wang, C., Blei, D.M., Jordan, M.I.: Nested hierarchical Dirichlet processes. TPAMI 37(2), 256–270 (2015)
Pitman, J.: Some developments of the Blackwell-MacQueen Urn scheme. Lect. Notes-Monogr. Ser. 30, 245–267 (1996)
Romano, S., Vinh, N.X., Bailey, J., Verspoor, K.: Adjusting for chance clustering comparison measures. JMLR 17(1), 4635–4666 (2016)
Steinhardt, J., Ghahramani, Z.: Flexible martingale priors for deep hierarchies (2012)
Stolcke, A., Omohundro, S.: Hidden Markov model induction by Bayesian model merging. In: Advances in Neural Information Processing Systems, pp. 11–18 (1993)
Teh, Y.W., Daume III, H., Roy, D.M.: Bayesian agglomerative clustering with coalescents. In: NeurIPS, pp. 1473–1480 (2008)
Teh, Y.W., Jordan, M.I.: Hierarchical Bayesian nonparametric models with applications. Bayesian Nonparametrics 1, 158–207 (2010)
Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical Dirichlet processes. J. Am. Stat. Assoc. 101(476), 1566–1581 (2006)
Ward Jr., J.H.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58(301), 236–244 (1963)
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms (2017)
Acknowledgements
We thank the reviewers for the helpful feedback. This research has been supported by SFI under the grant SFI/12/RC/2289_P2.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Huang, W., Laitonjam, N., Piao, G., Hurley, N.J. (2021). Inferring Hierarchical Mixture Structures: A Bayesian Nonparametric Approach. In: Karlapalem, K., et al. Advances in Knowledge Discovery and Data Mining. PAKDD 2021. Lecture Notes in Computer Science(), vol 12714. Springer, Cham. https://doi.org/10.1007/978-3-030-75768-7_17
Download citation
DOI: https://doi.org/10.1007/978-3-030-75768-7_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-75767-0
Online ISBN: 978-3-030-75768-7
eBook Packages: Computer ScienceComputer Science (R0)