Abstract
Co-clustering has emerged as an important technique for mining contingency data matrices. However, almost all existing co-clustering algorithms are hard partitioning, assigning each row and column of the data matrix to one cluster. Recently a Bayesian co-clustering approach has been proposed which allows a probability distribution membership in row and column clusters. The approach uses variational inference for parameter estimation. In this work, we modify the Bayesian co-clustering model, and use collapsed Gibbs sampling and collapsed variational inference for parameter estimation. Our empirical evaluation on real data sets shows that both collapsed Gibbs sampling and collapsed variational inference are able to find more accurate likelihood estimates than the standard variational Bayesian co-clustering approach.
Chapter PDF
Similar content being viewed by others
References
Shan, H., Banerjee, A.: Bayesian co-clustering. In: IEEE International Conference on Data Mining (2008)
Hartigan, J.A.: Direct Clustering of a Data Matrix. Journal of the American Statistical Association 337, 123–129 (1972)
Dhillon, I.S., Mallela, S., Modha, D.S.: Information-Theoretic Co-Clustering. In: ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 89–98 (2003)
Shafiei, M.M., Milios, E.E.: Latent Dirichlet Co-Clustering. In: International Conference on Data Mining, pp. 542–551 (2006)
Blei, D., Ng, A., Jordan, M.: Latent Dirichlet Allocation. Journal of Machine Learning Research 3, 993–1022 (2003)
Beal, M.J.: Variational Algorithms for Approximate Bayesian Inference. PhD thesis, Gatsby Computational Neuroscience Unit, University College London (2003)
Teh, Y.W., Newman, D., Welling, M.: A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation. In: Advances in Neural Information Processing Systems, vol. 19 (2007)
Neal, R.M.: Probabilistic Inference Using Markov Chain Monte Carlo Methods, Technical Report CRG-TR-93-1, Dept. of Computer Science, University of Toronto (1993)
Griffiths, T.L., Steyvers, M.: Finding Scientific Topics. National Academy of Science 101, 5228–5235 (2004)
Teh, Y.W., Kurihara, K., Welling, M.: Collapsed Variational Inference for HDP. In: Advances in Neural Information Processing Systems, vol. 20 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, P., Domeniconi, C., Laskey, K.B. (2009). Latent Dirichlet Bayesian Co-Clustering. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2009. Lecture Notes in Computer Science(), vol 5782. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04174-7_34
Download citation
DOI: https://doi.org/10.1007/978-3-642-04174-7_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04173-0
Online ISBN: 978-3-642-04174-7
eBook Packages: Computer ScienceComputer Science (R0)