Combinatorial Markov Random Fields

  • Ron Bekkerman
  • Mehran Sahami
  • Erik Learned-Miller
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4212)


A combinatorial random variable is a discrete random variable defined over a combinatorial set (e.g., a power set of a given set). In this paper we introduce combinatorial Markov random fields (Comrafs), which are Markov random fields where some of the nodes are combinatorial random variables. We argue that Comrafs are powerful models for unsupervised and semi-supervised learning. We put Comrafs in perspective by showing their relationship with several existing models. Since it can be problematic to apply existing inference techniques for graphical models to Comrafs, we design two simple and efficient inference algorithms specific for Comrafs, which are based on combinatorial optimization. We show that even such simple algorithms consistently and significantly outperform Latent Dirichlet Allocation (LDA) on a document clustering task. We then present Comraf models for semi-supervised clustering and transfer learning that demonstrate superior results in comparison to an existing semi-supervised scheme (constrained optimization).


  1. 1.
    McGurk, H., MacDonald, J.: Hearing lips and seeing voices. Nature 264(5588), 746–748 (1976)CrossRefGoogle Scholar
  2. 2.
    de Sa, V.: Unsupervised Classification Learning from Cross-Modal Environmental Structure. PhD thesis, University of Rochester (1994)Google Scholar
  3. 3.
    Friedman, N., Mosenzon, O., Slonim, N., Tishby, N.: Multivariate information bottleneck. In: Proceedings of UAI-17 (2001)Google Scholar
  4. 4.
    Bickel, S., Scheffer, T.: Multi-view clustering. In: Proceedings of ICDM-4 (2004)Google Scholar
  5. 5.
    Bekkerman, R., El-Yaniv, R., McCallum, A.: Multi-way distributional clustering via pairwise interactions. In: Proceedings of ICML-22, pp. 41–48 (2005)Google Scholar
  6. 6.
    Li, S.: Markov random field modeling in computer vision. Springer, Heidelberg (1995)Google Scholar
  7. 7.
    Besag, J.: Spatial interaction and statistical analysis of lattice systems. Journal of the Royal Statistical Society 36(2), 192–236 (1974)zbMATHMathSciNetGoogle Scholar
  8. 8.
    Blei, D., Ng, A., Jordan, M.: Latent Dirichlet allocation. JMLR 3, 993–1022 (2003)zbMATHCrossRefGoogle Scholar
  9. 9.
    Tishby, N., Pereira, F., Bialek, W.: The information bottleneck method, Invited paper to the 37th Annual Allerton Conference (1999)Google Scholar
  10. 10.
    Dhillon, I.S., Mallela, S., Modha, D.S.: Information-theoretic co-clustering. In: Proceedings of SIGKDD-9, pp. 89–98 (2003)Google Scholar
  11. 11.
    Besag, J.: On the statistical analysis of dirty pictures. Journal of the Royal Statistical Society 48(3) (1986)Google Scholar
  12. 12.
    Wagstaff, K., Cardie, C.: Clustering with instance-level constraints. In: Proceedings of ICML-17 (2000)Google Scholar
  13. 13.
    McCallum, A., Corrada-Emmanuel, A., Wang, X.: Topic and role discovery in social networks. In: Proceedings of IJCAI-19, pp. 786–791 (2005)Google Scholar
  14. 14.
    Bekkerman, R., Sahami, M.: Semi-supervised clustering using combinatorial MRFs. In: Proceedings of ICML-23 Workshop on Learning in Structured Output Spaces (2006)Google Scholar
  15. 15.
    Bekkerman, R., El-Yaniv, R., Tishby, N., Winter, Y.: Distributional word clusters vs. words for text categorization. JMLR 3, 1183–1208 (2003)zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Ron Bekkerman
    • 1
  • Mehran Sahami
    • 2
  • Erik Learned-Miller
    • 1
  1. 1.Department of Computer ScienceUniversity of MassachusettsAmherstUSA
  2. 2.Google Inc.Mountain ViewUSA

Personalised recommendations