Machine Learning Approaches to Link-Based Clustering

  • Zhongfei (Mark) Zhang
  • Bo Long
  • Zhen Guo
  • Tianbing Xu
  • Philip S. Yu


We have reviewed several state-of-the-art machine learning approaches to different types of link-based clustering in this chapter. Specifically, we have presented the spectral clustering for heterogeneous relational data, the symmetric convex coding for homogeneous relational data, the citation model for clustering the special but popular homogeneous relational data—the textual documents with citations, the probabilistic clustering framework on mixed membership for general relational data, and the statistical graphical model for dynamic relational clustering. We have demonstrated the effectiveness of these machine learning approaches through empirical evaluations.


Relational Data Latent Dirichlet Allocation Normalize Mutual Information Document Cluster Relational Cluster 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work is supported in part through NSF grants [IIS-0535162, IIS-0812114, IIS-0905215, and DBI-0960443], as well as graduate research internships at Google Research Labs and NEC Laboratories America, Inc. Yun Chi, Yihong Gong, Xiaoyun Wu, and Shenghuo Zhu have made contributions to part of this material.


  1. 1.
    A. Banerjee, I. S. Dhillon, J. Ghosh, S. Merugu, and D. S. Modha. A generalized maximum entropy approach to bregman co-clustering and matrix approximation. In KDD, pages 509–514, 2004.Google Scholar
  2. 2.
    A. Banerjee, S. Merugu, I. S. Dhillon, and J. Ghosh. Clustering with bregman divergences. Journal of Machine Learning Research, 6:1705–1749, 2005.Google Scholar
  3. 3.
    S. Basu, M. Bilenko, and R. J. Mooney. A probabilistic framework for semi-supervised clustering. In Proceedings ACM KDD04, pages 59–68, Seattle, WA, August 2004.Google Scholar
  4. 4.
    M. J. Beal, Z. Ghahramani, and C. E. Rasmussen. The infinite hidden markov model. In NIPS 14, 2002.Google Scholar
  5. 5.
    D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 993–1022, 2003.Google Scholar
  6. 6.
    T. N. Bui and C. Jones. A heuristic for reducing fill-in in sparse matrix factorization. In PPSC, pages 445–452, 1993.Google Scholar
  7. 7.
    M. Catral, L. Han, M. Neumann, and R. J. Plemmons. On reduced rank nonnegative matrix factorization for symmetric nonnegative matrices. Linear Algebra and Its Application, 2004.Google Scholar
  8. 8.
    P. K. Chan, M. D. F. Schlag, and J. Y. Zien. Spectral k-way ratio-cut partitioning and clustering. In DAC’93, pages 749–754, 1993.Google Scholar
  9. 9.
    Y. Chi, X. Song, D. Zhou, K. Hino, and B. L. Tseng. Evolutionary spectral clustering by incorporating temporal smoothness. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 153–162, 2007.Google Scholar
  10. 10.
    H. Cho, I. Dhillon, Y. Guan, and S. Sra. Minimum sum squared residue co-clustering of gene expression data. In SDM, 2004.Google Scholar
  11. 11.
    D. Cohn and H. Chang. Learning to probabilistically identify authoritative documents. In Proceeding of ICML, pages 167–174, 2000.Google Scholar
  12. 12.
    D. A. Cohn and T. Hofmann. The missing link – a probabilistic model of document content and hypertext connectivity. In Proceedings of NIPS, pages 430–436, 2000.Google Scholar
  13. 13.
    D. D. Lee and H. S. Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 401:788–791, 1999.PubMedCrossRefGoogle Scholar
  14. 14.
    I. S. Dhillon, S. Mallela, and D. S. Modha. Information-theoretic co-clustering. In KDD’03, pages 89–98, 2003.Google Scholar
  15. 15.
    I. Dhillon, Y. Guan, and B. Kulis. A unified view of kernel k-means, spectral clustering and graph cuts. Technical Report TR-04-25, University of Texas at Austin, 2004.Google Scholar
  16. 16.
    I. Dhillon, Y. Guan, and B. Kulis. A fast kernel-based multilevel algorithm for graph clustering. In KDD’05, 2005.Google Scholar
  17. 17.
    I. S. Dhillon. Co-clustering documents and words using bipartite spectral graph partitioning. In KDD, pages 269–274, 2001.Google Scholar
  18. 18.
    C. Ding, X. He, and H. D. Simon. On the equivalence of nonnegative matrix factorization and spectral clustering. In SDM’05, 2005.Google Scholar
  19. 19.
    C. H. Q. Ding, X. He, H. Zha, M. Gu, and H. D. Simon. A min-max cut algorithm for graph partitioning and data clustering. In Proceedings of ICDM 2001, pages 107–114, 2001.Google Scholar
  20. 20.
    E. Erosheva and S. E. Fienberg. Bayesian mixed membership models for soft clustering and classification. Classification-The Ubiquitous Challenge, pages 11–26, 2005.Google Scholar
  21. 21.
    E.A. Erosheva, S.E. Fienberg, and J. Lafferty. Mixed membership models of scientific publications. In NAS.Google Scholar
  22. 22.
    M. D. Escobar and M. West. Bayesian density estimation and inference using mixtures. The Annals of Statistics, 90:577–588, 1995.Google Scholar
  23. 23.
    B. Gao, T. Y. Liu, X. Zheng, Q. S. Cheng, and W. Y. Ma. Consistent bipartite graph co-partitioning for star-structured high-order heterogeneous data co-clustering. In KDD’05, pages 41–50, 2005.Google Scholar
  24. 24.
    Z. Guo, S. Zhu, Y. Chi, Z. Zhang, and Y. Gong. A latent topic model for linked documents. In Proceedings of ACM SIGIR, 2009.Google Scholar
  25. 25.
    G. Heinrich. Parameter estimation for text analysis. Technical Report, 2004.Google Scholar
  26. 26.
    B. Hendrickson and R. Leland. A multilevel algorithm for partitioning graphs. In Supercomputing ’95, page 28, 1995.Google Scholar
  27. 27.
    M. Henzinger, R. Motwani, and C. Silverstein. Challenges in web search engines. In Proceedings of the 18th International Joint Conference on Artificial Intelligence, pages 1573–1579, 2003.Google Scholar
  28. 28.
    T. Hofmann. Probabilistic latent semantic indexing. In Proceedings SIGIR, pages 50–57, 1999.Google Scholar
  29. 29.
    G. Karypis. A clustering toolkit, 2002.Google Scholar
  30. 30.
    G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on Scientific Computing, 20(1):359–392, 1998.CrossRefGoogle Scholar
  31. 31.
    B. Kernighan and S. Lin. An efficient heuristic procedure for partitioning graphs. The Bell System Technical Journal, 49(2):291–307, 1970.Google Scholar
  32. 32.
    R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins. Trawling the Web for emerging cyber-communities. Computer Networks, 31(11–16), 1999.CrossRefGoogle Scholar
  33. 33.
    K. Lang. News weeder: Learning to filter netnews. In ICML, 1995.Google Scholar
  34. 34.
    T. Li. A general model for clustering binary data. In KDD’05, 2005.Google Scholar
  35. 35.
    B. Long, Z. Zhang, and P. S. Yu. Relational clustering by symmetric convex coding. In Proceedings of International Conference on Machine Learning, 2007.Google Scholar
  36. 36.
    B. Long, Z. Zhang, X. Wu, and P. S. Yu. Spectral clustering for multi-type relational data. In Proceedings of ICML, 2006.Google Scholar
  37. 37.
    B. Long, Z. Zhang, and P. S. Yu. A probabilistic framework for relational clustering. In Proceedings of ACM KDD, 2007.Google Scholar
  38. 38.
    B. Long, X. Wu, Z. Zhang, and P. S. Yu. Unsupervised learning on k-partite graphs. In KDD-2006, 2006.Google Scholar
  39. 39.
    B. Long, Z. M. Zhang, and P. S. Yu. Co-clustering by block value decomposition. In KDD’05, 2005.Google Scholar
  40. 40.
    A. McCallum, K. Nigam, J. Rennie, and K. Seymore. Automating the construction of internet portals with machine learning. Information Retrieval, 3(2):127–163, 2000.CrossRefGoogle Scholar
  41. 41.
    A. Ng, M. Jordan, and Y. Weiss. On spectral clustering: Analysis and an algorithm. In Advances in Neural Information Processing Systems 14, 2001.Google Scholar
  42. 42.
    J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis Machine Intelligence, 22(8):888–905, 2000.CrossRefGoogle Scholar
  43. 43.
    A. Strehl and J. Ghosh. Cluster ensembles – a knowledge reuse framework for combining partitionings. In AAAI 2002, pages 93–98, 2002.Google Scholar
  44. 44.
    Y. Teh, M. Beal M. Jordan, and D. Blei. Hierarchical dirichlet processes. Journal of the American Statistical Association, 101(476):1566–1581, 2007.CrossRefGoogle Scholar
  45. 45.
    K. Wagstaff, C. Cardie, S. Rogers, and S. Schroedl. Constrained k-means clustering with background knowledge. In ICML-2001, pages 577–584, 2001.Google Scholar
  46. 46.
    E. P. Xing, A. Y. Ng, M. I. Jorda, and S. Russel. Distance metric learning with applications to clustering with side information. In NIPS’03, volume 16, 2003.Google Scholar
  47. 47.
    T. Xu, Z. Zhang, P. S. Yu, and B. Long. Evolutionary clustering by hierarchical dirichlet process with hidden markov state. In Proceedings of IEEE ICDM, 2008.Google Scholar
  48. 48.
    W. Xu, X. Liu, and Y. Gong. Document clustering based on non-negative matrix factorization. In Proceedings of SIGIR, pages 267–273, 2003.Google Scholar
  49. 49.
    S. Yu and J. Shi. Multiclass spectral clustering. In ICCV’03, 2003.Google Scholar
  50. 50.
    H. Zha, C. Ding, M. Gu, X. He, and H. Simon. Bi-partite graph partitioning and data clustering. In ACM CIKM’01, 2001.Google Scholar
  51. 51.
    H. Zha, C. Ding, M. Gu, X. He, and H. Simon. Spectral relaxation for k-means clustering. Advances in Neural Information Processing Systems, 14, 2002.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • Zhongfei (Mark) Zhang
    • 1
  • Bo Long
    • 2
  • Zhen Guo
    • 3
  • Tianbing Xu
    • 3
  • Philip S. Yu
    • 4
  1. 1.Computer Science DepartmentSUNYBinghamtonUSA
  2. 2.Yahoo! Labs, Yahoo! Inc.SunnyvaleUSA
  3. 3.Computer Science DepartmentSUNY BinghamtonBinghamtonUSA
  4. 4.Department of Computer ScienceUniversity of Illinois at ChicagoChicagoUSA

Personalised recommendations