Statistics and Computing

, Volume 17, Issue 4, pp 395–416

# A tutorial on spectral clustering

Article

## Abstract

In recent years, spectral clustering has become one of the most popular modern clustering algorithms. It is simple to implement, can be solved efficiently by standard linear algebra software, and very often outperforms traditional clustering algorithms such as the k-means algorithm. On the first glance spectral clustering appears slightly mysterious, and it is not obvious to see why it works at all and what it really does. The goal of this tutorial is to give some intuition on those questions. We describe different graph Laplacians and their basic properties, present the most common spectral clustering algorithms, and derive those algorithms from scratch by several different approaches. Advantages and disadvantages of the different spectral clustering algorithms are discussed.

### Keywords

Spectral clustering Graph Laplacian

## Preview

### References

1. Aldous, D., Fill, J.: Reversible Markov chains and random walks on graphs (in preparation). Online version available at http://www.stat.berkeley.edu/users/aldous/RWG/book.html
2. Bach, F., Jordan, M.: Learning spectral clustering. In: Thrun, S., Saul, L., Schölkopf, B. (eds.) Advances in Neural Information Processing Systems 16 (NIPS), pp. 305–312. MIT Press, Cambridge (2004) Google Scholar
3. Bapat, R., Gutman, I., Xiao, W.: A simple method for computing resistance distance. Z. Naturforsch. 58, 494–498 (2003) Google Scholar
4. Barnard, S., Pothen, A., Simon, H.: A spectral algorithm for envelope reduction of sparse matrices. Numer. Linear Algebra Appl. 2(4), 317–334 (1995)
5. Belkin, M.: Problems of learning on manifolds. Ph.D. Thesis, University of Chicago (2003) Google Scholar
6. Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6), 1373–1396 (2003)
7. Belkin, M., Niyogi, P.: Towards a theoretical foundation for Laplacian-based manifold methods. In: Auer, P., Meir, R. (eds.) Proceedings of the 18th Annual Conference on Learning Theory (COLT), pp. 486–500. Springer, New York (2005) Google Scholar
8. Ben-David, S., von Luxburg, U., Pal, D.: A sober look on clustering stability. In: Lugosi, G., Simon, H. (eds.) Proceedings of the 19th Annual Conference on Learning Theory (COLT), pp. 5–19. Springer, Berlin (2006) Google Scholar
9. Bengio, Y., Delalleau, O., Roux, N., Paiement, J., Vincent, P., Ouimet, M.: Learning eigenfunctions links spectral embedding and kernel PCA. Neural Comput. 16, 2197–2219 (2004)
10. Ben-Hur, A., Elisseeff, A., Guyon, I.: A stability based method for discovering structure in clustered data. In: Pacific Symposium on Biocomputing, pp. 6–17 (2002) Google Scholar
11. Bhatia, R.: Matrix Analysis. Springer, New York (1997) Google Scholar
12. Bie, T.D., Cristianini, N.: Fast SDP relaxations of graph cut clustering, transduction, and other combinatorial problems. J. Mach. Learn. Res. 7, 1409–1436 (2006)
13. Bolla, M.: Relations between spectral and classification properties of multigraphs. Technical Report No. DIMACS-91-27, Center for Discrete Mathematics and Theoretical Computer Science (1991) Google Scholar
14. Brémaud, P.: Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues. Springer, New York (1999)
15. Brito, M., Chavez, E., Quiroz, A., Yukich, J.: Connectivity of the mutual k-nearest-neighbor graph in clustering and outlier detection. Stat. Probab. Lett. 35, 33–42 (1997)
16. Bui, T.N., Jones, C.: Finding good approximate vertex and edge partitions is NP-hard. Inf. Process. Lett. 42(3), 153–159 (1992)
17. Chapelle, O., Schölkopf, B., Zien, A. (eds.): Semi-Supervised Learning. MIT Press, Cambridge (2006) Google Scholar
18. Chung, F.: Spectral Graph Theory. CBMS Regional Conference Series, vol. 92. Conference Board of the Mathematical Sciences, Washington (1997)
19. Dhillon, I.: Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 269–274. ACM Press, New York (2001)
20. Dhillon, I., Guan, Y., Kulis, B.: A unified view of kernel k-means, spectral clustering, and graph partitioning. Technical Report No. UTCS TR-04-25, University of Texas at Austin (2005) Google Scholar
21. Ding, C.: A tutorial on spectral clustering. Talk presented at ICML (2004). Slides available at http://crd.lbl.gov/~cding/Spectral/
22. Ding, C., He, X., Zha, H., Gu, M., Simon, H.: A min-max cut algorithm for graph partitioning and data clustering. In: Proceedings of the first IEEE International Conference on Data Mining (ICDM), pp. 107–114. IEEE Computer Society, Washington (2001)
23. Donath, W.E., Hoffman, A.J.: Lower bounds for the partitioning of graphs. IBM J. Res. Develop. 17, 420–425 (1973)
24. Fiedler, M.: Algebraic connectivity of graphs. Czechoslovak Math. J. 23, 298–305 (1973)
25. Fouss, F., Pirotte, A., Renders, J.-M., Saerens, M.: Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. IEEE Trans. Knowl. Data Eng. 19, 355–369 (2007) Google Scholar
26. Fraley, C., Raftery, A.E.: Model-based clustering, discriminant analysis, and density estimation. J. Am. Stat. Assoc. 97, 611–631 (2002)
27. Giné, E., Koltchinskii, V.: Empirical graph Laplacian approximation of Laplace-Beltrami operators: large sample results. In: Proceedings of the 4th International Conference on High Dimensional Probability, pp. 238–259 (2005) Google Scholar
28. Golub, G., Van Loan, C.: Matrix Computations. Johns Hopkins University Press, Baltimore (1996)
29. Guattery, S., Miller, G.: On the quality of spectral separators. SIAM J. Matrix Anal. Appl. 19(3), 701–719 (1998)
30. Gutman, I., Xiao, W.: Generalized inverse of the Laplacian matrix and some applications. Bull. Acad. Serb. Sci. Arts (Cl. Math. Natur.) 129, 15–23 (2004)
31. Hagen, L., Kahng, A.: New spectral methods for ratio cut partitioning and clustering. IEEE Trans. Comput.-Aided Des. 11(9), 1074–1085 (1992)
32. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, New York (2001)
33. Hein, M.: Uniform convergence of adaptive graph-based regularization. In: Proceedings of the 19th Annual Conference on Learning Theory (COLT), pp. 50–64. Springer, New York (2006) Google Scholar
34. Hein, M., Audibert, J.-Y., von Luxburg, U.: From graphs to manifolds—weak and strong pointwise consistency of graph Laplacians. In: Auer, P., Meir, R. (eds.) Proceedings of the 18th Annual Conference on Learning Theory (COLT), pp. 470–485. Springer, New York (2005) Google Scholar
35. Hein, M., Audibert, J.-Y., von Luxburg, U.: Graph Laplacians and their convergence on random neighborhood graphs. J. Mach. Learn. Res. 8, 1325–1370 (2007)
36. Hendrickson, B., Leland, R.: An improved spectral graph partitioning algorithm for mapping parallel computations. SIAM J. Sci. Comput. 16, 452–469 (1995)
37. Joachims, T.: Transductive learning via spectral graph partitioning. In: Fawcett, T., Mishra, N. (eds.) Proceedings of the 20th International Conference on Machine Learning (ICML), pp. 290–297. AAAI Press (2003) Google Scholar
38. Kannan, R., Vempala, S., Vetta, A.: On clusterings: good, bad and spectral. J. ACM 51(3), 497–515 (2004)
39. Kempe, D., McSherry, F.: A decentralized algorithm for spectral analysis. In: Proceedings of the 36th Annual ACM Symposium on Theory of Computing (STOC), pp. 561–568. ACM Press, New York (2004) Google Scholar
40. Klein, D., Randic, M.: Resistance distance. J. Math. Chem. 12, 81–95 (1993)
41. Koren, Y.: Drawing graphs by eigenvectors: theory and practice. Comput. Math. Appl. 49, 1867–1888 (2005)
42. Lafon, S.: Diffusion maps and geometric harmonics. Ph.D. Thesis, Yale University (2004) Google Scholar
43. Lang, K.: Fixing two weaknesses of the spectral method. In: Weiss, Y., Schölkopf, B., Platt, J. (eds.) Advances in Neural Information Processing Systems 18, pp. 715–722. MIT Press, Cambridge (2006) Google Scholar
44. Lange, T., Roth, V., Braun, M., Buhmann, J.: Stability-based validation of clustering solutions. Neural Comput. 16(6), 1299–1323 (2004)
45. Lovász, L.: Random walks on graphs: a survey. In: Combinatorics, Paul Erdös is Eighty, pp. 353–397. János Bolyai Math. Soc., Budapest (1993) Google Scholar
46. Lütkepohl, H.: Handbook of Matrices. Wiley, Chichester (1997) Google Scholar
47. Meila, M., Shi, J.: A random walks view of spectral segmentation. In: 8th International Workshop on Artificial Intelligence and Statistics (AISTATS) (2001) Google Scholar
48. Mohar, B.: The Laplacian spectrum of graphs. In: Graph Theory, Combinatorics, and Applications. Kalamazoo, MI, 1988, vol. 2, pp. 871–898. Wiley, New York (1991) Google Scholar
49. Mohar, B.: Some applications of Laplace eigenvalues of graphs. In: Hahn, G., Sabidussi, G. (eds.) Graph Symmetry: Algebraic Methods and Applications. NATO ASI Ser. C, vol. 497, pp. 225–275. Kluwer, Dordrecht (1997) Google Scholar
50. Nadler, B., Lafon, S., Coifman, R., Kevrekidis, I.: Diffusion maps, spectral clustering and eigenfunctions of Fokker-Planck operators. In: Weiss, Y., Schölkopf, B., Platt, J. (eds.) Advances in Neural Information Processing Systems 18, pp. 955–962. MIT Press, Cambridge (2006) Google Scholar
51. Ng, A., Jordan, M., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Dietterich, T., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems 14, pp. 849–856. MIT Press, Cambridge (2002) Google Scholar
52. Norris, J.: Markov Chains. Cambridge University Press, Cambridge (1997)
53. Penrose, M.: A strong law for the longest edge of the minimal spanning tree. Ann. Probab. 27(1), 246–260 (1999)
54. Pothen, A., Simon, H.D., Liou, K.P.: Partitioning sparse matrices with eigenvectors of graphs. SIAM J. Matrix Anal. Appl. 11, 430–452 (1990)
55. Saerens, M., Fouss, F., Yen, L., Dupont, P.: The principal components analysis of a graph, and its relationships to spectral clustering. In: Proceedings of the 15th European Conference on Machine Learning (ECML), pp. 371–383. Springer, Berlin (2004) Google Scholar
56. Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)
57. Simon, H.: Partitioning of unstructured problems for parallel processing. Comput. Syst. Eng. 2, 135–148 (1991)
58. Spielman, D., Teng, S.: Spectral partitioning works: planar graphs and finite element meshes. In: 37th Annual Symposium on Foundations of Computer Science (Burlington, VT, 1996), pp. 96–105. CA: IEEE Comput. Soc. Press, Los Alamitos (1996) Google Scholar
59. Stewart, G., Sun, J.: Matrix Perturbation Theory. Academic, New York (1990)
60. Still, S., Bialek, W.: How many clusters? An information-theoretic perspective. Neural Comput. 16(12), 2483–2506 (2004)
61. Stoer, M., Wagner, F.: A simple min-cut algorithm. J. ACM 44(4), 585–591 (1997)
62. Tibshirani, R., Walther, G., Hastie, T.: Estimating the number of clusters in a dataset via the gap statistic. J. Roy. Stat. Soc. B 63(2), 411–423 (2001)
63. Van Driessche, R., Roose, D.: An improved spectral bisection algorithm and its application to dynamic load balancing. Parallel Comput. 21(1), 29–48 (1995)
64. von Luxburg, U., Belkin, M., Bousquet, O.: Consistency of spectral clustering. Ann. Stat. (to appear). See also Technical Report No. 134, Max Planck Institute for Biological Cybernetics (2004) Google Scholar
65. von Luxburg, U., Bousquet, O., Belkin, M.: On the convergence of spectral clustering on random samples: the normalized case. In: Shawe-Taylor, J., Singer, Y. (eds.) Proceedings of the 17th Annual Conference on Learning Theory (COLT), pp. 457–471. Springer, New York (2004) Google Scholar
66. von Luxburg, U., Bousquet, O., Belkin, M.: Limits of spectral clustering. In: Saul, L., Weiss, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems (NIPS) 17, pp. 857–864. MIT Press, Cambridge (2005) Google Scholar
67. Wagner, D., Wagner, F.: Between min cut and graph bisection. In: Proceedings of the 18th International Symposium on Mathematical Foundations of Computer Science (MFCS), pp. 744–750. Springer, London (1993) Google Scholar