Advertisement

Kernel Spectral Clustering and Applications

Chapter

Abstract

In this chapter we review the main literature related to kernel spectral clustering (KSC), an approach to clustering cast within a kernel-based optimization setting. KSC represents a least-squares support vector machine-based formulation of spectral clustering described by a weighted kernel PCA objective. Just as in the classifier case, the binary clustering model is expressed by a hyperplane in a high dimensional space induced by a kernel. In addition, the multi-way clustering can be obtained by combining a set of binary decision functions via an Error Correcting Output Codes (ECOC) encoding scheme. Because of its model-based nature, the KSC method encompasses three main steps: training, validation, testing. In the validation stage model selection is performed to obtain tuning parameters, like the number of clusters present in the data. This is a major advantage compared to classical spectral clustering where the determination of the clustering parameters is unclear and relies on heuristics. Once a KSC model is trained on a small subset of the entire data, it is able to generalize well to unseen test points. Beyond the basic formulation, sparse KSC algorithms based on the Incomplete Cholesky Decomposition (ICD) and L0, \(L_{1},L_{0} + L_{1}\), Group Lasso regularization are reviewed. In that respect, we show how it is possible to handle large-scale data. Also, two possible ways to perform hierarchical clustering and a soft clustering method are presented. Finally, real-world applications such as image segmentation, power load time-series clustering, document clustering, and big data learning are considered.

Keywords

Community Detection Kernel Matrix Cluster Membership Group Lasso Soft Cluster 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgements

EU: The research leading to these results has received funding from the European Research Council under the European Union’s Seventh Framework Programme (FP7/2007–2013)/ERC AdG A-DATADRIVE-B (290923). This chapter reflects only the authors’ views, the Union is not liable for any use that may be made of the contained information. Research Council KUL: GOA/10/09 MaNet, CoE PFV/10/002 (OPTEC), BIL12/11T; PhD/Postdoc grants. Flemish Government: FWO: projects: G.0377.12 (Structured systems), G.088114N (Tensor-based data similarity); PhD/Postdoc grants. IWT: projects: SBO POM (100031); PhD/Postdoc grants. iMinds Medical Information Technologies SBO 2014. Belgian Federal Science Policy Office: IUAP P7/19 (DYSCO, Dynamical systems, control and optimization, 2012–2017.)

References

  1. 1.
    Alzate, C., Sinn, M.: Improved electricity load forecasting via kernel spectral clustering of smart meters. In: ICDM, pp. 943–948 (2013)Google Scholar
  2. 2.
    Alzate, C., Suykens, J.A.K.: Multiway spectral clustering with out-of-sample extensions through weighted kernel PCA. IEEE Trans. Pattern Anal. Mach. Intell. 32(2), 335–347 (2010)CrossRefGoogle Scholar
  3. 3.
    Alzate, C., Suykens, J.A.K.: Sparse kernel spectral clustering models for large-scale data analysis. Neurocomputing 74(9), 1382–1390 (2011)CrossRefGoogle Scholar
  4. 4.
    Alzate, C., Suykens, J.A.K.: Hierarchical kernel spectral clustering. Neural Networks 35, 21–30 (2012)CrossRefzbMATHGoogle Scholar
  5. 5.
    Alzate, C., Espinoza, M., De Moor, B., Suykens, J.A.K.: Identifying customer profiles in power load time series using spectral clustering. In: Proceedings of the 19th International Conference on Neural Networks (ICANN 2009), pp. 315–324 (2009)Google Scholar
  6. 6.
    Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Boston (1999)Google Scholar
  7. 7.
    Ben-Israel, A., Iyigun, C.: Probabilistic d-clustering. J. Classif. 25(1), 5–26 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Bishop, C.M.: Pattern Recognition and Machine Learning (Information Science and Statistics). Springer, New York (2006)zbMATHGoogle Scholar
  9. 9.
    Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008(10), P10,008 (2008)Google Scholar
  10. 10.
    Candes, E.J., Wakin, M.B., Boyd, S.: Enhancing sparsity by reweighted l1 minimization. J. Fourier Anal. Appl. (Special Issue on Sparsity) 14(5), 877–905 (2008)Google Scholar
  11. 11.
    Chung, F.R.K.: Spectral Graph Theory. American Mathematical Society, Providence (1997)zbMATHGoogle Scholar
  12. 12.
    De Brabanter, K., De Brabanter, J., Suykens, J.A.K., De Moor, B.: Optimized fixed-size kernel models for large data sets. Comput. Stat. Data Anal. 54(6), 1484–1504 (2010)Google Scholar
  13. 13.
    Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391–407 (1990)CrossRefGoogle Scholar
  14. 14.
    Delvenne, J.C., Yaliraki, S.N., Barahona, M.: Stability of graph communities across time scales. Proc. Natl. Acad. Sci. 107(29), 12755–12760 (2010)CrossRefGoogle Scholar
  15. 15.
    Dhanjal, C., Gaudel, R., Clemenccon, S.: Efficient eigen-updating for spectral graph clustering (2013) [arXiv/1301.1318]Google Scholar
  16. 16.
    Dhillon, I., Guan, Y., Kulis, B.: Kernel k-means, spectral clustering and normalized cuts. In: 10th ACM Knowledge Discovery and Data Mining Conf., pp. 551–556 (2004)Google Scholar
  17. 17.
    Dhillon, I., Guan, Y., Kulis, B.: Weighted graph cuts without eigenvectors a multilevel approach. IEEE Trans. Pattern Anal. Mach. Intell. 29(11), 1944–1957 (2007)CrossRefGoogle Scholar
  18. 18.
    Espinoza, M., Joye, C., Belmans, R., De Moor, B.: Short-term load forecasting, profile identification and customer segmentation: a methodology based on periodic time series. IEEE Trans. Power Syst. 20(3), 1622–1630 (2005)CrossRefGoogle Scholar
  19. 19.
    Fowlkes, C., Belongie, S., Chung, F., Malik, J.: Spectral grouping using the Nyström method. IEEE Trans. Pattern Anal. Mach. Intell. 26(2), 214–225 (2004)CrossRefGoogle Scholar
  20. 20.
    Frederix, K., Van Barel, M.: Sparse spectral clustering method based on the incomplete cholesky decomposition. J. Comput. Appl. Math. 237(1), 145–161 (2013)Google Scholar
  21. 21.
    Friedman, J., Hastie, T., Tibshirani, R.: A note on the group lasso and a sparse group lasso (2010) [arXiv:1001.0736]Google Scholar
  22. 22.
    Huang, K., Zheng, D., Sun, J., Hotta, Y., Fujimoto, K., Naoi, S.: Sparse learning for support vector classification. Pattern Recogn. Lett. 31(13), 1944–1951 (2010)CrossRefGoogle Scholar
  23. 23.
    Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 1(2), 193–218 (1985)MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Lancichinetti, A., Radicchi, F., Ramasco, J.J., Fortunato, S.: Finding statistically significant communities in networks. PLoS ONE 6(4), e18961 (2011)CrossRefGoogle Scholar
  25. 25.
    Langone, R., Suykens, J.A.K.: Community detection using kernel spectral clustering with memory. J. Phys. Conf. Ser. 410(1), 012100 (2013)CrossRefGoogle Scholar
  26. 26.
    Langone, R., Alzate, C., Suykens, J.A.K.: Modularity-based model selection for kernel spectral clustering. In: Proc. of the International Joint Conference on Neural Networks (IJCNN 2011), pp. 1849–1856 (2011)Google Scholar
  27. 27.
    Langone, R., Alzate, C., Suykens, J.A.K.: Kernel spectral clustering for community detection in complex networks. In: Proc. of the International Joint Conference on Neural Networks (IJCNN 2012), pp. 2596–2603 (2012)Google Scholar
  28. 28.
    Langone, R., Alzate, C., Suykens, J.A.K.: Kernel spectral clustering with memory effect. Phys. A Stat. Mech. Appl. 392(10), 2588–2606 (2013)MathSciNetCrossRefGoogle Scholar
  29. 29.
    Langone, R., Alzate, C., De Ketelaere, B., Suykens, J.A.K.: Kernel spectral clustering for predicting maintenance of industrial machines. In: IEEE Symposium Series on Computational Intelligence and data mining SSCI (CIDM) 2013, pp. 39–45 (2013)Google Scholar
  30. 30.
    Langone, R., Mall, R., Suykens, J.A.K.: Soft kernel spectral clustering. In: Proc. of the International Joint Conference on Neural Networks (IJCNN 2013), pp. 1–8 (2013)Google Scholar
  31. 31.
    Langone, R., Mall, R., Suykens, J.A.K.: Clustering data over time using kernel spectral clustering with memory. In: SSCI (CIDM) 2014, pp. 1–8 (2014)Google Scholar
  32. 32.
    Langone, R., Agudelo, O.M., De Moor, B., Suykens, J.A.K.: Incremental kernel spectral clustering for online learning of non-stationary data. Neurocomputing 139, 246–260 (2014)Google Scholar
  33. 33.
    Langone, R., Alzate, C., De Ketelaere, B., Vlasselaer, J., Meert, W., Suykens, J.A.K.: Ls-svm based spectral clustering and regression for predicting maintenance of industrial machines. Eng. Appl. Artif. Intell. 37, 268–278 (2015)Google Scholar
  34. 34.
    Leskovec, J., Kleinberg, J., Faloutsos, C.: Graph evolution: densification and shrinking diameters. ACM Trans. Knowl. Discov. Data 1(1), 2 (2007)CrossRefGoogle Scholar
  35. 35.
    Leskovec, J., Lang, K.J., Mahoney, M.: Empirical comparison of algorithms for network community detection. In: Proceedings of the 19th International Conference on World Wide Web, WWW ’10, pp. 631–640. ACM, New York (2010)Google Scholar
  36. 36.
    Liao, T.W.: Clustering of time series data - a survey. Pattern Recogn. 38(11), 1857–1874 (2005)CrossRefzbMATHGoogle Scholar
  37. 37.
    Lin, F., Cohen, W.W.: Power iteration clustering. In: ICML, pp. 655–662 (2010)Google Scholar
  38. 38.
    Mall, R., Langone, R., Suykens, J.: FURS: fast and unique representative subset selection retaining large scale community structure. Soc. Netw. Anal. Min. 3(4), 1–21 (2013)CrossRefGoogle Scholar
  39. 39.
    Mall, R., Langone, R., Suykens, J.A.K.: Kernel spectral clustering for big data networks. Entropy (Special Issue on Big Data) 15(5), 1567–1586 (2013)Google Scholar
  40. 40.
    Mall, R., Langone, R., Suykens, J.A.K.: Self-tuned kernel spectral clustering for large scale networks. In: IEEE International Conference on Big Data (2013)CrossRefGoogle Scholar
  41. 41.
    Mall, R., Langone, R., Suykens, J.A.K.: Agglomerative hierarchical kernel spectral data clustering. In: Symposium Series on Computational Intelligence (SSCI-CIDM), pp. 1–8 (2014)Google Scholar
  42. 42.
    Mall, R., Langone, R., Suykens, J.A.K.: Multilevel hierarchical kernel spectral clustering for real-life large scale complex networks. PLoS ONE 9(6), e99966 (2014)CrossRefGoogle Scholar
  43. 43.
    Mall, R., Mehrkanoon, S., Langone, R., Suykens, J.A.K.: Optimal reduced sets for sparse kernel spectral clustering. In: Proc. of the International Joint Conference on Neural Networks (IJCNN 2014), pp. 2436–2443 (2014)Google Scholar
  44. 44.
    Mall, R., Jumutc, V., Langone, R., Suykens, J.A.K.: Representative subsets for big data learning using kNN graphs. In: IEEE International Conference on Big Data, pp. 37–42 (2014)Google Scholar
  45. 45.
    Martin, D., Fowlkes, C., Tal, D., Malik, J.: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Proc. 8th Int’l Conf. Computer Vision, vol. 2, pp. 416–423 (2001)Google Scholar
  46. 46.
    Meila, M., Shi, J.: Learning segmentation by random walks. In: T.K. Leen, T.G. Dietterich, V. Tresp (eds.) Advances in Neural Information Processing Systems, vol. 13. MIT Press, Cambridge (2001)Google Scholar
  47. 47.
    Meila, M., Shi, J.: A random walks view of spectral segmentation. In: Artificial Intelligence and Statistics AISTATS (2001)Google Scholar
  48. 48.
    Mika, S., Schölkopf, B., Smola, A.J., Müller, K.R., Scholz, M., Rätsch, G.: Kernel PCA and de-noising in feature spaces. In: Kearns, M.S., Solla, S.A., Cohn, D.A. (eds.) Advances in Neural Information Processing Systems, vol. 11. MIT Press, Cambridge (1999)Google Scholar
  49. 49.
    Newman, M.E.J.: Modularity and community structure in networks. Proc. Natl. Acad. Sci. U. S. A. 103(23), 8577–8582 (2006)CrossRefGoogle Scholar
  50. 50.
    Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69(2), 026113 (2004)CrossRefGoogle Scholar
  51. 51.
    Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Dietterich, T.G., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems 14, pp. 849–856. MIT Press, Cambridge (2002)Google Scholar
  52. 52.
    Ning, H., Xu, W., Chi, Y., Gong, Y., Huang, T.S.: Incremental spectral clustering by efficiently updating the eigen-system. Pattern Recogn. 43(1), 113–127 (2010)CrossRefzbMATHGoogle Scholar
  53. 53.
    Novak, M., Alzate, C., langone, R., Suykens, J.A.K.: Fast kernel spectral clustering based on incomplete cholesky factorization for large scale data analysis. Internal Report 14–119, ESAT-SISTA, KU Leuven (Leuven, Belgium) (2015)Google Scholar
  54. 54.
    Peluffo, D., Garcia, S., Langone, R., Suykens, J.A.K., Castellanos, G.: Kernel spectral clustering for dynamic data using multiple kernel learning. In: Proc. of the International Joint Conference on Neural Networks (IJCNN 2013), pp. 1085–1090 (2013)Google Scholar
  55. 55.
    Puzicha, J., Hofmann, T., Buhmann, J.: Non-parametric similarity measures for unsupervised texture segmentation and image retrieval. In: Computer Vision and Pattern Recognition, pp. 267–272 (1997)Google Scholar
  56. 56.
    Rosvall, M., Bergstrom, C.T.: Maps of random walks on complex networks reveal community structure. Proc. Natl. Acad. Sci. 105(4), 1118–1123 (2008)CrossRefGoogle Scholar
  57. 57.
    Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20(1), 53–65 (1987)CrossRefzbMATHGoogle Scholar
  58. 58.
    Schölkopf, B., Smola, A.J., Müller, K.R.: Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 10, 1299–1319 (1998)CrossRefGoogle Scholar
  59. 59.
    Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)CrossRefGoogle Scholar
  60. 60.
    Strehl, A., Ghosh, J.: Cluster ensembles - a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2002)MathSciNetzbMATHGoogle Scholar
  61. 61.
    Suykens, J.A.K., Van Gestel, T., De Brabanter, J., De Moor, B., Vandewalle, J.: Least Squares Support Vector Machines. World Scientific, Singapore (2002)CrossRefzbMATHGoogle Scholar
  62. 62.
    Suykens, J.A.K., Van Gestel, T., Vandewalle, J., De Moor, B.: A support vector machine formulation to PCA analysis and its kernel version. IEEE Trans. Neural Netw. 14(2), 447–450 (2003)CrossRefGoogle Scholar
  63. 63.
    von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)MathSciNetCrossRefGoogle Scholar
  64. 64.
    Williams, C.K.I., Seeger, M.: Using the Nyström method to speed up kernel machines. In: Advances in Neural Information Processing Systems, vol. 13. MIT Press, Cambridge (2001)Google Scholar
  65. 65.
    Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. 68(1), 49–67 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  66. 66.
    Zhu, J., Rosset, S., Hastie, T., Tibshirani, R.: 1-norm svms. In: Neural Information Processing Systems, vol. 16 (2003)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.KU Leuven ESAT-STADIUSLeuven(Belgium)
  2. 2.IBM’s Smarter Cities Technology CenterDublin(Ireland)

Personalised recommendations