Advertisement

An Incremental Reseeding Strategy for Clustering

  • Xavier BressonEmail author
  • Huiyi Hu
  • Thomas Laurent
  • Arthur Szlam
  • James von Brecht
Conference paper
Part of the Mathematics and Visualization book series (MATHVISUAL)

Abstract

We propose an easy-to-implement and highly parallelizable algorithm for multiway graph partitioning. The algorithm proceeds by alternating three simple routines in an iterative fashion: diffusion , thresholding, and random sampling. We demonstrate experimentally that the proper combination of these ingredients leads to an algorithm that achieves state-of-the-art performance in terms of cluster purity on standard benchmark data sets. We also describe a coarsen, cluster and refine approach similar to Dhillon et al. (IEEE Trans Pattern Anal Mach Intell 29(11):1944–1957, 2007) and Karypis and Kumar (SIAM J Sci Comput 20(1):359–392, 1998) that removes an order of magnitude from the runtime of our algorithm while still maintaining competitive accuracy.

Notes

Acknowledgements

XB is supported by NRF Fellowship NRFF2017-10.

References

  1. 1.
    R. Andersen, F. Chung, K. Lang, Local graph partitioning using pagerank vectors, in Proceedings of the 47th Annual Symposium on Foundations of Computer Science (FOCS ’06), pp. 475–486 (2006)Google Scholar
  2. 2.
    R. Arora, M. Gupta, A. Kapila, M. Fazel, Clustering by left-stochastic matrix factorization, in International Conference on Machine Learning (ICML) (2011), pp. 761–768Google Scholar
  3. 3.
    X. Bresson, T. Laurent, D. Uminsky, J. von Brecht, Multiclass total variation clustering, in Advances in Neural Information Processing Systems (NIPS) (2013)Google Scholar
  4. 4.
    X. Bresson, T. Laurent, A. Szlam, J.H. von Brecht, The product cut, in Advances in Neural Information Processing Systems (NIPS) (2016)Google Scholar
  5. 5.
    J. Bruna, S. Mallat, Invariant scattering convolution networks. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1872–1886 (2013)CrossRefGoogle Scholar
  6. 6.
    I.S. Dhillon, Y. Guan, B. Kulis, Weighted graph cuts without eigenvectors: a multilevel approach. IEEE Trans. Pattern Anal. Mach. Intell. 29(11), 1944–1957 (2007)CrossRefGoogle Scholar
  7. 7.
    C. Garcia-Cardona, E. Merkurjev, A.L. Bertozzi, A. Flenner, A.G. Percus, Multiclass data segmentation using diffuse interface methods on graphs. IEEE Trans. Pattern Anal. Mach. Intell. 99, 1 (2014)zbMATHGoogle Scholar
  8. 8.
    G. Karypis, V. Kumar, A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1), 359–392 (1998)MathSciNetCrossRefGoogle Scholar
  9. 9.
    S. Lafon, A.B. Lee, Diffusion maps and coarse-graining: a unified framework for dimensionality reduction, graph partitioning, and data set parameterization. IEEE Trans. Pattern Anal. Mach. Intell. 28(9), 1393–1403 (2006)CrossRefGoogle Scholar
  10. 10.
    F. Lin, W.W. Cohen, Power iteration clustering, in ICML (2010), pp. 655–662Google Scholar
  11. 11.
    L. Lovász, M. Simonovits, Random walks in a convex body and an improved volume algorithm. Random Struct. Algorithms 4(4), 359–412 (1993)MathSciNetCrossRefGoogle Scholar
  12. 12.
    A.K. McCallum, Bow: a toolkit for statistical language modeling, text retrieval, classification and clustering (1996). http://www.cs.cmu.edu/~mccallum/bow
  13. 13.
    D.A. Spielman, S.-H. Teng, Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems, in Proceedings of the Thirty-Sixth Annual ACM Symposium on Theory of Computing (2004), pp. 81–90Google Scholar
  14. 14.
    D.A. Spielman, S.-H. Teng, A local clustering algorithm for massive graphs and its application to nearly linear time graph partitioning. SIAM J. Comput. 42(1), 1–26 (2013)MathSciNetCrossRefGoogle Scholar
  15. 15.
    M. Stephane, Group invariant scattering. Commun. Pure Appl. Math. 65(10), 1331–1398 (2012)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Z. Yang, T. Hao, O. Dikmen, X. Chen, E. Oja, Clustering by nonnegative matrix factorization using graph random walk, in Advances in Neural Information Processing Systems (NIPS) (2012), pp. 1088–1096CrossRefGoogle Scholar
  17. 17.
    S.X. Yu, J. Shi, Multiclass spectral clustering. in international conference on computer vision, in International Conference on Computer Vision (2003)Google Scholar
  18. 18.
    X. Zhu, Z. Ghahramani, J. Lafferty, Semi-supervised learning using Gaussian fields and harmonic functions. in IN ICML, pp. 912–919 (2003), pp. 912–919Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Xavier Bresson
    • 1
    Email author
  • Huiyi Hu
    • 2
  • Thomas Laurent
    • 3
  • Arthur Szlam
    • 4
  • James von Brecht
    • 5
  1. 1.School of Computer Science and EngineeringNanyang Technological UniversitySingaporeSingapore
  2. 2.Google IncMountain ViewUSA
  3. 3.Department of MathematicsLoyola Marymount UniversityLos AngelesUSA
  4. 4.Facebook AI ResearchNew YorkUSA
  5. 5.Department of MathematicsCalifornia State UniversityLong BeachUSA

Personalised recommendations