Skip to main content
Log in

A Distributed Block Chebyshev-Davidson Algorithm for Parallel Spectral Clustering

  • Published:
Journal of Scientific Computing Aims and scope Submit manuscript

Abstract

We develop a distributed Block Chebyshev-Davidson algorithm to solve large-scale leading eigenvalue problems for spectral analysis in spectral clustering. First, the efficiency of the Chebyshev-Davidson algorithm relies on the prior knowledge of the eigenvalue spectrum, which could be expensive to estimate. This issue can be lessened by the analytic spectrum estimation of the Laplacian or normalized Laplacian matrices in spectral clustering, making the proposed algorithm very efficient for spectral clustering. Second, to make the proposed algorithm capable of analyzing big data, a distributed and parallel version has been developed with attractive scalability. The speedup by parallel computing is approximately equivalent to \(\sqrt{p}\), where p denotes the number of processes. Numerical results will be provided to demonstrate its efficiency in spectral clustering and scalability advantage over existing eigensolvers used for spectral clustering in parallel computing environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Algorithm 2
Algorithm 3
Algorithm 4
Fig. 1
Algorithm 5
Algorithm 6
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data Availability

Enquiries about data availability should be directed to the authors.

Notes

  1. http://graphchallenge.mit.edu.

  2. https://github.com/qiyuanpang/DistributedLEVP.jl.

References

  1. Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)

    Article  Google Scholar 

  2. Cheeger, J.: A lower bound for the smallest eigenvalue of the Laplacian. In: Problems in Analysis, pp. 195–200. Princeton University Press (2015)

    Google Scholar 

  3. Donath, W.E., Hoffman, A.J.: Algorithms for partitioning of graphs and computer logic based on eigenvectors of connection matrices. IBM Tech. Discl. Bull. 15(3), 938–944 (1972)

    Google Scholar 

  4. Fiedler, M.: Algebraic connectivity of graphs. Czechoslov. Math. J. 23(2), 298–305 (1973)

    Article  MathSciNet  Google Scholar 

  5. Guattery, S., Miller, G.L.: On the performance of spectral graph partitioning methods. Technical report, Carnegie-Mellon Univ Pittsburgh PA Department of Computer Science (1994)

  6. Spielman, D.A., Teng, S.-H.: Spectral partitioning works: planar graphs and finite element meshes. In: Proceedings of 37th Conference on Foundations of Computer Science, pp. 96–105. IEEE (1996)

  7. Ng, A., Jordan, M., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems, 14 (2001)

  8. Zhou, Y., Saad, Y.: A Chebyshev-Davidson algorithm for large symmetric eigenproblems. SIAM J. Matrix Anal. Appl. 29(3), 954–971 (2007)

    Article  MathSciNet  Google Scholar 

  9. Szabo, A., Ostlund, N.S.: Modern Quantum Chemistry: Introduction to Advanced Electronic Structure Theory. Courier Corporation (2012)

    Google Scholar 

  10. Jianfeng, L., Yang, H.: Preconditioning orbital minimization method for planewave discretization. Multiscale Model. Simul. 15(1), 254–273 (2017)

    Article  MathSciNet  Google Scholar 

  11. Li, Y., Yang, H.: Interior eigensolver for sparse Hermitian definite matrices based on Zolotarev’s functions. Commun. Math. Sci. 19(4), 1113–1135 (2021)

    Article  MathSciNet  Google Scholar 

  12. Zhou, Y., Chelikowsky, J.R., Saad, Y.: Chebyshev-filtered subspace iteration method free of sparse diagonalization for solving the Kohn–Sham equation. J. Comput. Phys. 274, 770–782 (2014)

    Article  ADS  Google Scholar 

  13. Saad, Y.: Numerical methods for large eigenvalue problems: revised edition. SIAM (2011)

  14. Schofield, G., Chelikowsky, J.R., Saad, Y.: Using Chebyshev-filtered subspace iteration and windowing methods to solve the Kohn–Sham problem. Practical Aspects of Computational Chemistry I: An Overview of the Last Two Decades and Current Trends, pp. 167–189 (2012)

  15. Zhou, Y., Wang, Z., Zhou, A.: Accelerating large partial evd. SVD calculations by filtered block Davidson

  16. Miao, C.-Q.: A filtered-Davidson method for large symmetric eigenvalue problems. East Asian J. Appl. Math. 7(1), 21–37 (2017)

    Article  MathSciNet  Google Scholar 

  17. Crouzeix, M., Philippe, B., Sadkane, M.: The Davidson method. SIAM J. Sci. Comput. 15(1), 62–76 (1994)

    Article  MathSciNet  Google Scholar 

  18. Sleijpen, G.L.G., Van der Vorst, H.A.: A Jacobi–Davidson iteration method for linear eigenvalue problems. SIAM Rev. 42(2), 267–293 (2000)

    Article  ADS  MathSciNet  Google Scholar 

  19. Zhou, Y.: A block Chebyshev-Davidson method with inner–outer restart for large eigenvalue problems. J. Comput. Phys. 229(24), 9188–9200 (2010)

    Article  ADS  MathSciNet  CAS  Google Scholar 

  20. Teng, Z., Zhou, Y., Li, R.-C.: A block Chebyshev-Davidson method for linear response eigenvalue problems. Adv. Comput. Math. 42, 1103–1128 (2016)

    Article  MathSciNet  Google Scholar 

  21. Zhou, Y., Wang, Z., Zhou, A.: Accelerating large partial EVD/SVD calculations by filtered block Davidson methods. Sci. China Math. 59, 1635–1662 (2016)

    Article  MathSciNet  Google Scholar 

  22. Ji, L., Hua, D.: A block Chebyshev-Davidson method for solving symmetric eigenproblems. J. Numer. Methods Comput. Appl. 32(3), 209 (2011)

    MathSciNet  CAS  Google Scholar 

  23. Miao, C.-Q., Cheng, L.: On flexible block Chebyshev-Davidson method for solving symmetric generalized eigenvalue problems. Adv. Comput. Math. 49(6), 78 (2023)

    Article  MathSciNet  Google Scholar 

  24. Miao, C.-Q.: On Chebyshev-Davidson method for symmetric generalized eigenvalue problems. J. Sci. Comput. 85(3), 53 (2020)

    Article  MathSciNet  Google Scholar 

  25. Wang, B., An, H., Xie, H., Mo, Z.: A new subspace iteration algorithm for solving generalized eigenvalue problems. arXiv preprint arXiv:2212.14520. (2022)

  26. Koehl, P.: Large eigenvalue problems in coarse–grained dynamic analyses of supramolecular systems. J. Chem. Theory Comput. 14(7), 3903–3919 (2018)

    Article  CAS  PubMed  Google Scholar 

  27. Di Napoli, E., Berljafa, M.: Block iterative eigensolvers for sequences of correlated eigenvalue problems. Comput. Phys. Commun. 184(11), 2478–2488 (2013)

    Article  ADS  MathSciNet  Google Scholar 

  28. Zhou, Y., Saad, Y., Tiago, M.L., Chelikowsky, J.R.: Parallel self-consistent-field calculations via Chebyshev-filtered subspace acceleration. Phys. Rev. E 74(6), 066704 (2006)

    Article  ADS  Google Scholar 

  29. Kohn, W., Sham, L.J.: Self-consistent equations including exchange and correlation effects. Phys. Rev. 140(4A), A1133 (1965)

    Article  ADS  MathSciNet  Google Scholar 

  30. Yu, V.W., Corsetti, F., García, A., Huhn, W.P., Jacquelin, M., Jia, W., Lange, B., Lin, L., Lu, J., Mi, W., et al.: ELSI: a unified software interface for Kohn–Sham electronic structure solvers. Comput. Phys. Commun. 222, 267–285 (2018)

    Article  ADS  CAS  Google Scholar 

  31. Jianfeng, L., Yang, H.: A cubic scaling algorithm for excited states calculations in particle-particle random phase approximation. J. Computat. Phys. 340, 297–308 (2017)

    Article  ADS  MathSciNet  Google Scholar 

  32. Daniel, J.W., Gragg, W.B., Kaufman, L., Stewart, G.W.: Reorthogonalization and stable algorithms for updating the Gram–Schmidt QR factorization. Math. Comput. 30(136), 772–795 (1976)

    MathSciNet  Google Scholar 

  33. Cannon, L.E.: A Cellular Computer to Implement the Kalman Filter Algorithm. Montana State University (1969)

    Google Scholar 

  34. Van De Geijn, R.A., Watts, J.: Summa: scalable universal matrix multiplication algorithm. Concurr. Pract. Exp. 9(4), 255–274 (1997)

    Article  Google Scholar 

  35. Agarwal, R.C., Balle, S.M., Gustavson, F.G., Joshi, M., Palkar, P.: A three-dimensional approach to parallel matrix multiplication. IBM J. Res. Dev. 39(5), 575–582 (1995)

    Article  Google Scholar 

  36. Solomonik, E., Demmel, J.: Communication-optimal parallel 2.5 d matrix multiplication and LU factorization algorithms. In: European Conference on Parallel Processing, pp. 90–109. Springer (2011)

  37. Azad, A., Ballard, G., Buluc, A., Demmel, J., Grigori, L., Schwartz, O., Toledo, S., Williams, S.: Exploiting multiple levels of parallelism in sparse matrix–matrix multiplication. SIAM J. Sci. Comput. 38(6), C624–C651 (2016)

    Article  MathSciNet  Google Scholar 

  38. Buluç, A., Gilbert, J.R.: Parallel sparse matrix–matrix multiplication and indexing: implementation and experiments. SIAM J. Sci. Comput. 34(4), C170–C191 (2012)

    Article  MathSciNet  Google Scholar 

  39. Schatz, M.D., Van de Geijn, R.A., Poulson, J.: Parallel matrix multiplication: a systematic journey. SIAM J. Sci. Comput. 38(6), C748–C781 (2016)

    Article  MathSciNet  Google Scholar 

  40. Selvitopi, O., Brock, B., Nisa, I., Tripathy, A., Yelick, K., Buluç, A.: Distributed-memory parallel algorithms for sparse times tall-skinny-dense matrix multiplication. In: Proceedings of the ACM International Conference on Supercomputing, pp. 431–442 (2021)

  41. Kannan, R., Ballard, G., Park, H.: MPI-FAUN: An MPI-based framework for alternating-updating nonnegative matrix factorization. IEEE Trans. Knowl. Data Eng. 30(3), 544–558 (2017)

    Article  Google Scholar 

  42. Demmel, J., Grigori, L., Hoemmen, M., Langou, J.: Communication-optimal parallel and sequential QR and LU factorizations. SIAM J. Sci. Comput. 34(1), A206–A239 (2012)

    Article  MathSciNet  Google Scholar 

  43. Anderson, E., Bai, Z., Bischof, C., Blackford, S., Demmel, J., Dongarra, J., Du Croz, J., Greenbaum, A., Hammarling, S., McKenney, A., Sorensen, D.: LAPACK Users’ Guide. Society for Industrial and Applied Mathematics, Philadelphia, PA, third edition (1999)

  44. Blackford, L.S., Choi, J., Cleary, A., D’Azevedo, E., Demmel, J., Dhillon, I., Dongarra, J., Hammarling, S., Henry, G., Petitet, A., Stanley, K., Walker, D., Whaley, R.C.: ScaLAPACK Users’ Guide. Society for Industrial and Applied Mathematics, Philadelphia, PA (1997)

  45. Knyazev, A.V.: Toward the optimal preconditioned eigensolver: locally optimal block preconditioned conjugate gradient method. SIAM J. Sci. Comput. 23(2), 517–541 (2001)

    Article  MathSciNet  Google Scholar 

  46. Lehoucq, R.B., Sorensen, D.C., Yang, C.: ARPACK users’ guide: solution of large-scale eigenvalue problems with implicitly restarted Arnoldi methods. SIAM (1998)

  47. Lin, F., Cohen, W.W.: Power iteration clustering. In: ICML (2010)

  48. Naumov, M., Moon, T.: Parallel spectral graph partitioning. NVIDIA, Santa Clara, CA, USA, Tech. Rep., NVR-2016-001 (2016)

  49. Chen, W.-Y., Song, Y., Bai, H., Lin, C.-J., Chang, E.Y.: Parallel spectral clustering in distributed systems. IEEE Trans. Pattern Anal. Mach. Intell. 33(3), 568–586 (2010)

    Article  Google Scholar 

  50. Yan, W., Brahmakshatriya, U., Xue, Y., Gilder, M., Wise, B.: p-pic: parallel power iteration clustering for big data. J. Parallel Distrib. Comput. 73(3), 352–359 (2013)

    Article  Google Scholar 

  51. Huo, Z., Mei, G., Casolla, G., Giampaolo, F.: Designing an efficient parallel spectral clustering algorithm on multi-core processors in Julia. J. Parallel Distrib. Comput. 138, 211–221 (2020)

    Article  Google Scholar 

  52. Chan, E., Heimlich, M., Purkayastha, A., Van De Geijn, R.: Collective communication: theory, practice, and experience. Concurr. Comput. Pract. Exp. 19(13), 1749–1783 (2007)

    Article  Google Scholar 

  53. Byrne, S., Wilcox, L.C., Churavy, V.: MPI. JL: Julia bindings for the message passing interface. In: Proceedings of the JuliaCon Conferences, vol. 1, pp. 68 (2021)

  54. Gabriel, E., Fagg, G.E., Bosilca, G., Angskun, T., Dongarra, J.J., Squyres, J.M., Sahay, V., Kambadur, P., Barrett, B., Lumsdaine, A., et al.: Open MPI: goals, concept, and design of a next generation MPI implementation. In: European Parallel Virtual Machine/Message Passing Interface Users’ Group Meeting, pp. 97–104. Springer (2004)

  55. Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2, 193–218 (1985)

    Article  Google Scholar 

  56. Danon, L., Diaz-Guilera, A., Duch, J., Arenas, A.: Comparing community structure identification. J. Stat. Mech. Theory Exp. 2005(09), P09008 (2005)

    Article  Google Scholar 

  57. Knyazev, A.: Recent implementations, applications, and extensions of the locally optimal block preconditioned conjugate gradient method (LOBPCG). arXiv preprint arXiv:1708.08354 (2017)

  58. Balay, Satish, Abhyankar, Shrirang, Adams, Mark F., Benson, Steven, Brown, Jed, Brune, Peter, Buschelman, Kris, Constantinescu, Emil M., Dalcin, Lisandro, Dener, Alp, Eijkhout, Victor, Faibussowitsch, Jacob, Gropp, William D., Hapla, Václav, Isaac, Tobin, Jolivet, Pierre, Karpeev, Dmitry, Kaushik, Dinesh, Knepley, Matthew G., Kong, Fande, Kruger, Scott, May, Dave A., McInnes, Lois Curfman, Mills, Richard Tran, Mitchell, Lawrence, Munson, Todd, Roman, Jose E., Rupp, Karl, Sanan, Patrick, Sarich, Jason, Smith, Barry F., Zampini, Stefano, Zhang, Hong, Zhang, Hong, Zhang, Junchao: PETSc Web page. https://petsc.org/, (2023)

  59. Balay, S., Abhyankar, S., Adams, M.F., Benson, S., Brown, J., Brune, P., Buschelman, K., Constantinescu, E., Dalcin, L., Dener, A., Eijkhout, V., Faibussowitsch, J., Gropp, W.D., Hapla, V., Isaac, T., Jolivet, P., Karpeev, D., Kaushik, D., Knepley, M.G., Kong, F., Kruger, S., May, D.A., McInnes, L.C., Mills, R.T., Mitchell, L., Munson, T., Roman, J.E., Rupp, K., Sanan, P., Sarich, J., Smith, B.F., Zampini, S., Zhang, H., Zhang, H., Zhang, J.: PETSc/TAO users manual. Technical Report ANL-21/39 - Revision 3.20, Argonne National Laboratory (2023)

  60. Balay, S., Gropp, W.D., McInnes, L.C., Smith, B.F.: Efficient management of parallelism in object oriented numerical software libraries. In: Arge, E., Bruaset, A.M., Langtangen, H.P. (eds.) Modern Software Tools for Scientific Computing, pp. 163–202. Birkhäuser Press (1997)

    Chapter  Google Scholar 

  61. Cho, K., Mitsuya, K., Kato, A.: Traffic data repository at the \(\{\)WIDE\(\}\) project. In: 2000 USENIX Annual Technical Conference (USENIX ATC 00), (2000)

  62. Kepner, J., Samsi, S., Arcand, W., Bestor, D., Bergeron, B., Davis, T., Gadepally, V., Houle, M., Hubbell, M., Jananthan, H., et al.: Design, generation, and validation of extreme scale power-law graphs. arXiv preprint arXiv:1803.01281 (2018)

Download references

Acknowledgements

We thank Aleksey Urmanov for helpful discussion and comments.

Funding

We thank Oracle Labs, Oracle Corporation, Austin, TX, for providing funding that supported research in the area of scalable spectral clustering and distributed eigensolvers. H. Y. was partially supported by the US National Science Foundation under awards DMS-2244988, DMS-2206333, and the Office of Naval Research Award N00014-23-1-2007.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haizhao Yang.

Ethics declarations

Conflict of interest

The authors have not disclosed any competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pang, Q., Yang, H. A Distributed Block Chebyshev-Davidson Algorithm for Parallel Spectral Clustering. J Sci Comput 98, 69 (2024). https://doi.org/10.1007/s10915-024-02455-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10915-024-02455-y

Keywords

Navigation