Skip to main content
Log in

A mixed precision LOBPCG algorithm

  • Original Paper
  • Published:
Numerical Algorithms Aims and scope Submit manuscript

Abstract

The locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm is a popular approach for computing a few smallest eigenvalues and the corresponding eigenvectors of a large Hermitian positive definite matrix \(A\). In this work, we propose a mixed precision variant of LOBPCG that uses a (sparse) Cholesky factorization of \(A\) computed in lower precision as the preconditioner. To further enhance performance, a mixed precision orthogonalization strategy is proposed. To analyze the impact of reducing precision in the preconditioner on performance, we carry out a rounding error and convergence analysis of PINVIT, a simplified variant of LOBPCG. Our theoretical results predict and our numerical experiments confirm that the impact on convergence remains marginal. In practice, our mixed precision LOBPCG algorithm typically reduces the computation time by a factor of \(1.4\)\(2.0\) on both CPUs and GPUs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data Availability

The results/data/figures in this manuscript have not been published elsewhere, nor are they under consideration (from you or one of your Contributing Authors) by another publisher. All of the material is owned by the authors and/or no permissions are required.

Notes

  1. https://sparse.tamu.edu

  2. For MPLOBPCG-schol, the number of LOBPCG iterations for constructing the initial guess is also counted.

References

  1. Balcan, D., Gonçalves, B., Hu, H., Ramasco, J.J., Colizza, V., Vespignani, A.: Modeling the spatial spread of infectious diseases: the GLobal Epidemic and Mobility computational model. J. Comput. Sci. 1(3), 132–145 (2010). https://doi.org/10.1016/j.jocs.2010.07.002

    Article  Google Scholar 

  2. Knyazev, A.: Recent implementations, applications, and extensions of the locally optimal block preconditioned conjugate gradient method (LOBPCG). arXiv:1708.08354(2017)

  3. Saad, Y.: Numerical Methods for Large Eigenvalue Problems, Revised SIAM, Philadelphia, PA, USA (2011)

    Book  MATH  Google Scholar 

  4. Neymeyr, K.: A geometric theory for preconditioned inverse iteration applied to a subspace. Math. Comp. 71(237), 197–216 (2002). https://doi.org/10.1090/S0025-5718-01-01357-6

    Article  MathSciNet  MATH  Google Scholar 

  5. Argentati, M., Knyazev, A., Neymeyr, K., Ovtchinnikov, E., Zhou, M.: Convergence theory for preconditioned eigenvalue solvers in a nutshell. Found. Comput. Math. 17, 713–727 (2017). https://doi.org/10.1007/s10208-015-9297-1

    Article  MathSciNet  MATH  Google Scholar 

  6. Knyazev, A.V.: Toward the optimal preconditioned eigensolver: Locally optimal block preconditioned conjugate gradient method. SIAM J. Sci. Comput. 23(2), 517–541 (2001). https://doi.org/10.1137/S1064827500366124

    Article  MathSciNet  MATH  Google Scholar 

  7. Abdelfattah, A., Anzt, H., Boman, E.G., Carson, E., Cojean, T., Dongarra, J., Fox, A., Gates, M., Higham, N.J., Li, X.S., Loe, J., Luszczek, P., Pranesh, S., Rajamanickam, S., Ribizel, T., Smith, B.F., Swirydowicz, K., Thomas, S., Tomov, S., Tsai, Y.M., Yang, U.M.: A survey of numerical linear algebra methods utilizing mixed-precision arithmetic. Int. J. High Perform. Comput. Appl. 35(4), 344–369 (2021). https://doi.org/10.1177/10943420211003313

    Article  Google Scholar 

  8. Higham, N.J., Mary, T.: Mixed precision algorithms in numerical linear algebra. Acta Numer. 31, 347–414 (2022). https://doi.org/10.1017/S0962492922000022

    Article  MathSciNet  MATH  Google Scholar 

  9. Carson, E., Higham, N.J.: Accelerating the solution of linear systems by iterative refinement in three precisions. SIAM J. Sci. Comput. 40(2), 817–847 (2018). https://doi.org/10.1137/17M1140819

    Article  MathSciNet  MATH  Google Scholar 

  10. Ogita, T., Aishima, K.: Iterative refinement for symmetric eigenvalue decomposition. Japan J. Indust. Appl. Math. 35(3), 1007–1035 (2018). https://doi.org/10.1007/s13160-018-0310-3

    Article  MathSciNet  MATH  Google Scholar 

  11. Ogita, T., Aishima, K.: Iterative refinement for symmetric eigenvalue decomposition II: clustered eigenvalues. Japan J. Indust. Appl. Math. 36(2), 435–459 (2019). https://doi.org/10.1007/s13160-019-00348-4

    Article  MathSciNet  MATH  Google Scholar 

  12. Ogita, T., Aishima, K.: Iterative refinement for singular value decomposition based on matrix multiplication. J. Comput. Appl. Math. 369, 112512 (2020). https://doi.org/10.1016/j.cam.2019.112512

    Article  MathSciNet  MATH  Google Scholar 

  13. Bujanović, Z., Kressner, D., Schröder, C.: Iterative refinement of Schur decompositions. Numer. Algorithms 92(1), 247–267 (2023). https://doi.org/10.1007/s11075-022-01327-6

    Article  MathSciNet  MATH  Google Scholar 

  14. Gao, W., Ma, Y., Shao, M.: A mixed precision Jacobi SVD algorithm. arXiv:2209.04626 (2022)

  15. Dongarra, J.J.: Algorithm 589: SICEDR: A FORTRAN subroutine for improving the accuracy of computed matrix eigenvalues. ACM Trans. Math. Software 8(4), 371–375 (1982). https://doi.org/10.1145/356012.356016

    Article  MATH  Google Scholar 

  16. Golub, G.H., Van Loan, C.F.: Matrix Computations, 4th edn. Johns Hopkins University Press, Baltimore, MD, USA (2013)

    Book  MATH  Google Scholar 

  17. Duersch, J.A., Shao, M., Yang, C., Gu, M.: A robust and efficient implementation of LOBPCG. SIAM J. Sci. Comput. 40(5), 655–676 (2018). https://doi.org/10.1137/17M1129830

    Article  MathSciNet  MATH  Google Scholar 

  18. Hetmaniuk, U., Lehoucq, R.: Basis selection in LOBPCG. J. Comput. Phys. 218(1), 324–332 (2006). https://doi.org/10.1016/j.jcp.2006.02.007

    Article  MathSciNet  MATH  Google Scholar 

  19. Yamazaki, I., Tomov, S., Dongarra, J.: Mixed-precision Cholesky QR factorization and its case studies on multicore CPU with multiple GPUs. SIAM J. Sci. Comput. 37(3), 307–330 (2015). https://doi.org/10.1137/14M0973773

    Article  MathSciNet  MATH  Google Scholar 

  20. Yamazaki, I., Tomov, S., Kurzak, J., Dongarra, J., Barlow, J.: Mixed-precision block Gram Schmidt orthogonalization. In: ScalA ’15: Proceedings of the 6th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, pp. 2–128 (2015). https://doi.org/10.1145/2832080.2832082

  21. Rohwedder, T., Schneider, R., Zeiser, A.: Perturbed preconditioned inverse iteration for operator eigenvalue problems with applications to adaptive wavelet discretization. Adv. Comput. Math. 34(1), 43–66 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  22. Higham, N.J.: Accuracy and Stability of Numerical Algorithms, 2nd edn. SIAM, Philadelphia, PA, USA (2002)

    Book  MATH  Google Scholar 

  23. Chen, Y., Davis, T.A., Hager, W.W., Rajamanickam, S.: Algorithm 887: CHOLMOD, supernodal sparse Cholesky factorization and update/downdate. ACM Trans. Math. Software 35(3), 22–12214 (2008). https://doi.org/10.1145/1391989.1391995

    Article  MathSciNet  Google Scholar 

  24. Shao, M., Oryspayev, D., Yang, C., Maris, P., Cook, B.: Fault-tolerant LOBPCG for nuclear CI calculations. In: International Conference on High Performance Computing in Asia-Pacific Region (HPC ASIA 2023), February 27–March 2, 2023, Singapore, Singapore, pp. 88–95. ACM, New York, NY, USA (2023). https://doi.org/10.1145/3578178.3578240

  25. Fadel, S., Ghoniemy, S., Abdallah, M., Sorra, H.A., Ashour, A., Ansary, A.: Investigating the effect of different kernel functions on the performance of SVM for recognizing Arabic characters. Int. J. Adv. Comput. Sci. Appl. 7(1) (2016). https://doi.org/10.14569/IJACSA.2016.070160

Download references

Acknowledgements

The authors thank Erin Carson for helpful discussions. Part of this work was performed when the second author was visiting EPF Lausanne in 2022.

Funding

Yuxin Ma is partially supported by the State Scholarship Fund of China Scholarship Council (CSC) under Grant No. 202106100093, National Key R &D Program of China under Grant No. 2021YFA1003305 and National Natural Science Foundation of China under Grant No. 71991471. Meiyue Shao is partially supported by the National Natural Science Foundation of China under grant No. 11971118.

Author information

Authors and Affiliations

Authors

Contributions

These authors contributed equally to this work.

Corresponding author

Correspondence to Yuxin Ma.

Ethics declarations

Ethics approval

Not applicable

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Daniel Kressner, Yuxin Ma and Meiyue Shao are contributed equally to this work.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kressner, D., Ma, Y. & Shao, M. A mixed precision LOBPCG algorithm. Numer Algor 94, 1653–1671 (2023). https://doi.org/10.1007/s11075-023-01550-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11075-023-01550-9

Keywords

Mathematics Subject Classification (2010)

Navigation