Skip to main content
Log in

Resolving small random symmetric linear systems on graphics processing units

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

This paper focuses on the resolution of a large number of small random symmetric linear systems and its parallel implementation in single precision on graphics processing units (GPUs). The computations involved by each linear system are independent from the others, and the number of unknowns does not exceed 64. For this purpose, we present the adaptation to our context of largely used methods that include: LDLt factorization, Householder reduction to a tridiagonal matrix, parallel cyclic reduction (PCR) that is not a power of two and the divide and conquer algorithm for tridiagonal eigenproblems. We not only detail the implementation and optimization of each method, but we also compare the sustainability of each solution and its performance which include both parallel complexity and cache memory occupation. In the context of solving a large number of small random linear systems on GPUs with no information about their conditioning, our research indicates that the best strategy requires the use of Householder tridiagonalization + PCR followed if necessary by a divide and conquer diagonalization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Abbas-Turki LA, Bouselmi AI, Mikou MA (2014) Toward a coherent Monte Carlo simulation of CVA. Monte Carlo Methods Appl 20(3):195–216

    Article  MATH  MathSciNet  Google Scholar 

  2. Abbas-Turki LA, Mikou MA (2015) TVA on American derivatives. Preprint: https://hal.archives-ouvertes.fr/hal-01142874

  3. Abbas-Turki LA, Vialle S, Lapeyre B, Mercier P (2014) Pricing derivatives on graphics processing units using Monte Carlo simulation. Concurr Comput Pract Exp 26(9):1679–1697

    Article  Google Scholar 

  4. Ballard G, Demmel J, Holtz O, schwartz O (2010) Communication-optimal parallel and sequential Cholesky decomposition. SIAM J Sci Comput 32(6):3495–3523

    Article  MATH  MathSciNet  Google Scholar 

  5. Brigo D, Morini M, Pallavicini A (2013) Counterparty Credit Risk, Collateral and Funding: With Pricing Cases For All Asset Classes. Wiley, New York

    Book  MATH  Google Scholar 

  6. Brigo D, Pallavicini A (2008) Counterparty risk and contingent CDS under correlation between interest-rates and default. Risk Mag (February) 84–88

  7. Cesari G et al (2009) Modelling, pricing and hedging counterparty credit exposure, Springer Finance, New York

  8. Cho H, Yoon PA (2014) A Memory-efficient algorithm for large-scale symmetric tridiagonal eigenvalue problem on multi-GPU systems. Int’l Conf. Par. and Dist. Proc. Tech. and Appl., pp 568–573

  9. Clément E, Lamberton D, Protter P (2002) An analysis of a least squares regression algorithm for American option pricing. Financ Stoch 17:448–471

    MATH  Google Scholar 

  10. Crépey S, Bielecki TR (2014) Counterparty risk and funding. a tale of two puzzles. CRC Press, Boca Raton

    MATH  Google Scholar 

  11. Crépey S, Grbac Z, Ngor N, Skovmand D (2014) A Lévy HJM multiple-curve model with application to CVA computation. Quant Financ 15(3):1–19

  12. Cuppen JJM (1981) A divide and conquer method for the symmetric tridi- agonal eigenproblem. Numer Math 36:177–195

    Article  MATH  MathSciNet  Google Scholar 

  13. Demmel JW (1997) Applied numerical linear algebra. SIAM, New Delhi

    Book  MATH  Google Scholar 

  14. Demmel JW, Marques OA, Parlett BN, Vömel C (2008) Performance and accuracy of LAPACK’s symmetric tridiagonal eigensolvers. SIAM J Sci Comput 30(3):1508–1526

    Article  MATH  MathSciNet  Google Scholar 

  15. Fujii M, Takahashi A (2015) Perturbative expansion technique for non-linear FBSDEs with interacting particle method. Asia-Pac Financ Mark 22(3):283–304. doi:10.1007/s10690-015-9201-7

  16. Goddeke D, Strzodka R (2010) Cyclic reduction tridiagonal solvers on GPUs applied to mixed precision multigrid. IEEE Trans Parallel Distrib Syst 22(1):22–32

    Article  Google Scholar 

  17. Gordy MB, Juneja S (2010) Nested simulation in portfolio risk measurement. Manag Sci 56(10):1833–1848

    Article  MATH  Google Scholar 

  18. Gragg WB, Thornton JR, Warner DD (1992) Parallel divide and conquer algorithms for the symmetric tridiagonal eigenproblem and bidiagonal singular value problem. Model Simul 23(1):49–56

    Google Scholar 

  19. Gu M, Eisenstat S (1992) A stable algorithm for the rank-1 modification of the symmetric eigenproblem. Computer Science Dept. Report YALEU/DCS/RR-916, Yale University, New Haven

  20. Gu M, Eisenstat S (1995) A divide-and-conquer algorithm for the symmetric tridiagonal eigenproblem. SIAM J Matrix Anal Appl 16:172–191

    Article  MATH  MathSciNet  Google Scholar 

  21. Hockney RW, Jesshope CR (1981) Parallel computers: architecture, programming and algorithms. Adam Hilger Ltd, England

    MATH  Google Scholar 

  22. Henry-Labordère P (2012) Cutting CVA’s complexity. Risk Mag (July) 2012:67–73

  23. http://icl.cs.utk.edu/magma/. Accessed 11 July 2016

  24. http://www.proba.jussieu.fr/~abbasturki/soft.htm or http://www-pequan.lip6.fr/~graillat/cva.tar.gz. Accessed 11 July 2016

  25. http://www-pequan.lip6.fr/cadna/. Accessed 11 July 2016

  26. Li R-C (1994) Solving secular equations stably and efficiently. Computer Science Dept. Technical Report CS-94-260, University of Tennessee, Knoxville, (LAPACK Working Note 89.)

  27. Longstaff FA, Schwartz ES (2001) Valuing American options by simulation: a simple least-squares approach. Rev Financ Stud 14(1):113–147

    Article  Google Scholar 

  28. Löwner K (1934) Über monotone matrixfunctionen. Math Z 38:177–216

    Article  MATH  MathSciNet  Google Scholar 

  29. Press WH, Teukolsky SA, Vetterling WT, Flannery BP (2002) Numerical Recipes in C++: the art of scientific computing. Cambridge University Press, Cambridge

    MATH  Google Scholar 

  30. Volkov V, Demmel J (2008) LU, QR and Cholesky factorizations using vector capabilities of GPUs, Technical Report No. UCB/EECS-2008-49, University of California, Berkeley

  31. Vömel C, Tomov S, Dongarra J (2012) Divide & conquer on hybrid GPU-accelerated multicore systems. SIAM J Sci Comput 34(2):70–82

    Article  MATH  MathSciNet  Google Scholar 

  32. Zhang Y, Cohen J, Owens JD (2010) Fast tridiagonal solvers on the GPU. Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp 127–136

Download references

Acknowledgments

This work was funded by project ARRAND (ANR-15-CE39-0002-01) and partially supported by the project FastRelax (ANR-14-CE25-0018-01).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to L. A. Abbas-Turki.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Abbas-Turki, L.A., Graillat, S. Resolving small random symmetric linear systems on graphics processing units. J Supercomput 73, 1360–1386 (2017). https://doi.org/10.1007/s11227-016-1813-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-016-1813-9

Keywords

Navigation