Parallel Implementation of the Sherman-Morrison Matrix Inverse Algorithm

  • Xin He
  • Marcus Holm
  • Maya Neytcheva
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7782)

Abstract

We present two parallel strategies to compute the inverse of a dense matrix, based on the so-called Sherman-Morrison algorithm and demonstrate their efficiency in memory and runtime on multicore CPU and GPU-equipped computers. Our methods are shown to be much more efficient than the direct method to compute the inverse of a nonsingular dense matrix, yielding up to 12 times faster performance on the CPU.

Keywords

Block Size Parallel Implementation Memory Consumption Memory Footprint Multicore Machine 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bru, R., Cerdán, J., Marín, J., Mas, J.: Preconditioning sparse nonsymmetric linear systems with the Sherman-Morrison formula. SIAM J. Sci. Comput. 25, 701–715 (2003)MathSciNetMATHCrossRefGoogle Scholar
  2. 2.
    Bru, R., Marín, J., Mas, J., Tůma, M.: Balanced incomplete factorization. SIAM J. Sci. Comput. 30, 2302–2318 (2008)MathSciNetMATHCrossRefGoogle Scholar
  3. 3.
    Buttari, A., Langou, J., Kurzak, J., Dongarra, J.: A class of parallel tiled linear algebra algorithms for multicore architectures. Parallel Comput. 35, 38–53 (2009)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Cerdán, J., Faraj, T., Malla, N., Marín, J., Mas, J.: Block approximate inverse preconditioners for sparse nonsymmetric linear systems. Electron. Trans. Numer. Anal. 37, 23–40 (2010)MathSciNetMATHGoogle Scholar
  5. 5.
    Gunnels, J., Lin, C., Morrow, G., van de Geijn, R.: Analysis of a Class of Parallel Matrix Multiplication Algorithms. In: First Merged Internatial Parallel Processing Symposium and Symposium on Parallel and Distributed Processing (IPPS/SPDP 1998), pp. 110–116. IEEE Press, New York (1998)CrossRefGoogle Scholar
  6. 6.
    Hellel, D.: A survery of parallel algorithms in numerical linear algebra. SIAM Rev. 20, 740–777 (1978)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Melab, N., Talbi, E.-G., Petiton, S.: A parallel adaptive Gauss-Jordan algorithm. J. Supercomput. 17, 167–185 (2000)MATHCrossRefGoogle Scholar
  8. 8.
    Nath, R., Tomov, S., Dongarra, J.: An improved MAGMA GEMM for Fermi GPUs. Int. J. High Perform. Comput. 24, 511–515 (2010)CrossRefGoogle Scholar
  9. 9.
    Sun Performance Library Reference Manual, http://docs.sun.com/app/docs/doc/820-2171

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Xin He
    • 1
  • Marcus Holm
    • 1
  • Maya Neytcheva
    • 1
  1. 1.Department of Information TechnologyUppsala UniversitySweden

Personalised recommendations