PARA 2012: Applied Parallel and Scientific Computing pp 206-219 | Cite as
Parallel Implementation of the Sherman-Morrison Matrix Inverse Algorithm
Conference paper
Abstract
We present two parallel strategies to compute the inverse of a dense matrix, based on the so-called Sherman-Morrison algorithm and demonstrate their efficiency in memory and runtime on multicore CPU and GPU-equipped computers. Our methods are shown to be much more efficient than the direct method to compute the inverse of a nonsingular dense matrix, yielding up to 12 times faster performance on the CPU.
Keywords
Block Size Parallel Implementation Memory Consumption Memory Footprint Multicore Machine
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Preview
Unable to display preview. Download preview PDF.
References
- 1.Bru, R., Cerdán, J., Marín, J., Mas, J.: Preconditioning sparse nonsymmetric linear systems with the Sherman-Morrison formula. SIAM J. Sci. Comput. 25, 701–715 (2003)MathSciNetMATHCrossRefGoogle Scholar
- 2.Bru, R., Marín, J., Mas, J., Tůma, M.: Balanced incomplete factorization. SIAM J. Sci. Comput. 30, 2302–2318 (2008)MathSciNetMATHCrossRefGoogle Scholar
- 3.Buttari, A., Langou, J., Kurzak, J., Dongarra, J.: A class of parallel tiled linear algebra algorithms for multicore architectures. Parallel Comput. 35, 38–53 (2009)MathSciNetCrossRefGoogle Scholar
- 4.Cerdán, J., Faraj, T., Malla, N., Marín, J., Mas, J.: Block approximate inverse preconditioners for sparse nonsymmetric linear systems. Electron. Trans. Numer. Anal. 37, 23–40 (2010)MathSciNetMATHGoogle Scholar
- 5.Gunnels, J., Lin, C., Morrow, G., van de Geijn, R.: Analysis of a Class of Parallel Matrix Multiplication Algorithms. In: First Merged Internatial Parallel Processing Symposium and Symposium on Parallel and Distributed Processing (IPPS/SPDP 1998), pp. 110–116. IEEE Press, New York (1998)CrossRefGoogle Scholar
- 6.Hellel, D.: A survery of parallel algorithms in numerical linear algebra. SIAM Rev. 20, 740–777 (1978)MathSciNetCrossRefGoogle Scholar
- 7.Melab, N., Talbi, E.-G., Petiton, S.: A parallel adaptive Gauss-Jordan algorithm. J. Supercomput. 17, 167–185 (2000)MATHCrossRefGoogle Scholar
- 8.Nath, R., Tomov, S., Dongarra, J.: An improved MAGMA GEMM for Fermi GPUs. Int. J. High Perform. Comput. 24, 511–515 (2010)CrossRefGoogle Scholar
- 9.Sun Performance Library Reference Manual, http://docs.sun.com/app/docs/doc/820-2171
Copyright information
© Springer-Verlag Berlin Heidelberg 2013