Numerical analysis of parallel implementation of the reorthogonalized ABS methods
- 75 Downloads
Solving systems of equations is a critical step in various computational tasks. We recently published a novel ABS-based reorthogonalization algorithm to compute the QR factorization. Experimental analysis on the Matlab 2015a platform revealed that this new ABS-based algorithm was able to more accurately calculate the rank of the coefficient matrix, the determination of the orthogonal bases and the QR factorization than the built-in rank or qr Matlab functions. However, the reorthogonalization process significantly increased the computation cost. Therefore, we tested a new approach to accelerate this algorithm by implementing it on different parallel platforms. The above mentioned ABS-based reorthogonalization algorithm was implemented using Matlab’s parallel computing toolbox and accelerated massive parallelism (AMP) runtime library. We have tested various matrices including Pascal, Vandermonde and randomly generated dense matrices. The performance of the parallel algorithm was determined by calculating the speed-up factor defined as the fold reduction of execution time compared to the sequential algorithm. For comparison, we also tested the effect of parallel implementation of the classical Gram–Schmidt algorithm incorporating a reorthogonalization step. The results show that the achieved speed-up is significant, and also the performance of this practical parallel algorithm increases as the number of equations grows. The results reveal that the reorthogonalized ABS algorithm is practical and efficient. This fact expands the practical usefulness of our algorithms.
KeywordsABS methods Rank-revealing QR Reorthogonalization Parallel computing
The authors would like to thank József Abaffy and Attila Mócsai for helpful suggestions.
- Abaffy J, Fodor S (2015) Reorthogonalization methods in ABS classes. Acta Polytech Hung 12(6):23–41Google Scholar
- Agullo E, Demmel J, Dongarra J, Hadri B, Kurzak J, Langou J, Ltaief H, Luszczek P, Tomov S (2009) Numerical linear algebra on emerging architectures: the plasma and magma projects. J Phys: Conf Ser 180:012037Google Scholar
- Hegedüs CJ (2015) Reorthogonalization methods revisited. Acta Polytech Hung 12(8):7–28Google Scholar
- Humphrey JR, Price DK, Spagnoli KE, Paolini AL, Kelmelis EJ (2010 April) CULA: hybrid GPU accelerated linear algebra routines. In: Modeling and simulation for defense systems and applications, vol 7705. International Society for Optics and Photonics, p 770502Google Scholar
- Lingen FJ (2000) Efficient Gram–Schmidt orthonormalisation on parallel computers. Int J Numer Methods Biomed Eng 16(1):57–66Google Scholar
- Press WH, Flannery BP, Teukolsky SA, Vetterling WT (2007) Numerical recipes: the art of scientific computing, 3rd edn. Cambridge University Press, New YorkGoogle Scholar
- Reese J, Zaranek S (2011) GPU programming in MATLAB, MathWorks Technical Technical Articles and Newsletters. https://www.mathworks.com/company/newsletters/articles/gpu-programming-in-matlab.html
- Smith WS (2012) Fixed versus floating point. The Scientist and Engineer’s, Guide to Digital Signal Processing. California Technical Pub., ISBN 0966017633, 1997 Retrieved December 31, p 514Google Scholar
- Suh JW, Kim Y (2013) Accelerating MATLAB with GPU computing: a primer with examples. Newnes, LondonGoogle Scholar
- Volkov V, Demmel J (2008) LU, QR and Cholesky factorizations using vector capabilities of GPUs. Tech. rep., UC BerkeleyGoogle Scholar