Performance analysis of asynchronous parallel Jacobi
- 100 Downloads
The directed acyclic graph (DAG) associated with a parallel algorithm captures the partial order in which separaT.L.cal computations are completed and how their outputs are subsequently used in further computations. Unlike in a synchronous parallel algorithm, the DAG associated with an asynchronous parallel algorithm is not predetermined. Instead, it is a product of the asynchronous timing dynamics of the machine and cannot be known in advance, as such it is best thought of as a pseudorandom variable. In this paper, we present a formalism for analyzing the performance of asynchronous parallel Jacobi’s method in terms of its DAG. We use this app.roach to prove error bounds and bounds on the rate of convergence. The rate of convergence bounds is based on the statistical properties of the DAG and is valid for systems with a non-negative iteration matrix. We supp.ort our theoretical results with a suit of numerical examples, where we compare the performance of synchronous and asynchronous parallel Jacobi to certain statistical properties of the DAGs associated with the computations. We also present some examples of small matrices with elements of mixed sign, which demonstrate that determining whether a system will converge under asynchronous iteration in this more general setting is a far more difficult problem.
KeywordsAsynchronous parallel Jacobi’s method Chaotic iterations Parallel algorithm performance
Unable to display preview. Download preview PDF.
The first author was supp.orted by Engineering and Physical Sciences Research Council (EPSRC) grant EP/I005293 “Nonlinear Eigenvalue Problems: Theory and Numeric.” The second author was supp.orted by EPSRC grant EP/I006702/1 “Novel Asynchronous Algorithms and Software for Large Sparse Systems.” This research made use of the Balena High Performance Computing (HPC) Service at the University of Bath.
We would like to thank our two referees, whose valuable comments and suggestions have considerably improved the original manuscript over several rounds of reviewing. We would also like to thank Dr Mark Muldoon (University of Manchester, UK) for many useful discussions.
- 1.Avron, H., Druinsky, A., Gupta, A.: Revisiting asynchronous linear solvers: provable convergence rate through randomization. In: Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, IPDPS ’14, pp. 198–207 (2014)Google Scholar
- 3.Bethune, I., Bull, J.M., Dingle, N.J., Higham, N.J.: Performance analysis of asynchronous Jacobi’s method implemented in MPI, SHMEM and OpenMP. IJHPCA 28(1), 97–111 (2014)Google Scholar
- 4.V.D. Blondel, M. Karow, V. Protassov, F.R. Wirth: Special issue on the joint spectral radius: theory, methods and app.lications. Linear Algebra App.l. 428(10), (2008)Google Scholar
- 5.Bull, J.M., Freeman, T.L.: Numerical performance of an asynchronous Jacobi iteration Parallel Processing, Volume 634 of Lecture Notes in Computer Science, pp. 361–366. Springer, Berlin, Heidelberg (1992)Google Scholar
- 7.Dingle, N.J., Knottenbelt, W.J.: Distributed solution of large Markov models using asynchronous iterations and graph partitioning. In: Proceedings of the 18th UK Performance Engineering Workshop, pp. 27–34 (2002)Google Scholar
- 12.Horn, R., Johnson, C.: Matrix analysis. Cambridge University Press (1990)Google Scholar
- 13.Sridhar, S., Liu, J., Wright, S.J.: An asynchronous parallel randomized Kaczmarz algorithm. arXiv:1401.4780 (2014)
- 14.Lu, J., Tang, Y.: Distributed asynchronous algorithms for solving positive definite linear equations over dynamic networks. arXiv:1306.0260(2013)
- 15.Niu, F., Recht, B., Ré, C., Wright, S.J.: Hogwild: a lock-free app.roach to parallelizing stochastic gradient descent NIPS (2011)Google Scholar