Weighted Block-Asynchronous Iteration on GPU-Accelerated Systems
In this paper, we analyze the potential of using weights for block-asynchronous relaxation methods on GPUs. For this purpose, we introduce different weighting techniques similar to those applied in block-smoothers for multigrid methods. For test matrices taken from the University of Florida Matrix Collection we report the convergence behavior and the total runtime for the different techniques. Analyzing the results, we observe that using weights may accelerate the convergence rate of block-asynchronous iteration considerably. While component-wise relaxation methods are seldom directly applied to systems of linear equations, using them as smoother in a multigrid framework they often provide an important contribution to finite element solvers. Since the parallelization potential of the classical smoothers like SOR and Gauss-Seidel is usually very limited, replacing them by weighted block-asynchronous smoothers may be beneficial to the overall multigrid performance. Due to the increase of heterogeneity in today’s architecture designs, the significance and the need for highly parallel asynchronous smoothers is expected to grow.
Keywordsasynchronous relaxation weighted block-asynchronous iteration methods multigrid smoother GPU
Unable to display preview. Download preview PDF.
- [AD86]Aydin, U., Dubois, M.: Generalized asynchronous iterations, pp. 272–278 (1986)Google Scholar
- [ATDH11]Anzt, H., Tomov, S., Dongarra, J., Heuveline, V.: A block-asynchronous relaxation method for graphics processing units. Technical report, Innovative Computing Laboratory, University of Tennessee, UT-CS-11-687 (2011)Google Scholar
- [ATG+11]Anzt, H., Tomov, S., Gates, M., Dongarra, J., Heuveline, V.: Block-asynchronous Multigrid Smoothers for GPU-accelerated Systems. Technical report, Innovative Computing Laboratory, University of Tennessee, UT-CS-11-689 (2011)Google Scholar
- [Bah97]Miellou, J.C., Rhofir, K., Bahi, J.: Asynchronous multisplitting methods for nonlinear fixed point problems. Numerical Algorithms 15(3-4), 315–345 (1997), cited By (since 1996) 23Google Scholar
- [BE86]Bertsekas, D.P., Eckstein, J.: Distributed asynchronous relaxation methods for linear network flow problems. In: Proceedings of IFAC 1987 (1986)Google Scholar
- [BFG+]Baker, A.H., Falgout, R.D., Gamblin, T., Kolev, T.V., Martin, S., Yang, U.M.: Scaling algebraic multigrid solvers: On the road to exascale. In: Proceedings of Competence in High Performance Computing, CiHPC 2010 (2010)Google Scholar
- [BFKMY11]Baker, A.H., Falgout, R.D., Kolev, T.V., Yang, U.M.: Multigrid smoothers for ultra-parallel computing, LLNL-JRNL-435315 (2011)Google Scholar
- [int]Intel C++ Compiler Options. Intel Corporation. Document Number: 307776-002USGoogle Scholar
- [NVI09]NVIDIA Corporation. NVIDIA CUDA Compute Unified Device Architecture Programming Guide, 2.3.1 edn. (August 2009)Google Scholar
- [NVI11]NVIDIA Corporation. CUDA TOOLKIT 4.0 READINESS FOR CUDA APPLICATIONS, 4.0 edn. (March 2011)Google Scholar
- [Var10]Varga, R.S.: Matrix Iterative Analysis. Springer Series in Computational Mathematics. Springer (2010)Google Scholar