Performance and Energy Analysis of the Iterative Solution of Sparse Linear Systems on Multicore and Manycore Architectures
In this paper we investigate the performance-energy balance of a variety of concurrent architectures, from general-purpose and digital signal multicore systems to graphics processors (GPUs), representative of current technology. This analysis employs the conjugate gradient method, an important algorithm for the iterative solution of linear systems that is basically composed of the sparse matrix-vector product and other (minor) vector kernels. To allow a fair comparison, we leverage simple implementations of the numerical methods and underlying kernels, and rely only on those optimizations applied by the target compiler.
KeywordsEnergy efficiency High-performance computing Sparse linear algebra Multicore processors Low-power processors GPUs
This work was supported by the CICYT project TIN2011-23283 and FEDER, and by EU FET grant “EXA2GREEN” 318793.
- 1.CRESTA: collaborative research into Exascale systemware, tools and applications. http://cresta-project.eu
- 2.The Mont Blanc project. http://montblanc-project.eu
- 3.Anzt, H., Heuveline, V., Aliaga, J., Castillo, M., Fernández, J., Mayo, R., Quintana-Ortí, E.S.: Analysis and optimization of power consumption in the iterative solution of sparse linear systems on multi-core and many-core platforms. In: Green Computing Conference and Workshops (IGCC), pp. 1–6 (2011)Google Scholar
- 4.Asanovic, K., et al.: The landscape of parallel computing research: a view from Berkeley. Technical Report UCB/EECS-2006-183, University of California at Berkeley, Electrical Engineering and Computer Sciences (2006)Google Scholar
- 5.Ashby, S., et al.: The opportunities and challenges of Exascale computing. Summary Report of the Advanced Scientific Computing Advisory Committee (ASCAC) Subcommittee, November 2010Google Scholar
- 8.Bell, N., Garland, M.: Efficient sparse matrix-vector multiplication on CUDA. NVIDIA Technical Report NVR-2008-004, NVIDIA Corporation, December 2008Google Scholar
- 9.Bergman, K., et al.: Exascale computing study: Technology challenges in achieving exascale systems. DARPA IPTO ExaScale Computing Study (2008)Google Scholar
- 10.Buluç, A., Williams, S., Oliker, L., Demmel, J.: Reduced-bandwidth multithreaded algorithms for sparse matrix-vector multiplication. In Proceedings of the IPDPS, pp. 721–733 (2011)Google Scholar
- 11.Langville, A., Meyer, C.: Google’s PageRank and Beyond: The Science of Search Engine Rankings. Princeton University Press, Princeton (2009)Google Scholar
- 14.Williams, S., Bell, N., Choi, J., Garland, M., Oliker, L., Vuduc, R.: Sparse matrix vector multiplication on multicore and accelerator systems. In: Kurzak, J., Bader, D.A., Dongarra, J. (eds.) Scientific Computing with Multicore Processors and Accelerators. CRC Press, Boca Raton (2010)Google Scholar