The Impact of Voltage-Frequency Scaling for the Matrix-Vector Product on the IBM POWER8

  • Sandra Catalán
  • A. Cristiano I. Malossi
  • Costas Bekas
  • Enrique S. Quintana-Ortí
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9833)

Abstract

The physical limitations of CMOS miniaturization have promoted understanding the interplay between performance and energy into a primary challenge. In this paper we contribute towards this goal by assessing the effect of voltage and frequency scaling (VFS) on the energy consumption of the dense and sparse matrix-vector products. The optimization of the sparse kernel, from the perspective of both performance and energy efficiency, is especially difficult due to its irregular memory access pattern, but the potential benefits are remarkable because of its varied applications.

Our experiments with a small synthetic training set show that it is possible to build a general classification of sparse matrices that governs the optimal VFS level from the point of view of energy efficiency. More importantly, this characterization can be leveraged to tune VFS for a major portion of the University of Florida Matrix Collection, when executed on the IBM Power8, yielding significant gains with respect to a (power-hungry) configuration that simply favours performance.

Keywords

Energy efficiency Voltage-frequency scaling Performance prediction Performance metrics Matrix-vector product IBM POWER8 

Notes

Acknowledgements

This work was supported by project Exa2Green (under grant agreement n\(^\circ \)318793) of the Future and Emerging Technologies (FET) programme within the ICT theme of the Seventh Framework Programme for Research (FP7/2007–2013) of the European Commission. The researchers from Universidad Jaume I were supported by project TIN2014-53495-R of the MINECO and FEDER, and the FPU program of MECD.

References

  1. 1.
    The University of Florida Sparse Matrix Collection, January 2016. http://www.cise.ufl.edu/research/sparse/matrices/
  2. 2.
    Aliaga, J.I., Anzt, H., Castillo, M., Fernández, J., León, G., Pérez, J., Quintana-Ortí, E.S.: Unveiling the performance-energy trade-off in iterative linear system solvers for multithreaded processors. Concurr. Comput. Pract. Exper. 27(4), 895–904 (2015)CrossRefGoogle Scholar
  3. 3.
    Asanovic, K., et al.: The landscape of parallel computing research: a view from berkeley. Technical report UCB/EECS-2006-183, EECS Department, University of California, Berkeley, December 2006Google Scholar
  4. 4.
    Bergner, P., et al.: Performance Optimization and Tuning Techniques for IBM Power Systems Processors Including IBM POWER8. IBM (2015). IBM Reed BooksGoogle Scholar
  5. 5.
    Buono, D., et al.: Optimizing sparse linear algebra for large-scale graph analytics. Computer 48(8), 26–34 (2015)CrossRefGoogle Scholar
  6. 6.
    Byun, J.-H., Lin, R., Yelick, K.A., Demmel, J.: Autotuning sparse matrix-vector multiplication for multicore. Technical report UCB/EECS-2012-215, EECS Dept., Univ. California, Berkeley (2012)Google Scholar
  7. 7.
    Choi, J.W., Singh, A., Vuduc, R.W.: Model-driven autotuning of sparse matrix-vector multiply on GPUs. In Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPopp 2010, pp. 115–126 (2010)Google Scholar
  8. 8.
    Dongarra, J., Heroux, M.A.: Toward a new metric for ranking high performance computing systems. Sandia report SAND2013-4744, Sandia National Laboratories, June 2013Google Scholar
  9. 9.
    Duranton, M., De Bosschere, K., Cohen, A., Maebe, J., Munk, H.: HiPEAC vision 2015 (2015). https://www.hipeac.org/publications/vision/
  10. 10.
    Guo, P., Wang, L., Chen, P.: A performance modeling and optimization analysis tool for sparse matrix-vector multiplication on GPUs. IEEE Trans. Parallel Distrib. Syst. 25(5), 1112–1123 (2013)MathSciNetGoogle Scholar
  11. 11.
    Hager, G., Treibig, J., Habich, J., Wellein, G.: Exploring performance and power properties of modern multi-core chips via simple machine models. Concurr. Comput. Pract. Exper. 28(2), 189–210 (2016)Google Scholar
  12. 12.
    Hennessy, J.L., Patterson, D.A.: Computer Architecture: A Quantitative Approach, 5th edn. Morgan Kaufmann, Waltham (2012)MATHGoogle Scholar
  13. 13.
    Karakasis, V., Goumas, G., Koziris, N.: Exploring the performance-energy tradeoffs in sparse matrix-vector multiplication. In: Workshop on Emerging Supercomputing Technologies (WEST) - ICS 2011 (2011)Google Scholar
  14. 14.
    Kepner, J., Gilbert, J. (eds.): Graph Algorithms in the Language of Linear Algebra. SIAM, Philadelphia (2011)MATHGoogle Scholar
  15. 15.
    Lefurgy, C., Wang, X., Ware, M.: Server-level power control. In: Proceedings of the 4th IEEE Conference on Autonomic Computing (ICAC 2007), Jacksonville, Florida, USA, 11–15 June, 2007Google Scholar
  16. 16.
    Liu, X., Smelyanskiy, M., Chow, E., Dubey, P.: Efficient sparse matrix-vector multiplication on x86-based many-core processors. In: Proceedings of the 27th International Conference on Supercomputing, Eugene, Oregon, USA, pp. 273–282, June 2013Google Scholar
  17. 17.
  18. 18.
    Malkowski, K.: Co-adapting Scientific Applications and Architectures Toward Energy-efficient High Performance Computing. Ph.D. thesis, University Park, PA, USA (2008) AI3346339Google Scholar
  19. 19.
    Malossi, A.C.I., Ineichen, Y., Bekas, C., Curioni, A., Quintana-Ortí, E.S.: Performance and energy-aware characterization of the sparse matrix-vector multiplication on multithreaded architectures. In Proceedings of 43rd International Conference on Parallel Processing (ICCP), Minneapolis (MN), USA, pp. 139–148 (2014)Google Scholar
  20. 20.
    Malossi, A.C.I., Ineichen, Y., Bekas, C., Curioni, A., Quintana-Ortí, E.S.: Systematic derivation of time and power models for linear algebra kernels on multicore architectures. Sustainable Comput. Inf. Syst. 7, 24–40 (2016)CrossRefGoogle Scholar
  21. 21.
    Saad, Y.: Iterative Methods for Sparse Linear Systems. SIAM, Philadelphia (2003)CrossRefMATHGoogle Scholar
  22. 22.
    Vuduc, R.: Automatic performance tuning of sparse matrix kernels. Ph.D. dissertation, Univ. California, Berkeley, January 2004Google Scholar
  23. 23.
    Williams, S., et al.: Optimization of sparse matrix-vector multiplication on emerging multicore platforms. Parallel Comput. 35(3), 178–194 (2009)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Sandra Catalán
    • 1
  • A. Cristiano I. Malossi
    • 2
  • Costas Bekas
    • 2
  • Enrique S. Quintana-Ortí
    • 1
  1. 1.Dpto. de Ingeniería y Ciencia de ComputadoresUniversidad Jaume ICastellónSpain
  2. 2.IBM Research–Zurich, Foundations of Cognitive SolutionsRüschlikonSwitzerland

Personalised recommendations