Formal Metrics for Large-Scale Parallel Performance
Performance measurement of parallel algorithms is well studied and well understood. However, a flaw in traditional performance metrics is that they rely on comparisons to serial performance with the same input. This comparison is convenient for theoretical complexity analysis but impossible to perform in large-scale empirical studies with data sizes far too large to run on a single serial computer. Consequently, scaling studies currently rely on ad hoc methods that, although effective, have no grounded mathematical models. In this position paper we advocate using a rate-based model that has a concrete meaning relative to speedup and efficiency and that can be used to unify strong and weak scaling studies.
KeywordsProcessing Element Problem Size Parallel Algorithm Parallel Performance Serial Performance
This material is based in part upon work supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, Scientific Discovery through Advanced Computing (SciDAC) program under Award Number 12-015215.
Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000. SAND 2015-2890 C
- 1.Amdahl, G.M.: Validity of the single processor approach to achieving large scale computing capabilities. In: Proceedings of the AFIPS 1967, pp. 483–485, April 1967. doi: 10.1145/1465482.1465560
- 2.Bernaschi, M., Bisson, M., Fatica, M., Melchionna, S.: 20 Petaflops simulation of proteins suspensions in crowding conditions. In: Proceedings of the SC 2013, November 2013. doi: 10.1145/2503210.2504563
- 3.Bussmann, M., et al.: Radiative signatures of the relativistic Kelvin-Helmholtz instability. In: Proceedings of the SC 2013, November 2013. doi: 10.1145/2503210.2504564
- 8.Gustafson, J.L.: Fixed time, tiered memory, and superlinear speedup. In: Proceedings of the Fifth Distributed Memory Computing Conference, pp. 1255–1260 April 1990. doi: 10.1109/DMCC.1990.556383
- 9.Habib, S., et al.: HACC: Extreme scaling and performance across diverse architectures. In: Proceedings of the SC 2013, November 2013. doi: 10.1145/2503210.2504566
- 11.Kaminsky, A.: Big CPU, Big Data: Solving the World’s Toughest Computational Problems with Parallel Computing. Unpublished manuscript (2015), retrieved from http://www.cs.rit.edu/ark/bcbd
- 13.Oldfield, R.A., Moreland, K., Fabian, N., Rogers, D.: Evaluation of methods to integrate analysis into a large-scale shock physics code. In: Proceedings of the ICS 2014, pp. 83–92. June 2014. doi: 10.1145/2597652.2597668
- 14.Quinn, M.J.: Parallel Programming in C with MPI and OpenMP. McGraw-Hill, New York (2004). ISBN 978-0-07-282256-4Google Scholar
- 15.Rossinelli, D., et al.: 11 PFLOP/s simulations of cloud cavitation collapse. In: Proceedings of the SC 2013, November 2013. doi: 10.1145/2503210.2504565