Using Simulation to Evaluate and Tune the Performance of Dynamic Load Balancing of an Over-Decomposed Geophysics Application
Finite difference methods are commonplace in scientific computing. Despite their apparent regularity, they often exhibit load imbalance that damages their efficiency. We characterize the spatial and temporal load imbalance of Ondes3D, a seismic wave propagation simulator. We reveal that this imbalance originates from the nature of the input data and from low-level CPU optimizations. Such dynamic imbalance should therefore be quite common and is intractable by any static approach or classical code reorganization. An effective solution, with few code modifications, combines domain over-decomposition and dynamic load balancing (e.g., with AMPI), migrating data and computation at the granularity of an MPI rank. It generally requires a careful tuning of the over-decomposition level, the load balancing heuristic and frequency. These choices are quite dependent on application and platform characteristics. In this paper, we propose a methodology that leverages the capabilities of the SimGrid framework to conduct such study at low experimental cost. It combines emulation, simulation, and application modeling that requires minimal code modification and yet manages to capture both spatial and temporal load imbalance, faithfully predicting its overall performance. We compare simulation and real executions results and show how our strategy can be used to determine the best load balancing configuration for a given application/hardware configuration.
KeywordsLoad balancing and over-decomposition Performance prediction Simulation Geophysics FDM application
We thank CAPES/Cofecub 764-13, FAPERGS/Inria ExaSE, FAPERGS Green-Cloud, CNPq 447311/2014-0, CNRS/LICIA Intl. Lab, the EU H2020 Programme and from MCTI/RNP-Brazil under the HPC4E Project, grant 689772. Some experiments were carried out at the Grid’5000 platform (https://www.grid5000.fr), with support from Inria, CNRS, RENATER and several other organizations.
- 1.Aochi, H., Ducellier, A., Dupros, F., Delatre, M., Ulrich, T., Martin, F., Yoshimi, M.: Finite difference simulations of seismic wave propagation for the 2007 mw 6.6 Niigata-ken Chuetsu-Oki earthquake: Validity of models and reliable input ground motion in the near-field. Pure Appl. Geophys. 170(1–2), 43–64 (2013)CrossRefGoogle Scholar
- 2.Aochi, H., Ducellier, A., Dupros, F., Terrier, M., Lambert, J.: Investigation of historical earthquake by seismic wave propagation simulation: source parameters of the 1887 M6.3 Ligurian, north-western Italy, earthquake. In: 8ème colloque AFPS, Vers une maitrise durable du risque sismique. p. 6, September 2011Google Scholar
- 6.Dupros, F., Do, H.T., Aochi, H.: On scalability issues of the elastodynamics equations on multicore platforms. In: International Conference on Computer Science, Procedia Computer Science, p. 9. Elsevier, Barcelone, June 2013Google Scholar
- 9.Engelmann, C., Naughton, T.: A network contention model for the extreme-scale simulator. In: Press, A. (ed.) 34th IASTED International Conference on Modelling, Identification and Control (MIC) (2015)Google Scholar
- 11.Kalé, L., Krishnan, S.: CHARM++: a portable concurrent object oriented system based on C++. In: Proceedings of OOPSLA 1993, pp. 91–108. ACM Press (1993)Google Scholar
- 12.Keller Tesser, R., Lima Pilla, L., Dupros, F., Navaux, P., Mehaut, J.F., Mendes, C.: Improving the performance of seismic wave simulations with dynamic load balancing. In: International Conference Parallel, Distributed and Network-Based Processing (2014)Google Scholar
- 13.Martinez, V., Michéa, D., Dupros, F., Aumage, O., Thibault, S., Aochi, H., Navaux, P.O.A.: Towards seismic wave modeling on heterogeneous many-core architectures using task-based runtime system. In: SBAC-PAD. IEEE Computer Society (2015)Google Scholar
- 14.Mucci, P.J., Browne, S., Deane, C., Ho, G.: PAPI: a portable interface to hardware performance counters. In: Proceedings of the Department of Defense HPCMP Users Group Conference, pp. 7–10 (1999)Google Scholar
- 17.Zheng, G., Kakulapati, G., Kale, L.: Bigsim: a parallel simulator for performance prediction of extremely large parallel machines. In: Parallel and Distributed Processing Symposium, Proceedings, 18th International, p. 78, April 2004Google Scholar