The computation of chemical reaction rates is commonly the performance bottleneck in CFD simulations of turbulent combustion with detailed chemistry. Therefore, an optimization method is used where C++ source code is automatically generated for arbitrary reaction mechanisms. The generated code is highly optimized for the chosen mechanism and contains all routines for computing chemical reaction rates. In this work, the serial performance of the automatically generated source code, which in an earlier work only used ISO C++, is further improved by utilizing two compiler extensions: restrict and __builtin_assume_aligned. Introducing these two extensions to the generated code reduces the time for computing chemical reaction rates by up to 50% for the investigated reaction mechanism and total simulation time by up to 25%. Compared to OpenFOAM’s standard chemistry implementation, the new code is faster by a factor of 10. This work discusses the effect of the two compiler extensions on performance by looking at two specific kernel functions from the automatically generated code and the effect on the assembly generated by the gcc and Intel compilers. The newly optimized code is used to evaluate the performance gain in a large scale parallel case, which simulates an experimentally investigated turbulent flame of laboratory scale on 14,400 CPU cores on the Hazel Hen cluster at HLRS. In the simulation, no combustion models are used and the flame is resolved down to the smallest length scales. With this approach, comparison of measured data with the simulation shows very good agreement. Using the optimized code including compiler extensions, total simulation time decreases by 20% compared to the same code without compiler extensions. A comprehensive database from the simulation results has been assembled and will consist of 10 TB of 3D and 2D transient field variables.
- Node-level performance optimization
- Automated code generation
- Turbulent combustion
This is a preview of subscription content, access via your institution.
https://godbolt.org/g/sUKL2j: vectorized loop due to use of restrict.
https://godbolt.org/g/CWobvD: not vectorized without restrict.
https://godbolt.org/g/95smzt: smaller code size for gcc when alignment info is given.
OpenCFD, OpenFOAM: The Open Source CFD Toolbox. User Guide Version 1.4, OpenCFD Limited (Reading UK, April 2007)
D. Goodwin, H. Moffat, R. Speth, Cantera: a object-oriented software toolkit for chemical kinetics, thermodynamics, and transport processes, version 2.3.0b (2017). Software available at www.cantera.org
T. Zirwes, F. Zhang, J. Denev, P. Habisreuther, H. Bockhorn, Automated code generation for maximizing performance of detailed chemistry calculations in OpenFOAM, in High Performance Computing in Science and Engineering ’17, ed. by W. Nagel, D. Kröner, M. Resch (Springer, 2017), pp. 189–204
F. Zhang, H. Bonart, T. Zirwes, P. Habisreuther, H. Bockhorn, N. Zarzalis, Direct numerical simulation of chemically reacting flows with the public domain code OpenFOAM, in High Performance Computing in Science and Engineering ’14, ed. by W. Nagel, D. Kröner, M. Resch (Springer, 2015), pp. 221–236
R. Barlow, S. Meares, G. Magnotti, H. Cutcher, A. Masri, Local extinction and near-field structure in piloted turbulent CH4/air jet flames with inhomogeneous inlets. Combust. Flame 162(10), 3516–3540 (2015)
High Performance Computing Center Stuttgart (2018). www.hlrs.de/systems/cray-xc40-hazel-hen
V. Damian, A. Sandu, M. Damian, F. Potra, R. Carmichael, The kinetic preprocessor kpp—a software environment for solving chemical kinetics. Comput. Chem. Eng. 26, 1567–1579 (2002)
K. Niemeyer, N. Curtis, pyJac v1.0.4 (2017). https://github.com/slackha/pyJac
G. Smith, D. Golden, M. Frenklach, N. Moriarty, B. Eiteneer, M. Goldenberg et al., Gri 3.0 reaction mechanism
B. McBride, S. Gordon, M. Reno, Coefficients for Calculating Thermodynamic and Transport Properties of Individual Species (1993). National Aeronautics and Space Administration, NASA Technical Memorandum 4513
M. Godbolt, Compiler explorer (2018). https://godbolt.org/
U. Paul, B. Dick, M. Kellner, Towards efficient codes for sustainable hpc, in 20th Results and Review Workshop, Stuttgart (2017)
T. Poinsot, D. Veynante, Theoretical and Numerical Combustion (R.T. Edwards, 2001)
F. Zhang, T. Zirwes, P. Habisreuther, H. Bockhorn, Effect of unsteady stretching on the flame local dynamics. Combust. Flame 175, 170–179 (2017)
T. Zirwes, F. Zhang, T. Häber, D. Roth, H. Bockhorn, Direct numerical simulation of ignition by hot moving particles,” in 26th International Colloquium on the Dynamics of Explosions and Reactive Systems (2017)
This work was supported by the Helmholtz Association of German Research Centres (HGF) through the Research Unit EMR. This work was performed on the national supercomputer Cray XC40 Hazel Hen at the High Performance Computing Center Stuttgart (HLRS) and on the computational resource ForHLR II with the acronym Cnoise funded by the Ministry of Science, Research and the Arts Baden-Württemberg and DFG (“Deutsche Forschungsgemeinschaft”).
Editors and Affiliations
Rights and permissions
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Zirwes, T., Zhang, F., Denev, J.A., Habisreuther, P., Bockhorn, H., Trimis, D. (2019). Improved Vectorization for Efficient Chemistry Computations in OpenFOAM for Large Scale Combustion Simulations. In: Nagel, W., Kröner, D., Resch, M. (eds) High Performance Computing in Science and Engineering ' 18. Springer, Cham. https://doi.org/10.1007/978-3-030-13325-2_13
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-13324-5
Online ISBN: 978-3-030-13325-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)