OpenMP\(^*\) SIMD Vectorization and Threading of the Elmer Finite Element Software
We describe the design and implementation of hierarchical high-order basis functions with OpenMP* SIMD constructs in the Elmer Finite Element software. We give rationale of our design decisions and present some of the key challenges encountered during the implementation. Our numerical results on a platform supporting Intel® AVX2 show that the new basis function implementation is 3x to 4x faster when compared to the same code without OpenMP SIMD in use, or 5x to 10x faster when compared to the original Elmer implementation. In addition, our numerical results show similar speedups for the entire finite element assembly process.
KeywordsFinite elements Basis functions Implementation OpenMP SIMD
Thomas Zwinger was supported by the Nordic Centre of Excellence, eSTICC.
Intel and Xeon are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. *Other brands and names are the property of their respective owners.
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance.
Optimization Notice: Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
- 2.Bangerth, W., Hartmann, R., Kanschat, G.: deal.II - a general purpose object oriented finite element library. ACM Trans. Math. Softw. (TOMS) 33(4), 24 (2007)Google Scholar
- 3.Braess, D.: Finite Elements, 2nd edn. Cambridge University Press, Cambridge (2001)Google Scholar
- 5.Ciarlet, P.G.: The Finite Element Method for Elliptic Problems. North-Holland, Amsterdam (1978)Google Scholar
- 6.Demkowicz, L.: Computing with hp-Adaptive Finite Elements: Volume 1 One and Two Dimensional Elliptic and Maxwell Problems. CRC Press, Boca Raton (2006)Google Scholar
- 7.Demkowicz, L., Kurtz, J., Pardo, D., Paszyński, M., Rachowicz, W., Zdunek, A.: Computing with hp-Adaptive Finite Element Method: Volume II Frontiers: Three Dimensional Elliptic and Maxwell Problems. Chapmann & Hall/CRC, Boca Raton (2007). Applied Mathematics & Nonlinear ScienceGoogle Scholar
- 9.Gagliardini, O., Zwinger, T., Gillet-Chaulet, F., Durand, G., Favier, L., de Fleurian, B., Greve, R., Malinen, M., Martín, C., Råback, P., Ruokolainen, J., Sacchettini, M., Schäfer, M., Seddik, J.T.H.: Capabilities and performance of Elmer/Ice, a new-generation ice sheet model. Geosci. Model Dev. 6, 2135–2152 (2013)Google Scholar
- 10.Logg, A., Mardal, K.A., Wells, G.: Automated Solution of Differential Equations by the Finite Element Method: The FEniCS Book, vol. 84. Springer, Berlin (2012)Google Scholar
- 11.Lyly, M., Ruokolainen, J., Järvinen, E.: ELMER-a finite element solver for multiphysics. CSC-report on scientific computing 2000, pp. 156–159 (1999)Google Scholar
- 12.Råback, P., Malinen, M., Ruokolainen, J., Pursula, A., Zwinger, T.: Elmer Models Manual, March 2016Google Scholar
- 13.Schöberl, J.: C++11 implementation of finite elements in NGsolve. Technical report 30, TU Wien (2014)Google Scholar
- 14.Schöberl, J., et al.: NGsolve finite element library. http://sourceforge.net/projects/ngsolve
- 15.Snir, M., Otto, S.W., Huss-Lederman, S., Walker, D.W., Dongarra, J.: MPI - The Complete Reference, vol. 1, 2nd edn. MIT Press, Cambridge (1998)Google Scholar
- 16.Solin, P., Segeth, K., Dolezel, I.: Higher-Order Finite Element Methods. Chapman & Hall/CRC Press, London (2003)Google Scholar
- 17.Szabo, B.A., Babuska, I.: Finite Element Analysis. Wiley, Chichester (1991)Google Scholar