Abstract
We present an approach to hybrid MPI/OpenMP parallelization in FETI-DP methods using OpenMP with PETSc+MPI in the finite element assembly and using the shared memory parallel direct solver Pardiso in the FETI-DP solution phase. Our approach thus uses OpenMP parallelization on subdomains and MPI in between subdomains. We investigate the efficiency of this approach for a benchmark problem from two dimensional nonlinear hyperelasticity. We observe good scalability for up to four threads for each MPI rank on a state-of-the-art Ivy Bridge architecture and incremental improvements for up to ten OpenMP threads for each MPI rank.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Amestoy, P.R., Duff, I.S., l’Excellent, J.Y.: Multifrontal parallel distributed symmetric and unsymmetric solvers. Comput. Methods Appl. Mech. Eng. 184(2–4), 501–520 (2000)
Amestoy, P.R., Duff, I.S., l’Excellent, J.Y., Koster, J.: A fully asynchronous multifrontal solver using distributed dynamic scheduling. SIAM J. Matrix Anal. Appl. 23(1), 15–41 (2001)
Balay, S., Abhyankar, S., Adams, M.F., Brown, J., Brune, P., Buschelman, K., Eijkhout, V., Gropp, W.D., Kaushik, D., Knepley, M.G., McInnes, L.C., Rupp, K., Smith, B.F., Zhang, H.: Changes in the petsc 3.5 version. http://www.mcs.anl.gov/petsc/documentation/changes/35.html (2014)
Balay, S., Abhyankar, S., Adams, M.F., Brown, J., Brune, P., Buschelman, K., Eijkhout, V., Gropp, W.D., Kaushik, D., Knepley, M.G., McInnes, L.C., Rupp, K., Smith, B.F., Zhang, H.: PETSc users manual. Tech. Rep. ANL-95/11 - Revision 3.5, Argonne National Laboratory. http://www.mcs.anl.gov/petsc (2014)
Balay, S., Gropp, W.D., McInnes, L.C., Smith, B.F.: Efficient management of parallelism in object oriented numerical software libraries. In: Arge, E., Bruaset, A.M., Langtangen, H.P. (eds.) Modern Software Tools in Scientific Computing, pp. 163–202. Birkhäuser, Boston (1997)
Balay, S., Abhyankar, S., Adams, M.F., Brown, J., Brune, P., Buschelman, K., Eijkhout, V., Gropp, W.D., Kaushik, D., Knepley, M.G., McInnes, L.C., Rupp, K., Smith, B.F., Zhang, H.: PETSc Web page. http://www.mcs.anl.gov/petsc (2014)
Bermeo, J.D.: Added support for mkl-pardiso solver. https://bitbucket.org/petsc/petsc/pull-request/105/added-support-for-mkl-pardiso-solver/commits (2013)
Bhardwaj, M., Pierson, K.H., Reese, G., Walsh, T., Day, D., Alvin, K., Peery, J., Farhat, C., Lesoinne, M.: Salinas: a scalable software for high performance structural and mechanics simulation. In: ACM/IEEE Proceedings of SC02: High Performance Networking and Computing. Gordon Bell Award, pp. 1–19 (2002)
Davis, T.A.: A column pre-ordering strategy for the unsymmetric-pattern multifrontal method. ACM Trans. Math. Softw. 30(2), 165–195 (2004). http://doi.acm.org/10.1145/992200.992205
Davis, T.A., Duff, I.S.: An unsymmetric-pattern multifrontal method for sparse lu factorization. SIAM J. Matrix Anal. Appl. 18(1), 140–158 (1997)
Davis, T.A., Duff, I.S.: A combined unifrontal/multifrontal method for unsymmetric sparse matrices. ACM Trans. Math. Softw. 25(1), 1–19 (1999)
Farhat, C., Lesoinne, M., Pierson, K.: A scalable dual-primal domain decomposition method. Numer. Linear Algebra Appl. 7, 687–714 (2000)
Farhat, C., Lesoinne, M., LeTallec, P., Pierson, K., Rixen, D.: FETI-DP: a dual-primal unified FETI method - part i: a faster alternative to the two-level FETI method. Int. J. Numer. Methods Eng. 50, 1523–1544 (2001)
Guèye, I.: Solving large linear systems arising in finite element approximations on massively parallel computers. Theses, Mines ParisTech (2009). https://tel.archives-ouvertes.fr/tel-00477653
Guèye, I., Juvigny, X., Feyel, F., Roux, F.X., Cailletaud, G.: A parallel algorithm for direct solution of large sparse linear systems, well suitable to domain decomposition methods. Eur. J. Comput. Mech./Revue Européenne de Mécanique Numérique 18(7–8), 589–605 (2009). doi:10.3166/ejcm.18.589–605
Guèye, I., Arem, S.E., Feyel, F., Roux, F.X., Cailletaud, G.: A new parallel sparse direct solver: Presentation and numerical experiments in large-scale structural mechanics parallel computing. Int. J. Numer. Methods Eng. 88(4), 370–384 (2011). doi:10.1002/nme.3179. http://dx.doi.org/10.1002/nme.3179
Guo, X., Gorman, G., Lange, M., Sunderland, A., Ashworth, M.: Developing hybrid openmp/mpi parallelism for fluidity-icom - next generation geophysical fluid modelling technology (2012). http://www.hector.ac.uk/cse/distributedcse/reports/fluidity-icom02/fluidity-icom02.pdf. Final Report for DCSE ICOM
Klawonn, A., Rheinbach, O.: Inexact FETI-DP methods. Int. J. Numer. Methods Eng. 69(2), 284–307 (2007)
Klawonn, A., Lanser, M., Rheinbach, O.: Towards extremely scalable nonlinear domain decomposition methods for elliptic partial differential equation. Tech. Rep. 2014–13, Preprint Reihe, Fakultät für Mathematik, TU Bergakademie Freiberg, ISSN 1433-9407. http://tu-freiberg.de/fakult1/forschung/preprints (2014) [Submitted to SISC]
Klawonn, A., Lanser, M., Rheinbach, O.: A nonlinear FETI-DP method with an inexact coarse problem. In: Dickopf, T., Gander, M.J., Krause, R., Pavarino, L.F. (eds.) Domain Decomposition Methods in Science and Engineering. Lecture Notes in Computational Science and Engineering, vol. 22. Springer, Heidelberg (2015); Accepted for publication October 2014. Proceedings of the 22nd Conference on Domain Decomposition Methods in Science and Engineering, Lugano, 16–20 September 2013. Also http://tu-freiberg.de/fakult1/forschung/preprints
Klawonn, A., Rheinbach, O.: Highly scalable parallel domain decomposition methods with an application to biomechanics. ZAMM Z. Angew. Math. Mech. 90(1), 5–32 (2010). doi:10.1002/zamm.200900329. http://dx.doi.org/10.1002/zamm.200900329
Klawonn, A., Widlund, O.B.: Dual-primal FETI methods for linear elasticity. Commun. Pure Appl. Math. 59(11), 1523–1572 (2006)
Kuzmin, A., Luisier, M., Schenk, O.: Fast methods for computing selected elements of the greens function in massively parallel nanoelectronic device simulations. In: Wolf, F., Mohr, B., Mey, D. (eds.) Euro-Par 2013 Parallel Processing. Lecture Notes in Computer Science, vol. 8097, pp. 533–544. Springer, Berlin/Heidelberg (2013)
Rheinbach, O.: Parallel iterative substructuring in structural mechanics. Arch. Comput. Methods Eng. 16(4), 425–463 (2009). doi:10.1007/s11831-009-9035-4. http://dx.doi.org/10.1007/s11831-009-9035-4
Schenk, O., Wächter, A., Hagemann, M.: Matching-based preprocessing algorithms to the solution of saddle-point problems in large-scale nonconvex interior-point optimization. Comput. Optim. Appl. 36(2–3), 321–341 (2007). doi:10.1007/s10589-006-9003-y. http://dx.doi.org/10.1007/s10589-006-9003-y
Schenk, O., Bollhöfer, M., Römer, R.A.: On large-scale diagonalization techniques for the anderson model of localization. SIAM Rev. 50(1), 91–112 (2008). doi:10.1137/070707002. http://dx.doi.org/10.1137/070707002
Smith, B.F., Bjørstad, P.E., Gropp, W.: Domain Decomposition: Parallel Multilevel Methods for Elliptic Partial Differential Equations. Cambridge University Press, Cambridge (1996)
Toselli, A., Widlund, O.: Domain Decomposition Methods - Algorithms and Theory. Springer Series in Computational Mathematics, vol. 34. Springer, Heidelberg (2004)
Treibig, J., Hager, G., Wellein, G.: LIKWID: a lightweight performance-oriented tool suite for x86 multicore environments. In: PSTI2010, the First International Workshop on Parallel Software Tools and Tool Infrastructures, pp. 207–216. IEEE Computer Society, Los Alamitos (2010). http://dx.doi.org/10.1109/ICPPW.2010.38
Zienkiewicz, O., Taylor, R.: The Finite Element Method for Solid and Structural Mechanics. Elsevier, Oxford (2005)
Acknowledgements
This work was supported by the German Research Foundation (DFG) through the Priority Programme 1648 “Software for Exascale Computing” (SPPEXA) under KL 2094/4-1, RH 122/2-1, WE 5289/1-1.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Klawonn, A., Lanser, M., Rheinbach, O., Stengel, H., Wellein, G. (2015). Hybrid MPI/OpenMP Parallelization in FETI-DP Methods. In: Mehl, M., Bischoff, M., Schäfer, M. (eds) Recent Trends in Computational Engineering - CE2014. Lecture Notes in Computational Science and Engineering, vol 105. Springer, Cham. https://doi.org/10.1007/978-3-319-22997-3_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-22997-3_4
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22996-6
Online ISBN: 978-3-319-22997-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)