Skip to main content

Abstract

We present an approach to hybrid MPI/OpenMP parallelization in FETI-DP methods using OpenMP with PETSc+MPI in the finite element assembly and using the shared memory parallel direct solver Pardiso in the FETI-DP solution phase. Our approach thus uses OpenMP parallelization on subdomains and MPI in between subdomains. We investigate the efficiency of this approach for a benchmark problem from two dimensional nonlinear hyperelasticity. We observe good scalability for up to four threads for each MPI rank on a state-of-the-art Ivy Bridge architecture and incremental improvements for up to ten OpenMP threads for each MPI rank.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Amestoy, P.R., Duff, I.S., l’Excellent, J.Y.: Multifrontal parallel distributed symmetric and unsymmetric solvers. Comput. Methods Appl. Mech. Eng. 184(2–4), 501–520 (2000)

    Article  MATH  Google Scholar 

  2. Amestoy, P.R., Duff, I.S., l’Excellent, J.Y., Koster, J.: A fully asynchronous multifrontal solver using distributed dynamic scheduling. SIAM J. Matrix Anal. Appl. 23(1), 15–41 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  3. Balay, S., Abhyankar, S., Adams, M.F., Brown, J., Brune, P., Buschelman, K., Eijkhout, V., Gropp, W.D., Kaushik, D., Knepley, M.G., McInnes, L.C., Rupp, K., Smith, B.F., Zhang, H.: Changes in the petsc 3.5 version. http://www.mcs.anl.gov/petsc/documentation/changes/35.html (2014)

  4. Balay, S., Abhyankar, S., Adams, M.F., Brown, J., Brune, P., Buschelman, K., Eijkhout, V., Gropp, W.D., Kaushik, D., Knepley, M.G., McInnes, L.C., Rupp, K., Smith, B.F., Zhang, H.: PETSc users manual. Tech. Rep. ANL-95/11 - Revision 3.5, Argonne National Laboratory. http://www.mcs.anl.gov/petsc (2014)

  5. Balay, S., Gropp, W.D., McInnes, L.C., Smith, B.F.: Efficient management of parallelism in object oriented numerical software libraries. In: Arge, E., Bruaset, A.M., Langtangen, H.P. (eds.) Modern Software Tools in Scientific Computing, pp. 163–202. Birkhäuser, Boston (1997)

    Chapter  Google Scholar 

  6. Balay, S., Abhyankar, S., Adams, M.F., Brown, J., Brune, P., Buschelman, K., Eijkhout, V., Gropp, W.D., Kaushik, D., Knepley, M.G., McInnes, L.C., Rupp, K., Smith, B.F., Zhang, H.: PETSc Web page. http://www.mcs.anl.gov/petsc (2014)

  7. Bermeo, J.D.: Added support for mkl-pardiso solver. https://bitbucket.org/petsc/petsc/pull-request/105/added-support-for-mkl-pardiso-solver/commits (2013)

  8. Bhardwaj, M., Pierson, K.H., Reese, G., Walsh, T., Day, D., Alvin, K., Peery, J., Farhat, C., Lesoinne, M.: Salinas: a scalable software for high performance structural and mechanics simulation. In: ACM/IEEE Proceedings of SC02: High Performance Networking and Computing. Gordon Bell Award, pp. 1–19 (2002)

    Google Scholar 

  9. Davis, T.A.: A column pre-ordering strategy for the unsymmetric-pattern multifrontal method. ACM Trans. Math. Softw. 30(2), 165–195 (2004). http://doi.acm.org/10.1145/992200.992205

  10. Davis, T.A., Duff, I.S.: An unsymmetric-pattern multifrontal method for sparse lu factorization. SIAM J. Matrix Anal. Appl. 18(1), 140–158 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  11. Davis, T.A., Duff, I.S.: A combined unifrontal/multifrontal method for unsymmetric sparse matrices. ACM Trans. Math. Softw. 25(1), 1–19 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  12. Farhat, C., Lesoinne, M., Pierson, K.: A scalable dual-primal domain decomposition method. Numer. Linear Algebra Appl. 7, 687–714 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  13. Farhat, C., Lesoinne, M., LeTallec, P., Pierson, K., Rixen, D.: FETI-DP: a dual-primal unified FETI method - part i: a faster alternative to the two-level FETI method. Int. J. Numer. Methods Eng. 50, 1523–1544 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  14. Guèye, I.: Solving large linear systems arising in finite element approximations on massively parallel computers. Theses, Mines ParisTech (2009). https://tel.archives-ouvertes.fr/tel-00477653

  15. Guèye, I., Juvigny, X., Feyel, F., Roux, F.X., Cailletaud, G.: A parallel algorithm for direct solution of large sparse linear systems, well suitable to domain decomposition methods. Eur. J. Comput. Mech./Revue Européenne de Mécanique Numérique 18(7–8), 589–605 (2009). doi:10.3166/ejcm.18.589–605

    Google Scholar 

  16. Guèye, I., Arem, S.E., Feyel, F., Roux, F.X., Cailletaud, G.: A new parallel sparse direct solver: Presentation and numerical experiments in large-scale structural mechanics parallel computing. Int. J. Numer. Methods Eng. 88(4), 370–384 (2011). doi:10.1002/nme.3179. http://dx.doi.org/10.1002/nme.3179

  17. Guo, X., Gorman, G., Lange, M., Sunderland, A., Ashworth, M.: Developing hybrid openmp/mpi parallelism for fluidity-icom - next generation geophysical fluid modelling technology (2012). http://www.hector.ac.uk/cse/distributedcse/reports/fluidity-icom02/fluidity-icom02.pdf. Final Report for DCSE ICOM

  18. Klawonn, A., Rheinbach, O.: Inexact FETI-DP methods. Int. J. Numer. Methods Eng. 69(2), 284–307 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  19. Klawonn, A., Lanser, M., Rheinbach, O.: Towards extremely scalable nonlinear domain decomposition methods for elliptic partial differential equation. Tech. Rep. 2014–13, Preprint Reihe, Fakultät für Mathematik, TU Bergakademie Freiberg, ISSN 1433-9407. http://tu-freiberg.de/fakult1/forschung/preprints (2014) [Submitted to SISC]

  20. Klawonn, A., Lanser, M., Rheinbach, O.: A nonlinear FETI-DP method with an inexact coarse problem. In: Dickopf, T., Gander, M.J., Krause, R., Pavarino, L.F. (eds.) Domain Decomposition Methods in Science and Engineering. Lecture Notes in Computational Science and Engineering, vol. 22. Springer, Heidelberg (2015); Accepted for publication October 2014. Proceedings of the 22nd Conference on Domain Decomposition Methods in Science and Engineering, Lugano, 16–20 September 2013. Also http://tu-freiberg.de/fakult1/forschung/preprints

  21. Klawonn, A., Rheinbach, O.: Highly scalable parallel domain decomposition methods with an application to biomechanics. ZAMM Z. Angew. Math. Mech. 90(1), 5–32 (2010). doi:10.1002/zamm.200900329. http://dx.doi.org/10.1002/zamm.200900329

  22. Klawonn, A., Widlund, O.B.: Dual-primal FETI methods for linear elasticity. Commun. Pure Appl. Math. 59(11), 1523–1572 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  23. Kuzmin, A., Luisier, M., Schenk, O.: Fast methods for computing selected elements of the greens function in massively parallel nanoelectronic device simulations. In: Wolf, F., Mohr, B., Mey, D. (eds.) Euro-Par 2013 Parallel Processing. Lecture Notes in Computer Science, vol. 8097, pp. 533–544. Springer, Berlin/Heidelberg (2013)

    Chapter  Google Scholar 

  24. Rheinbach, O.: Parallel iterative substructuring in structural mechanics. Arch. Comput. Methods Eng. 16(4), 425–463 (2009). doi:10.1007/s11831-009-9035-4. http://dx.doi.org/10.1007/s11831-009-9035-4

  25. Schenk, O., Wächter, A., Hagemann, M.: Matching-based preprocessing algorithms to the solution of saddle-point problems in large-scale nonconvex interior-point optimization. Comput. Optim. Appl. 36(2–3), 321–341 (2007). doi:10.1007/s10589-006-9003-y. http://dx.doi.org/10.1007/s10589-006-9003-y

  26. Schenk, O., Bollhöfer, M., Römer, R.A.: On large-scale diagonalization techniques for the anderson model of localization. SIAM Rev. 50(1), 91–112 (2008). doi:10.1137/070707002. http://dx.doi.org/10.1137/070707002

  27. Smith, B.F., Bjørstad, P.E., Gropp, W.: Domain Decomposition: Parallel Multilevel Methods for Elliptic Partial Differential Equations. Cambridge University Press, Cambridge (1996)

    MATH  Google Scholar 

  28. Toselli, A., Widlund, O.: Domain Decomposition Methods - Algorithms and Theory. Springer Series in Computational Mathematics, vol. 34. Springer, Heidelberg (2004)

    Google Scholar 

  29. Treibig, J., Hager, G., Wellein, G.: LIKWID: a lightweight performance-oriented tool suite for x86 multicore environments. In: PSTI2010, the First International Workshop on Parallel Software Tools and Tool Infrastructures, pp. 207–216. IEEE Computer Society, Los Alamitos (2010). http://dx.doi.org/10.1109/ICPPW.2010.38

  30. Zienkiewicz, O., Taylor, R.: The Finite Element Method for Solid and Structural Mechanics. Elsevier, Oxford (2005)

    MATH  Google Scholar 

Download references

Acknowledgements

This work was supported by the German Research Foundation (DFG) through the Priority Programme 1648 “Software for Exascale Computing” (SPPEXA) under KL 2094/4-1, RH 122/2-1, WE 5289/1-1.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Martin Lanser .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Klawonn, A., Lanser, M., Rheinbach, O., Stengel, H., Wellein, G. (2015). Hybrid MPI/OpenMP Parallelization in FETI-DP Methods. In: Mehl, M., Bischoff, M., Schäfer, M. (eds) Recent Trends in Computational Engineering - CE2014. Lecture Notes in Computational Science and Engineering, vol 105. Springer, Cham. https://doi.org/10.1007/978-3-319-22997-3_4

Download citation

Publish with us

Policies and ethics