Abstract
Parallel simulations at extreme scale require that the mesh is distributed across a large number of processors with equal work load and minimum inter-part communications. A number of algorithms have been developed to meet these goals and graph/hypergraph-based methods are by far the most powerful ones. However, the global implementation of current approaches can fail on very large core counts and the vertex imbalance is not optimal where individual cores are lightly loaded. Those issues are resolved by combination of global and local partitioning and an iterative improvement algorithm, LIIPBMod, developed in the previous study (Zhou et al. in SIAM J. Sci. Comput. 32:3201–3227, 2010). In the current work, this combined partition strategy is applied to the simulations at extreme scale with up to O(1010) elements and up to O(300K) cores. Strong scaling studies on IBM BlueGene/P and Cray XT5 systems demonstrate the effectiveness of this combined partition algorithm.
Similar content being viewed by others
References
Boman E, Devine K, Fisk LA, Heaphy R, Hendrickson B, Leung V, Vaughan C, Catalyurek U, Bozdag D, Mitchell W (1999) Zoltan home page. http://www.cs.sandia.gov/Zoltan
Bui T, Jones C (1993) A heuristic for reducing fill in sparse matrix factorization. In: Proceedings of the 6th SIAM conference on parallel processing for scientific computing. SIAM, Philadelphia, pp 445–452
Çatalyürek ÜV, Aykanat C (1999) PaToH: a multilevel hypergraph partitioning tool, version 3.0. Bilkent University, Department of Computer Engineering, Ankara, 06533 Turkey. PaToH is available at http://bmi.osu.edu/umit/software.htm
Devine KD, Boman EG, Heaphy RT, Bisseling RH, Catalyurek UV (2006) Parallel hypergraph partitioning for scientific computing. In: Proceedings of 20th international parallel and distributed processing symposium (IPDPS’06). IEEE, New York
Hendrickson B, Leland R (1995) A multilevel algorithm for partitioning graphs. In: Proceedings of supercomputing ’95, December 1995. ACM, New York
Jansen KE, Whiting CH, Hulbert GM (1999) A generalized-α method for integrating the filered Navier–Stokes equations with a stabilized finite element method. Comput Methods Appl Mech Eng 190:305–319
Karypis G, Kumar V (1996) A parallel algorithm for multilevel graph partitioning and sparse matrix ordering. In: 10th international parallel processing symposium, pp 314–319
Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J Sci Comput 20(1):359–392
Karypis G, Kumar V (1998) Multilevel algorithms for multi-constraint graph partitioning. In: Proceedings of the 1998 ACM/IEEE conference on supercomputing, pp 1–13
Sahni O, Müller Y, Jansen KE, Shephard MS, Taylor CA (2006) Efficient anisotropic adaptive discretization of the cardiovascular system. Comput Methods Appl Mech Eng 195:5634–5655
Sahni O, Zhou M, Shephard MS, Jansen KE (2009) Scalable implicit finite element solver for massively parallel processing with demonstration to 160k cores. In: Proceedings of IEEE/ACM SC’09, Portland, OR, USA, November 2009. Finalist paper for the Gordon Bell prize
Schloegel K, Karypis G, Kumar V (2002) Parallel static and dynamic multiconstraint graph partitioning. Concurr Comput, Pract Exp 14(3):219–240
Teresco JD, Devine KD, Flaherty JE (2005) Partitioning and dynamic load balancing for the numerical solution of partial differential equations. In: Numerical solution of partial differential equations on parallel computers. Springer, Berlin
Trifunovic A, Knottenbelt WJ (2004) Parkway 2.0: a parallel multilevel hypergraph partitioning tool. In: Proceedings of 19th international symposium on computer and information sciences (ISCIS 2004). LNCS, vol 3280. Springer, Berlin, pp 789–800
Vignon-Clementel IE, Figueroa CA, Jansen KE, Taylor CA (2006) Outflow boundary conditions for three-dimensional finite element modeling of blood flow and pressure in arteries. Comput Methods Appl Mech Eng 195(29–32):3776–3796
Whiting CH, Jansen KE (2001) A stabilized finite element method for the incompressible Navier–Stokes equations using a hierarchical basis. Int J Numer Methods Fluids 35:93–116
Womersley J (1955) Method for the calculation of velocity, rate of flow and viscous drag in arteries when the pressure gradient is known. J Physiol 127:553–563
Zhou M (2009) Petascale adaptive computational fluid dynamics. PhD thesis, Rensselaer Polytechnic Institute, August 2009
Zhou M, Sahni O, Devine KD, Shephard MS, Jansen KE (2010) Controlling unstructured mesh partitions for massively parallel simulations. SIAM J Sci Comput 32:3201–3227
Zhou M, Sahni O, Shephard MS, Carothers CD, Jansen KE (2010) Adjacency-based data reordering algorithm for acceleration of finite element computations. Sci Program 18:107–123
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhou, M., Sahni, O., Xie, T. et al. Unstructured mesh partition improvement for implicit finite element at extreme scale. J Supercomput 59, 1218–1228 (2012). https://doi.org/10.1007/s11227-010-0521-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-010-0521-0