Hierarchical Partitioning and Dynamic Load Balancing for Scientific Computation

  • James D. Teresco
  • Jamal Faik
  • Joseph E. Flaherty
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3732)

Abstract

Cluster and grid computing has made hierarchical and heterogeneous computing systems increasingly common as target environments for large-scale scientific computation. A cluster may consist of a network of multiprocessors. A grid computation may involve communication across slow interfaces. Modern supercomputers are often large clusters with hierarchical network structures. For maximum efficiency, software must adapt to the computing environment. We focus on partitioning and dynamic load balancing, in particular on hierarchical procedures implemented within the Zoltan Toolkit, guided by DRUM, the Dynamic Resource Utilization Model. Here, different balancing procedures are used in different parts of the domain. Preliminary results show that hierarchical partitionings are competitive with the best traditional methods on a small hierarchical cluster.

Keywords

Sandia National Laboratory Surface Index Williams College Callback Function Heterogeneous Computing System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Baden, S.B., Fink, S.J.: A programming methodology for dual-tier multicomputers. IEEE Transactions on Software Engineering 26(3), 212–216 (2000)CrossRefGoogle Scholar
  2. 2.
    Berger, M.J., Bokhari, S.H.: A partitioning strategy for nonuniform problems on multiprocessors. IEEE Trans. Computers 36, 570–580 (1987)CrossRefGoogle Scholar
  3. 3.
    Biswas, R., Devine, K.D., Flaherty, J.E.: Parallel, adaptive finite element methods for conservation laws. Appl. Numer. Math. 14, 255–283 (1994)MATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Boman, E., Devine, K., Heaphy, R., Hendrickson, B., Heroux, M., Preis, R.: LDRD report: Parallel repartitioning for optimal solver performance. Technical Report SAND2004–0365, Sandia National Laboratories, Albuquerque, NM (February 2004)Google Scholar
  5. 5.
    Bottasso, C.L., Flaherty, J.E., Özturan, C., Shephard, M.S., Szymanski, B.K., Teresco, J.D., Ziantz, L.H.: The quality of partitions produced by an iterative load balancer. In: Szymanski, B.K., Sinharoy, B. (eds.) Proc. Third Workshop on Languages, Compilers, and Runtime Systems, Troy, pp. 265–277 (1996)Google Scholar
  6. 6.
    Bui, T., Jones, C.: A heuristic for reducing fill in sparse matrix factorization. In: Proc. 6th SIAM Conf. Parallel Processing for Scientific Computing, pp. 445–452. SIAM, Philadelphia (1993)Google Scholar
  7. 7.
    Campbell, P.M., Devine, K.D., Flaherty, J.E., Gervasio, L.G., Teresco, J.D.: Dynamic octree load balancing using space-filling curves. Technical Report CS-03-01,Williams College Department of Computer Science (2003)Google Scholar
  8. 8.
    Cybenko, G.: Dynamic load balancing for distributed memory multiprocessors. J. Parallel Distrib. Comput. 7, 279–301 (1989)CrossRefGoogle Scholar
  9. 9.
    Devine, K., Boman, E., Heaphy, R., Hendrickson, B., Vaughan, C.: Zoltan data management services for parallel dynamic applications. Computing in Science and Engineering 4(2), 90–97 (2002)CrossRefGoogle Scholar
  10. 10.
    Devine, K.D., Boman, E.G., Heaphy, R.T., Hendrickson, B.A., Teresco, J.D., Faik, J., Flaherty, J.E., Gervasio, L.G.: New challenges in dynamic load balancing. Appl. Numer. Math. 52(2-3), 133–152 (2005)MATHCrossRefMathSciNetGoogle Scholar
  11. 11.
    Devine, K.D., Hendrickson, B.A., Boman, E., St. John, M., Vaughan, C.: Zoltan: ADynamic Load Balancing Library for Parallel Applications; User’s Guide. Sandia National Laboratories, Albuquerque, NM (1999), Tech. Report SAND99-1377. Open-source software distributed at http://www.cs.sandia.gov/Zoltan
  12. 12.
    Dillon Jr., R.E.: Aparametric study of perforatedmuzzle brakes. ARDC Tech. Report ARLCBTR- 84015, Benét Weapons Laboratory, Watervliet (1984) Google Scholar
  13. 13.
    Faik, J., Teresco, J.D., Devine, K.D., Flaherty, J.E., Gervasio, L.G.: A model for resourceaware load balancing on heterogeneous clusters. Technical Report CS-05-01, Williams College Department of Computer Science (2005), Submitted to Transactions on Parallel and Distributed SystemsGoogle Scholar
  14. 14.
    Flaherty, J.E., Loy, R.M., Shephard, M.S., Simone, M.L., Szymanski, B.K., Teresco, J.D., Ziantz, L.H.: Distributed octree data structures and local refinementmethod for the parallel solution of three-dimensional conservation laws. In: Bern, M., Flaherty, J., Luskin, M. (eds.) Grid Generation and Adaptive Algorithms, Minneapolis. The IMA Volumes in Mathematics and its Applications, vol. 113, pp. 113–134. Institute for Mathematics and its Applications, Springer (1999)Google Scholar
  15. 15.
    Flaherty, J.E., Loy, R.M., Shephard, M.S., Szymanski, B.K., Teresco, J.D., Ziantz, L.H.: Adaptive local refinement with octree load-balancing for the parallel solution of threedimensional conservation laws. J. Parallel Distrib. Comput. 47, 139–152 (1997)CrossRefGoogle Scholar
  16. 16.
    Flaherty, J.E., Loy, R.M., Shephard, M.S., Teresco, J.D.: Software for the parallel adaptive solution of conservation laws by discontinuous Galerkin methods. In: Cockburn, B., Karniadakis, G., Shu, S.-W. (eds.) Discontinous Galerkin Methods Theory, Computation and Applications. Lecture Notes in Compuational Science and Engineering, vol. 11, pp. 113–124. Springer, Berlin (2000)Google Scholar
  17. 17.
    Gropp, W., Lusk, E., Doss, N., Skjellum, A.: A high-performance, portable implementation of the MPI message passing interface standard. Parallel Computing 22(6), 789–828 (1996)MATHCrossRefGoogle Scholar
  18. 18.
    Hendrickson, B., Leland, R.: A multilevel algorithm for partitioning graphs. In: Proc. Supercomputing 1995 (1995)Google Scholar
  19. 19.
    Hu, Y.F., Blake, R.J.: An optimal dynamic load balancing algorithm. PreprintDL-P-95-011, Daresbury Laboratory, Warrington, WA4 4AD, UK (1995)Google Scholar
  20. 20.
    Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Scien. Comput. 20(1) (1999)Google Scholar
  21. 21.
    Karypis, G., Kumar, V.: Parallel multilevel k-way partitioning scheme for irregular graphs. SIAM Review 41(2), 278–300 (1999)MATHCrossRefMathSciNetGoogle Scholar
  22. 22.
    Leiss, E., Reddy, H.: Distributed load balancing: design and performance analysis. W. M. Kuck Research Computation Laboratory 5, 205–270 (1989)Google Scholar
  23. 23.
    Mitchell, W.F.: Refinement tree based partitioning for adaptive grids. In: Proc. Seventh SIAM Conf. on Parallel Processing for Scientific Computing, pp. 587–592. SIAM, Philadelphia (1995)Google Scholar
  24. 24.
    Patra, A., Oden, J.T.: Problem decomposition for adaptive hp finite element methods. Comp. Sys. Engng. 6(2), 97–109 (1995)CrossRefGoogle Scholar
  25. 25.
    Pilkington, J.R., Baden, S.B.: Dynamic partitioning of non-uniform structured workloads with spacefilling curves. IEEE Trans. on Parallel and Distributed Systems 7(3), 288–300 (1996)CrossRefGoogle Scholar
  26. 26.
    Pothen, A., Simon, H., Liou, K.-P.: Partitioning sparse matrices with eigenvectors of graphs. SIAM J. Mat. Anal. Appl. 11(3), 430–452 (1990)MATHCrossRefMathSciNetGoogle Scholar
  27. 27.
    Rabenseifner, R., Wellein, G.: Comparision of parallel programming models on clusters of SMP nodes. In: Bock, H., Kostina, E., Phu, H., Rannacher, R. (eds.) Proc. Intl. Conf. on High Performance Scientific Computing, Hanoi, pp. 409–426. Springer, Heidelberg (2004)Google Scholar
  28. 28.
    Shephard, M.S., Flaherty, J.E., de Cougny, H.L., Özturan, C., Bottasso, C.L., Beall, M.W.: Parallel automated adaptive procedures for unstructured meshes. In: Parallel Comput. in CFD, number R-807, vol. R-807, pp. 6.1–6.49. Agard, Neuilly-Sur-Seine (1995)Google Scholar
  29. 29.
    Simon, H.D.: Partitioning of unstructured problems for parallel processing. Comp. Sys. Engng. 2, 135–148 (1991)CrossRefGoogle Scholar
  30. 30.
    Taylor, C.A., Hugues, T.J.R., Zarins, C.K.: Finite element modeling of blood flow in arteries. Comput. Methods Appl. Mech. Engrg. 158(1-2), 155–196 (1998)MATHCrossRefMathSciNetGoogle Scholar
  31. 31.
    Taylor, V.E., Nour-Omid, B.: A study of the factorization fill-in for a parallel implementation of the finite element method. Int. J. Numer. Meth. Engng. 37, 3809–3823 (1994)MATHCrossRefGoogle Scholar
  32. 32.
    Teresco, J.D., Beall, M.W., Flaherty, J.E., Shephard, M.S.: A hierarchical partition model for adaptive finite element computation. Comput. Methods Appl. Mech. Engrg. 184, 269–285 (2000)MATHCrossRefGoogle Scholar
  33. 33.
    Teresco, J.D., Devine, K.D., Flaherty, J.E.: Partitioning and Dynamic Load Balancing for the Numerical Solution of Partial Differential Equations. In: Numerical Solution of Partial Differential Equations on Parallel Computers, Springer, Heidelberg (2005)Google Scholar
  34. 34.
    Teresco, J.D., Faik, J., Flaherty, J.E.: Resource-aware scientific computation on a heterogeneous cluster. Computing in Science & Engineering 7(2), 40–50 (2005)CrossRefMathSciNetGoogle Scholar
  35. 35.
    Teresco, J.D., Ungar, L.P.: A comparison of Zoltan dynamic load balancers for adaptive computation. Technical Report CS-03-02, Williams College Department of Computer Science (2003), Presented at COMPLAS 2003Google Scholar
  36. 36.
    Walshaw, C., Cross, M.: Parallel Optimisation Algorithms for Multilevel Mesh Partitioning. Parallel Comput. 26(12), 1635–1660 (2000)MATHCrossRefMathSciNetGoogle Scholar
  37. 37.
    Warren, M.S., Salmon, J.K.: A parallel hashed tree n-body algorithm. In: Proc. Supercomputing 1993, pp. 12–21. IEEE Computer Society Press, Los Alamitos (1993)CrossRefGoogle Scholar
  38. 38.
    Williams, R.: Performance of dynamic load balancing algorithms for unstructured mesh calculations. Concurrency 3, 457–481 (1991)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • James D. Teresco
    • 1
  • Jamal Faik
    • 2
  • Joseph E. Flaherty
    • 2
  1. 1.Department of Computer ScienceWilliams CollegeWilliamstownUSA
  2. 2.Department of Computer ScienceRensselaer Polytechnic InstituteTroyUSA

Personalised recommendations