ParaPART: Parallel mesh partitioning tool for distributed systems

  • Jian Chen
  • Valerie E. Taylor
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1586)

Abstract

In this paper, we present ParaPART, a parallel version of a mesh partitioning tool, called PART, for distributed systems. PART takes into consideration the heterogeneities in processor performance, network performance and application computational complexities to achieve a balanced estimate of execution time across the processors in the distributed system. Simulated annealing is used in PART to perform the backtracking search for desired partitions. ParaPART significantly improves performance of PART by using the asynchronous multiple Markov chain approach of parallel simulated annealing. ParaPART is used to partition six irregular meshes into 8, 16, and 100 subdomains using up to 64 client processors on an IBM SP2 machine. The results show superlinear speedup in most cases and nearly perfect speedup for the rest. Using the partitions from ParaPART, we ran an explicit, 2-D finite element code on two geographically distributed IBM SP machines. Results indicate that ParaPART produces results consistent with PART. The execution time was reduced by 12% as compared with partitions that consider only processor performance; this is significant given the theoretical upper bound of 15% reduction.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    P. Banerjee, M. H. Jones, and J. S. Sargent. Parallel simulated annealing algorithms for cell placement on hypercube multiprocessors. IEEE Trans. on Parallel and Distributed Systems, 1(1), January 1990.Google Scholar
  2. 2.
    M. J. Berger and S. H. Bokhari. A partitioning strategy for nonuniform problems on multiprocessors. IEEE Trans. Comput., C-36:570–580, May 1987.Google Scholar
  3. 3.
    J. A. Chandy, S. Kim, B. Ramkumar, S. Parkes, and P. Banerjee. An evaluation of parallel simulated annealing strategies with application to standard cell placement. IEEE Trans. on Comp. Aid. Design of Int. Cir. and Sys., 16(4), April 1997.Google Scholar
  4. 4.
    H. C. Chen, H. Gao, and S. Sarma. WHAMS3D project progress report PR-2. Technical Report 1112, University of Illinois CSRD, 1991.Google Scholar
  5. 5.
    J. Chen and V. E. Taylor. Part: A partitioning tool for efficient use of distributed systems. In Proceedings of the 11th International Conference on Application Specific Systems, Architectures and Processors, pages 328–337, Zurich, Switzerland, July 1997.Google Scholar
  6. 6.
    J. Chen and V. E. Taylor. Mesh partitioning for distributed systems. In Proceedings of the Seventh IEEE International Symposium on High Performance Distributed Computing, Chicago, IL, July 1998.Google Scholar
  7. 7.
    J. Chen and V. E. Taylor. Mesh partitioning for distributed systems: Exploring optimal number of partitions with local and remote communication. In Proceedings of Ninth SIAM Conference on Parallel Processing for Scientic Computing, San Antonio, TX, March 22–24 1999. to appear.Google Scholar
  8. 8.
    C. Farhat. A simple and efficient automatic fem domain decomposer. Computers and Structures, 28(5):579–602, 1988.CrossRefGoogle Scholar
  9. 9.
    I. Foster, J. Geisler, W. Gropp, N. Karonis, E. Lusk, G. Thiruvathukal, and S. Tuecke. A wide-area implementation of the Message Passing Interface. Parallel Computing, 1998. to appear.Google Scholar
  10. 10.
    I. Foster and C. Kesselman. Computational grids. In I. Foster and C. Kesselman, editors, The Grid: Blueprint for a New Computing Infrastructure pages 15–52. Morgan Kaufmann, San Francisco, California, 1986.Google Scholar
  11. 11.
    A. George and J. Liu. Computer Solution of Large Sparse Positive Definite Systems. Prentice-Hall, Englewood Cliffs, New Jersey, 1981.MATHGoogle Scholar
  12. 12.
    D. R. Greening. Parallel simulated annealing techniques. Physica, D(42):293–306, 1990.Google Scholar
  13. 13.
    A. S. Grimshaw, W. A. Wulf, and the Legion team. The legion vision of a worldwide virtual computer. Communications of the ACM, 40(1), January 1997.Google Scholar
  14. 14.
    G. Hasteer and P. Banerjee. Simulated annealing based parallel state assignment of finite state machines. Journal of Parallel and Distributed Computing 43, 1997.Google Scholar
  15. 15.
    B. Hendrickson and R. Leland. The chaco user’s guide. Technical Report SAND93-2339, Sandia National Laboratory, 1993.Google Scholar
  16. 16.
    J. Jamison and R. Wilder. vbns: The internet fast lane for research and education. IEEE Communications Magazine, January 1997.Google Scholar
  17. 17.
    G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. Technical report, Department of Computer Science TR95-035, University of Minnesota, 1995.Google Scholar
  18. 18.
    G. Karypis and V. Kumar. Multilevel k-way partitioning scheme for irregular graphs. Technical report, Department of Computer Science TR95-064, University of Minnesota, 1995.Google Scholar
  19. 19.
    B. Kernighan and S. Lin. An efficient heuristic procedure for partitioning graphs. Bell System Technical Journal, 29:291–307, 1970.Google Scholar
  20. 20.
    S. A. Kravitz and R. A. Rutenbar. Placement by simulated annealing on a multiprocessor. IEEE Trans. Comput. Aided Design, CAD-6:534–549, July 1987.CrossRefGoogle Scholar
  21. 21.
    V. Kumar, A. Grama, A. Gupta, and G. Karypis. Introduction to Parallel Computing: Design and Analysis of Algorithms. Benjamin/Cummings, 1994.Google Scholar
  22. 22.
    S.Y. Lee and K.G. Lee. Asynchronous communication of multiple markov chain in parallel simulated annealing. In Proc. of Int. Conf. Parallel Processing, volume III, pages 169–176, St. Charles, IL, August 1992.Google Scholar
  23. 23.
    B. Nour-Omid, A. Raefsky, and G. Lyzenga. Solving finite element equations on concurrent computers. In A. K. Noor, editor, Parallel Computations and Their Impact on Mechanics, pages 209–227. ASME, New York, 1986.Google Scholar
  24. 24.
    H. D. Simon. Partitioning of unstructured problems for parallel processing. Computing Systems in Engineering, 2(2/3):135–148, 1991.CrossRefGoogle Scholar
  25. 25.
    H. D. Simon and C. Farhat. Top/domdec: a software tool for mesh partitioning and parallel processing. Technical report, Report RNR-93-011, NASA, July 1993.Google Scholar
  26. 26.
    H. D. Simon, A. Sohn, and R. Biswas. Harp: A fast spectral partitioner. In Proceedings of the Ninth ACM Symposium on Parallel Algorithms and Architectures, Newport, Rhode Island, June 22–25 1997.Google Scholar
  27. 27.
    A. Sohn. Parallel n-ary speculative computation of simulated annealing. IEEE Trans. on Parallel and Distributed Systems, 6(10), October 1995.Google Scholar
  28. 28.
    C. Walshaw, M. Cross, S. Johnson, and M. Everett. JOSTLE: Partitioning of Unstructured Meshes for Massively Parallel Machines, In N. Satofuka et al, editor, Parallel Computational Fluid Dynamics: New Algorithms and Applications. Elsevier, Amsterdam, 1995.Google Scholar

Copyright information

© Springer-Verlag 1999

Authors and Affiliations

  • Jian Chen
    • 1
  • Valerie E. Taylor
    • 1
  1. 1.Department of Electrical and Computer EngineeringNorthwestern UniversityEvanston

Personalised recommendations