An implementation of a tree-based N-body algorithm on message-passing architectures
In this paper, a “manager-worker” model for a parallel implementation of hierarchical N-body algorithm is introduced. We describe a load-balanced, extremely simple algorithm for solving the Astrophysics simulation of N-body problem using tree-based data structures and coarse-grained parallel architectures. This algorithm, based on the Barnes-Hut method, first assembles a tree data structure which represents the distribution of bodies, or particles, at all length-scales. A domain decomposition, or an adaptive load balancing technique is used to assign bodies to processors as well as to insure that processors are assigned equal amounts of work. Therefore, the problem of load balancing for parallelized particle simulations implemented on MIMD machines is addressed and a simple dynamic had balancing algorithm, called costzones, is employed A number of measurements were carried out in order to reveal the behavior of the N-body application with respect to a partitioning technique and load imbalance overhead. We also show that, with using the introduced manager-worker model and the costzones domain decomposition technique, the algorithm is load balanced and that the majority of the time of the algorithm is spent in performing on-processor functions and not in inter-processor communications. We have conducted our study on several MIMD machines such as the 256-processor Cray T3D and the 64-processor Intel Paragon at JPL/ESS-NASA and on the 32-processor Thinking Machines CMS at UMC.
KeywordsAstrophysics Simulations Barnes-Hut Algorithm Dynamic Load Balancing MIMD Machines Performance Analysis
Unable to display preview. Download preview PDF.
- R. W. Hockney and J. W. Eastwood, Computer Simulation Using Particles, Adam Hilger, Bristol 1988.Google Scholar
- J. Katzenelson, “Computational Structure of the N-Body Problem,” SIAM J. Sci. Stat. Comp., v.10, no.4, pp.787–815, 1989.Google Scholar
- J. Barnes and P. Hut, “A Hierarchical O(NlogN) Force-Calculation Algorithm,” Nature, v. 324, pp. 446–449, 86.Google Scholar
- A. W Appel, “An Efficient Program for Many-Body Simulation,” SIAM J. Sci. Stat. Comput., v. 6, pp. 85–93, 85.Google Scholar
- L. Greengard and V. Rokhlin, “A Fast Algorithm for Particle Simulations” J. Comp. Phys., v.73, pp. 325–348, 87.Google Scholar
- A.I. Meajil, “A Load Balancing of O(NlogN) N-Body Algorithm on Message-Passing Architectures,” 9th Intl. Conf. on Parallel and Distributed Computing Systems, Sep 25–27, 1996, Dijon, France.Google Scholar
- J. P. Singh and C. Holt, “A Parallel Fast Multipole Method,” Proc. Supercomputing'93, pp. 54–65, November 93.Google Scholar
- G. Fox, Numerical Algorithms for Modern Parallel Computer Architectures, pp. 37–62, Springer-Verlag, 1988.Google Scholar
- T El-Ghazawi, T Sterling, A. Meajil, and A. Ozkaya, “Overhead and Salability Measurements on the Cray T3D and Intel Paragon Systems,” Research Report to CESDIS, Goddard Space Flight Center, NASA, June 1995.Google Scholar
- J. K. Salmon, Parallel Hierarchical N-body Methods, Ph.D. thesis, California Institute of Technology, Dec. 90.Google Scholar