Topology-Aware Communication in Wide-Area Message-Passing

  • Craig A. Lee
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2840)

Abstract

This position paper examines the use of topology-aware communication services to support message-passing in wide-area, distributed environments, i.e., grids. Grid computing promises great benefits in the flexible sharing of resources but poses equally great challenges for high-performance computing, that is to say, how to execute large-scale computations on a grid with reasonable utilization of the machines involved. For wide-area computations using a message-passing paradigm, these issues can be addressed by using topology-aware communication, i.e., communication services that are aware of and can exploit the topology of the network connecting all relevant machines. Such services can include augmented communication semantics (e.g., filtering), collective operations, content-based and policy-based routing, and managing communication scope to manage feasibility. While such services can be implemented and deployed in a variety of ways, we propose the use of a peer-to-peer, middleware forwarding and routing layer. In a related application domain (time management in distributed simulations) we provide emulation results showing that such topology-awareness can play a major role in performance and scalability. Besides these benefits, such communication services raise a host of implementation and integration issues for their operational deployment and use in grid environments. Hence, we discuss the need for proper APIs and high-level models.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Lee, C., Stepanek, J.: On future global grid communication performance. In: 10th IEEE Heterogeneous Computing Workshop (May 2001)Google Scholar
  2. 2.
    Kielmann, T., et al.: MagPIe: MPI’s collective communication operations for clustered wide area systems. In: Symposium on Principles and Practice of Parallel Programming, Atlanta, GA, May 1999, pp. 131–140 (1999)Google Scholar
  3. 3.
    Karonis, N., de Supinski, B., Foster, I., Gropp, W., Lusk, E., Bresnahan, J.: Exploiting hierarchy in parallel computer networks to optimize collective operation performance. In: IPDPS, pp. 377–384 (2000)Google Scholar
  4. 4.
    Gabriel, E., Resch, M., Beisel, T., Keller, R.: Distributed computing in a heterogenous computing environment. In: Alexandrov, V.N., Dongarra, J. (eds.) PVM/MPI 1998. LNCS, vol. 1497, pp. 180–187. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  5. 5.
    Eickermann, T., Grund, H., Henrichs, J.: Performance issues of distributed mpi applications in a german gigabit testbed. In: Margalef, T., Dongarra, J., Luque, E. (eds.) PVM/MPI 1999. LNCS, vol. 1697, p. 3. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  6. 6.
    Foster, I., Iamnitchi, A.: On death, taxes, and the convergence of peer-to-peer and grid computing. In: Kaashoek, M.F., Stoica, I. (eds.) IPTPS 2003. LNCS, vol. 2735, Springer, Heidelberg (2003)CrossRefGoogle Scholar
  7. 7.
    Karonis, N., Toonen, B., Foster, I.: MPICH-G2: A Grid-Enabled Implementation of the Message Passing Interface. Journal of Parallel and Distributed Computing, JPDC (2003) (to appear)Google Scholar
  8. 8.
    The Defense Modeling and Simulation Office. The High Level Architecture (2000), http://hla.dmso.mil
  9. 9.
    Lee, C., Coe, E., Raghavendra, C., et al.: Scalable time management algorithms using active networks for distributed simulation. In: DARPA Active Network Conference and Exposition, May 29-30, pp. 366–378 (2002)Google Scholar
  10. 10.
    Lee, C., Coe, E., Michel, B.S., Solis, I., Stepanek, J., Clark, J.M., Davis, B.: Using topology-aware communication services in grid environments. In: Workshop on Grids and Advanced Networks, International Symposium on Cluster Computing and the Grid (May 2003)Google Scholar
  11. 11.
    Fujimoto, R., Hoare, P.: HLA RTI performance in high speed LAN environments. In: Fall Simulation Interoperability Workshop (1998)Google Scholar
  12. 12.
    White, B., et al.: An integrated experimental environment for distributed systems and networks. In: Proc. of the Fifth Symposium on Operating Systems Design and Implementation, Boston, MA, December 2002, pp. 255–270 (2002)Google Scholar
  13. 13.
    Michel, B.S., Reiher, P.: Peer-to-Peer Internetworking. In: OPENSIG (September 2001)Google Scholar
  14. 14.
    The Globus Team. The Globus Metacomputing Project (1998), http://www.globus.org
  15. 15.
    Czajkowski, K., Fitzgerald, S., Foster, I., Kesselman, C.: Grid information services for distributed resource sharing. In: HPDC-10, pp. 181–194 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Craig A. Lee
    • 1
  1. 1.Computer Systems Research DepartmentThe Aerospace CorporationEl SegundoUSA

Personalised recommendations