Topology-Aware Communication in Wide-Area Message-Passing
This position paper examines the use of topology-aware communication services to support message-passing in wide-area, distributed environments, i.e., grids. Grid computing promises great benefits in the flexible sharing of resources but poses equally great challenges for high-performance computing, that is to say, how to execute large-scale computations on a grid with reasonable utilization of the machines involved. For wide-area computations using a message-passing paradigm, these issues can be addressed by using topology-aware communication, i.e., communication services that are aware of and can exploit the topology of the network connecting all relevant machines. Such services can include augmented communication semantics (e.g., filtering), collective operations, content-based and policy-based routing, and managing communication scope to manage feasibility. While such services can be implemented and deployed in a variety of ways, we propose the use of a peer-to-peer, middleware forwarding and routing layer. In a related application domain (time management in distributed simulations) we provide emulation results showing that such topology-awareness can play a major role in performance and scalability. Besides these benefits, such communication services raise a host of implementation and integration issues for their operational deployment and use in grid environments. Hence, we discuss the need for proper APIs and high-level models.
Unable to display preview. Download preview PDF.
- 1.Lee, C., Stepanek, J.: On future global grid communication performance. In: 10th IEEE Heterogeneous Computing Workshop (May 2001)Google Scholar
- 2.Kielmann, T., et al.: MagPIe: MPI’s collective communication operations for clustered wide area systems. In: Symposium on Principles and Practice of Parallel Programming, Atlanta, GA, May 1999, pp. 131–140 (1999)Google Scholar
- 3.Karonis, N., de Supinski, B., Foster, I., Gropp, W., Lusk, E., Bresnahan, J.: Exploiting hierarchy in parallel computer networks to optimize collective operation performance. In: IPDPS, pp. 377–384 (2000)Google Scholar
- 7.Karonis, N., Toonen, B., Foster, I.: MPICH-G2: A Grid-Enabled Implementation of the Message Passing Interface. Journal of Parallel and Distributed Computing, JPDC (2003) (to appear)Google Scholar
- 8.The Defense Modeling and Simulation Office. The High Level Architecture (2000), http://hla.dmso.mil
- 9.Lee, C., Coe, E., Raghavendra, C., et al.: Scalable time management algorithms using active networks for distributed simulation. In: DARPA Active Network Conference and Exposition, May 29-30, pp. 366–378 (2002)Google Scholar
- 10.Lee, C., Coe, E., Michel, B.S., Solis, I., Stepanek, J., Clark, J.M., Davis, B.: Using topology-aware communication services in grid environments. In: Workshop on Grids and Advanced Networks, International Symposium on Cluster Computing and the Grid (May 2003)Google Scholar
- 11.Fujimoto, R., Hoare, P.: HLA RTI performance in high speed LAN environments. In: Fall Simulation Interoperability Workshop (1998)Google Scholar
- 12.White, B., et al.: An integrated experimental environment for distributed systems and networks. In: Proc. of the Fifth Symposium on Operating Systems Design and Implementation, Boston, MA, December 2002, pp. 255–270 (2002)Google Scholar
- 13.Michel, B.S., Reiher, P.: Peer-to-Peer Internetworking. In: OPENSIG (September 2001)Google Scholar
- 14.The Globus Team. The Globus Metacomputing Project (1998), http://www.globus.org
- 15.Czajkowski, K., Fitzgerald, S., Foster, I., Kesselman, C.: Grid information services for distributed resource sharing. In: HPDC-10, pp. 181–194 (2001)Google Scholar