Network interface active messages for low overhead communication on SMP PC clusters
NICAM is a communication layer for SMP PC clusters connected via Myrinet, designed to reduce overhead and latency by directly utilizing a micro-processor equipped on the network interface. It adopts remote memory operations to reduce much of the overhead found in message passing. NICAM employs an Active Messages framework for flexibility in programming on the network interface, and this flexibility will compensate for the large latency resulting from the relatively slow micro-processor. Running message handlers directly on the network interface reduces the overhead by freeing the main processors from the work of polling incoming messages. The handlers also make synchronizations faster by avoiding the costly interactions between the main processors and the network interface. In addition, this implementation can completely hide latency of barriers in data-parallel programs, because handlers running in the background of the main processors allow reposition of barriers to any place where the latency is not critical.
Unable to display preview. Download preview PDF.
- 1.R. P. Martin, A. M. Vahdat, D. E. Culler, T. E. Anderson: Effects of Communication Latency, Overhead, and Bandwidth in a Cluster Architecture. Int'l Symp. on Computer Architecture, (ISCA'97) (1997).Google Scholar
- 2.T. von Eicken, D. E. Culler, S. C. Goldstein, K. E. Schauser: Active Messages: a Mechanism for Integrated Communication and Computation. Int'l Symp. on Computer Architecture (ISCA'92), pp.256–266 (1992).Google Scholar
- 3.Y. Tanaka, M. Matsuda, M. Ando, K. Kubota, M. Sato: COMPaS: A Pentium Pro PC-based SMP Cluster and its Experience. IPPS Workshop on Personal Computer based Networks of Workstations (PC-NOW'98) (1998).Google Scholar
- 5.H. Tezuka, A. Hori, Y. Ishikawa, M. Sato: PM: An Operating System Coordinated High Performance Communication Library. High-Performance Computing and Networking, LNCS 1225, pp.708–717, Springer-Verlag (1997).Google Scholar
- 6.M. Gupta, E. Schonberg: Static Analysis to Reduce Synchronization Costs in Data-Parallel Programs. Symp. on Principles of Programming Languages, pp.322–332 (1996).Google Scholar
- 7.R. Gupta: The Fuzzy Barrier: A Mechanism for High Speed Synchronization of Processors. Int'l Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS-III) pp.54–63 (1989).Google Scholar
- 8.A. Fahmy, A. Heddaya: Communicable Memory and Lazy Barriers for Bulk Synchronous Parallelism in BSPk. Boston University Technical Report BU-CS-96-012 (1996).Google Scholar
- 9.K. E. Schauser, C. J. Scheiman, J. M. Ferguson, P. Z. Kolano: Exploiting the Capability of Communications Co-processor. Int'l Parallel Processing Symposium (IPPS'96) (1996).Google Scholar
- 10.A. Krishnamurthy, K. E. Schauser, C. J. Scheiman, R. Y. Wang, D. E. Culler, K. Yelick: Evaluation of Architectural Support for Global Address-Based Communication in Large-Scale Parallel Machines. Int'l Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VII) (1996).Google Scholar
- 11.L. Prylli, B. Tourancheau: BIP: a New Protocol Designed for High Performance Networking on Myrinet. IPPS Workshop on Personal Computer based Networks of Workstations (PC-NOW'98) (1998).Google Scholar
- 12.S. Pakin, M. Lauria, A. Chien: High Performance Messaging on Workstations: Illinois Fast Messages (FM) for Myrinet. Supercomputing'95 (1995).Google Scholar