Abstract
Modern networks of workstations connected by Gigabit networks have the ability to run high-performance computing applications at a reasonable performance, but at a significantly lower cost. The performance of these applications is usually dominated by their efficiency of the underlying communication mechanisms. However, efficient communication requires that not only messages themselves are sent fast, but also notification about message arrival should be fast as well. For example, a message that has arrived at its destination is worthless until the recipient is alerted to the message arrival.
In this paper we describe a new operation, the remote-enqueue atomic operation, which can be used in multiprocessors, and workstation clusters. This operation atomically inserts a data element in a queue that physically resides in a remote processor's memory. This operation can be used for fast notification of message arrival, and for fast passing of small messages. Compared to other software and hardware queueing alternatives, remote-enqueue provides high speed at a low implementation cost without compromising protection in a general-purpose computing environment.
Preview
Unable to display preview. Download preview PDF.
References
T.E. Anderson, D.E. Culler, and D.A. Patterson. A Case for NOW (Networks of Workstations). IEEE Micro, 15(1):54–64, February 1995.
H. Bal, R. Hofman, and K. Verstoep. A Comparison of Three High Speed Networks for Parallel Cluster Computing. In Proc. 1st International Workshop on Communication and Arch. Support for Network-Based Parallel Computing, pages 184–197, 1997.
BBN Advanced Computers Inc. Inside the TC2000ℳ Computer. Cambridge, Massachusetts, February 1990.
M. Blumrich, K. Li, R. Alpert, C. Dubnicki, E. Feiten, and J. Sandberg. Virtual Memory Mapped Network Interface for the SHRIMP Multicomputer. In Proc. 21-th International Symposium on Comp. Arch., pages 142–153, Chicago, IL, April 1994.
M.A. Blumrich, C.Dubnicki, E.W. Feiten, and K. Li. Protected, User-level DMA for the SHRIMP Network Interface. In Proc. of the 2nd International Symposium on High Performance Computer Architecture, pages 154–165, San Jose, CA, February 1996.
N.J. Boden, D. Cohen, and W.-K. Su. Myrinet: A Gigabit-per-Second Local Area Network. IEEE Micro, 15(1):29, February 1995.
William J. Bolosky, Michael L. Scott, Robert P. Fitzgerald, Robert J. Fowler, and Alan L. Cox. NUMA Policies and Their Relation to Memory Architecture. In Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 212–221, Santa Clara, CA, April 1991.
E.A. Brewer, F.T. Chong, L.Tl Liu, S.D. Sharma, and J.D. Kubiatowicz. Remote Queues: Exposing Message Queues for Optimization and Atomicity. In Symp. on Parallel Algorithms and Architecures, 1995.
G. Buzzard, D. Jacobson, S. Marovich, and J. Wilkes. Hamlyn: a High-performance Network Interface, with Sender-Based Memory Management. In Proceedings of the Hot Interconnects III Symposium, August 1995.
A. Davis, M. Swanson, and M. Parker. Efficient Communication Mechanisms for Cluster Based Parallel Computing. Technical report, University of Utah, Dept. of Computer Science, 1996.
J. Edler, J. Lipkis, and E. Schonberg. Process Management for Highly Parallel UNIX Systems. Technical Report Ultracomputer Note 136, Ultracomputer Research Laboratory, New York University, April 1988.
T. von Eicken, D. E. Culler, S. C. Goldstein, and K. E. Schauser. Active Messages: A Mechanism for Integrated Communication and Computation. In Proc. 19-th International Symposium on Comp. Arch., pages 256–266, Gold Coast, Australia, May 1992.
R. Gillett. Memory Channel Network for PCI. IEEE Micro, 16(1):12, February 1996.
J. Heinlein, K. Gharachorloo, S. Dresser, and A. Gupta. Integration of Message Passing and Shared Memory in the Stanford FLASH Multiprocessor. In Proc. of the 6-th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 38–50, 1994.
James C. Hoe and Mike Ehrlich. StarT-JR: A Parallel System from Commodity Technology. In Proceedings of the 7th Transputer/Occam International Conference, November 1995. Tokyo, Japan.
Andrew W. Wilson Jr., Richard P. LaRowe Jr., and Marc J. Teller. Hardware Assist for Distributed Shared Memory. In Proc. 13-th Int. Conf. on Distr. Comp. Syst., pages 246–255, Pittsburgh, PA, May 1993.
V. Karamcheti, S. Pakin, and A. Chien. High Performance Messaging on Workstations: Illinois Fast Messages (FM) for Myrinet. In Supercomputing 95, 1995.
Manolis G. H. Katevenis, Evangelos P. Markatos, George Kalokerinos, and Apostolos Dollas. Telegraphos: A Substrate for High-Performance Computing on Workstation Clusters. Journal of Parallel and Distributed Computing, 43(2):94–108, June 1997.
O. Lysne, S. Gjessing, and K. Lochsen. Running the SCI Protocol over HIC Networks. In Proceedings of the Second International Workshop on SCI-based Low-cost/High-perfocmance Computing (SCIzzL-2), March 1995. Santa Barbara, CA.
E.P. Markatos. Using Remote Memory to avoid Disk Thrashing: A Simulation Study. In Proceedings of the ACM International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS '96), pages 69–73, February 1996.
E.P. Markatos and C.E. Chronaki. Trace-Driven Simulations of Data-Alignment and Other Factors affecting Update and Invalidate Based Coherent Memory. In Proceedings of the ACM International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS '94), pages 44–52, January 1994.
E.P. Markatos and M. G.H. Katevenis. Telegraphos: High-Performance Networking for Parallel Processing on Workstation Clusters. In Proc. of the 2nd International Symposium on High Performance Computer Architecture, pages 144–153, Feb 1996. URL: http://www.csi.forth.gr/ proj/arch-vlsi/papers/ 1996.HPCA96.Telegraphos.ps.gz.
E.P. Markatos and M. G.H. Katevenis. User-Level DMA without Operating System Kernel Modification. In Proc. of the 3rd International Symposium on High Performance Computer Architecture, pages 322–331, Feb 1997. URL: http://www.csi.forth.gr/proj/aavg/papers/ 1997. HPCA97.user _level_dma.ps.gz.
D. Serpanos. Scalable Shared-Memory Interconnections. PhD thesis, Princeton University, Dept. of Computer Science, October 1990.
R. Sites. Alpha AXP Architecture. Communications of the ACM, 36(2):33–44, February 1993.
Tandem Computers Inc. ServerNet Technology: Introducing the Worlds First System Area Network, 1996. http://www.tandem.com/INFOCTR/BRFS_WPS/SNTSANWP/SNTSANWP.HTM.
Thorsten von Eicken, Anindya Basu, Vineet Buch, and Werner Vogels. U-Net: A User-Level Network Interface for Parallel and Distributed Computing. In Proc. 15-th Symposium on Operating Systems Principles, pages 40–53, December 1995.
Thomas M. Warschko, Joachim M. Blum, and Walter F. Tichy. The ParaPC / ParaStation Project: Efficient Parallel Computing by Clustering Workstations. Technical Report 13/96, University of Karlsruhe, Dept. of Informatics, 1996.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Markatos, E.P., Katevenis, M.G.H., Vatsolaki, P. (1998). The remote enqueue operation on networks of workstations. In: Panda, D.K., Stunkel, C.B. (eds) Network-Based Parallel Computing Communication, Architecture, and Applications. CANPC 1998. Lecture Notes in Computer Science, vol 1362. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0052203
Download citation
DOI: https://doi.org/10.1007/BFb0052203
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64140-7
Online ISBN: 978-3-540-69693-3
eBook Packages: Springer Book Archive