Advertisement

Achieving low cost synchronization in a multiprocessor system

  • Rajiv Gupta
  • Michael Epstein
Submitted Presentations
Part of the Lecture Notes in Computer Science book series (LNCS, volume 365)

Abstract

A barrier are a commonly used mechanism for synchronizing processors executing in parallel. A software implementation of the barrier mechanism using shared variables has two major drawbacks. First, the synchronization overhead is high and second, when a processor reaches the barrier it must idle until all processors reach the barrier. In this paper, the fuzzy barrier, a mechanism that avoids the above drawbacks, is presented. The first problem is avoided by implementing the mechanism in hardware. The second problem is solved by using software techniques to find useful instructions that can be executed by a processor while it awaits synchronization. The hardware implementation eliminates busy waiting at barriers, provides a mask that allows disjoint subsets of processors to synchronize simultaneously, and provides multiple barriers by associating a tag with a barrier. Compiler techniques are presented for constructing barrier regions which consist of instructions that a processor can execute while it is waiting for other processors to reach the barrier. The larger the barrier region, the more likely it is that none of the processors will have to stall. Initial observations show that barrier regions can be large and the use of program transformations can significantly increase their size.

Keywords

multiprocessor systems barrier synchronization parallelizing compilers 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    P. Tang and P.C. Yew, “Processor Self-Scheduling for Multiple-Nested Parallel Loops,” Proc. International Conf. on Parallel Processing, pp. 528–535, August, 1986.Google Scholar
  2. 2.
    R. Gupta, “Synchronization and Communication Costs of Loop Partitioning on Shared-Memory Multiprocessor Systems,” Tech. Report TR-88-019, Philips Laboratories, Briarcliff Manor, NY, 1988.Google Scholar
  3. 3.
    H.S. Stone, High-Performance Computer Architecture, Addison-Wesley Publishing Company, 1987.Google Scholar
  4. 4.
    P.C. Yew, N.F. Tzeng, and D.H. Lawrie, “Distributing Hot-Spot Addressing in Large Scale Multiprocessors,” IEEE Trans. on Computers, vol. C-36, no. 4, April, 1987.Google Scholar
  5. 5.
    C.D. Polychronopoulos, “Compiler Optimizations for Enhancing Parallelism and Their Impact on Architecture Design,” IEEE Trans. on Computers, vol. 37, no. 8, pp. 991–1004, August, 1988.CrossRefGoogle Scholar
  6. 6.
    J.R. Ellis, Bulldog: A Compiler for VLIW Architectures, MIT Press, 1986.Google Scholar
  7. 7.
    R. Gupta and M.L. Soffa, “A Reconfigurable LIW Architecture,” Proc. of the International Conf. on Parallel Processing, pp. 893–900, August, 1987.Google Scholar
  8. 8.
    A.V. Aho, R. Sethi, and J.D. Ullman, Compilers: Principles, Techniques, and Tools, Addison-Wesley, 1986.Google Scholar
  9. 9.
    J. Hennessy and T. Gross, “Postpass Code Optimization of Pipeline Constraints,” ACM Trans. on Programming Languages and Systems, vol. 3, no. 5, pp. 422–448, 1983.CrossRefGoogle Scholar
  10. 10.
    W.C. Hsu, “Register Allocation and Code Scheduling for Load/Store Architectures,” Dept. of Computer Science; Ph.D. dissertation, University of Wisconsin, Madison, 1987.Google Scholar
  11. 11.
    D.J Kuck, R.H. Kuhn, D.A. Padua, B. Leasure, and M. Wolfe, “Dependence Graphs and Compiler Optimizations,” 8th Annual ACM Symp. on Principles of Programming Languages, pp. 207–218, 1981.Google Scholar
  12. 12.
    R. Gupta, “The Fuzzy Barrier: A Mechanism for High Speed Synchronization of Processors,” to appear Third International Conf. on Architectural Support for Programming Languages and Operating Systems, April 1989.Google Scholar
  13. 13.
    R. Cytron, “Doacross: Beyond Vectorization for Multiprocessors,” Proc. International Conf. on Parallel Processing, pp. 836–844, August, 1986.Google Scholar
  14. 14.
    “Multimax Technical Summary,” Encore Computer Corporation, Marlboro MA, 1987.Google Scholar
  15. 15.
    A. Osterhaug, “Guide to Parallel Programming on Sequent Computer Systems,” Sequent Computer Systems, Inc., Beaverton, Oregan, 1987.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1989

Authors and Affiliations

  • Rajiv Gupta
    • 1
  • Michael Epstein
    • 1
  1. 1.Philips LaboratoriesNorth American Philips CorporationBriarcliff Manor

Personalised recommendations