Skip to main content
Log in

Two-Phase Barrier: A Synchronization Primitive for Improving the Processor Utilization

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

Barrier is widely used for synchronization in parallel programs. Since the process arrived earlier than others should wait at the barrier, the total processor utilization decreases. In this paper, to find the sources of the barrier waiting time, parallel programs are executed on the various grain sizes through execution-driven simulations. In simulation studies, we found that even if approximately equal amounts of work are distributed to each processor, all processes may not arrive at a barrier at the same time. The reasons are that the different numbers of cache misses and instructions within in partitioned grains result in the difference in arrival time of processors at the barrier. In this paper, the two-phased barrier is considered to reduce the blind waiting time in the traditional barrier scheme, which can be simply constructed by dividing one specific stage for the synchronization into two stages. On each stage, processes decide their stall or not, which is dependent on the current execution state of grains running on any given processors. Simulation results show that the reduced barrier waiting times attributed to the two-phased barrier contribute to the performance improvement of parallel programs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

REFERENCES

  1. J. Archibald and J.-L. Baer, Cache Coherence Protocols: Evaluation Using a Multiprocessors Simulation Model, ACM Transaction on Computer Systems, 4(4):273-298 (November 1986).

    Google Scholar 

  2. T. S. Axelrod, Effects of Synchronization Barriers on Multiprocessor Performance, Parallel Comput., 4(3):129-140 (1986).

    Google Scholar 

  3. H. El-Rewini and T. G. Lewis, Scheduling Parallel Program Tasks onto Arbitrary Target Machines, Journal of Parallel and Distributed Computing (June 1990).

  4. R. Gupta, The Fuzzy Barrier: A Mechanism for High Speed Synchronization of Processors. In Proc. 3rd Int'l Conf. Architectural Support Progr. Lang. Operat. Syst., pp. 54-63 (April 1989).

  5. I. B. Jung and J. W. Lee, Techniques for Improving the Cache Performance in Parallel Applications. In Eleventh LASTED Int'l Conf. Parallel and Distributed Computing and Systems (PDCS'99, MIT, Boston), pp. 597-602 (November 1999).

  6. L. I. Kontothanassis, R. W. Wisniewski, and M. L. Scott, Scheduler-Conscious Synchronization, ACM Transactions on Computer Systems, 15(1):3-40 (February 1997).

    Google Scholar 

  7. V. Kumar, A. Grama, A. Gupta, and G. Karypis, Introduction to Parallel Computing (Design and Analysis of Algorithms), The Benjamin/Cummings Publishing Company, Inc. (1994).

  8. B. G. Lim and A. Agarwal, Waiting Algorithms for Synchronization in Large-Scale Multiprocessors, ACM Transactions on Computer Systems, 11(3):256-294 (August 1993).

    Google Scholar 

  9. B. Lubachevsky, Synchronization Barrier and Related Tools for Shared Memory Parallel Programming. In Proc. 1989 Int'l Conf. Parallel Processing, pp. 75-179 (August 1989).

  10. E. Markatos, M. Crovella, and P. Das, The Effects of Multiprogramming on Barrier Synchronization. In Proc. 3rd IEEE Symposium on Parallel and Distributed Processing, pp. 662-669 (December 1991).

  11. C. McCann, R. Vaswani, and J. Zahorjan, A Dynamic Processor Allocation Strategy for Multiprogrammed, Shared Memory Multiprocessors, ACM Transaction on Computer Systems, 11(2):146-178 (May 1993).

    Google Scholar 

  12. J. M. Mellor-Crummey and M. L. Scott, Algorithms for Scalable Synchronization on Shared Memory Multiprocessors, ACM Transactions on Computer Systems, 9(1):21-65 (February 1991).

    Google Scholar 

  13. M. L. Scott and M. M. Michael, The Topology Barrier: A Synchronization Abstraction for Regularly-Structured Parallel Applications, Technical Report Technical Report 605, Department of Computer Science, University of Rochester (1996).

  14. M. L. Scott and J. M. Mellor-Crummey, Fast, Contention-Free Combining Tree Barriers, International Journal of Parallel Programming, 22(4):449-481 (August 1994).

    Google Scholar 

  15. J. E. Veenstra and R. J. Fowler, MINT: A Front End for Efficient Simulation of Shared-Memory Multiprocessors. In Proc. 2nd Int'l Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), pp. 201-207 (January 1994).

  16. T. Yang and A. Gerasoulis, PYRROS: Static Task Scheduling and Code Generation for Message-Passing Multiprocessors. In The 6th ACM Int'l Conf. Supercomputing (July 1992).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Inbum Jung.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jung, I., Hyun, J., Lee, J. et al. Two-Phase Barrier: A Synchronization Primitive for Improving the Processor Utilization. International Journal of Parallel Programming 29, 607–627 (2001). https://doi.org/10.1023/A:1013153020460

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1013153020460

Navigation