HyBar: high efficient barrier synchronization based on a hybrid packet-circuit switching Network-on-Chip

Wei, Zhenqi; Liu, Peilin; Sun, Rongdi

doi:10.1007/s11432-016-0306-y

HyBar: high efficient barrier synchronization based on a hybrid packet-circuit switching Network-on-Chip

Research Paper
Published: 09 February 2017

Volume 60, article number 062402, (2017)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

Zhenqi Wei¹,
Peilin Liu¹ &
Rongdi Sun¹

85 Accesses
1 Citation
Explore all metrics

Abstract

Realizing barrier synchronization in multi-/many-core processors with high efficiency becomes more and more challenging as the number of cores integrated in a single chip keeps growing. Quite a few barrier solutions have been proposed, while they provide limited improvements for synchronizing large amounts of cores or incur unfavorable restrictions on performing concurrent barriers. This paper presents HyBar, a hardware barrier based on a hybrid switching NoC which adopts packet switching and circuit switching methods in two sub-networks respectively. Dedicated channels in the circuit-switching sub-network are dynamically built and removed when barrier requests traverse the packet-switching sub-network according to a modified dimensionorder routing algorithm. The efficiency of inter-core communication for concurrent barriers is improved by merging barrier arrival requests and broadcasting release requests along the circuit channels. The execution time of synthetic cases, benchmark kernels and parallel applications using various barrier solutions are evaluated in an RTL-based simulation platform. Experimental results show that our proposal provides about 15%–50% performance improvement compared to previous solutions, while the hardware overhead is marginal under SMIC 40 nm technology. Moreover, HyBar introduces a minor efficiency loss for concurrent barriers with no limitation on their layouts of participating cores in the on-chip network.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast Network-on-Chip Design

Design of a Deadlock-Free XY-YX Router for Network-on-Chip

Optimization of the GNU OpenMP Synchronization Barrier in MPSoC

References

Wilkinson B, Allen M. Parallel Programming: Techniques and Applications Using Networked Workstations and Parallel Computers. Upper Saddle River: Prentice Hall, 2004
Google Scholar
Sartori J, Kumar R. Low-overhead, high-speed multi-core barrier synchronization. In: Proceedings of the 5th International Conference on High Performance Embedded Architectures and Compilers (HiPEAC’10), Pisa, 2010. 18–34
Chapter Google Scholar
Shen X B. Evolution of MPP SoC architecture techniques. Sci China Ser F-Inf Sci, 2008, 51: 756–764
Article MATH Google Scholar
Villa O, Palermo G, Silvano C. Efficiency and scalability of barrier synchronization on NoC based many-core architectures. In: Proceedings of International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES’08), New York, 2008. 81–90
Google Scholar
Monchiero M, Palermo G, Silvano C, et al. Efficient synchronization for embedded on-chip multiprocessors. IEEE Trans Very Large Scale Integration Syst, 2006, 14: 1049–1062
Article Google Scholar
Xiao H, Wu N, Ge F, et al. Efficient synchronization for distributed embedded multiprocessors. IEEE Trans Very Large Scale Integration Syst, 2016, 24: 779–783
Article Google Scholar
Wei Z Q, Liu P L, Sun R D, et al. TAB barrier: hybrid barrier synchronization for NoC-based processors. In: Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS’15), Lisbon, 2015. 409–412
Google Scholar
Chen X, Lu Z, Jantsch A, et al. Cooperative communication based barrier synchronization in on-chip mesh architectures. IEICE Electron Expr, 2011, 8: 1856–1862
Article Google Scholar
Chen X W, Lu Z, Jantsch A, et al. Cooperative communication for efficient and scalable all-to-all barrier synchronization on mesh-based many-core NoCs. IEICE Electron Expr, 2014, 11: 20140542
Article Google Scholar
Abellan J L, Fernandez J, Acacio M E, et al. Design of a collective communication infrastructure for barrier synchronization in cluster-based nanoscale MPSoCs. In: Proceedings of Design, Automation Test in Europe Conference Exhibition (DATE’12), Dresden, 2012. 491–496
Google Scholar
Oh J, PrvulovicM, Zajic A. TLSync: support for multiple fast barriers using on-chip transmission lines. In: Proceedings of the 38th Annual International Symposium on Computer Architecture (ISCA’11), San Jose, 2011. 105–115
Google Scholar
Kumar A, Peh L S, Kundu P, et al. Express virtual channels: towards the ideal interconnection fabric. In: Proceedings of the 34th Annual International Symposium on Computer Architecture (ISCA’07), San Diego, 2007. 150–161
Google Scholar
Krishna T, Peh L S. Single-cycle collective communication over a shared network fabric. In: Proceedings of the 8th IEEE/ACM International Symposium on Networks-on-Chip (NoCS’14), Ferrara, 2014. 1–8
Google Scholar
Daneshtalab M, Ebrahimi M, Mohammadi S, et al. Low-distance path-based multicast routing algorithm for networkon- chips. IET Comput Digit Tech, 2009, 3: 430–442
Article Google Scholar
Modarressi M, Sarbazi-Azad H, Arjomand M. A hybrid packet-circuit switched on-chip network based on SDM. In: Proceedings of Conference on Design, Automation and Test in Europe (DATE’09), Nice, 2009. 566–569
Google Scholar
Lin J, Zhou W, Yu Z, et al. A hybrid router combining circuit switching and packet switching with virtual channels for on-chip networks. In: Proceedings of the 10th IEEE International Conference on ASIC (ASICON’13), Shenzhen, 2013. 1–4
Google Scholar
Abousamra A K, Melhem R G, Jones A K. Déjà Vu switching for multiplane NoCs. In: Proeedings of the 6th IEEE/ACM International Symposium on Networks on Chip (NoCS’12), Copenhagen, 2012. 11–18
Chapter Google Scholar
Ou P, Zhang J, Quan H, et al. A 65nm 39 GOPS/W 24-core processor with 11 Tb/s/W packet-controlled circuitswitched doublelayer network-on-chip and heterogeneous execution array. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC’13), San Francisco, 2013. 56–57
Google Scholar
Jerger N D E, Peh L S, Lipasti M H. Circuit-switched coherence. In: Proceedings of the 2nd IEEE/ACM International Symposium on Networks-on-Chip (NoCS’08), Newcastle upon Tyne, 2008. 193–202
Google Scholar
Chen G, Anders M A, Kaul H, et al. A 340 mV-to-0.9V 20.2 Tb/s source-synchronous hybrid packet/circuit-switched 16×16 network-on-chip in 22nm tri-gate CMOS. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC’14), San Francisco, 2014. 276–277
Google Scholar
Glass C J, Ni L M. The turn model for adaptive routing. In: Proceedings of the 19th Annual International Symposium on Computer Architecture (ISCA’92). New York: ACM, 1992. 278–287
Google Scholar
Becker D U. Efficient microarchitecture for network-on-chip routers. Dissertation for Ph.D. Degree. Palo Alto: Stanford University, 2012
Google Scholar
McMahon F H. Livermore Fortran Kernels: a Computer Test of Numerical Performance Range. Technical Report UCRL-53745. 1986
Google Scholar

Download references

Acknowledgments

This work was partially supported by Equipment Pre-Research Foundation of China (Grant No. 9140A08010414JW03025).

Author information

Authors and Affiliations

School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China
Zhenqi Wei, Peilin Liu & Rongdi Sun

Authors

Zhenqi Wei
View author publications
You can also search for this author in PubMed Google Scholar
Peilin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Rongdi Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhenqi Wei.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wei, Z., Liu, P. & Sun, R. HyBar: high efficient barrier synchronization based on a hybrid packet-circuit switching Network-on-Chip. Sci. China Inf. Sci. 60, 062402 (2017). https://doi.org/10.1007/s11432-016-0306-y

Download citation

Received: 11 August 2016
Accepted: 06 November 2016
Published: 09 February 2017
DOI: https://doi.org/10.1007/s11432-016-0306-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

HyBar: high efficient barrier synchronization based on a hybrid packet-circuit switching Network-on-Chip

Abstract

Access this article

Similar content being viewed by others

Fast Network-on-Chip Design

Design of a Deadlock-Free XY-YX Router for Network-on-Chip

Optimization of the GNU OpenMP Synchronization Barrier in MPSoC

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

HyBar: high efficient barrier synchronization based on a hybrid packet-circuit switching Network-on-Chip

Abstract

Access this article

Similar content being viewed by others

Fast Network-on-Chip Design

Design of a Deadlock-Free XY-YX Router for Network-on-Chip

Optimization of the GNU OpenMP Synchronization Barrier in MPSoC

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation