Network Interface with Task Spawning Support for NoC-Based DSM Architectures

  • Aurang Zaib
  • Jan Heißwolf
  • Andreas Weichslgartner
  • Thomas Wild
  • Jürgen Teich
  • Jürgen Becker
  • Andreas Herkersdorf
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9017)

Abstract

Distributed Shared Memory (DSM) architectures are becoming popular to exploit parallelism of architectures while offering flexibility of using both shared and distributed memory paradigms to application developers. At the same time, Networks on Chip (NoC) have become reality to address communication bottlenecks in massively parallel tile-based processor architectures. In NoC-based DSM architectures, the synchronization overhead for spawning a task on a remote network node may lead to high performance penalties. In order to reduce the synchronization delays during remote task spawning, the design of Network Interface (NI) becomes important. In this paper, we present a network interface architecture which supports task spawning between network nodes by employing efficient synchronization mechanisms. The proposed NI internal hardware support offloads the software from handling the synchronization during remote task spawning and hence results in achieving better overall performance. Simulation results highlight that the proposed hardware architecture improves the performance by up to 42 % in comparison to existing state of the art approaches. The FPGA prototype is also used to depict the benefits of the proposed approach for real world applications. Implementation results show the low area footprint of the proposed hardware.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agarwal, A.: The tile processor: A 64-core multicore for embedded processing. In: HPEC (2007)Google Scholar
  2. 2.
    Howard, J., Dighe, S., Hoskote, Y., et al.: A 48-Core IA-32 message-passing processor with DVFS in 45nm CMOS. In: ISSCC (2010)Google Scholar
  3. 3.
    Benini, L., Micheli, G.D.: Networks on chips: a new SoC paradigm. Computer (2002)Google Scholar
  4. 4.
    Nitzberg, B., Lo, V.: Distributed shared memory: A survey of issues and algorithms. Computer 24(8), 52–60 (1991)CrossRefGoogle Scholar
  5. 5.
    Yelick, K., Bonachea, D., Chen, W.-Y., Colella, P., Datta, K., Duell, J., Graham, S.L., Hargrove, P., Hilfinger, P., Husbands, P., et al.: Productivity and performance using partitioned global address space languages. In: Proceedings of the 2007 International Workshop on Parallel Symbolic Computation, pp. 24–32. ACM (2007)Google Scholar
  6. 6.
    Tota, S.V., Casu, M.R., Roch, M.R., Rostagno, L., Zamboni, M.: Medea: a hybrid shared-memory/message-passing multiprocessor noc-based architecture. In: Design, Automation & Test in Europe Conference & Exhibition (DATE), pp. 45–50 (2010)Google Scholar
  7. 7.
    Chen, X., Lu, Z., Jantsch, A., Chen, S.: Supporting distributed shared memory on multi-core network-on-chips using a dual microcoded controller. In: Proceedings of the Conference on Design, Automation and Test in Europe, pp. 39–44 (2010)Google Scholar
  8. 8.
    Kavadias, S.G., Katevenis, M.G., Zampetakis, M., Nikolopoulos, D.S.: On-chip communication and synchronization mechanisms with cache-integrated network interfaces. In: Proceedings of the 7th ACM International Conference on Computing Frontiers, ser. CF 2010 (2010)Google Scholar
  9. 9.
    Adve, S.V., Adve, V.S., Hill, M.D., Vernon, M.K.: Comparison of hardware and software cache coherence schemes (1991)Google Scholar
  10. 10.
    Oechslein, B., Schedel, J., Kleinöder, J., Bauer, L., Henkel, J., Lohmann, D., Schröder-Preikschat, W.: Octopos: a parallel operating system for invasive computing. In: Proceedings of the International Workshop on Systems for Future Multi-Core Architectures (SFMA), EuroSys, pp. 9–14 (2011)Google Scholar
  11. 11.
    Heisswolf, J., Koenig, R., Kupper, M., Becker, J.: Providing multiple hard latency and throughput guarantees for packet switching networks on chip. Computers & Electrical Engineering 39(8), 2603–2622 (2013). http://www.sciencedirect.com/science/article/pii/S0045790613001638 CrossRefGoogle Scholar
  12. 12.
    Rahmani, A.-M., Afzali-Kusha, A., Pedram, M.: A novel synthetic traffic pattern for power/performance analysis of network-on-chips using negative exponential distribution. Journal of Low Power Electronics 5(3), 396–405 (2009)CrossRefGoogle Scholar
  13. 13.
    Gaiesler, J.: The leon processor user’s manual, July 2001. http://www.cs.ucr.edu/~dalton/leon/downloads/leon-2.3.5.pdf

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Aurang Zaib
    • 1
  • Jan Heißwolf
    • 2
  • Andreas Weichslgartner
    • 3
  • Thomas Wild
    • 1
  • Jürgen Teich
    • 3
  • Jürgen Becker
    • 2
  • Andreas Herkersdorf
    • 1
  1. 1.Technical University MunichMunichGermany
  2. 2.Karlsruhe Institute of TechnologyKarlsruheGermany
  3. 3.University of Erlangen-NurembergErlangenGermany

Personalised recommendations