Skip to main content

Advertisement

SpringerLink
Log in
Menu
Find a journal Publish with us
Search
Cart
Book cover

European Conference on Parallel Processing

Euro-Par 2011: Euro-Par 2011: Parallel Processing Workshops pp 365–374Cite as

  1. Home
  2. Euro-Par 2011: Parallel Processing Workshops
  3. Conference paper
Study of Hierarchical N-Body Methods for Network-on-Chip Architectures

Study of Hierarchical N-Body Methods for Network-on-Chip Architectures

  • Thomas Canhao Xu30,31,
  • Pasi Liljeberg30,31 &
  • Hannu Tenhunen30,31 
  • Conference paper
  • 1073 Accesses

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 7156)

Abstract

In this paper, we study two hierarchical N-Body methods for Network-on-Chip (NoC) architectures. The modern Chip Multiprocessor (CMP) designs are mainly based on the shared-bus communication architecture. As the number of cores increases, it suffers from high communication delays. Therefore, NoC based architecture is proposed. The N-Body problem is a classical problem of approximating the motion of bodies. Two methods, namely Barnes-Hut (Barnes) and Fast Multipole (FMM), have been developed for fast simulation. The two algorithms have been implemented and studied in conventional computer systems and Graphics Processing Units (GPUs). However, as a promising unconventional multicore architecture, the evaluation of N-Body methods in a NoC platform has not been well addressed. We define a NoC model based on state-of-the-art systems. Evaluation results are presented using a cycle accurate full system simulator. Experiments show that, Barnes scales better (53.7x/Barnes and 36.6x/FMM for 64 processing elements) and requires less cache than FMM. However, we observe hot-spot traffic in Barnes. Our analysis and experiment results provide a guideline for studying N-Body methods in a NoC platform.

Keywords

  • Graphic Processing Unit
  • Memory Controller
  • Cache Coherence
  • Fast Multipole Method
  • Cache Bank

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This work is supported by Academy of Finland and Nokia Foundation. The authors would like to thank the anonymous reviewers for their feedback and suggestions.

Download conference paper PDF

References

  1. Dally, W.J., Towles, B.: Route packets, not wires: on-chip inteconnection networks. In: Proceedings of the 38th Conference on Design Automation, pp. 684–689 (June 2001)

    Google Scholar 

  2. Intel: Intel research areas on microarchitecture (May 2011), http://techresearch.intel.com/projecthome.aspx?ResearchAreaId=11

  3. Tilera: Tile-gx processor family (May 2011), http://www.tilera.com/products/processors/TILE-Gx_Family

  4. Aarseth, S.J., Henon, M., Wielen, R.: A comparison of numerical methods for the study of star cluster dynamics. Astronomy and Astrophysics 37, 183–187 (1974)

    Google Scholar 

  5. Perrone, L., Nicol, D.: Using n-body algorithms for interference computation in wireless cellular simulations. In: Proc. of 8th Int. Symp. on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, pp. 49–56 (2000)

    Google Scholar 

  6. Salmon, J.: Parallel n log n n-body algorithms and applications to astrophysics. In: Compcon Spring 1991, Digest of Papers, February-1 March, pp. 73–78 (1991)

    Google Scholar 

  7. Barnes, J., Hut, P.: A hierarchical o(n log n) force-calculation algorithm. Nature (1988)

    Google Scholar 

  8. Greengard, L.F.: The rapid evaluation of potential fields in particle systems. PhD thesis, New Haven, CT, USA (1987) AAI8727216

    Google Scholar 

  9. Holt, C., Singh, J.P.: Hierarchical n-body methods on shared address space multiprocessors. In: Proc. of 7th SIAM Conf. on PPSC (1995)

    Google Scholar 

  10. Singh, J.P., Hennessy, J.L., Gupta, A.: Implications of hierarchical n-body methods for multiprocessor architectures. ACM Tran. Comp. Sys. 13, 141–202 (1995)

    CrossRef  Google Scholar 

  11. Nyland, L., Harris, M., Prins, J.: Fast N-Body Simulation with CUDA. In: Nguyen, H. (ed.) GPU Gems 3. Addison Wesley Professional (August 2007)

    Google Scholar 

  12. Jetley, P., Wesolowski, L., Gioachin, F., Kalé, L., Quinn, T.: Scaling hierarchical n-body simulations on gpu clusters. In: SC 2010, pp. 1–11 (November 2010)

    Google Scholar 

  13. Hamada, T., Nitadori, K.: 190 tflops astrophysical n-body simulation on a cluster of gpus. In: SC 2010, pp. 1–9 (November 2010)

    Google Scholar 

  14. Tremblay, M., Chaudhry, S.: A third-generation 65nm 16-core 32-thread plus 32-scout-thread cmt sparc processor. In: ISSCC 2008, pp. 82–83 (February 2008)

    Google Scholar 

  15. Thoziyoor, S., Muralimanohar, N., Ahn, J.H., Jouppi, N.P.: Cacti 5.1. Technical Report HPL-2008-20, HP Labs

    Google Scholar 

  16. Global, H.: Ddr 3 sdram memory controller ip core (May 2011), http://www.hitechglobal.com/IPCores/DDR3Controller.htm

  17. Kim, C., Burger, D., Keckler, S.W.: An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches. In: ACM SIGPLAN, pp. 211–222 (October 2002)

    Google Scholar 

  18. Patel, A., Ghose, K.: Energy-efficient mesi cache coherence with pro-active snoop filtering for multicore microprocessors. In: Proceeding of the Thirteenth International Symposium on Low Power Electronics and Design, pp. 247–252 (August 2008)

    Google Scholar 

  19. Magnusson, P., Christensson, M., Eskilson, J., Forsgren, D., Hallberg, G., Hogberg, J., Larsson, F., Moestedt, A., Werner, B.: Simics: A full system simulation platform. Computer 35(2), 50–58 (2002)

    CrossRef  Google Scholar 

  20. Dejonghe, H.: A completely analytical family of anisotropic Plummer models. Royal Astronomical Society, Monthly Notices 224, 13–39 (1987)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

  1. Turku Center for Computer Science, Joukahaisenkatu 3-5 B, 20520, Turku, Finland

    Thomas Canhao Xu, Pasi Liljeberg & Hannu Tenhunen

  2. Department of Information Technology, University of Turku, 20014, Turku, Finland

    Thomas Canhao Xu, Pasi Liljeberg & Hannu Tenhunen

Authors
  1. Thomas Canhao Xu
    View author publications

    You can also search for this author in PubMed Google Scholar

  2. Pasi Liljeberg
    View author publications

    You can also search for this author in PubMed Google Scholar

  3. Hannu Tenhunen
    View author publications

    You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

  1. Scilytics, Koellnerhofgasse 3/15A, 1010, Vienna, Austria

    Michael Alexander

  2. ICAR-CNR, Via P. Castellino, 111, 80131, Napoli, Italy

    Pasqua D’Ambra

  3. University of Amsterdam, 1090, Amsterdam, Netherlands

    Adam Belloum

  4. Innovative Computing Laboratory, The University of Tennessee, US

    George Bosilca

  5. Department of Experimental Medicine and Clinic, University Magna Græcia, 88100, Catanzaro, Italy

    Mario Cannataro

  6. Computer Science Department, University of Pisa, Italy

    Marco Danelutto

  7. Second University of Naples, Italy

    Beniamino Di Martino

  8. TUMünchen,, Boltzmannstr. 3, ,, 85748, Garching, Germany

    Michael Gerndt

  9. Equipe Runtime, INRIA Bordeaux Sud-Ouest, 33405, Talence Cedex, France

    Emmanuel Jeannot & Raymond Namyst & 

  10. Equipe HIEPACS, INRIA Bordeaux Sud-Ouest, 33405, Talence Cedex, France

    Jean Roman

  11. Computer Science and Mathematics Division, Oak Ridge National Laboratory, 37831-6164, Oak Ridge, TN, USA

    Stephen L. Scott

  12. Department of Scientific Computing, University of Vienna, Nordbergstr. 15/3C, 1090, Vienna, Austria

    Jesper Larsson Traff

  13. Computer Science and Mathematics Division, Oak Ridge National Laboratory, 37831, Oak Ridge, TN, USA

    Geoffroy Vallée

  14. Technische Universität München, Germany

    Josef Weidendorfer

Rights and permissions

Reprints and Permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Xu, T.C., Liljeberg, P., Tenhunen, H. (2012). Study of Hierarchical N-Body Methods for Network-on-Chip Architectures. In: Alexander, M., et al. Euro-Par 2011: Parallel Processing Workshops. Euro-Par 2011. Lecture Notes in Computer Science, vol 7156. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29740-3_41

Download citation

  • .RIS
  • .ENW
  • .BIB
  • DOI: https://doi.org/10.1007/978-3-642-29740-3_41

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-29739-7

  • Online ISBN: 978-3-642-29740-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Search

Navigation

  • Find a journal
  • Publish with us

Discover content

  • Journals A-Z
  • Books A-Z

Publish with us

  • Publish your research
  • Open access publishing

Products and services

  • Our products
  • Librarians
  • Societies
  • Partners and advertisers

Our imprints

  • Springer
  • Nature Portfolio
  • BMC
  • Palgrave Macmillan
  • Apress
  • Your US state privacy rights
  • Accessibility statement
  • Terms and conditions
  • Privacy policy
  • Help and support

167.114.118.210

Not affiliated

Springer Nature

© 2023 Springer Nature