Low Power High Performance Computing on Arm System-on-Chip in Astrophysics

Taffoni, Giuliano; Bertocco, Sara; Coretti, Igor; Goz, David; Ragagnin, Antonio; Tornatore, Luca

doi:10.1007/978-3-030-32520-6_33

Low Power High Performance Computing on Arm System-on-Chip in Astrophysics

Giuliano Taffoni¹⁷,
Sara Bertocco¹⁷,
Igor Coretti¹⁷,
David Goz¹⁷,
Antonio Ragagnin¹⁷ &
…
Luca Tornatore¹⁷

Conference paper
First Online: 13 October 2019

1321 Accesses
1 Citations

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1069))

Abstract

In this paper, we quantitatively evaluate the impact of computation on the energy consumption on Arm MPSoC platforms, exploiting both CPUs and embedded GPUs. Performance and energy measures are made on a direct N-body code, a real scientific application from the astrophysical domain. The time-to-solutions, energy-to-solutions and energy delay product using different software configurations are compared with those obtained on a general purpose x86 desktop and PCIe GPGPU. With this work, we investigate the possibility of using commodity single boards based on Arm MPSoC as an HPC computational resource for real Astrophysical production runs. Our results show to which extent those boards can be used and which modification are necessary to a production code to profit of them. A crucial finding of this work is the effect of the emulated double precision on the GPU performances that allow to use embedded and gaming GPUs as excellent HPC resources.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
https://www.ibm.com/blogs/systems/ibm-nvidia-present-nvlink-server-youve-waiting/.
2.
Roofline is a visually intuitive performance model used to bound the performance of various numerical methods and operations running on multicore, manycore, or accelerator processor architectures.
3.
https://www.geeks3d.com/20140305/amd-radeon-and-nvidia-geforce-fp32-fp64-gflops-table-computing/.

References

Ammendola, R., Biagioni, A., Cretaro, P., Frezza, O., Cicero, F.L., et al.: The next generation of Exascale-class systems: the ExaNeSt project. In: Euromicro Conference on Digital System Design (DSD), Vienna, pp. 510–515 (2017). http://dx.doi.org/10.1109/DSD.2017.20
Arm Mali GPU OpenCL Developer Guide, Version 3 (2016). http://infocenter.arm.com/help/topic/com.arm.doc.100614_0300_00_en/arm_mali_gpu_opencl_developer_guide_100614_0300_00_en.pdf
Gaster, B., Howes, L.W., Kaeli, D.R., Mistry, P., Schaa, D.: Heterogeneous Computing with OpenCL - Revised OpenCL 1.2 Edition. Morgan Kaufmann (2013)
Google Scholar
Berczik, P., Nitadori, K., Zhong, S., Spurzem, R., Hamada, T., Wang, X., Berentzen, I., Veles, A., Ge, W.: High performance massively parallel direct N-body simulations on large GPU clusters. In: International conference on High Performance Computing, Kyiv, Ukraine, 8–10 October 2011, pp. 8–18 (2011)
Google Scholar
Bonomi, F., Milito, R., Zhu, J., Addepalli, S.: Fog computing and its role in the internet of things. In: Proceedings of the First Edition of the MCC Workshop on Mobile Cloud Computing - MCC -12, p. 13. ACM Press, New York (2012). http://dx.doi.org/10.1145/2342509.2342513
Cameron, K.W., Ge, R., Feng, X., Varner, D., Jones, C.: High-performance, power-aware distributed computing framework. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage, and Analysis (SC). ACM/IEEE (2004)
Google Scholar
Capuzzo-Dolcetta, R., Spera, M.: A performance comparison of different graphics processing units running direct N-body simulations. Comput. Phys. Commun. 184, 2528–2539 (2013)
Article Google Scholar
Doucet, K., Zhang, J.: Learning cluster computing by creating a Raspberry Pi cluster. In: Proceedings of the SouthEast Conference, ACM SE 2017, pp. 191–194 (2017). http://dx.doi.org/10.1145/3077286.3077324
Durand, Y., Carpenter, P.M., Adami, S., Bilas, A., Dutoit, D., et al.: EUROSERVER: energy efficient node for European micro-servers. In: 17th Euromicro Conference on Digital System Design, Verona, pp. 206–213 (2014). https://doi.org/10.1109/DSD.2014.15
Farber, R.: Parallel Programming with OpenACC, 1st edn. Morgan Kaufmann Publishers Inc., San Francisco (2016)
Google Scholar
Goz, D., Tornatore, L., Bertocco, S., Taffoni, G.: Direct N-body code designed for heterogeneous platforms. In: INAF-OATs Technical Report, vol. 223, July 2018. http://dx.doi.org/10.20371/INAF/PUB/2018_00002
Harfst, S., Gualandris, A., Merritt, D., Spurzem, R., Portegies, Z.S., Berczik, P.: Performance analysis of direct N-body algorithms on special-purpose supercomputers. New Astron. 12, 357–377 (2007)
Article Google Scholar
Katevenis, M., Chrysos, N., Marazakis, M., Mavroidis, I., Chaix, F., Kallimanis, N., et al.: The ExaNeSt project: interconnects, storage, and packaging for exascale systems. In: 2016 Euromicro Conference on Digital System Design (DSD), Limassol, pp. 60–67 (2016)
Google Scholar
Katevenis, M., Ammendola, R., Biagioni, A., Cretaro, P., Frezza, O., Lo, C.F., et al.: Next generation of Exascale-class systems: ExaNeSt project and the status of its interconnect and storage development. Microprocess. Microsyst. 61, 58–71 (2018)
Article Google Scholar
Keller, M., Beutel, J., Thiele, L.: Demo abstract: mountainview precision image sensing on high-alpine locations. In: Pesch, D., Das, S. (Eds.) Adjunct Proceedings of the 6th European Workshop on Sensor Networks, EWSN, Cork, pp. 15–16 (2009)
Google Scholar
Kobayashi, H.: Feasibility study of a future HPC system for memory-intensive applications: final report. In: Resch, M., Bez, W., Focht, E., Kobayashi, H., Patel, N. (eds.) Sustained Simulation Performance 2014. Springer, Cham (2014)
Google Scholar
Kogge, P., Bergman, K., Borkar, S., Campbell, D., Carson, W., Dally, W., Denneau, M., Franzon, P., Harrod, W., Hill, K., et al.: Exascale computing study: technology challenges in achieving exascale systems. Technical report, University of NotreDame, CSE Department (2008)
Google Scholar
Konstantinidis, S., Kokkotas, K.: MYRIAD: a new N-body code for simulations of star clusters. Astron. Astrophys. 522, A70 (2010)
Article Google Scholar
Mantovani, F., Calore, E.: Performance and power analysis of HPC workloads on heterogeneous multi-node clusters. J. Low Power Electron. Appl. 8(2) (2018). http://www.mdpi.com/2079-9268/8/2/13
Article Google Scholar
Martinez, K., Basford, P.J., DeJager, D., Hart, J.K.: Using a heterogeneous sensor network to monitor glacial movement. In: 10th European Conference on Wireless Sensor Networks, Ghent, Belgium (2013)
Google Scholar
Nitadori, K., Aarseth, S.J.: Accelerating NBODY6 with graphics processing units. MNRAS 424, 545–552 (2012)
Article Google Scholar
Nitadori, K., Makino, J.: Sixth- and eighth-order Hermite integrator for N-body simulations. New Astron. 13, 498–507 (2008)
Article Google Scholar
Nickolls, J., Buck, I., Garland, M., Skadron, K.: Scalable parallel programming with CUDA. Queue 6(2), 40–53 (2008). https://doi.org/10.1145/1365490.1365500
Article Google Scholar
Ou, Z., Pang, B., Deng, Y., Nurminen, J., Yla-Jaaski, A., Hui, P.: Energy- and cost-efficiency analysis of ARM-based clusters. In: 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGrid 2012, pp. 115–123 (2012)
Google Scholar
Rajovic, N., Rico, A., Puzovic, N., Adeniyi-Jones, C., Ramirez, A.: Tibidabo: making the case for an ARM-based HPC system. Future Gener. Comput. Syst. 36 322–334 (2014). http://dx.doi.org/10.1016/J.FUTURE.2013.07.013
Article Google Scholar
Spera, M.: Using Graphics Processing Units to solve the classical N-body problem in physics and astrophysics. ArXiv e-prints 1411.5234 (2014)
Spera, M., Capuzzo-Dolcetta, R.: Rapid mass segregation in small stellar clusters. Astrophys. Space Sci. 362(12), 12 (2017). article id 233
Google Scholar
Terpstra, D., Jagode, H., You, H., Dongarra, J.: Collecting performance datawith papi-c. In: Muller, M.S., Resch, M.M., Schulz, A., Nagel, W.E. (eds.) Tools for High Performance Computing 2009, pp. 157–173. Springer, Heidelberg (2009)
Google Scholar
Thall, A.: Extended-precision floating-point numbers for GPU computation, p. 52 (2006). https://doi.org/10.1145/1179622.1179682
Turton, P., Turton, T.F.: Pibrain’a cost-effective supercomputer for educational use. In: 5th Brunei International Conference on Engineering and Technology, BICET 2014, pp. 1–4 (2014)
Google Scholar
Upton, E., Halfacree, G.: Raspberry Pi User Guide, 4th ed. Wiley (2016)
Google Scholar
Yoneki, E.: Demo: RasPiNET: decentralised communication and sensing platform with satellite connectivity. In: Proceedings of the 9th ACM MobiCom Workshop on Challenged Networks - CHANTS -14. ACM Press, New York, pp. 81–84 (2014). http://dx.doi.org/10.1145/2645672.2645691

Download references

Acknowledgments

This work was carried out within the ExaNeSt (FET-HPC) project (Grant no. 671553), the ASTERICS project (Grant no. 653477) and EuroExa (FET-HPC) project (Grant no. 754337) funded by the European Union’s Horizon 2020 research and innovation programme.

Author information

Authors and Affiliations

National Institute of Astrophysics, Astronomical Observatory of Trieste, via G.B. Tiepolo 11, Trieste, Italy
Giuliano Taffoni, Sara Bertocco, Igor Coretti, David Goz, Antonio Ragagnin & Luca Tornatore

Authors

Giuliano Taffoni
View author publications
You can also search for this author in PubMed Google Scholar
Sara Bertocco
View author publications
You can also search for this author in PubMed Google Scholar
Igor Coretti
View author publications
You can also search for this author in PubMed Google Scholar
David Goz
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Ragagnin
View author publications
You can also search for this author in PubMed Google Scholar
Luca Tornatore
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sara Bertocco .

Editor information

Editors and Affiliations

Faculty of Science and Engineering, Saga University, Saga, Japan
Kohei Arai
The Science and Information (SAI) Organization, Bradford, West Yorkshire, UK
Rahul Bhatia
The Science and Information (SAI) Organization, Bradford, West Yorkshire, UK
Supriya Kapoor

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Taffoni, G., Bertocco, S., Coretti, I., Goz, D., Ragagnin, A., Tornatore, L. (2020). Low Power High Performance Computing on Arm System-on-Chip in Astrophysics. In: Arai, K., Bhatia, R., Kapoor, S. (eds) Proceedings of the Future Technologies Conference (FTC) 2019. FTC 2019. Advances in Intelligent Systems and Computing, vol 1069. Springer, Cham. https://doi.org/10.1007/978-3-030-32520-6_33

Download citation

DOI: https://doi.org/10.1007/978-3-030-32520-6_33
Published: 13 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32519-0
Online ISBN: 978-3-030-32520-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics