Analyzing the Energy and Power Consumption of Remote Memory Accesses in the OpenSHMEM Model

  • Siddhartha Jana
  • Oscar Hernandez
  • Stephen Poole
  • Chung-Hsing Hsu
  • Barbara M. Chapman
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8356)


PGAS models like OpenSHMEM provide interfaces to explicitly initiate one-sided remote memory accesses among processes. In addition, the model also provides synchronizing barriers to ensure a consistent view of the distributed memory at different phases of an application. The incorrect use of such interfaces affects the scalability achievable while using a parallel programming model. This study aims at understanding the effects of these constructs on the energy and power consumption behavior of OpenSHMEM applications. Our experiments show that cost incurred in terms of the total energy and power consumed depends on multiple factors across the software and hardware stack. We conclude that there is a significant impact on the power consumed by the CPU and DRAM due to multiple factors including the design of the data transfer patterns within an application, the design of the communication protocols within a middleware, the architectural constraints laid by the interconnect solutions, and also the levels of memory hierarchy within a compute node. This work motivates treating energy and power consumption as important factors while designing compute solutions for current and future distributed systems.


  1. 1.
    Intel 64 and ia-32 architectures software developers manual volume 3b: System programming guide, part 2Google Scholar
  2. 2.
    Thrifty: An exascale architecture for energy-proportional computing,
  3. 3.
    Linux tuning guide, amd opteron 6200 series processors (April 2012)Google Scholar
  4. 4.
    Nvml api reference manual, ver.5.319.43 (August 2013)Google Scholar
  5. 5.
    Aboughazaleh, N., Childers, B., Melhem, R., Craven, M.: Collaborative compiler-os power management for time-sensitive applications. Tech. rep., Department of Computer Science, University of Pittsburgh (2002)Google Scholar
  6. 6.
    Benedict, S.: Review: Energy-aware performance analysis methodologies for hpc architectures-an exploratory study. J. Netw. Comput. Appl. 35(6), 1709–1719 (2012), CrossRefGoogle Scholar
  7. 7.
    Choi, J.W., Bedard, D., Fowler, R., Vuduc, R.: A theoretical framework for algorithm-architecture co-design. In: Proc. IEEE Int’l. Parallel and Distributed Processing Symp. (IPDPS), Boston, MA, USA (May 2013)Google Scholar
  8. 8.
    David, H., Gorbatov, E., Hanebutte, U.R., Khanna, R., Le, C.: Rapl: Memory power estimation and capping. In: 2010 ACM/IEEE International Symposium on Low-Power Electronics and Design (ISLPED), pp. 189–194 (2010)Google Scholar
  9. 9.
    Dongarra, J., Ltaief, H., Luszczek, P., Weaver, V.M.: Energy Footprint of Advanced Dense Numerical Linear Algebra Using Tile Algorithms on Multicore Architectures. In: 2012 Second International Conference on Cloud and Green Computing, pp. 274–281 (2012),
  10. 10.
    High Performance Computing Tools group and at UH, Extreme Scale Systems Center at ORNL: Openshmem application programming interface, version 1.0. Tech. rep., University of Houston (UH), Oak Ridge National Laboratory, ORNL (2012),
  11. 11.
    Hackenberg, D., Ilsche, T., Schone, R., Molka, D., Schmidt, M., Nagel, W.: Power measurement techniques on standard compute nodes: A quantitative comparison. In: 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 194–204 (2013)Google Scholar
  12. 12.
    Hoefler, T.: Software and hardware techniques for power-efficient hpc networking. Computing in Science Engineering 12(6), 30–37 (2010)CrossRefGoogle Scholar
  13. 13.
    Kerrisk, M.: Linux programmer’s manual (2012),
  14. 14.
    Knüpfer, A., Brunst, H., Doleschal, J., Jurenz, M., Lieber, M., Mickler, H., Müller, M., Nagel, W.: The vampir performance analysis tool-set. In: Resch, M., Keller, R., Himmler, V., Krammer, B., Schulz, A. (eds.) Tools for High Performance Computing, pp. 139–155. Springer, Heidelberg (2008), CrossRefGoogle Scholar
  15. 15.
    Korthikanti, V.A., Agha, G.: Towards optimizing energy costs of algorithms for shared memory architectures. In: Proceedings of the 22nd ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2010, p. 157 (2010),
  16. 16.
    Li, D., de Supinski, B.R., Schulz, M., Cameron, K., Nikolopoulos, D.S.: Hybrid MPI/OpenMP power-aware computing. In: 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), pp. 1–12 (2010),
  17. 17.
    Markatos, E., Crovella, M., Das, P., Dubnicki, C., LeBlanc, T.: The effects of multiprogramming on barrier synchronization. In: Proceedings of the Third IEEE Symposium on Parallel and Distributed Processing, pp. 662–669 (1991)Google Scholar
  18. 18.
    Mucci, P.J., Browne, S., Deane, C., Ho, G.: Papi: A portable interface to hardware performance counters. In: Proceedings of the Department of Defense HPCMP Users Group Conference, pp. 7–10 (1999)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Siddhartha Jana
    • 1
  • Oscar Hernandez
    • 2
  • Stephen Poole
    • 2
  • Chung-Hsing Hsu
    • 2
  • Barbara M. Chapman
    • 1
  1. 1.HPCTools, Computer Science DepartmentUniversity of HoustonHoustonUSA
  2. 2.Computer Science and Mathematics DivisionOak Ridge National LaboratoryOak RidgeUSA

Personalised recommendations