The Impact of Parallel Programming Interfaces on Energy

Part of the SpringerBriefs in Computer Science book series (BRIEFSCOMPUTER)


Thread-level parallelism exploitation is being widely used to make the best use of hardware resources and improve performance. However, as discussed in Chap.  1, energy consumption has become an important issue. Therefore, the objective when designing parallel applications is not to simply improve performance but to do so with a minimal impact on energy consumption. In order to speed up the development process of parallel applications and make it as transparent as possible to the programmer, different PPIs are used (e.g., OpenMP, PThreads, or MPI). However, each one of these has different characteristics with respect to the management (i.e., creation and finalization of threads/processes), workload distribution, synchronization, and communication. Considering the aforementioned scenario, this chapter presents a comprehensive study of the opportunities for parallel computing regarding the most common parallel programming interfaces that exploit parallelism through shared variables (OpenMP and PThreads) or message passing (MPI-1 and MPI-2). Fourteen applications, classified according to their communication demands, were parallelized and executed on different embedded and general-purpose processors. Several metrics were used to evaluate the parallel programming interfaces and multicore processors, such as performance, energy, EDP, and the influence of the processor’s static power on the total energy consumption. The remainder of this chapter is organized as follows: First, the methodology used during the study is discussed, that is, the multicore architectures, benchmark suite, setup, and how the energy consumption was calculated. Then, the results are discussed. Finally, we conclude this study.


  1. 13.
    Blem, E., Menon, J., Sankaralingam, K.: Power struggles: revisiting the RISC vs. CISC debate on contemporary arm and ×86 architectures. In: 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA), pp. 1–12. IEEE, Piscataway (2013).
  2. 14.
    Browne, S., Dongarra, J., Garner, N., Ho, G., Mucci, P.: A portable programming interface for performance evaluation on modern processors. Int. J. High Perform. Comput. Appl. 14(3), 189–204 (2000). CrossRefGoogle Scholar
  3. 17.
    Butenhof, D.R.: Programming with POSIX Threads. Addison-Wesley Longman Publishing, Boston (1997)Google Scholar
  4. 19.
    Cera, M., Pezzi, G., Mathias, E., Maillard, N., Navaux, P.: Improving the dynamic creation of processes in MPI-2. In: Recent Advances in Parallel Virtual Machine and Message Passing Interface pp. 247–255. Springer, Berlin (2006)CrossRefGoogle Scholar
  5. 22.
    Chapman, B., Jost, G., Pas, R.v.d.: Using OpenMP: Portable Shared Memory Parallel Programming (Scientific and Engineering Computation). MIT Press, Cambridge, MA (2007)Google Scholar
  6. 35.
    Esmaeilzadeh, H., Blem, E., St. Amant, R., Sankaralingam, K., Burger, D.: Power limitations and dark silicon challenge the future of multicore. ACM Trans. Comput. Syst. 30(3), 11:1–11:27 (2012). CrossRefGoogle Scholar
  7. 36.
    Foster, I.: Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering. Addison-Wesley Longman Publishing, Boston (1995)zbMATHGoogle Scholar
  8. 38.
    Gropp, W., Lusk, E., Skjellum, A.: Using MPI (2Nd Ed.): Portable Parallel Programming with the Message-passing Interface. MIT Press, Cambridge (1999)CrossRefGoogle Scholar
  9. 44.
    Hoefler, T., Lumsdaine, A., Rehm, W.: Implementation and performance analysis of non-blocking collective operations for MPI. In: Proceedings of the 2007 ACM/IEEE Conference on Supercomputing, SC ’07, pp. 52:1–52:10. ACM, New York (2007).
  10. 60.
    Kontorinis, V., Shayan, A., Tullsen, D.M., Kumar, R.: Reducing peak power with a table-driven adaptive processor core. In: Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 42, pp. 189–200. ACM, New York (2009).
  11. 69.
    Lorenzon, A.F., Cera, M.C., Schneider Beck, A.C.: Performance and energy evaluation of different multi-threading interfaces in embedded and general purpose systems. J. Signal Process. Syst. 80(3), 295–307 (2015). CrossRefGoogle Scholar
  12. 74.
    Luk, C.K., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Reddi, V.J., Hazelwood, K.: Pin: Building customized program analysis tools with dynamic instrumentation. In: Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’05, pp. 190–200. ACM, New York (2005).
  13. 80.
    McVoy, L., Staelin, C.: Lmbench: Portable tools for performance analysis. In: Proceedings of the 1996 Annual Conference on USENIX Annual Technical Conference, ATEC ’96, pp. 23–23. USENIX Association, Berkeley (1996)Google Scholar
  14. 83.
    Nose, K., Sakurai, T.: Optimization of vdd and vth for low-power and high speed applications. In: Proceedings of the 2000 Asia and South Pacific Design Automation Conference, ASP-DAC ’00, pp. 469–474. ACM, New York (2000).
  15. 117.
    Tanenbaum, A.S.: Modern Operating Systems, 3rd edn. Prentice Hall, Upper Saddle River (2007)zbMATHGoogle Scholar

Copyright information

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Computer ScienceFederal University of Pampa (UNIPAMPA)AlegreteBrazil
  2. 2.Institute of Informatics, Campus do ValeFederal University of Rio Grande do Sul (UFRGS)Porto AlegreBrazil

Personalised recommendations