Probabilistic-Based Selection of Alternate Implementations for Heterogeneous Platforms

  • Javier FernándezEmail author
  • Andrés Sánchez Cuadrado
  • David del Rio Astorga
  • Manuel F. Dolz
  • J. Daniel García
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10393)


Over the last years, heterogeneous architectures have become a de facto approach for improving the performance of numerous scientific and industrial applications. However, developing for these architectures is not straightforward: each processor demands its specific programming paradigm and, often, certain applications are only well-suited to run on a particular processing unit. Therefore, a major challenge arises when programming for these platforms: to select the most suitable device and routine implementation to solve a given problem. To deal with this issue, this paper proposes a novel probabilistic-based selector that uses the problem size to automatically choose the most appropriate version of a same kernel. In order to analyze this approach, we have developed this selector within the OmpSs programming framework and evaluated its accuracy and performance gains when executing different implementations of the general matrix-matrix multiplication. Finally, we also demonstrate how this solution delivers a comparable performance with respect to a runtime approach from the state-of-the-art.


Implementation selector Heterogeneous platforms Auto-tuning Probabilistic modeling 


  1. 1.
  2. 2.
    Augonnet, C., Thibault, S., Namyst, R., Wacrenier, P.A.: StarPU: a unified platform for task scheduling on heterogeneous multicore architectures. Concurr. Comput.: Pract. Exper. 23(2), 187–198 (2011)CrossRefGoogle Scholar
  3. 3.
    Ayguadé, E., Badia, R.M., Bellens, P., Cabrera, D., Duran, A., Ferrer, R., Gonzàlez, M., Igual, F., Jiménez-González, D., Labarta, J., Martinell, L., Martorell, X., Mayo, R., Pérez, J.M., Planas, J., Quintana-Ortí, E.S.: Extending OpenMP to survive the heterogeneous multi-core era. Int. J. Parallel Prog. 38(5), 440–459 (2010)CrossRefzbMATHGoogle Scholar
  4. 4.
    Belikov, E., Deligiannis, P., Totoo, P., Aljabri, M., Loidl, H.W.: A survey of high-level parallel programming models. Technical report, HW-MACS-TR-0103, Department of Computer Science, Heriot-Watt University, December 2013Google Scholar
  5. 5.
    Brodtkorb, A.R., Dyken, C., Hagen, T.R., Hjelmervik, J.M., Storaasli, O.O.: State-of-the-art in heterogeneous computing. Sci. Program. 18(1), 1–33 (2010)Google Scholar
  6. 6.
    Dastgeer, U., Li, L., Kessler, C.: Adaptive implementation selection in the SkePU skeleton programming, library. In: Advanced Parallel Processing Technologies: 10th International Symposium, APPT 2013, Revised Selected Papers, Stockholm, Sweden, 27–28 August 2013, pp. 170–183 (2013)Google Scholar
  7. 7.
    Duran, A., Ayguadé, E., Badia, R.M., Labarta, J., Martinell, L., Martorell, X., Planas, J.: OmpSs: a proposal for programming heterogeneous multi-core architectures. Parallel Process. Lett. 21, 173–193 (2011)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Gough, B.: GNU Scientific Library Reference Manual, 3rd edn. Network Theory Ltd., Cambridge (2009)Google Scholar
  9. 9.
    Planas, J., Badia, R.M., Ayguad, E., Labarta, J.: Self-adaptive OmpSs tasks in heterogeneous environments. In: 2013 IEEE 27th International Symposium on Parallel and Distributed Processing, pp. 138–149, May 2013Google Scholar
  10. 10.
    del Rio Astorga, D., Dolz, M.F., Sanchez, L.M., Fernández, J., García, J.D.: An adaptive offline implementation selector for heterogeneous parallel platforms. Int. J. High Perform. Comput. Appl. (2017)Google Scholar
  11. 11.
    Shen, J., Varbanescu, A., Sips, H.: Look before you leap: using the right hardware resources to accelerate applications. In: IEEE International Conference on High Performance Computing and Communications, pp. 383–391, August 2014Google Scholar
  12. 12.
    Su, L.T.: Architecting the future through heterogeneous computing. In: 2013 IEEE International Solid-State Circuits Conference Digest of Technical Papers, pp. 8–11, February 2013Google Scholar
  13. 13.
    Tan, W.J., Tang, W.T., Goh, R., Turner, S., Wong, W.F.: A code generation framework for targeting optimized library calls for multiple platforms. IEEE Trans. Parallel Distribut. Syst. 26(7), 1789–1799 (2015)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Javier Fernández
    • 1
    Email author
  • Andrés Sánchez Cuadrado
    • 1
  • David del Rio Astorga
    • 1
  • Manuel F. Dolz
    • 1
  • J. Daniel García
    • 1
  1. 1.Computer Science and Engineering DepartmentUniversity Carlos III of MadridLeganésSpain

Personalised recommendations