Challenging the abstraction penalty in parallel patterns libraries

Adding FastFlow support to GrPPI

Abstract

In the last years, pattern-based programming has been recognized as a good practice for efficiently exploiting parallel hardware resources. Following this approach, multiple libraries have been designed for providing such high-level abstractions to ease the parallel programming. However, those libraries do not share a common interface. To pave the way, GrPPI has been designed for providing an intermediate abstraction layer between application developers and existing parallel programming frameworks like OpenMP, Intel TBB or ISO C++ threads. On the other hand, FastFlow has been adopted as an efficient object-based programming framework that may benefit from being supported as an additional GrPPI backend. However, the object-based approach presents some major challenges to be incorporated under the GrPPI type safe functional programming style. In this paper, we present the integration of FastFlow as a new GrPPI backend to demonstrate that structured parallel programming frameworks perfectly fit the GrPPI design. Additionally, we also demonstrate that GrPPI does not incur in additional overheads for providing its abstraction layer, and we study the programmability in terms of lines of code and cyclomatic complexity. In general, the presented work acts as reciprocal validation of both FastFlow (as an efficient, native structured parallel programming framework) and GrPPI (as an efficient abstraction layer on top of existing parallel programming frameworks).

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Notes

  1. 1.

    This pattern is not the titled Google’s MapReduce, which exploits key-value pairs to compute problems that can be parallelized by mapping a function over a given data set or stream of data, and then combining the results.

  2. 2.

    GrPPI assumes all stream parallel computations are pipelines, whose first stage acts as stream generators and the last stage acts as stream absorber. In both cases, the stages may anyway implement some kind of computation on the generated/absorbed stream items.

  3. 3.

    FastFlow provides two different Farm patterns: the ordered farm pattern preserves the input/output data stream ordering; the normal farm does not preserve it, for all those computations that, as an example, directly store results in shared memory and the order of the write/updates does not matter.

References

  1. 1.

    Aldinucci M, Danelutto M, Drocco M, Kilpatrick P, Peretti Pezzi G, Torquati M (2015) The loop-of-stencil-reduce paradigm. In: Proceedings of International Workshop on Reengineering for Parallelism in Heterogeneous Parallel Platforms (RePara). IEEE, Helsinki, Finland, pp 172–177

  2. 2.

    Aldinucci M, Danelutto M, Kilpatrick P, Meneghin M, Torquati M (2012) An efficient unbounded lock-free queue for multi-core systems. In: Euro-Par 2012 Parallel Processing: 18th International Conference, Euro-Par 2012, Rhodes Island, Greece. Springer, New York, pp 662–673

  3. 3.

    Aldinucci M, Peretti Pezzi G, Drocco M, Spampinato C, Torquati M (2015) Parallel visual data restoration on multi-GPGPUs using stencil-reduce pattern. Int J High Perform Comput Appl 29:461–472

    Article  Google Scholar 

  4. 4.

    Asanovic K, Bodik R, Demmel J, Keaveny T, Keutzer K, Kubiatowicz J, Morgan N, Patterson D, Sen K, Wawrzynek J, Wessel D, Yelick K (2009) A view of the parallel computing landscape. Commun ACM 52(10):56–67

    Article  Google Scholar 

  5. 5.

    Danelutto M, Torquati M (2015) Structured parallel programming with “core” fastflow. In: Zsók V, Horváth Z, Csató L (eds) Central European Functional Programming School, LNCS, vol 8606, Springer, New York, pp 29–75

  6. 6.

    del Rio Astorga D, Dolz MF, Fernández J, García JD (2017) A generic parallel pattern interface for stream and data processing. Concurr Comput Pract Exp 29:e4175

    Article  Google Scholar 

  7. 7.

    Ernsting S, Kuchen H (2014) A scalable farm skeleton for hybrid parallel and distributed programming. Int J Parallel Program 42(6):968–987

    Article  Google Scholar 

  8. 8.

    Ernstsson A, Li L, Kessler C (2017) Skepu2: flexible and type-safe skeleton programming for heterogeneous parallel systems. Int J Parallel Program

  9. 9.

    Excess home page (2017) http://www.excess-project.eu/. Accessed Sept 2018

  10. 10.

    FastFlow home page (2017) http://calvados.di.unipi.it/. Accessed Sept 2018

  11. 11.

    GrPPI github (2017) https://github.com/arcosuc3m/grppi. Accessed Sept 2018

  12. 12.

    Haidi M, Gorlatch S (2018) High-level programming for many-cores using C++14 and the STL. Int J Parallel Program 46:23–41

    Article  Google Scholar 

  13. 13.

    Kessler C, Gorlatch S, Enmyren J, Dastgeer U, Steuwer M, Kegel P (2017) Skeleton programming for portable ManyCore computing. In: Programming multicore and manycore computing systems. Wiley, Hoboken

  14. 14.

    Microsoft Parallel Pattern Library home page (2017) https://msdn.microsoft.com/en-us/library/dd492418.aspx. Accessed Sept 2018

  15. 15.

    OpenMP home page (2017) http://www.openmp.org/. Accessed Sept 2018

  16. 16.

    Repara home page (2017) http://repara-project.eu/. Accessed Sept 2018

  17. 17.

    Rephrase home page (2017) https://rephrase-ict.eu. Accessed Sept 2018

  18. 18.

    Rephrase Project Technical Report. D2.5 Advanced Pattern Set (2017) https://rephraseeu.weebly.com/uploads/3/1/0/9/31098995/d2-5.pdf. Accessed Sept 2018

  19. 19.

    Rephrase Project Technical Report. D2.1. Report on Initial Pattern Set (2017) https://rephraseeu.weebly.com/uploads/3/1/0/9/31098995/d2-1.pdf. Accessed Sept 2018

  20. 20.

    TBB home page (2017) https://www.threadingbuildingblocks.org/. Accessed Sept 2018

  21. 21.

    Wong M, Garcia JD, Keryell R (2018) Supporting Pipelines in C++. Working Paper P1261R0, ISO/IEC JTC1/SC22/WG21

  22. 22.

    Yin T (2018) Lizard: an cyclomatic complexity analyzer tool online; Accessed 10 Nov 2018

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to J. Daniel Garcia.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work has been partially supported by the European Commission EU H2020-ICT-2014-1 Project RePhrase (No. 644235) and by the Spanish Ministry of Economy and Competitiveness through TIN2016-79637-P “Towards Unification of HPC and Big Data Paradigms”.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Garcia, J.D., del Rio, D., Aldinucci, M. et al. Challenging the abstraction penalty in parallel patterns libraries. J Supercomput 76, 5139–5159 (2020). https://doi.org/10.1007/s11227-019-02826-5

Download citation

Keywords

  • Parallel design patterns
  • Data-intensive computing
  • Stream computing
  • Algorithmic skeletons