Data stream processing via code annotations

  • Marco Danelutto
  • Tiziano De Matteis
  • Gabriele Mencagli
  • Massimo Torquati
Article

Abstract

Time-to-solution is an important metric when parallelizing existing code. The REPARA approach provides a systematic way to instantiate stream and data parallel patterns by annotating the sequential source code with \({\mathtt {C}}\)++\({\mathtt {11}}\) attributes. Annotations are automatically transformed in a target parallel code that uses existing libraries for parallel programming (e.g., FastFlow). In this paper, we apply this approach for the parallelization of a data stream processing application. The description shows the effectiveness of the approach in easily and quickly prototyping several parallel variants of the sequential code by obtaining good overall performance in terms of both throughput and latency.

Keywords

Code annotations Parallel patterns Data stream processing 

References

  1. 1.
    Andrade H, Gedik B, Turaga D (2014) Fundamentals of stream processing. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  2. 2.
    Cugola G, Margara A (2012) Processing flows of information: From data stream to complex event processing. ACM Comput Surv 44(3):15:1–15:62Google Scholar
  3. 3.
    Castro Fernandez R, Migliavacca M, Kalyvianaki E, Pietzuch P (2013) Integrating scale out and fault tolerance in stream processing using operator state management. In: Proc. of the 2013 ACM SIGMOD international conference on management of data, SIGMOD ’13. ACM, New York, pp 725–736Google Scholar
  4. 4.
    Chapman B, Jost G, Pas Rvd (2007) Using OpenMP: portable shared memory parallel programming (scientific and engineering computation). The MIT Press, USAGoogle Scholar
  5. 5.
    Danelutto M, De Matteis T., Mencagli G, Torquati M (2015) Parallelizing high-frequency trading applications by using c++11 attributes. In: Proc. of the 1st IEEE Inter. workshop on reengineering for parallelism in heterogeneous parallel platformsGoogle Scholar
  6. 6.
    Danelutto M, Garcia JD, Sanchez LM, Sotomayor R, Torquati, M (2016) Introducing parallelism by using repara c++11 attributes. In: Proc. of the 17th Euromicro PDP 2016: parallel distributed and network-based processing. IEEE, CreteGoogle Scholar
  7. 7.
    Danelutto M, Torquati M (2015) Structured parallel programming with “core” fastflow. In: Zsók V, Horváth Z, Csató L (eds) Central European functional programming school. vol 8606, Springer, LNCS, pp 29–75Google Scholar
  8. 8.
    De Matteis T, Mencagli G (2016) Keep calm and react with foresight: strategies for low- latency and energy-efficient elastic data stream processing. In: Proceedings of the 21th ACM SIGPLAN symposium on principles and practice of parallel programming, PPoPP 2016. ACM, New YorkGoogle Scholar
  9. 9.
    Enterprise C, Inc. (2011) C, NVIDIA, the Portland Group: The OpenACC Application Programming Interface, v1.0aGoogle Scholar
  10. 10.
    FastFlow website (2015). http://mc-fastflow.sourceforge.net/
  11. 11.
    Gulisano V, Jimenez-Peris R, Patino-Martinez M, Soriente C, Valduriez P (2012) Streamcloud: An elastic and scalable data streaming system. IEEE Trans Parallel Distrib Syst 23(12):2351–2365CrossRefGoogle Scholar
  12. 12.
    IBM Infosphere Streams website (2015). http://www-03.ibm.com/software/products/en/ibm-streams
  13. 13.
    Apache Spark Streaming website (2015). https://spark.apache.org/streaming
  14. 14.
    Apache Storm website (2015). https://storm.apache.org
  15. 15.
    Intel\(\textregistered \) TBB website (2015). http://threadingbuildingblocks.org
  16. 16.
    Leijen D, Schulte W, Burckhardt S (2009) The design of a task parallel library. In: Proc. of the 24th ACM SIGPLAN conference on object oriented programming systems languages and applications, OOPSLA ’09, ACM, New York, pp 227–242Google Scholar
  17. 17.
    Blumofe RD, Joerg CF, Kuszmaul BC, Leiserson CE, Randall KH, Zhou Y (1995) Cilk: an efficient multithreaded runtime system. SIGPLAN Not 30(8):207–216CrossRefGoogle Scholar
  18. 18.
    Kramer P, Egloff D, Blaser L (2016) The alea reactive dataflow system for gpu parallelization. In: Proc. of the HLGPU 2016 Workshop, HiPEAC 2016, PragueGoogle Scholar
  19. 19.
    REPARA website (2016). http://repara-project.eu/
  20. 20.
    ISO/IEC (2011) Information technology—Programming languages—C++. International Standard ISO/IEC 14882:20111, ISO/IEC, GenevaGoogle Scholar
  21. 21.
    REPARA Project Deliverable, “D2.1: REPARA C++ Open Specification document” (2015)Google Scholar
  22. 22.
    Andrade H, Gedik B, Wu KL, Yu PS (2011) Processing high data rate streams in system s. J Parallel Distrib Comput 71(2):145–156CrossRefGoogle Scholar
  23. 23.
    Babcock B, Babu S, Datar M, Motwani R, Widom J (2002) Models and issues in data stream systems. In: Proc. of the 21st ACM SIGMOD-SIGACT-SIGART Symp. on principles of database systems, PODS ’02, ACM, New York, pp 1–16Google Scholar
  24. 24.
    Aldinucci M, Campa S, Danelutto M, Kilpatrick P, Torquati M (2014) Design patterns percolating to parallel programming framework implementation. Int J Parallel Program 42(6):1012–1031CrossRefGoogle Scholar
  25. 25.
    Balkesen C, Tatbul N (2011) Scalable data partitioning techniques for parallel sliding window processing over data streams. In: VLDB Inter. workshop on data management for sensor networks (DMSN’11), SeattleGoogle Scholar
  26. 26.
    Mattson T, Sanders B, Massingill B (2004) Patterns for parallel programming, 1st edn. Addison-Wesley Professional, USAGoogle Scholar
  27. 27.
    Thies W, Karczmarek M, Amarasinghe SP (2002) Streamit: a language for streaming applications. In: Proc. of the 11th Inter. conference on compiler construction, CC ’02. Springer-Verlag, London, pp 179–196Google Scholar
  28. 28.
    REPARA Project Deliverable, “D2.2: Static analysis techniques for AIR generation”. Available at: http://repara-project.eu/
  29. 29.
    REPARA Project Deliverable, “D3.3: Static partitioning tool” (2015)Google Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • Marco Danelutto
    • 1
  • Tiziano De Matteis
    • 1
  • Gabriele Mencagli
    • 1
  • Massimo Torquati
    • 1
  1. 1.Department of Computer ScienceUniversity of PisaPisaItaly

Personalised recommendations