Advertisement

Parallel Stream Processing with MPI for Video Analytics and Data Visualization

  • Adriano VogelEmail author
  • Cassiano Rista
  • Gabriel Justo
  • Endrius Ewald
  • Dalvan Griebler
  • Gabriele Mencagli
  • Luiz Gustavo Fernandes
Conference paper
  • 2 Downloads
Part of the Communications in Computer and Information Science book series (CCIS, volume 1171)

Abstract

The amount of data generated is increasing exponentially. However, processing data and producing fast results is a technological challenge. Parallel stream processing can be implemented for handling high frequency and big data flows. The MPI parallel programming model offers low-level and flexible mechanisms for dealing with distributed architectures such as clusters. This paper aims to use it to accelerate video analytics and data visualization applications so that insight can be obtained as soon as the data arrives. Experiments were conducted with a Domain-Specific Language for Geospatial Data Visualization and a Person Recognizer video application. We applied the same stream parallelism strategy and two task distribution strategies. The dynamic task distribution achieved better performance than the static distribution in the HPC cluster. The data visualization achieved lower throughput with respect to the video analytics due to the I/O intensive operations. Also, the MPI programming model shows promising performance outcomes for stream processing applications.

Keywords

Parallel programming Stream parallelism Distributed processing Cluster 

Notes

Acknowledgements

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nivel Superior - Brasil (CAPES) - Finance Code 001, by the FAPERGS 01/2017-ARD project ParaElastic (No. 17/2551-0000871-5), and by the Universal MCTIC/CNPq N\(^{\circ }\) 28/2018 project called SParCloud (No. 437693/2018-0).

References

  1. 1.
    Aldinucci, M., Danelutto, M., Kilpatrick, P., Torquati, M.: Fastflow: High-level and Efficient Streaming on Multicore, Chap. 13, pp. 261–280. Wiley-Blackwell, Hoboken (2014)Google Scholar
  2. 2.
    Andrade, H., Gedik, B., Turaga, D.: Fundamentals of Stream Processing: Application Design, Systems, and Analytics. Cambridge University Press, Cambridge (2014)CrossRefGoogle Scholar
  3. 3.
    Ayachit, U.: The ParaView Guide: A Parallel Visualization Application. Kitware Inc., New York (2015)Google Scholar
  4. 4.
    De Matteis, T., Mencagli, G.: Proactive elasticity and energy awareness in data stream processing. J. Syst. Softw. 127(C), 302–319 (2017).  https://doi.org/10.1016/j.jss.2016.08.037CrossRefGoogle Scholar
  5. 5.
    Ewald, E., Vogel, A., Rista, C., Griebler, D., Manssour, I., Gustavo, L.: Parallel and distributed processing support for a geospatial data visualization DSL. In: Symposium on High Performance Computing Systems (WSCAD), pp. 221–228. IEEE (2018)Google Scholar
  6. 6.
    FastFlow: FastFlow (FF) Website (2019). http://mc-fastflow.sourceforge.net/. Accessed Feb 2019
  7. 7.
    Friedman, E., Tzoumas, K.: Introduction to Apache Flink: Stream Processing for Real Time and Beyond, 1st edn. O’Reilly Media Inc., Sebastopol (2016)Google Scholar
  8. 8.
    Georges, A., Buytaert, D., Eeckhout, L.: Statistically rigorous java performance evaluation. SIGPLAN Not. 42(10), 57–76 (2007).  https://doi.org/10.1145/1297105.1297033CrossRefGoogle Scholar
  9. 9.
    Griebler, D., Danelutto, M., Torquati, M., Fernandes, L.G.: SPar: a DSL for high-level and productive stream parallelism. Parallel Process. Lett. 27(01), 1740005 (2017).  https://doi.org/10.1142/S0129626417400059MathSciNetCrossRefGoogle Scholar
  10. 10.
    Griebler, D., Hoffmann, R.B., Danelutto, M., Fernandes, L.G.: Higher-level parallelism abstractions for video applications with SPar. In: Parallel Computing is Everywhere, Proceedings of the International Conference on Parallel Computing, ParCo 2017, pp. 698–707. IOS Press, Bologna (2017).  https://doi.org/10.3233/978-1-61499-843-3-698
  11. 11.
    Griebler, D., Hoffmann, R.B., Danelutto, M., Fernandes, L.G.: Stream Parallelism with ordered data constraints on multi-core systems. J. Supercomput. 75, 1–20 (2018).  https://doi.org/10.1007/s11227-018-2482-7CrossRefGoogle Scholar
  12. 12.
    Hirzel, M., Soulé, R., Schneider, S., Gedik, B., Grimm, R.: A catalog of stream processing optimizations. ACM Comput. Surv. 46(4), 46:1–46:34 (2014)CrossRefGoogle Scholar
  13. 13.
    Jain, A.: Mastering Apache Storm: Real-time Big Data Streaming Using Kafka, Hbase and Redis. Packt Publishing, Birmingham (2017)Google Scholar
  14. 14.
    Latham, R., Bautista-Gomez, L., Balaji, P.: Portable topology-aware MPI-I/O. In: IEEE International Conference on Parallel and Distributed Systems (ICPADS), pp. 710–719, December 2017.  https://doi.org/10.1109/ICPADS.2017.00096
  15. 15.
    Ledur, C., Griebler, D., Manssour, I., Fernandes, L.G.: A high-level DSL for geospatial visualizations with multi-core parallelism support. In: 41th IEEE Computer Society Signature Conference on Computers, Software and Applications, COMPSAC 2017, pp. 298–304. IEEE, Torino (2017)Google Scholar
  16. 16.
    Matteis, T.D., Mencagli, G.: Keep calm and react with foresight: strategies for low-latency and energy-efficient elastic data stream processing. In: Proceedings of the ACM Symposium on Principles and Practice of Parallel Programming, pp. 13:1–13:12 (2016)Google Scholar
  17. 17.
    McCool, M., Robison, A.D., Reinders, J.: Structured Parallel Programming: Patterns for Efficient Computation. Morgan Kaufmann, Burlington (2012)Google Scholar
  18. 18.
    Mendez, S., Rexachs, D., Luque, E.: Analyzing the parallel I/O severity of MPI applications. In: IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 953–962 (2017).  https://doi.org/10.1109/CCGRID.2017.45
  19. 19.
    Moreland, K.: A survey of visualization pipelines. IEEE Trans. Visual Comput. Graph. 19(3), 367–378 (2013)CrossRefGoogle Scholar
  20. 20.
    Pereira, R., Azambuja, M., Breitman, K., Endler, M.: An architecture for distributed high performance video processing in the cloud. In: international Conference on Cloud Computing, pp. 482–489. IEEE (2010)Google Scholar
  21. 21.
    Perrot, A., Bourqui, R., Hanusse, N., Lalanne, F., Auber, D.: Large interactive visualization of density functions on big data infrastructure. In: IEEE Symposium on Large Data Analysis and Visualization (LDAV), pp. 99–106, October 2015.  https://doi.org/10.1109/LDAV.2015.7348077
  22. 22.
    Quinn, M.J.: Parallel Programming in C with MPI and OpenMP. McGraw-Hill, New York (2003)Google Scholar
  23. 23.
    Reinders, J.: Intel Threading Building Blocks: Outfitting C++ for Multi-core Processor Parallelism. O’Reilly Media, Sebastopol (2007)Google Scholar
  24. 24.
    Seo, S., Latham, R., Zhang, J., Balaji, P.: Implementation and evaluation of MPI nonblocking collective I/O. In: IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 1084–1091, May 2015.  https://doi.org/10.1109/CCGrid.2015.81
  25. 25.
    Steed, C.A., et al.: Big data visual analytics for exploratory earth system simulation analysis. Comput. Geosci. 61, 71–82 (2013).  https://doi.org/10.1016/j.cageo.2013.07.025CrossRefGoogle Scholar
  26. 26.
    Tan, H., Chen, L.: An approach for fast and parallel video processing on apache Hadoop clusters. In: 2014 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6, July 2014.  https://doi.org/10.1109/ICME.2014.6890135
  27. 27.
    Theis, T.N., Wong, H.S.P.: The end of Moore’s law: a new beginning for information technology. Comput. Sci. Eng. 19(2), 41 (2017)CrossRefGoogle Scholar
  28. 28.
    Thies, W., Karczmarek, M., Amarasinghe, S.: StreamIt: a language for streaming applications. In: Horspool, R.N. (ed.) CC 2002. LNCS, vol. 2304, pp. 179–196. Springer, Heidelberg (2002).  https://doi.org/10.1007/3-540-45937-5_14CrossRefGoogle Scholar
  29. 29.
    Vogel, A., Griebler, D., De Sensi, D., Danelutto, M., Fernandes, L.G.: Autonomic and latency-aware degree of parallelism management in SPar. In: Mencagli, G., et al. (eds.) Euro-Par 2018. LNCS, vol. 11339, pp. 28–39. Springer, Cham (2019).  https://doi.org/10.1007/978-3-030-10549-5_3CrossRefGoogle Scholar
  30. 30.
    Wylie, B.N., Baumes, J.: A unified toolkit for information and scientific visualization. In: VDA, p. 72430 (2009)Google Scholar
  31. 31.
    Zhang, T., Hua, G., Ligmann-Zielinska, A.: Visually-driven parallel solving of multi-objective land-use allocation problems: a case study in Chelan, Washington. Earth Sci. Inf. 8, 809–825 (2015)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.School of TechnologyPontifical Catholic University of Rio Grande do SulPorto AlegreBrazil
  2. 2.Department of Computer ScienceUniversity of PisaPisaItaly
  3. 3.Laboratory of Advanced Research on Cloud Computing (LARCC), Três de Maio Faculty (SETREM)Três de MaioBrazil

Personalised recommendations