Advertisement

A Multi-level Elasticity Framework for Distributed Data Stream Processing

  • Matteo NardelliEmail author
  • Gabriele Russo Russo
  • Valeria Cardellini
  • Francesco Lo Presti
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11339)

Abstract

Data Stream Processing (DSP) applications should be capable to efficiently process high-velocity continuous data streams by elastically scaling the parallelism degree of their operators so to deal with high variability in the workload. Moreover, to efficiently use computing resources, modern DSP frameworks should seamlessly support infrastructure elasticity, which allows to exploit resources available on-demand in geo-distributed Cloud and Fog systems. In this paper we propose E2DF, a framework to autonomously control the multi-level elasticity of DSP applications and the underlying computing infrastructure. E2DF revolves around a hierarchical approach, with two control layers that work at different granularity and time scale. At the lower level, fully decentralized Operator and Region managers control the reconfiguration of distributed DSP operators and resources. At the higher level, centralized managers oversee the overall application and infrastructure adaptation. We have integrated the proposed solution into Apache Storm, relying on a previous extension we developed, and conducted an experimental evaluation. It shows that, even with simple control policies, E2DF can improve resource utilization without application performance degradation.

Keywords

Data Stream Processing Elasticity Hierarchical control 

References

  1. 1.
    de Assunção, M.D., da Silva Veith, A., Buyya, R.: Distributed data stream processing and edge computing: a survey on resource elasticity and future directions. J. Netw. Comput. Appl. 103, 1–17 (2018)CrossRefGoogle Scholar
  2. 2.
    Cardellini, V., Lo Presti, F., Nardelli, M., Russo Russo, G.: Decentralized self-adaptation for elastic data stream processing. Future Gener. Comput. Syst. 87, 171–185 (2018)CrossRefGoogle Scholar
  3. 3.
    Cardellini, V., Lo Presti, F., Nardelli, M., Russo Russo, G.: Optimal operator deployment and replication for elastic distributed data stream processing. Concurr. Comput. Pract. Exp. 30(9), e4334 (2018)CrossRefGoogle Scholar
  4. 4.
    Cardellini, V., Grassi, V., Lo Presti, F., Nardelli, M.: Distributed QoS-aware scheduling in storm. In: Proceedings of ACM DEBS 2015, pp. 344–347 (2015)Google Scholar
  5. 5.
    Chen, T., Bahsoon, R., Yao, X.: A survey and taxonomy of self-aware and self-adaptive cloud autoscaling systems. ACM Comput. Surv. 51, 61 (2018)Google Scholar
  6. 6.
    Fernandez, R.C., Migliavacca, M., Kalyvianaki, E., Pietzuch, P.: Integrating scale out and fault tolerance in stream processing using operator state management. In: Proceedings of ACM SIGMOD 2013, pp. 725–736 (2013)Google Scholar
  7. 7.
    Gedik, B., Schneider, S., Hirzel, M., Wu, K.L.: Elastic scaling for data stream processing. IEEE Trans. Parallel Distrib. Syst. 25(6), 1447–1463 (2014)CrossRefGoogle Scholar
  8. 8.
    Gulisano, V., Jiménez-Peris, R., Patiño Martínez, M., Soriente, C., Valduriez, P.: StreamCloud: an elastic and scalable data streaming system. IEEE Trans. Parallel Distrib. Syst. 23(12), 2351–2365 (2012)CrossRefGoogle Scholar
  9. 9.
    Heinze, T., Roediger, L., Meister, A., Ji, Y., et al.: Online parameter optimization for elastic data stream processing. In: Proceedings of ACM SoCC 2015, pp. 276–287 (2015)Google Scholar
  10. 10.
    Liu, X., Dastjerdi, A.V., Calheiros, R.N., Qu, C., Buyya, R.: A stepwise auto-profiling method for performance optimization of streaming applications. ACM Trans. Auton. Adapt. Syst. 12(4), 24 (2018)Google Scholar
  11. 11.
    Lohrmann, B., Janacik, P., Kao, O.: Elastic stream processing with latency guarantees. In: Proceedings of IEEE ICDCS 2015, pp. 399–410 (2015)Google Scholar
  12. 12.
    Lombardi, F., Aniello, L., Bonomi, S., Querzoni, L.: Elastic symbiotic scaling of operators and resources in stream processing systems. IEEE Trans. Parallel Distrib. Syst. 29(3), 572–585 (2018)CrossRefGoogle Scholar
  13. 13.
    Mencagli, G.: A game-theoretic approach for elastic distributed data stream processing. ACM Trans. Auton. Adapt. Syst. 11(2), 13 (2016)CrossRefGoogle Scholar
  14. 14.
    Sajjad, H.P., Danniswara, K., Al-Shishtawy, A., Vlassov, V.: SpanEdge: towards unifying stream processing over central and near-the-edge data centers. In: Proceedings of 2016 IEEE/ACM Symposium on Edge Computing, pp. 168–178 (2016)Google Scholar
  15. 15.
    Weyns, D., et al.: On patterns for decentralized control in self-adaptive systems. In: de Lemos, R., Giese, H., Müller, H.A., Shaw, M. (eds.) Software Engineering for Self-Adaptive Systems II. LNCS, vol. 7475, pp. 76–107. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-35813-5_4CrossRefGoogle Scholar
  16. 16.
    Xu, L., Peng, B., Gupta, I.: Stela: enabling stream processing systems to scale-in and scale-out on-demand. In: Proceedings of IEEE IC2E 2016, pp. 22–31 (2016)Google Scholar
  17. 17.
    Zhang, Q., Zhang, Q., Shi, W., Zhong, H.: Firework: data processing and sharing for hybrid cloud-edge analytics. IEEE Trans. Parallel Distrib. Syst. 29(9), 2004–2017 (2018)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Matteo Nardelli
    • 1
    Email author
  • Gabriele Russo Russo
    • 1
  • Valeria Cardellini
    • 1
  • Francesco Lo Presti
    • 1
  1. 1.Department of Civil Engineering and Computer Science EngineeringUniversity of Rome Tor VergataRomeItaly

Personalised recommendations