Skip to main content

Auto-Scaling in Data Stream Processing Applications: A Model-Based Reinforcement Learning Approach

  • Conference paper
  • First Online:
New Frontiers in Quantitative Methods in Informatics (InfQ 2017)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 825))

Included in the following conference series:

Abstract

By exploiting on-the-fly computation, Data Stream Processing (DSP) applications can process huge volumes of data in a near real-time fashion. Adapting the application parallelism at run-time is critical in order to guarantee a proper level of QoS in face of varying workloads. In this paper, we consider Reinforcement Learning based techniques in order to self-configure the number of parallel instances for a single DSP operator. Specifically, we propose two model-based approaches and compare them to the baseline Q-learning algorithm. Our numerical investigations show that the proposed solutions provide better performance and faster convergence than the baseline.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Since we assume the action to be executed at the beginning of a time period, the number of instances during an interval is \(k+a\).

  2. 2.

    http://chriswhong.com/open-data/foil_nyc_taxi/.

References

  1. Cardellini, V., Lo Presti, F., Nardelli, M., Russo Russo, G.: Optimal operator deployment and replication for elastic distributed data stream processing. Concurr. Comput. 30(9), e4334 (2018). https://doi.org/10.1002/cpe.4334

    Article  Google Scholar 

  2. De Matteis, T., Mencagli, G.: Elastic scaling for distributed latency-sensitive data stream operators. In: Proceedings of PDP 2017, pp. 61–68 (2017)

    Google Scholar 

  3. Fernandez, R.C., Migliavacca, M., Kalyvianaki, E., Pietzuch, P.: Integrating scale out and fault tolerance in stream processing using operator state management. In: Proceedings of ACM SIGMOD 2013, pp. 725–736 (2013)

    Google Scholar 

  4. Gedik, B., Schneider, S., Hirzel, M., Wu, K.L.: Elastic scaling for data stream processing. IEEE Trans. Parallel Distrib. Syst. 25(6), 1447–1463 (2014)

    Article  Google Scholar 

  5. Heinze, T., Pappalardo, V., Jerzak, Z., Fetzer, C.: Auto-scaling techniques for elastic data stream processing. In: Proceedings of IEEE ICDEW 2014, pp. 296–302 (2014). https://doi.org/10.1109/ICDEW.2014.6818344

  6. Heinze, T., Aniello, L., Querzoni, L., Jerzak, Z.: Cloud-based data stream processing. In: Proceedings of ACM DEBS 2014, pp. 238–245 (2014)

    Google Scholar 

  7. Hirzel, M., Soulé, R., Schneider, S., Gedik, B., Grimm, R.: A catalog of stream processing optimizations. ACM Comput. Surv. 46(4), 46:1–46:34 (2014)

    Article  Google Scholar 

  8. Lohrmann, B., Janacik, P., Kao, O.: Elastic stream processing with latency guarantees. In: Proceedings of IEEE ICDCS 2015, pp. 399–410 (2015)

    Google Scholar 

  9. Lorido-Botran, T., Miguel-Alonso, J., Lozano, J.A.: A review of auto-scaling techniques for elastic applications in cloud environments. J. Grid Comput. 12(4), 559–592 (2014). https://doi.org/10.1007/s10723-014-9314-7

    Article  Google Scholar 

  10. Mastronarde, N., van der Schaar, M.: Fast reinforcement learning for energy-efficient wireless communication. IEEE Trans. Signal Process. 59(12), 6262–6266 (2011). https://doi.org/10.1109/TSP.2011.2165211

    Article  MathSciNet  Google Scholar 

  11. Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (2014)

    MATH  Google Scholar 

  12. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)

    Google Scholar 

  13. Tesauro, G., Jong, N.K., Das, R., Bennani, M.N.: On the use of hybrid reinforcement learning for autonomic resource allocation. Cluster Comput. 10(3), 287–299 (2007). https://doi.org/10.1007/s10586-007-0035-6

    Article  Google Scholar 

  14. Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992). https://doi.org/10.1007/BF00992698

    Article  MATH  Google Scholar 

  15. Yoon, K.P., Hwang, C.L.: Multiple Attribute Decision Making: An Introduction, vol. 104. Sage Publications, Thousand Oaks (1995)

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Valeria Cardellini .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cardellini, V., Lo Presti, F., Nardelli, M., Russo Russo, G. (2018). Auto-Scaling in Data Stream Processing Applications: A Model-Based Reinforcement Learning Approach. In: Balsamo, S., Marin, A., Vicario, E. (eds) New Frontiers in Quantitative Methods in Informatics. InfQ 2017. Communications in Computer and Information Science, vol 825. Springer, Cham. https://doi.org/10.1007/978-3-319-91632-3_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-91632-3_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-91631-6

  • Online ISBN: 978-3-319-91632-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics