Dynamic Load Balancing Techniques for Distributed Complex Event Processing Systems

  • Nikos Zacheilas
  • Nikolas Zygouras
  • Nikolaos Panagiotou
  • Vana Kalogeraki
  • Dimitrios Gunopulos
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9687)

Abstract

Applying real-time, cost-effective Complex Event processing (CEP) in the cloud has been an important goal in recent years. Distributed Stream Processing Systems (DSPS) have been widely adopted by major computing companies such as Facebook and Twitter for performing scalable event processing in streaming data. However, dynamically balancing the load of the DSPS’ components can be particularly challenging due to the high volume of data, the components’ state management needs, and the low latency processing requirements. Systems should be able to cope with these challenges and adapt to dynamic and unpredictable load changes in real-time. Our approach makes the following contributions: (i) we formulate the load balancing problem in distributed CEP systems as an instance of the job-shop scheduling problem, and (ii) we present a novel framework that dynamically balances the load of CEP engines in real-time and adapts to sudden changes in the volume of streaming data by exploiting two balancing policies. Our detailed experimental evaluation using data from the Twitter social network indicates the benefits of our approach in the system’s throughput.

References

  1. 1.
    Brito, A., Fetzer, C., Felber, P.: Multithreading-enabled active replication for event stream processing operators. In: SRDS, Niagara Falls, New York, USA (2009)Google Scholar
  2. 2.
    Fernandez, R.C., Migliavacca, M., Kalyvianaki, E., Pietzuch, P.: Integrating scale out and fault tolerance in stream processing using operator state management. In: SIGMOD, New York, NY, USA (2013)Google Scholar
  3. 3.
    Coffman, E.G., Bruno, J.L.: Computer and Job-Shop Scheduling Theory. Wiley, New York (1976)MATHGoogle Scholar
  4. 4.
    Demers, A., Gehrke, J., Hong, M., Riedewald, M., White, W.: Towards expressive publish/subscribe systems. In: Ioannidis, Y., Scholl, M.H., Schmidt, J.W., Matthes, F., Hatzopoulos, M., Böhm, K., Kemper, A., Grust, T., Böhm, C. (eds.) EDBT 2006. LNCS, vol. 3896, pp. 627–644. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  5. 5.
    Gedik, B., Schneider, S., Hirzel, M., Wu, K.L.: Elastic scaling for data stream processing. IEEE Trans. Parallel Distrib. Syst. 25(6), 1447–1463 (2014)CrossRefGoogle Scholar
  6. 6.
    Gufler, B., Augsten, N., Reiser, A., Kemper, A.: Handling data skew in MapReduce. In: CLOSER, Noordwijkerhout, The Netherlands (2011)Google Scholar
  7. 7.
    Gulisano, V., Jimenez-Peris, R., Patino-Martinez, M., Soriente, C., Valduriez, P.: Streamcloud: an elastic and scalable data streaming system. IEEE Trans. Parallel Distrib. Syst. 23(12), 2351–2365 (2012)CrossRefGoogle Scholar
  8. 8.
    Heinze, T., Jerzak, Z., Hackenbroich, G., Fetzer, C.: Latency-aware elastic scaling for distributed data stream processing systems. In: DEBS, Mumbai, India (2014)Google Scholar
  9. 9.
    Heinze, T., Zia, M., Krahn, R., Jerzak, Z., Fetzer, C.: An adaptive replication scheme for elastic data stream processing systems. In: DEBS, Oslo, Norway (2015)Google Scholar
  10. 10.
    Jia, Y., Brondino, I., Peris, R.J., Martínez, M.P., Ma, D.: A multi-resource load balancing algorithm for cloud cache systems. In: Proceedings of the 28th Annual ACM Symposium on Applied Computing (2013)Google Scholar
  11. 11.
    McCreadie, R., Macdonald, C., Ounis, I., Osborne, M., Petrovic, S.: Scalable distributed event detection for twitter. In: BigData, Santa Clara, CA, USA (2013)Google Scholar
  12. 12.
    Nasir, M.A.U., Morales, G.D.F., García-Soriano, D., Kourtellis, N., Serafini, M.: The power of both choices: practical load balancing for distributed stream processing engines. In: ICDE, Seoul, Korea (2015)Google Scholar
  13. 13.
    Petrović, S., Osborne, M., Lavrenko, V.: Streaming first story detection with application to twitter. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 181–189 (2010)Google Scholar
  14. 14.
    Rivetti, N., Querzoni, L., Anceaume, E., Busnel, Y., Sericola, B.: Efficient key grouping for near-optimal load balancing in stream processing systems. In: DEBS, Oslo, Norway (2015)Google Scholar
  15. 15.
    Schneider, S., Hirzel, M., Gedik, B., Wu, K.L.: Auto-parallelizing stateful distributed streaming applications. In: PACT, Minneapolis, MN, USA (2012)Google Scholar
  16. 16.
    Schultz-Møller, N.P., Migliavacca, M., Pietzuch, P.: Distributed complex event processing with query rewriting. In: DEBS, Nashville, Tennessee, USA (2009)Google Scholar
  17. 17.
    Tatbul, N., Çetintemel, U., Zdonik, S.: Staying fit: efficient load shedding techniques for distributed stream processing. In: VLDB, Vienna, Austria (2007)Google Scholar
  18. 18.
    Zacheilas, N., Kalogeraki, V., Zygouras, N., Panagiotou, N., Gunopulos, D.: Elastic complex event processing exploiting prediction. In: BigData, Santa Clara, CA, USA (2015)Google Scholar
  19. 19.
    Zygouras, N., Zacheilas, N., Kalogeraki, V., Kinane, D., Gunopulos, D.: Insights on a scalable and dynamic traffic management system. In: EDBT, Brussels, Belgium (2015)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2016

Authors and Affiliations

  • Nikos Zacheilas
    • 1
  • Nikolas Zygouras
    • 2
  • Nikolaos Panagiotou
    • 2
  • Vana Kalogeraki
    • 1
  • Dimitrios Gunopulos
    • 2
  1. 1.Athens University of Economics and BusinessAthensGreece
  2. 2.University of AthensAthensGreece

Personalised recommendations