Skip to main content

Influence of Parallelism Property of Streaming Engines on Their Performance

  • Conference paper
  • First Online:
New Trends in Databases and Information Systems (ADBIS 2016)

Abstract

Recent developments in Big Data are increasingly focusing on supporting computations in higher data velocity environments, including processing of continuous data streams in support of the discovery of valuable insights in real-time. In this work we investigate performance of streaming engines, specifically we address a problem of identifying optimal parameters that may affect the throughput (messages processed/second) and the latency (time to process a message). These parameters are also function of the parallelism property, i.e. a number of additional parallel tasks (threads) available to support parallel computation. In experimental evaluation we identify optimal cluster performance by balancing the degree of parallelism with number of nodes, which yield maximum throughput with minimum latency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Apache Hadoop, http://hadoop.apache.org/.

  2. 2.

    Spotify Labs, https://labs.spotify.com/2015/01/05/how-spotify-scales-apache-storm/

References

  1. Bedini, I., Sakr, S., Theeten, B., Sala, A., Cogan, P.: Modeling performance of a parallel streaming engine: Bridging theory and costs. In: Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering, pp. 173–184. ICPE 2013, NY, USA. ACM, New York (2013). http://doi.acm.org/10.1145/2479871.2479895

  2. Casale, G., Ustinova, T.: State of the art analysis (2015)

    Google Scholar 

  3. Lohrmann, B., Janacik, P., Kao, O.: Elastic stream processing with latency guarantees (2015)

    Google Scholar 

  4. Lohrmann, B., Warneke, D., Kao, O.: Nephele streaming: stream processing under qos constraints at scale. Cluster Comput. 17(1), 61–78 (2014). http://dx.doi.org/10.1007/s10586-013-0281-8

    Article  Google Scholar 

  5. Neumeyer, L., Robbins, B., Nair, A., Kesari, A.: S4: Distributed stream computing platform. In: 2010 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 170–177, December 2010

    Google Scholar 

  6. da Silva Morais, T.: Survey on frameworks for distributed computing: Hadoop, spark and storm (2015)

    Google Scholar 

  7. Theeten, B., Bedini, I., Cogan, P., Sala, A., Cucinotta, T.: Towards the optimization of a parallel streaming engine for telco applications. Bell Labs Techn. J. 18(4), 181–197 (2014)

    Article  Google Scholar 

  8. Toshniwal, A., Taneja, S., Shukla, A., Ramasamy, K., Patel, J.M., Kulkarni, S., Jackson, J., Gade, K., Fu, M., Donham, J., et al.: Storm@ twitter. In: Proceedings of the 2014 ACM SIGMOD international conference on Management of data, pp. 147–156. ACM (2014)

    Google Scholar 

  9. Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation, pp. 2–2. USENIX Association (2012)

    Google Scholar 

  10. Zaharia, M., Das, T., Li, H., Hunter, T., Shenker, S., Stoica, I.: Discretized streams: Fault-tolerant streaming computation at scale. In: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, pp. 423–438. ACM (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bela Stantic .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Franciscus, N., Milosevic, Z., Stantic, B. (2016). Influence of Parallelism Property of Streaming Engines on Their Performance. In: Ivanović, M., et al. New Trends in Databases and Information Systems. ADBIS 2016. Communications in Computer and Information Science, vol 637. Springer, Cham. https://doi.org/10.1007/978-3-319-44066-8_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-44066-8_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-44065-1

  • Online ISBN: 978-3-319-44066-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics