Skip to main content

Benchmarking Distributed Stream Processing Platforms for IoT Applications

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 10080))

Abstract

Internet of Things (IoT) is a technology paradigm where millions of sensors monitor, and help inform or manage, physical, environmental and human systems in real-time. The inherent closed-loop responsiveness and decision making of IoT applications makes them ideal candidates for using low latency and scalable stream processing platforms. Distributed Stream Processing Systems (DSPS) are becoming essential components of any IoT stack, but the efficacy and performance of contemporary DSPS have not been rigorously studied for IoT data streams and applications. Here, we develop a benchmark suite and performance metrics to evaluate DSPS for streaming IoT applications. The benchmark includes 13 common IoT tasks classified across functional categories and forming micro-benchmarks, and two IoT applications for statistical summarization and predictive analytics that leverage various dataflow patterns of DSPS. These are coupled with stream workloads from real IoT observations on smart cities. We validate the benchmark for the popular Apache Storm DSPS, and present the results.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://aws.amazon.com/iot/how-it-works/.

  2. 2.

    https://www.microsoft.com/en-in/server-cloud/internet-of-things/overview.aspx.

  3. 3.

    http://map.datacanvas.org.

  4. 4.

    http://www.debs2015.org/call-grand-challenge.html/.

  5. 5.

    https://github.com/dream-lab/bm-iot.

  6. 6.

    Application runtime \(= \frac{7 \,\mathrm{days}\,\times \, 24\,{\text{ h }}\,\times \, 60 \,\mathrm{min}\, \times \, 60 \,{\text{ s }}}{1000\,\times \,scaling}secs = 10.08\,\mathrm{min}\).

References

  1. Data Canvas Dataset. http://datacanvas.org/sense-your-city/

  2. Apache Flink. https://flink.apache.org/features.html/, April 2015

  3. Agrawal, D., et al.: SparkBench – a spark performance testing suite. In: Nambiar, R., Poess, M. (eds.) TPCTC 2015. LNCS, vol. 9508, pp. 26–44. Springer, Cham (2016). doi:10.1007/978-3-319-31409-9_3

    Chapter  Google Scholar 

  4. Ali, M.I., Gao, F., Mileo, A.: CityBench: a configurable benchmark to evaluate RSP engines using smart city datasets. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9367, pp. 374–389. Springer, Heidelberg (2015). doi:10.1007/978-3-319-25010-6_25

    Chapter  Google Scholar 

  5. Aman, S., Simmhan, Y., Prasanna, V.K.: Holistic measures for evaluating prediction models in smart grids. IEEE TKDE 27(2), 475–488 (2015)

    Google Scholar 

  6. Arasu, A., Cherniack, M., Galvez, E., Maier, D., Maskey, A.S., Ryvkina, E., Stonebraker, M., Tibbetts, R.: Linear road: a stream data management benchmark. In: VLDB (2004)

    Google Scholar 

  7. Arlitt, M., Marwah, M., Bellala, G., Shah, A., Healey, J., Vandiver, B.: IoTAbench: an internet of things analytics benchmark. In: ICPE (2015)

    Google Scholar 

  8. Balazinska, M., Balakrishnan, H., Madden, S.R., Stonebraker, M.: Fault-tolerance in the borealis distributed stream processing system. ACM TODS (2008)

    Google Scholar 

  9. Chen, J., DeWitt, D.J., Tian, F., Wang, Y.: Niagaracq: a scalable continuous query system for internet databases. ACM SIGMOD Rec. 29(2), 379–390 (2000)

    Article  Google Scholar 

  10. Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: ACM SoCC, pp. 143–154. ACM (2010)

    Google Scholar 

  11. Donovan, B., Work, D.B.: Using coarse GPS data to quantify city-scale transportation system resilience to extreme events. In: Transportation Research Board 94th Annual Meeting (2014)

    Google Scholar 

  12. Ghazal, A., Rabl, T., Hu, M., Raab, F., Poess, M., Crolotte, A., Jacobsen, H.A.: Bigbench: towards an industry standard benchmark for big data analytics. In: ACM SIGMOD (2013)

    Google Scholar 

  13. Huang, S., Huang, J., Dai, J., Xie, T., Huang, B.: The Hibench benchmark suite: characterization of the MapReduce-based data analysis. In: IEEE ICDEW (2010)

    Google Scholar 

  14. Lu, R., Wu, G., Xie, B., Hu, J.: Stream bench: towards benchmarking modern distributed stream computing frameworks. In: IEEE/ACM UCC (2014)

    Google Scholar 

  15. Nabi, Z., Bouillet, E., Bainbridge, A., Thomas, C.: Of streams and storms. Technical report, IBM (2014)

    Google Scholar 

  16. Nambiar, R.O., Poess, M.: The making of TPC-DS. In: VLDB (2006)

    Google Scholar 

  17. Suhothayan, S., Gajasinghe, K., Loku Narangoda, I., Chaturanga, S., Perera, S., Nanayakkara, V.: Siddhi: a second look at complex event processing architectures. In: ACM Workshop on Gateway Computing Environments (2011)

    Google Scholar 

  18. Toshniwal, A., Taneja, S., Shukla, A., Ramasamy, K., Patel, J.M., Kulkarni, S., Jackson, J., Gade, K., Fu, M., Donham, J., et al.: Storm@ twitter. In: ACM SIGMOD, pp. 147–156 (2014)

    Google Scholar 

  19. Wolf, G.: The data-driven life. The New York Times Magazine (2010)

    Google Scholar 

  20. Zaharia, M., Das, T., Li, H., Shenker, S., Stoica, I.: Discretized streams: an efficient and fault-tolerant model for stream processing on large clusters. In: USENIX Hot Cloud (2012)

    Google Scholar 

Download references

Acknowledgement

This work was supported by grants from the Robert Bosch Center for Cyber Physical Systems (RBCCPS) at IISc, DeitY and Microsoft Azure.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anshu Shukla .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Shukla, A., Simmhan, Y. (2017). Benchmarking Distributed Stream Processing Platforms for IoT Applications. In: Nambiar, R., Poess, M. (eds) Performance Evaluation and Benchmarking. Traditional - Big Data - Internet of Things. TPCTC 2016. Lecture Notes in Computer Science(), vol 10080. Springer, Cham. https://doi.org/10.1007/978-3-319-54334-5_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-54334-5_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-54333-8

  • Online ISBN: 978-3-319-54334-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics