Abstract
Distributed stream processing engines (DSPEs) provides various stateless stream partitioning to select the receiver tasks for each message regardless of the data fields. A representative DSPE, Apache Storm, provides the polarized stateless stream partitioning: Shuffle grouping considering the fairness only and Local-or-Shuffle grouping considering the locality only. The recently proposed Locality Aware grouping is a novel technique to solve this polarization. However, it is difficult to select an appropriate stream partitioning method considering various configurations of distributed stream applications, network capacity, and data size. In this paper, we benchmark the stateless stream partitioning methods from the perspective of different network bandwidths. To change bandwidths, we experiment on the most widely used the usual Ethernet equipment and the recent InfiniBand, a high-performance network equipment. We can use the benchmark results as the selection criteria for choosing the appropriate stream partitioning method according to the network bandwidth.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Toshniwal A et al (2014) Storm @Twitter. In: Proceedings of the international conference on management of data, ACM SIGMOD, Snowbird, Utah, pp 147–156
Iqbal MH, Soomro TR (2015) Big data analysis: apache storm perspective. Proc Int J Comput Trends Technol (IJCTT) 19(1):9–14
Son S et al (2018) Locality aware traffic distribution in Apache storm for energy analytics platform. In: Proceedings of IEEE international conference on Big Data and smart computing, Shanghai, China, pp 721–724
Lu X et al (2013) High-performance design of hadoop RPC with RDMA over InfiniBand. In: Proceedings of the IEEE 42nd international conerence on parallel processing (ICPP), Lyon, France, pp 641–650
Huang J et al (2012) High-performance design of HBase with RDMA over InfiniBand. In: Proceedings of the IEEE 26th international parallel and distributed processing symposium (IPDPS), Shanghai, China, pp 774–785
EsperTech Esper, https://www.espertech.com/esper/.
Shoro AG, Soomro TR (2015) Big Data analysis: apache spark perspective. Proc Glob J Comput Sci Technol 15(1)
Chintapalli S et al (2016) Benchmarking streaming computation engines: storm, flink and spark streaming. In: Proceedings of the IEEE international parallel and distributed processing symposium workshops (IPDPSW), Chicago, IL, pp 1789–1792
van der Veen JS et al (2015) Dynamically scaling apache storm for the analysis of streaming data. In: Proceedings of the IEEE 1st international conference on Big Data computing service and applications, Redwood City, CA, pp 154–161
Batyuk A, Voityshyn V (2016) Apache storm based on topology for real-time processing of streaming data from social networks. In: Proceedings of the IEEE 1st international conference on data stream mining & processing (DSMP), Lviv, Ukraine, pp 345–349
Acknowledgements
This research was partly supported by Korea Electric Power Corporation (Grant number:R18XA05). This research was also partly supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. NRF-2019R1A2C1085311).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Son, S., Moon, YS. (2021). A Benchmark Test for Stateless Stream Partitioning Over Distributed Network Environments. In: Park, J.J., Loia, V., Pan, Y., Sung, Y. (eds) Advanced Multimedia and Ubiquitous Engineering. Lecture Notes in Electrical Engineering, vol 716. Springer, Singapore. https://doi.org/10.1007/978-981-15-9309-3_9
Download citation
DOI: https://doi.org/10.1007/978-981-15-9309-3_9
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-9308-6
Online ISBN: 978-981-15-9309-3
eBook Packages: EngineeringEngineering (R0)