Abstract
Consistently, massive volumes of sensory, transactional, and web information are persistently produced as streams, which need to be investigated online as they arrive. The arrival rate of the big data stream may vary after some time. Scheduling plays a key part in big data streaming applications in a big data stream computing environment. In this paper, optimal scheduling is proposed on big data streams to handle the incomplete and delayed information. In this process, big data stream is taken as an input. The input big data stream consists of number of data streams and each data stream consists of number of tasks. Initially, the input big data stream is analyzed and the task is selected by calculating the features such as volatility, Hurst exponent, and distance. Enthalpy value is then computed based on the extracted feature for each data streams and the computed enthalpy value is taken as a feedback ID. Finally, krill herd optimization algorithm is used for the optimal scheduling of tasks based on the generated feedback ID. The results have shown that our proposed model outperformed popular scheduling algorithms in terms of computational time, schedule time, and throughput.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
M. Grzenda, K. Kwasiborska, T. Zaremba, Combining stream mining and neural networks for short term delay prediction, in International Joint Conference SOCO’17-CISIS’17-ICEUTE’17, León, Spain, 6–8 Sept 2017, Proceeding (Springer, Cham, 2017), pp. 188–197
Y. Qin, Q.Z. Sheng, N.J.G. Falkner, S. Dustdar, H. Wang, A.V. Vasilakos, When things matter: a survey on data-centric internet of things. J. Netw. Comput. Appl. 64, 137–153 (2016)
D. Sun, G. Zhang, C. Wu, K. Li, W. Zheng, Building a fault tolerant framework with deadline guarantee in big data stream computing environments. J. Comput. Syst. Sci. 89, 4–23 (2017)
S.K. Sharma, X. Wang, Live data analytics with collaborative edge and cloud processing in wireless IoT networks. IEEE Access 5, 4621–4635 (2017)
F. Fu, D.S. Turaga, O. Verscheure, M. van der Schaar, L. Amini, Configuring competing classifier chains in distributed stream mining systems. IEEE J. Sel. Top. Signal Process. 1(4), 548–563 (2007)
H. Wang, W. Fan, P.S. Yu, J. Han, Mining concept-drifting data streams using ensemble classifiers, in Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM (2003), pp. 226–235
S. Parsons, Current approaches to handling imperfect information in data and knowledge bases. IEEE Trans. Knowl. Data Eng. 8(3), 353–372 (1996)
Z. Tang, L. Jiang, J. Zhou, K. Li, K. Li, A self-adaptive scheduling algorithm for reduce start time. Future Gener. Comput. Syst. 43, 51–60 (2015)
R. Sandhu, S.K. Sood, Scheduling of big data applications on distributed cloud based on QoS parameters. Cluster Comput. 18(2), 817–828 (2015)
S. Fong, R. Wong, A.V. Vasilakos, Accelerated PSO swarm search feature selection for data stream mining big data. IEEE Trans. Serv. Comput. 9(1), 33–45 (2016)
J.-P. Poli, L. Boudet, A fuzzy expert system architecture for data and event stream processing. Fuzzy Sets Syst. 343, 20–28 (2017)
A. Valsamis, K. Tserpes, D. Zissis, D. Anagnostopoulos, T. Varvarigou, Employing traditional machine learning algorithms for big data streams analysis: the case of object trajectory prediction. J. Syst. Softw. 127, 249–257 (2017)
D. Triboan, L. Chen, F. Chen, Z. Wang, Semantic segmentation of real-time sensor data stream for complex activity recognition. Pers. Ubiquitous Comput. 21(3), 411–425 (2017)
W. Kun, Y. Yue, B. Liu, DAS: a dynamic assignment scheduling algorithm for stream computing in distributed applications, in 2016 IEEE Global Communications Conference (GLOBECOM), Atlanta, GA (2017), pp. 1632–1637
F. Zhang, J. Cao, S.U. Khan, K. Li, K. Hwang, A task-level adaptive MapReduce framework for real-time streaming data in healthcare applications. Fut. Gener. Comput. Syst. 43–44, 149–160 (2015)
T.N. Hasan, A data mining approach for handling evolving data streams. OSR J. Comput. Sci. 28–33 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Surapaneni, R.K., Nimmagadda, S., Govada, R.R. (2020). Handling Incomplete and Delayed Information Using Optimal Scheduling of Big Data Stream. In: Singh Tomar, G., Chaudhari, N.S., Barbosa, J.L.V., Aghwariya, M.K. (eds) International Conference on Intelligent Computing and Smart Communication 2019. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-15-0633-8_15
Download citation
DOI: https://doi.org/10.1007/978-981-15-0633-8_15
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-0632-1
Online ISBN: 978-981-15-0633-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)