Dynamic energy-efficient scheduling for streaming applications in storm

Abstract

With the rapid development of information technology, the data generated by the Internet has exploded in recent years. The proliferation of data has brought about a huge increase in the energy consumption for data processing especially in real-time processing framework for big data. In this study, two energy-efficient scheduling algorithms are proposed to reduce energy consumption for streaming applications in Storm. First, an energy consumption model is designed for Storm framework. Then this model is introduced into Storm by an energy consumption monitoring module. For proposed algorithm 1, the energy consumption of the processing tasks is minimized by integrating the tasks into the low energy consumption nodes. For proposed algorithm 2, load balance and energy consumption of Storm cluster are traded off and optimized by sorting the Slot utilization of low energy consumption nodes in the cluster and assigning tasks priority to the low Slot utilization nodes. Test on Hibench workload, the proposed algorithms reduce the total energy consumption of Storm cluster up to 32% compared with the traditional scheduling algorithms. It shows that the proposed scheduling algorithms can effectively reduce the total energy consumption of the Storm cluster while satisfying the deadline constrains.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

References

  1. 1.

    Li C, Zhang J, Luo Y (2017) Real-time scheduling based on optimized topology and communication traffic in distributed real-time computation platform of storm. J Netw Comput Appl 87:100–115

    Article  Google Scholar 

  2. 2.

    Liu X (2018) Robust resource management in distributed stream processing systems. PhD thesis

  3. 3.

    Chatzistergiou A, Viglas SD (2014) Fast heuristics for near-optimal task allocation in data stream processing over clusters, In: Proceedings of the 23rd ACM international conference on conference on information and knowledge management, ACM, pp 1579–1588

  4. 4.

    Liu X, Buyya R (2017) Performance-oriented deployment of streaming applications on cloud. IEEE Trans Big Data 5(1):46–59

    Article  Google Scholar 

  5. 5.

    Peng B, Hosseini M, Hong Z, Farivar R, Campbell R (2015) R-storm: resource-aware scheduling in storm. In: Proceedings of the 16th annual middleware conference, ACM, pp 149–161

  6. 6.

    Chakraborty R, Majumdar S (2016) A priority based resource scheduling technique for multitenant storm clusters. In: 2016 international symposium on performance evaluation of computer and telecommunication systems (SPECTS), IEEE, pp 1–6

  7. 7.

    Weng Z, Guo Q, Wang C, Meng X, He B (2017) Adastorm: resource efficient storm with adaptive configuration. In: 2017 IEEE 33rd international conference on data engineering (ICDE), IEEE, pp 1363–1364

  8. 8.

    Qian W, Shen Q, Qin J, Yang D, Yang Y, Wu Z (2016) S-storm: a slot-aware scheduling strategy for even scheduler in storm. In: 2016 IEEE 18th international conference on high performance computing and communications; IEEE 14th international conference on smart city; IEEE 2nd international conference on data science and systems (HPCC/SmartCity/DSS), IEEE, pp 623–630

  9. 9.

    Cardellini V, Grassi V, Lo Presti F, Nardelli M (2015) Distributed qos-aware scheduling in storm. In: Proceedings of the 9th ACM international conference on distributed event-based systems, ACM, pp 344–347

  10. 10.

    Eskandari L, Huang Z, Eyers D (2016) P-scheduler: adaptive hierarchical scheduling in apache storm. In: Proceedings of the Australasian computer science week multiconference, ACM, p 26

  11. 11.

    Xiang D, Wu Y, Shang P, Jiang J, Wu J, Yu K (2017) Rb-storm: resource balance scheduling in apache storm. In: 2017 6th IIAI international congress on advanced applied informatics (IIAI-AAI), IEEE, pp 419–423

  12. 12.

    Long S, Rao R, Miao W, Zhang X (2015) An improved topology schedule algorithm for storm system. In: Computer science and applications: proceedings of the 2014 Asia-Pacific conference on computer science and applications (CSAC 2014), Shanghai, China, 27–28 December 2014, p 187

  13. 13.

    Ibrahim H, Aburukba RO, El-Fakih K (2018) An integer linear programming model and adaptive genetic algorithm approach to minimize energy consumption of cloud computing data centers. Comput Electr Eng 67:551–565

    Article  Google Scholar 

  14. 14.

    Iqbal MH, Soomro TR (2015) Big data analysis: apache storm perspective. Int J Comput Trends Technol 19(1):9–14

    Article  Google Scholar 

  15. 15.

    Xu J, Chen Z, Tang J, Su S (2014) T-storm: traffic-aware online scheduling in storm. In: 2014 IEEE 34th international conference on distributed computing systems, IEEE, pp 535–544

  16. 16.

    Han Z, Zhang Y (2015) Spark: a big data processing platform based on memory computing, In: 2015 seventh international symposium on parallel architectures, algorithms and programming (PAAP), IEEE, pp 172–176

  17. 17.

    Backman N, Fonseca R, Çetintemel U (2012) Managing parallelism for stream processing in the cloud. In: Proceedings of the 1st international workshop on hot topics in cloud data processing, ACM, p 1

  18. 18.

    Zhang J, Li C, Zhu L, Liu Y (2016) The real-time scheduling strategy based on traffic and load balancing in storm. In: 2016 IEEE 18th international conference on high performance computing and communications; IEEE 14th international conference on smart city; IEEE 2nd international conference on data science and systems (HPCC/SmartCity/DSS), IEEE, pp 372–379

  19. 19.

    Aniello L, Baldoni R, Querzoni L (2013) Adaptive online scheduling in storm. In: Proceedings of the 7th ACM international conference on distributed event-based systems. ACM, pp 207–218

  20. 20.

    Wu C-M, Chang R-S, Chan H-Y (2014) A green energy-efficient scheduling algorithm using the dvfs technique for cloud datacenters. Futur Gener Comput Syst 37:141–147

    Article  Google Scholar 

  21. 21.

    Zhang X, Wu T, Chen M, Wei T, Zhou J, Hu S, Buyya R (2019) Energy-aware virtual machine allocation for cloud with resource reservation. J Syst Softw 147:147–161

    Article  Google Scholar 

  22. 22.

    Xu M, Alamro S, Lan T, Subramaniam S (2018) chronos: a unifying optimization framework for speculative execution of deadline-critical mapreduce jobs. In: 2018 IEEE 38th international conference on distributed computing systems (ICDCS), IEEE, pp 718–729

  23. 23.

    Requeno JI, Merseguer J, Bernardi S, Perez-Palacin D, Giotis G, Papanikolaou V (2019) Quantitative analysis of apache storm applications: the newsasset case study. Inf Syst Front 21(1):67–85

    Article  Google Scholar 

  24. 24.

    Mashayekhy L, Nejad MM, Grosu D, Zhang Q, Shi W (2014) Energy-aware scheduling of mapreduce jobs for big data applications. IEEE Trans Parallel Distrib Syst 26(10):2720–2733

    Article  Google Scholar 

  25. 25.

    Tian W, Li G, Yang W, Buyya R (2016) Hscheduler: an optimal approach to minimize the makespan of multiple mapreduce jobs. J Supercomput 72(6):2376–2393

    Article  Google Scholar 

  26. 26.

    Li H, Wang H, Fang S, Zou Y, Tian W (2019) An energy-aware scheduling algorithm for big data applications in spark. Cluster Comput 23:593–609

    Article  Google Scholar 

  27. 27.

    Maroulis S, Zacheilas N, Kalogeraki V (2017) A framework for efficient energy scheduling of spark workloads. In: 2017 IEEE 37th international conference on distributed computing systems (ICDCS), IEEE, pp 2614–2615

  28. 28.

    Jin P, Hao X, Wang X, Yue L (2018) Energy-efficient task scheduling for cpu-intensive streaming jobs on hadoop. IEEE Trans Parallel Distrib Syst 30(6):1298–1311

    Article  Google Scholar 

  29. 29.

    Song J, He H, Wang Z, Yu G, Pierson J-M (2018) Modulo based data placement algorithm for energy consumption optimization of mapreduce system. J Grid Comput 16(3):409–424

    Article  Google Scholar 

  30. 30.

    Luo L, Wu W-J, Zhang F (2014) Energy modeling based on cloud data center. J Softw 25(7):1371–1387

    Google Scholar 

  31. 31.

    Chen Y-R, Lee C-R (2016) G-storm: a gpu-aware storm scheduler. In: 2016 IEEE 14th international conference on dependable, autonomic and secure computing; 14th international conference on pervasive intelligence and computing; 2nd international conference on big data intelligence and computing and cyber science and technology congress (DASC/PiCom/DataCom/CyberSciTech), IEEE, pp 738–745

  32. 32.

    Samadi Y, Zbakh M, Tadonki C (2018) Performance comparison between hadoop and spark frameworks using hibench benchmarks. Concurr Comput Pract Exp 30(12):e4367

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by Chongqing science and Technology Commission Project (Grant Nos: cstc2017jcyjAX0142 and cstc2018jcyjAX0525), Key Research and Development Projects of Sichuan Science and Technology Department (Grant No: 2019YFG0107).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Hongjian Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Li, H., Dai, H., Liu, Z. et al. Dynamic energy-efficient scheduling for streaming applications in storm. Computing (2021). https://doi.org/10.1007/s00607-021-00961-7

Download citation

Keywords

  • Streaming application
  • Apache storm
  • Energy-efficient scheduling
  • Slot usage
  • Load balancing

Mathematics Subject Classification

  • 68Q85