Research of Task Scheduling Mechanism Based on Prediction of Memory Utilization

  • Juan Fang
  • Mengxuan Wang
  • Hao Sun
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 747)


With the arrival of big data era, distributed computing framework Hadoop has become the main solution to deal with big data now. People usually promote the performance of distributed computing by adding new computing nodes to cluster. With the expansion of the scale of the cluster, it produces a large amount of power consumption because of lack of reasonable management strategy. So how to make full use of computing resources in the cluster to improve the performance of the whole system and reduce the power consumption has become the main research direction of scholars and industrial circles. For the above, in order to make best use of computing resources and reduce the power consumption, this paper firstly proposes to optimize a reasonable configuration of the parameters provided by Hadoop. Comparing with the default configuration of Hadoop. It shows we can get better performance by parameter tuning. This paper proposes a task scheduling mechanism based on memory usage prediction. In this task schedule, it predicts the future use status of memory in the computing nodes by analyzing the use status before. The task scheduling mechanism can reduce the memory pressure by reducing the allocation of tasks when the computing node is under memory pressure. The task scheduling mechanism can be more flexible by setting the threshold of memory usage. This mechanism based on predicting memory usage can improve the performance of the system by making full use of the computing resources.


Big data High performance Prediction Task scheduling 


  1. 1.
    Bryant, R.E., Katz, R.H., Lazowska, E.D.: Big-data computing: creating revolutionary breakthroughs in commerce, science, and society. Computing Community Consortium, pp. 1–15 (2008)Google Scholar
  2. 2.
    Xu, X., Cao, L., Wang, X.: Adaptive task scheduling strategy based on dynamic workload adjustment for heterogeneous Hadoop clusters. IEEE Syst. J. 10(2), 471–482 (2016)CrossRefGoogle Scholar
  3. 3.
    Cheng, D., Rao, J., Guo, Y., et al.: Improving performance of heterogeneous mapreduce clusters with adaptive task tuning. IEEE Trans. Parallel Distrib. Syst. 28(3), 774–786 (2017)CrossRefGoogle Scholar
  4. 4.
    Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: Proceedings of Operating Systems Design and Implementation (OSDI), pp. 137–150 (2004)Google Scholar
  5. 5.
    Xiong, S.,Yu, L.,Shen, H., et al.: Efficient algorithms for sensor deployment and routing in sensor networks for network-strucured environment monitoring. In: 2012 IEEE Proceedings of INFOCOM, pp. 1008–1016. IEEE (2012)Google Scholar
  6. 6.
    Bai, X., Xuan, D., Yun, Z., et al.: Complete optimal deployment patterns for full-coverage and k-connectivity wireless sensor networks. In: Proceedings of the 9th ACM International Symposium on Mobile Ad hoc Networking and Computing, pp. 401–410. ACM (2008)Google Scholar
  7. 7.
    Zaharia, M., Konwinski, A., Joseph, A., Katz, R., Stoica, I.: Improving mapreduce performance in heterogeneous environments. In: OSDI, pp. 29–42 (2009)Google Scholar
  8. 8.
    Babu, S.: Towards automatic optimization of mapreduce programs. In: SoCC, pp. 137–142. ACM (2010)Google Scholar
  9. 9.
    Jiang, D., et al.: The performance of mapreduce: an in-depth study. Proc. VLDB Endow. 3, 472–483 (2010)CrossRefGoogle Scholar
  10. 10.
    Dean, J., Ghemawat, S.: Mapreduce: a flexible data processing tool. Commun. ACM 53(1), 72–77 (2010)CrossRefGoogle Scholar
  11. 11.
    Xie, J., Yin, S., Ruan, X.-J., Ding, Z.-Y., Tian, Y., Majors, J., Qin, X.: Improving mapreduce performance via data placement in heterogeneous hadoop clusters. In: Proceedings of 19th International Heterogeneity in Computing Workshop (2010)Google Scholar
  12. 12.
    Jiang, D., et al.: The performance of mapreduce: An in-depth study. Proc. VLDB Endow. 3, 472–483 (2010)CrossRefGoogle Scholar
  13. 13.
    Strutz, T.: Data fitting and uncertainty (A practical introduction to weighted least squares and beyond), Chapter 3. Springer ViewegGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  1. 1.Beijing University of TechnologyBeijingChina

Personalised recommendations