Efficient Resource Scheduling for Big Data Processing in Cloud Platform

  • Mohammad Mehedi Hassan
  • Biao Song
  • M. Shamim Hossain
  • Atif Alamri
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8729)


Nowadays, Big data processing in cloud is becoming an inevitable trend. For Big data processing, a specially designed cloud resource allocation approach is required. However, it is challenging how to efficiently allocate resources dynamically based on Big data applications’ QoS demands and support energy and cost savings by optimizing the number of servers in use. In order to solve this problem, a general problem formulation is established in this paper. By giving certain assumptions, we prove that the reduction of resource waste has a direct relation with cost minimization. Based on that, we develop efficient heuristic algorithms with tuning parameters to find cost minimized dynamic resource allocation solutions for the above-mentioned problem. In paper, we study and test the workload of Big data by running a group of typical Big data jobs, i.e., video surveillance services, on Amazon Cloud EC2. Then we create a large simulation scenario and compare our proposed method with other approaches.


Big data resource allocation cloud computing optimization 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Demchenko, Y., Zhao, Z., Grosso, P., Wibisono, A., de Laat, C.: Addressing big data challenges for scientific data infrastructure. In: 2012 IEEE 4th International Conference on Cloud Computing Technology and Science (CloudCom), pp. 614–617. IEEE (2012)Google Scholar
  2. 2.
    Ji, C., Li, Y., Qiu, W., Awada, U., Li, K.: Big data processing in cloud computing environments. In: 2012 12th International Symposium on Pervasive Systems, Algorithms and Networks (ISPAN), pp. 17–23. IEEE (2012)Google Scholar
  3. 3.
    Guo, S., Xiong, J., Wang, W., Lee, R.: Mastiff: A mapreduce-based system for time-based big data analytics. In: 2012 IEEE International Conference on Cluster Computing (CLUSTER), pp. 72–80. IEEE (2012)Google Scholar
  4. 4.
    Zhang, G., Li, C., Zhang, Y., Xing, C., Yang, J.: An efficient massive data processing model in the cloud – a preliminary report. In: 2012 Seventh ChinaGrid Annual Conference (ChinaGrid), pp. 148–155 (2012)Google Scholar
  5. 5.
    Speitkamp, B., Bichler, M.: A mathematical programming approach for server consolidation problems in virtualized data centers. IEEE Transactions on Services Computing 3(4), 266–278 (2010)CrossRefGoogle Scholar
  6. 6.
    Guo, J., Zhu, Z.-M., Zhou, X.-M., Zhang, G.-X.: An instances placement algorithm based on disk i/o load for big data in private cloud. In: 2012 International Conference on Wavelet Active Media Technology and Information Processing (ICWAMTIP), pp. 287–290 (2012)Google Scholar
  7. 7.
    Kaushik, R.T., Nahrstedt, K.: T: a data-centric cooling energy costs reduction approach for big data analytics cloud. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, p. 52. IEEE Computer Society Press (2012)Google Scholar
  8. 8.
    Mo, X., Wang, H.: Asynchronous index strategy for high performance real-time big data stream storage. In: 2012 3rd IEEE International Conference on Network Infrastructure and Digital Content (IC-NIDC), pp. 232–236. IEEE (2012)Google Scholar
  9. 9.
    Jung, N.G., Gnanasambandam, Mukherjee, T.: Synchronous parallel processing of big-data analytics services to optimize performance in federated clouds. In: 2012 IEEE 5th International Conference on Cloud Computing (CLOUD), pp. 811–818 (2012)Google Scholar
  10. 10.
    Rahman, M., Li, X., Palit, H.: Hybrid heuristic for scheduling data analytics workflow applications in hybrid cloud environment. In: 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), pp. 966–974. IEEE (2011)Google Scholar
  11. 11.
    Ferreto, T.C., Netto, M.A.S., Calheiros, R.N., De Rose, C.A.F.: Server consolidation with migration control for virtualized data centers. Future Gener. Comput. Syst. 27, 1027–1034 (2011)CrossRefGoogle Scholar
  12. 12.
    Jain, N., Menache, I., Naor, J., Yaniv, J.: Near-optimal scheduling mechanisms for deadline-sensitive jobs in large computing clusters. In: Proceedings of the 24th ACM Symposium on Parallelism in Algorithms and Architectures, pp. 255–266. ACM (2012)Google Scholar
  13. 13.
    Kou, L.T., Markowsky, G.: Multidimensional bin packing algorithms. IBM J. Res. Dev. 21, 443–448 (1977)CrossRefzbMATHMathSciNetGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Mohammad Mehedi Hassan
    • 1
  • Biao Song
    • 1
  • M. Shamim Hossain
    • 1
  • Atif Alamri
    • 1
  1. 1.College of Computer and Information Sciences, Pervasive and Mobile ComputingKing Saud UniversityRiyadhSaudi Arabia

Personalised recommendations