Dynamic Task Allocation for Data-Intensive Workflows in Cloud Environment

  • Xiping LiuEmail author
  • Liyang Zheng
  • Chen Junyu
  • Lei Shang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11434)


Cloud environment provides high performance computing services to process massive data for data-intensive workflows. Due to the different functional requirements, tasks in a workflow might be allocated to multiple cloud servers. The massive data among these tasks have to be transferred and this greatly increases the execution cost. To decrease the transferred data size during the workflow execution, this paper proposes a dynamic task allocation method based on the data dependencies. The workflow with data dependencies and typical control logic, i.e., sequential, parallel, and exclusive choice, is described based on process algebra. The data size relevant to a data dependency can be obtained only after the task is executed. Each task is allocated to a certain server according to relevant data size and maximal data paths. A case study is presented to illustrate the feasibility and effect of the proposed method and the related work is discussed based on the case study.


Dynamic task allocation Data-intensive workflows Cloud environment Data dependency Maximal data path 


  1. 1.
    Rimal, B.P., Choi, E.: A service-oriented taxonomical spectrum, cloudy challenges and opportunities of cloud computing. Int. J. Commun Syst 25(6), 796–819 (2012)CrossRefGoogle Scholar
  2. 2.
    Diaz-Montes, J., Diaz-Granados, M., Zou, M., Tao, S., Parashar, M.: Supporting data-intensive workflows in software-defined federated multi-clouds. IEEE Trans. Cloud Comput. 6(1), 250–263 (2018)CrossRefGoogle Scholar
  3. 3.
    Alkhanaka, E.N., Leea, S.P., Rezaeia, R., Parizi, R.M.: Cost optimization approaches for scientific workflow scheduling in cloud and grid computing: a review, classifications, and open issues. J. Syst. Softw. 113(3), 1–26 (2016)CrossRefGoogle Scholar
  4. 4.
    Lenhard, J., Ferme, V., Harrer, S., Geiger, M., Pautasso, C.: Lessons learned from evaluating workflow management systems. In: Braubach, L., et al. (eds.) ICSOC 2017. LNCS, vol. 10797, pp. 215–227. Springer, Cham (2018). Scholar
  5. 5.
    Masdari, M., ValiKardan, S., Shahi, Z., Azar, S.I.: Towards workflow scheduling in cloud computing: a comprehensive analysis. J. Netw. Comput. Appl. 66, 64–82 (2016)CrossRefGoogle Scholar
  6. 6.
    Moghadam, M.H., Babamir, S.M., Mirabi, M.: A multi-objective optimization model for data-intensive workflow scheduling in data grids. In: IEEE 41st Conference on Local Computer Networks Workshops, pp. 25–33 (2016)Google Scholar
  7. 7.
    Kumar, M.S., Gupta, I., Jana, P.K.: Forward load aware scheduling for data-intensive workflow applications in cloud system. In: International Conference on Information Technology, pp. 93–97 (2016)Google Scholar
  8. 8.
    Rodriguez, M.A., Buyya, R.: Deadline based resource provisioning and scheduling algorithm for scientific workflows on clouds. IEEE Trans. Cloud Comput. 2(2), 222–235 (2014)CrossRefGoogle Scholar
  9. 9.
    Choi, J., Adufu, T., Kim, Y.: Data-locality aware scientific workflow scheduling methods in HPC cloud environments. Int. J. Parallel Prog. 45(5), 1128–1141 (2017)CrossRefGoogle Scholar
  10. 10.
    Smanchat, S., Viriyapant, K.: Taxonomies of workflow scheduling problem and techniques in the cloud. Future Gener. Comput. Syst. 52, 1–12 (2015)CrossRefGoogle Scholar
  11. 11.
    Gupta, M., Jain, A.: A survey on cost aware task allocation algorithm for cloud environment. In. 4th IEEE International Conference on Signal Processing, Computing and Control, pp. 642–646 (2017)Google Scholar
  12. 12.
    Yuan, D., Yang, Y., Liu, X., Zhang, G., Chen, J.: A data dependency based strategy for intermediate data storage in scientific cloud workflow systems. Concurr. Comput. Pract. Exp. 24(9), 956–976 (2012)CrossRefGoogle Scholar
  13. 13.
    Bilgaiyan, S., Sagnika, S., Das M.: Workflow scheduling in cloud computing environment using cat swarm optimization. In: IEEE International Advance Computing Conference (IACC), pp. 680–685 (2014)Google Scholar
  14. 14.
    Xie, Y., Chen, S., Ni, Q., Hanqing, W.: Integration of resource allocation and task assignment for optimizing the cost and maximum throughput of business processes. J. Intell. Manuf. (2017). Scholar
  15. 15.
    Guerfel, R., Sbaï, Z., Ayed, R.B.: Model checking of cost-effective elasticity strategies in cloud computing. In: Braubach, L., et al. (eds.) ICSOC 2017. LNCS, vol. 10797, pp. 80–92. Springer, Cham (2018). Scholar
  16. 16.
    Baeten, J.C.M., Middelburg, C.A.: Process Algebra with Timing. Springer, New York (2002). Scholar
  17. 17.
    Bousselmi, K., Brahmi, Z., Gammoudi, M.M.: QoS-aware scheduling of workflows in cloud computing environments. In: IEEE 30th International Conference on Advanced Information Networking and Applications, pp. 737–745 (2016) Google Scholar
  18. 18.
    Mishra, S.K., Puthal, D., Sahoo1, B., Jena, S.K., Obaidat, M.S.: An adaptive task allocation technique for green cloud computing. J. Supercomput. 74(1), 370–385 (2018)CrossRefGoogle Scholar
  19. 19.
    Bessai, K., Youcef, S., Oulamara, A., Godart, C., Nurcan, S.: Bi-criteria workflow tasks allocation and scheduling in cloud computing environments. In: IEEE Fifth International Conference on Cloud Computing, pp. 638–645 (2012)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Xiping Liu
    • 1
    Email author
  • Liyang Zheng
    • 1
  • Chen Junyu
    • 1
  • Lei Shang
    • 1
  1. 1.Jiangsu Key Laboratory of Big Data Security and Intelligent Processing, School of Computer ScienceNanjing University of Posts and TelecommunicationsNanjingChina

Personalised recommendations