Skip to main content
Log in

Fair and near-optimal coflow scheduling without prior knowledge of coflow size

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Achieving the minimum average coflow completion time(CCT) and the isolation guarantees for multi-tenant, is considered a challenge in a cloud environment. This is because the minimum average CCT and isolation guarantees are two conflicting targets, and they cannot be achieved simultaneously. Prior solutions have implemented a single target either minimizing the average CCT or isolation guarantees. The prior solutions are also limited to clairvoyant scheduling. They also assume the availability of the complete knowledge of coflow sizes before the communication starts. In this paper, we propose an efficient scheduling algorithm smallest-height-first DRF(SHFDRF) for near-optimal scheduling and isolation guarantees without prior knowledge of coflow size. SHFDRF achieves the long-term isolation guarantees and the minimum average CCT by the smallest height first and the monopolistic dominant resource fairness bandwidth allocation strategy. The smallest height first and the monopolistic dominant resource fairness bandwidth allocation strategy can also improve link utilization and system throughput. The trace-driven simulation shows that SHFDRF enables communication stages to 1.28\(\times \), 2.27\(\times \), and 6.28\(\times \) faster on the 95th percentile compared to DRF, NCDRF, and Per-Flow Fairness. Even compared with minimum CCT, the completion time of coflow only slowed down by 13.9% on the 95th percentile. Overall, the performance of SHFDRF is acceptable, and it can be applied to the actual datacenter without the limitation of complete prior knowledge.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Coflow benchmark based on facebook traces (2018). https://github.com/coflow/coflow-benchmark

  2. Alizadeh M, Greenberg A, Maltz DA, Padhye J, Patel P, Prabhakar B, Sengupta S, Sridharan M (2011) Data center TCP (DCTCP). ACM SIGCOMM Computer Commun Rev 41(4):63–74. https://doi.org/10.1145/1851182.1851192

    Article  Google Scholar 

  3. Bai W, Chen L, Chen K, Han D, Tian C, Wang H (2017) PIAS: Practical information-agnostic flow scheduling for commodity data centers. IEEE/ACM Trans Netw 25(4):1954–1967. https://doi.org/10.1109/TNET.2017.2669216

    Article  Google Scholar 

  4. Ballani H, Costa P, Karagiannis T, Rowstron A (2011) Towards predictable datacenter networks. In: Proceedings of the ACM SIGCOMM 2011 Conference on SIGCOMM - SIGCOMM ’11, vol. 41, pp. 242–253. ACM Press, Toronto, Ontario, Canada. https://doi.org/10.1145/2018436.2018465

  5. Bonald T, Roberts J (2014) Enhanced cluster computing performance through proportional fairness. Perform Eval 79:134–145. https://doi.org/10.1016/j.peva.2014.07.009

    Article  Google Scholar 

  6. Chen Y, Wu J (2018) Multi-hop coflow routing and scheduling in data centers. In: 2018 IEEE International Conference on Communications (ICC), pp. 1–6. IEEE, Kansas City, MO. https://doi.org/10.1109/ICC.2018.8422880

  7. Chowdhury M, Liu Z, Ghodsi A, Stoica I (2016) HUG: multi-resource fairness for correlated and elastic demands. 13th USENIX Symposium on networked systems design and implementation (NSDI 16). USENIX, Santa Clara, California, pp 407–424

  8. Chowdhury M, Stoica I (2012) Coflow: an application layer abstraction for cluster networking. In: Proceedings of the 11th ACM workshop on hot topics in networks - HotNets-XI, pp. 1–6. ACM Press. https://doi.org/10.1145/2390231.2390237

  9. Chowdhury M, Stoica I (2015) Efficient coflow scheduling without prior knowledge. In: Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication - SIGCOMM ’15, pp. 393–406. ACM Press, London, United Kingdom. https://doi.org/10.1145/2785956.2787480

  10. Chowdhury M, Zaharia M, Ma J, Jordan MI, Stoica I (2011) Managing Data Transfers in Computer Clusters with Orchestra. ACM SIGCOMM Computer Commun Rev 41(4):98–109. https://doi.org/10.1145/2043164.2018448

    Article  Google Scholar 

  11. Chowdhury M, Zhong Y, Stoica I (2014) Efficient coflow scheduling with varys. In: Proceedings of the 2014 ACM Conference on SIGCOMM - SIGCOMM ’14, pp. 443–454. ACM Press, Chicago, Illinois, USA. https://doi.org/10.1145/2619239.2626315

  12. Chowdhury NMMK (2015) Coflow a networking abstraction for distributed data-parallel applications. University of California, Berkeley

    Google Scholar 

  13. Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113. https://doi.org/10.1145/1327452.1327492

    Article  Google Scholar 

  14. Dogar FR, Karagiannis T, Ballani H, Rowstron A (2014) Decentralized task-aware scheduling for data center networks. In: Proceedings of the 2014 ACM Conference on SIGCOMM - SIGCOMM ’14, pp. 431–442. ACM Press, Chicago, Illinois, USA. https://doi.org/10.1145/2619239.2626322

  15. Ghodsi A, Zaharia M, Hindman B, Konwinski A, Shenker S, Stoica I (2011) Dominant resource fairness: fair allocation of multiple resource types. In: 8th USENIX Symposium on networked systems design and implementation (NSDI ’11), vol 11. USENIX, Boston, MA, pp 323–336

  16. Ghodsi A, Zaharia M, Shenker S, Stoica I (2013) Choosy: max-min fair sharing for datacenter jobs with constraints. In: Proceedings of the 8th ACM European Conference on Computer Systems - EuroSys ’13, pp. 365–378. ACM Press, Prague, Czech Republic. https://doi.org/10.1145/2465351.2465387

  17. Guo C, Lu G, Wang HJ, Yang S, Kong C, Sun P, Wu W, Zhang Y (2010) SecondNet: a data center network virtualization architecture with bandwidth guarantees. In: Proceedings of the 6th International Conference on - Co-NEXT ’10. ACM Press, Philadelphia, USA. https://doi.org/10.1145/1921168.1921188

  18. Guo Y, Wang Z, Zhang H, Yin X, Shi X, Wu J (2019) Joint optimization of tasks placement and routing to minimize coflow completion time. J Netw Computer Appl 135:47–61. https://doi.org/10.1016/j.jnca.2019.02.031

    Article  Google Scholar 

  19. Hong CY, Caesar M, Godfrey PB (2012) Finishing flows quickly with preemptive scheduling. In: Proceedings of the ACM SIGCOMM 2012 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication - SIGCOMM ’12, pp. 127–138. ACM Press, Helsinki, Finland. https://doi.org/10.1145/2342356.2342389

  20. Isard M, Budiu M, Yu Y, Birrell A, Fetterly D (2007) Dryad: distributed data-parallel programs from sequential building blocks. In: ACM SIGOPS operating systems review, vol. 41, pp. 59–72. ACM Press, Lisboa, Portugal. https://doi.org/10.1145/1272996.1273005

  21. Jajoo A, Hu YC, Lin X (2019) Your coflow has many flows: sampling them for fun and speed. In: Your coflow has many flows: sampling them for fun and speed. USENIX, RENTON, WA, USA, pp 833–847

  22. Jeyakumar V, Alizadeh M, Mazieres D, Prabhakar B, Kim C, Greenberg A (2013) EyeQ: Practical network performance isolation at the edge. In: 10th USENIX Symposium on networked systems design and implementation (NSDI ’13). USENIX, Lombard, IL, pp 297–311

  23. Jiang D, Xu Z, Liu J, Zhao W (2016) An optimization-based robust routing algorithm to energy-efficient networks for cloud computing. Telecommun Syst 63(1):89–98. https://doi.org/10.1007/s11235-015-9975-y

    Article  Google Scholar 

  24. Li C, Zhang H, Zhou T (2019) Coflow scheduling algorithm based density peaks clustering. Future Gener Computer Syst 97:805–813. https://doi.org/10.1016/j.future.2019.03.035

    Article  Google Scholar 

  25. Nagelkerke NJD (1991) A note on a general definition of the coefficient of determination. Biometrika 78(3):691–692. https://doi.org/10.2307/2337038

    Article  MathSciNet  MATH  Google Scholar 

  26. Popa L, Kumar G, Chowdhury M, Krishnamurthy A, Ratnasamy S, Stoica I (2012) FairCloud: sharing the network in cloud computing. ACM SIGCOMM Computer Commun Rev 42(4):187–198. https://doi.org/10.1145/2377677.2377717

    Article  Google Scholar 

  27. Poullie P, Bocek T, Stiller B (2018) A survey of the state-of-the-art in fair multi-resource allocations for data centers. IEEE Trans Netw Serv Manag 15(1):169–183. https://doi.org/10.1109/TNSM.2017.2743066

    Article  Google Scholar 

  28. Shafiee M, Ghaderi J (2018) An improved bound for minimizing the total weighted completion time of coflows in datacenters. IEEE/ACM Trans Netw 26(4):1674–1687. https://doi.org/10.1109/TNET.2018.2845852

    Article  Google Scholar 

  29. Shi L, Zhang J, Liu Y, Robertazzi T (2018) Coflow scheduling in data centers: routing and bandwidth allocation. arXiv:1812.06898[cs]

  30. Singh A, Ong J, Agarwal A, Anderson G, Armistead A, Bannon R, Boving S, Desai G, Felderman B, Germano P, Kanagala A, Provost J, Simmons J, Tanda E, Wanderer J, Hölzle U, Stuart S, Vahdat A (2015) Jupiter rising: a decade of clos topologies and centralized control in google’s datacenter network. In: Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication - SIGCOMM ’15, vol. 45, pp. 183–197. ACM Press, London, United Kingdom. https://doi.org/10.1145/2785956.2787508

  31. Wang L, Wang W (2018) Fair coflow scheduling without prior knowledge. In: 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS), pp. 22–32. IEEE, Vienna . https://doi.org/10.1109/ICDCS.2018.00013

  32. Wang L, Wang W, Li B (2018) Utopia: near-optimal coflow scheduling with isolation guarantee. In: IEEE INFOCOM 2018 - IEEE Conference on Computer Communications, pp. 891–899. IEEE, Honolulu, HI. https://doi.org/10.1109/INFOCOM.2018.8485970

  33. Wang S, Zhang J, Huang T, Pan T, Liu J, Liu Y (2018) Multi-attributes-based coflow scheduling without prior knowledge. IEEE/ACM Trans Netw 26(4):1962–1975. https://doi.org/10.1109/TNET.2018.2858801

    Article  Google Scholar 

  34. Wang W, Jin AL (2016) Friends or foes: revisiting strategy-proofness in cloud network sharing. In: 2016 IEEE 24th International Conference on Network Protocols (ICNP), pp. 1–10. IEEE, Singapore. https://doi.org/10.1109/ICNP.2016.7784425

  35. Wang W, Ma S, Li B, Li B (2017) Coflex: Navigating the fairness-efficiency tradeoff for coflow scheduling. In: IEEE INFOCOM 2017 - IEEE Conference on Computer Communications, pp. 1–9. IEEE, Atlanta, GA, USA. https://doi.org/10.1109/INFOCOM.2017.8057172

  36. Wang Z, Zhang H, Shi X, Yin X, Li Y, Geng H, Wu Q, Liu J (2019) Efficient scheduling of weighted coflows in data centers. IEEE Trans Parallel Distrib Syst 30(9):2003–2017. https://doi.org/10.1109/TPDS.2019.2905560

    Article  Google Scholar 

  37. Zaharia M, Chowdhury M, Franklin MJ, Shenker S, Stoica I (2010) Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing., vol. 10, pp. 10–17. Boston, MA

  38. Zhang H, Chen L, Yi B, Chen K, Chowdhury M., Geng Y (2016) CODA: toward automatically identifying and scheduling coflows in the dark. In: Proceedings of the 2016 Conference on ACM SIGCOMM 2016 Conference - SIGCOMM ’16, pp. 160–173. ACM Press, Florianopolis, Brazil . https://doi.org/10.1145/2934872.2934880

  39. Zhang H, Shi X, Yin X, Wang Z (2017) Yosemite: efficient scheduling of weighted coflows in data centers. In: 2017 IEEE 25th International Conference on Network Protocols (ICNP), pp. 1–2. IEEE, Toronto, ON. https://doi.org/10.1109/ICNP.2017.8117586

  40. Zhao Y, Chen K, Bai W, Yu M, Tian C, Geng Y, Zhang Y, Li D, Wang S (2015) Rapier: integrating routing and scheduling for coflow-aware data center networks. In: 2015 IEEE Conference on Computer Communications (INFOCOM), pp. 424–432. IEEE, Kowloon, Hong Kong . https://doi.org/10.1109/INFOCOM.2015.7218408

Download references

Acknowledgements

This study was supported by the National Natural Science Foundation of China (Grant No. 61772386).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huyin Zhang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, C., Zhang, H., Ding, W. et al. Fair and near-optimal coflow scheduling without prior knowledge of coflow size. J Supercomput 77, 7690–7717 (2021). https://doi.org/10.1007/s11227-020-03614-2

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-020-03614-2

Keywords

Navigation