M-DRL: Deep Reinforcement Learning Based Coflow Traffic Scheduler with MLFQ Threshold Adaption

Chen, Tianba; Li, Wei; Sun, YuKang; Li, Yunchun

doi:10.1007/978-3-030-79478-1_7

Tianba Chen¹¹,
Wei Li¹¹,
YuKang Sun¹¹ &
…
Yunchun Li¹¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12639))

Included in the following conference series:

IFIP International Conference on Network and Parallel Computing

1122 Accesses

Abstract

The coflow scheduling in data-parallel clusters can improve application-level communication performance. The existing coflow scheduling method without prior knowledge usually uses Multi-Level Feedback Queue (MLFQ) with fixed threshold parameters, which is insensitive to coflow traffic characteristics. Manual adjustment of the threshold parameters for different application scenarios often has long optimization period and is coarse in optimization granularity. We propose M-DRL, a deep reinforcement learning based coflow traffic scheduler by dynamically setting thresholds of MLFQ to adapt to the coflow traffic characteristics, and reduces the average coflow completion time. Trace-driven simulations on the public dataset show that coflow communication stages using M-DRL complete 2.08\(\times \)(6.48\(\times \)) and 1.36\(\times \)(1.25\(\times \)) faster on average coflow completion time (95-th percentile) in comparison to per-flow fairness and Aalo, and is comparable to SEBF with prior knowledge.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Brockman, G., et al.: Openai gym (2016). arXiv preprint arXiv:1606.01540
Chen, L., Lingys, J., Chen, K., Liu, F.: Scaling deep reinforcement learning for datacenter-scale automatic traffic optimization. In: Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, pp. 191–205 (2018)
Google Scholar
Chowdhury, M., Stoica, I.: Coflow: a networking abstraction for cluster applications. In: Proceedings of the 11th ACM Workshop on Hot Topics in Networks, pp. 31–36 (2012)
Google Scholar
Chowdhury, M., Stoica, I.: Efficient coflow scheduling without prior knowledge. ACM SIGCOMM Comput. Commun. Rev. 45(4), 393–406 (2015)
Article Google Scholar
Chowdhury, M., Zaharia, M., Ma, J., Jordan, M.I., Stoica, I.: Managing data transfers in computer clusters with orchestra. ACM SIGCOMM Comput. Commun. Rev. 41(4), 98–109 (2011)
Google Scholar
Chowdhury, M., Zhong, Y., Stoica, I.: Efficient coflow scheduling with varys. In: Proceedings of the 2014 ACM Conference on SIGCOMM, pp. 443–454 (2014)
Google Scholar
François-Lavet, V., Henderson, P., Islam, R., Bellemare, M.G., Pineau, J.: An introduction to deep reinforcement learning. arXiv preprint arXiv:1811.12560 (2018)
Li, C., Zhang, H., Zhou, T.: Coflow scheduling algorithm based density peaks clustering. Futur. Gener. Comput. Syst. 97, 805–813 (2019)
Article Google Scholar
Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)
Mao, H., Schwarzkopf, M., Venkatakrishnan, S.B., Meng, Z., Alizadeh, M.: Learning scheduling algorithms for data processing clusters. In: Proceedings of the ACM Special Interest Group on Data Communication, pp. 270–288 (2019)
Google Scholar
Penney, D.D., Chen, L.: A survey of machine learning applied to computer architecture design. arXiv preprint arXiv:1909.12373 (2019)
Sivakumar, V.: MVFST-RL: an asynchronous RL framework for congestion control with delayed actions. arXiv preprint arXiv:1910.04054 (2019)
Wang, K., Zhou, Q., Guo, S., Luo, J.: Cluster frameworks for efficient scheduling and resource allocation in data center networks: a survey. IEEE Commun. Surv. Tutor. 20(4), 3560–3580 (2018)
Article Google Scholar
Wang, S., Zhang, J., Huang, T., Liu, J., Pan, T., Liu, Y.: A survey of coflow scheduling schemes for data center networks. IEEE Commun. Mag. 56(6), 179–185 (2018)
Article Google Scholar
Zhang, H., Chen, L., Yi, B., Chen, K., Chowdhury, M., Geng, Y.: Coda: toward automatically identifying and scheduling coflows in the dark. In: Proceedings of the 2016 ACM SIGCOMM Conference, pp. 160–173 (2016)
Google Scholar

Download references

Acknowledgement

This work is supported by the National Key Research and Development Program of China (Grant No. 2016YFB1000304) and National Natural Science Foundation of China (Grant No. 1636208).

Author information

Authors and Affiliations

Beijing Key Lab of Network Technology, School of Computer Science and Engineering, Beihang University, Beijing, China
Tianba Chen, Wei Li, YuKang Sun & Yunchun Li

Authors

Tianba Chen
View author publications
You can also search for this author in PubMed Google Scholar
Wei Li
View author publications
You can also search for this author in PubMed Google Scholar
YuKang Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yunchun Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wei Li .

Editor information

Editors and Affiliations

Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Xin He
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
En Shao
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Guangming Tan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, T., Li, W., Sun, Y., Li, Y. (2021). M-DRL: Deep Reinforcement Learning Based Coflow Traffic Scheduler with MLFQ Threshold Adaption. In: He, X., Shao, E., Tan, G. (eds) Network and Parallel Computing. NPC 2020. Lecture Notes in Computer Science(), vol 12639. Springer, Cham. https://doi.org/10.1007/978-3-030-79478-1_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-79478-1_7
Published: 23 June 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-79477-4
Online ISBN: 978-3-030-79478-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Federation for Information Processing (opens in a new tab)