Skip to main content

Responsive Regulation of Dynamic UAV Communication Networks Based on Deep Reinforcement Learning

  • Chapter
  • First Online:
Broadband Communications, Computing, and Control for Ubiquitous Intelligence

Part of the book series: Wireless Networks ((WN))

Abstract

In this chapter, the regulation of an Unmanned Aerial Vehicle (UAV) communication network is investigated in the presence of dynamic changes in the UAV lineup and user distribution. We target an optimal UAV control policy which is capable of identifying the upcoming change in the UAV lineup (quit or join-in) or user distribution, and proactively relocating the UAVs ahead of the change rather than passively dispatching the UAVs after the change. Specifically, a deep reinforcement learning (DRL)-based UAV control framework is developed to maximize the accumulated user satisfaction (US) score for a given time horizon which is able to handle the change in both the UAV lineup and user distribution. The framework accommodates the changed dimension of the state-action space before and after the UAV lineup change by deliberate state transition design. In addition, to handle the continuous state and action space, deep deterministic policy gradient (DDPG) algorithm, which is an actor-critic based DRL method, is exploited. Furthermore, to promote learning exploration around the timing of the change, the original DDPG scheme is adapted into an asynchronous parallel computing (APC) structure which leads to better training performance in both the critic and actor networks. Finally, extensive simulations are conducted to validate the convergence of the proposed learning approach, and demonstrate its capability in jointly handling the dynamics in UAV lineup and user distribution as well as its superiority over a passive reaction method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    In such a scenario, the duration of one time step needs to be scaled up to the order of minutes.

References

  1. Q. Zhang, M. Jiang, Z. Feng, W. Li, W. Zhang, M. Pan, IoT enabled UAV: network architecture and routing algorithm. IEEE Internet Things J. 6(2), 3727–3742 (2019)

    Article  Google Scholar 

  2. X. Shen, J. Gao, W. Wu, K. Lyu, M. Li, W. Zhuang, X. Li, J. Rao, Ai-assisted network-slicing based next-generation wireless networks. IEEE Open J. Veh. Technol. 1, 45–66 (2020)

    Article  Google Scholar 

  3. W. Zhuang, Q. Ye, F. Lyu, N. Cheng, J. Ren, SDN/NFV-empowered future IOV with enhanced communication, computing, and caching. Proc. IEEE 108(2), 274–291 (2019)

    Article  Google Scholar 

  4. Y. Zeng, R. Zhang, T.J. Lim, Wireless communications with unmanned aerial vehicles: opportunities and challenges. IEEE Commun. Mag. 54(5), 36–42 (2016)

    Article  Google Scholar 

  5. Unmanned aerial vehicle (UAV) market (2019). https://www.marketsandmarkets.com/Market-Reports/unmanned-aerial-vehicles-uav-market-662.html

  6. M. Li, N. Cheng, J. Gao, Y. Wang, L. Zhao, X. Shen, Energy-efficient UAV-assisted mobile edge computing: resource allocation and trajectory optimization. IEEE Trans. Veh. Technol. 69(3), 3424–3438 (2020)

    Article  Google Scholar 

  7. D. Shi, H. Gao, L. Wang, M. Pan, Z. Han, H.V. Poor, Mean field game guided deep reinforcement learning for task placement in cooperative multi-access edge computing. IEEE Internet Things J. 7, 9330–9340 (2020)

    Article  Google Scholar 

  8. N.H. Motlagh, M. Bagaa, T. Taleb, UAV-based IoT platform: a crowd surveillance use case. IEEE Commun. Mag. 55(2), 128–134 (2017)

    Article  Google Scholar 

  9. N. Zhao, W. Lu, M. Sheng, Y. Chen, J. Tang, F.R. Yu, K.-K. Wong, UAV-assisted emergency networks in disasters. IEEE Wireless Commun. 26(1), 45–51 (2019)

    Article  Google Scholar 

  10. M. Chen, M. Mozaffari, W. Saad, C. Yin, M. Debbah, C.S. Hong, Caching in the sky: proactive deployment of cache-enabled unmanned aerial vehicles for optimized quality-of-experience. IEEE J. Sel. Areas Commun. 35(5), 1046–1061 (2017)

    Article  Google Scholar 

  11. H. Wu, F. Lyu, C. Zhou, J. Chen, L. Wang, X. Shen, Optimal UAV caching and trajectory in aerial-assisted vehicular networks: a learning-based approach. IEEE J. Sel. Areas Commun. 38(12), 2783–2797 (2020)

    Article  Google Scholar 

  12. A.A. Nasir, H.D. Tuan, T.Q. Duong, H.V. Poor, UAV-enabled communication using NOMA. IEEE Trans. Commun. 67(7), 5126–5138 (2019)

    Article  Google Scholar 

  13. M. Mozaffari, W. Saad, M. Bennis, Y.-H. Nam, M. Debbah, A tutorial on UAVs for wireless networks: applications, challenges, and open problems. IEEE Commun. Surv. Tuts. 21(3), 2334–2360 (2019)

    Article  Google Scholar 

  14. Q. Wu, L. Liu, R. Zhang, Fundamental trade-offs in communication and trajectory design for UAV-enabled wireless network. IEEE Wireless Commun. 26(1), 36–44 (2019)

    Article  Google Scholar 

  15. Y. Zeng, X. Xu, R. Zhang, Trajectory design for completion time minimization in UAV-enabled multicasting. IEEE Trans. Wireless Commun. 17(4), 2233–2246 (2018)

    Article  Google Scholar 

  16. Q. Wu, R. Zhang, Common throughput maximization in UAV-enabled OFDMA systems with delay consideration. IEEE Trans. Commun. 66(12), 6614–6627 (2018)

    Article  Google Scholar 

  17. Y. Zeng, J. Xu, R. Zhang, Energy minimization for wireless communication with rotary-wing UAV. IEEE Trans. Wireless Commun. 18(4), 2329–2345 (2019)

    Article  Google Scholar 

  18. H. Guo, J. Liu, UAV-enhanced intelligent offloading for internet of things at the edge. IEEE Trans. Ind. Inf. 16(4), 2737–2746 (2019)

    Article  Google Scholar 

  19. M. Mozaffari, A.T.Z. Kasgari, W. Saad, M. Bennis, M. Debbah, Beyond 5G with UAVs: foundations of a 3D wireless cellular network. IEEE Trans. Wireless Commun. 18(1), 357–372 (2018)

    Article  Google Scholar 

  20. Q. Wu, Y. Zeng, R. Zhang, Joint trajectory and communication design for multi-UAV enabled wireless networks. IEEE Trans. Wireless Commun. 17(3), 2109–2121 (2018)

    Article  Google Scholar 

  21. M. Mozaffari, W. Saad, M. Bennis, M. Debbah, Mobile unmanned aerial vehicles (UAVs) for energy-efficient internet of things communications. IEEE Trans. Wireless Commun. 16(11), 7574–7589 (2017)

    Article  Google Scholar 

  22. Z. Yang, C. Pan, K. Wang, M. Shikh-Bahaei, Energy efficient resource allocation in UAV-enabled mobile edge computing networks. IEEE Trans. Wireless Commun. 18(9), 4576–4589 (2019)

    Article  Google Scholar 

  23. E. Alpaydin, Introduction to Machine Learning (MIT Press, Cambridge, 2020)

    MATH  Google Scholar 

  24. R.S. Sutton, A.G. Barto, Reinforcement Learning: An Introduction, 2nd edn. (Bradford Books, Cambridge, 2018)

    MATH  Google Scholar 

  25. N.C. Luong, D.T. Hoang, S. Gong, D. Niyato, P. Wang, Y.-C. Liang, D.I. Kim, Applications of deep reinforcement learning in communications and networking: a survey. IEEE Commun. Surv. Tuts. 21(4), 3133–3174 (2019)

    Article  Google Scholar 

  26. P.V. Klaine, J.P. Nadas, R.D. Souza, M.A. Imran, Distributed drone base station positioning for emergency cellular networks using reinforcement learning. Cognit. Comput. 10(5), 790–804 (2018)

    Article  Google Scholar 

  27. J. Cui, Y. Liu, A. Nallanathan, Multi-agent reinforcement learning-based resource allocation for UAV networks. IEEE Trans. Wireless Commun. 19(2), 729–743 (2019)

    Article  Google Scholar 

  28. J. Hu, H. Zhang, L. Song, Z. Han, H.V. Poor, Reinforcement learning for a cellular internet of UAVs: protocol design, trajectory control, and resource management. IEEE Wireless Commun. 27(1), 116–123 (2020)

    Article  Google Scholar 

  29. X. Liu, Y. Liu, Y. Chen, L. Hanzo, Trajectory design and power control for multi-UAV assisted wireless networks: a machine learning approach. IEEE Trans. Veh. Technol. 68(8), 7957–7969 (2019)

    Article  Google Scholar 

  30. Y. Hu, M. Chen, W. Saad, H. V. Poor, S. Cui, Distributed multi-agent meta learning for trajectory design in wireless drone networks (2020). arXiv:2012.03158

    Google Scholar 

  31. S. Singh, A. Kumbhar, I. Güvenç, M.L. Sichitiu, Distributed approaches for inter-cell interference coordination in UAV-based LTE-Advanced HetNets, in 2018 IEEE 88th Vehicular Technology Conference (VTC-Fall) (IEEE, Piscataway, 2018), pp. 1–6

    Google Scholar 

  32. U. Challita, W. Saad, C. Bettstetter, Interference management for cellular-connected UAVs: a deep reinforcement learning approach. IEEE Trans. Wireless Commun. 18(4), 2125–2140 (2019)

    Article  Google Scholar 

  33. F. Tang, Y. Zhou, N. Kato, Deep reinforcement learning for dynamic uplink/downlink resource allocation in high mobility 5G HetNet. IEEE J. Sel. Areas Commun. 38(12), 2773–2782 (2020)

    Article  Google Scholar 

  34. N. Cheng, F. Lyu, W. Quan, C. Zhou, H. He, W. Shi, X. Shen, Space/aerial-assisted computing offloading for IoT applications: a learning-based approach. IEEE J. Sel. Areas Commun. 37(5), 1117–1129 (2019)

    Article  Google Scholar 

  35. X. Liu, M. Chen, C. Yin, Optimized trajectory design in UAV based cellular networks for 3D users: a double Q-learning approach (2019). arXiv:1902.06610

    Google Scholar 

  36. C.H. Liu, Z. Chen, J. Tang, J. Xu, C. Piao, Energy-efficient UAV control for effective and fair communication coverage: a deep reinforcement learning approach. IEEE J. Sel. Areas Commun. 36(9), 2059–2070 (2018)

    Article  Google Scholar 

  37. S. Khairy, P. Balaprakash, L. X. Cai, Y. Cheng, Constrained deep reinforcement learning for energy sustainable multi-UAV based random access IoT networks with NOMA (2020). arXiv:2002.00073

    Google Scholar 

  38. Y. Huang, X. Mo, J. Xu, L. Qiu, Y. Zeng, Online maneuver design for UAV-enabled NOMA systems via reinforcement learning, in 2020 IEEE Wireless Communications and Networking Conference (WCNC). (IEEE, Piscataway, 2020), pp. 1–6

    Google Scholar 

  39. T.P. Lillicrap, J.J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, D. Wierstra, Continuous control with deep reinforcement learning (2015). arXiv:1509.02971

    Google Scholar 

  40. H.X. Pham, H.M. La, D. Feil-Seifer, A. Nefian, Cooperative and distributed reinforcement learning of drones for field coverage (2018). arXiv:1803.07250

    Google Scholar 

  41. D. Chen, Q. Qi, Z. Zhuang, J. Wang, J. Liao, Z. Han, Mean field deep reinforcement learning for fair and efficient UAV control. IEEE Internet Things J. 8(2), 813–828 (2020)

    Article  Google Scholar 

  42. A. Al-Hourani, S. Kandeepan, A. Jamalipour, Modeling air-to-ground path loss for low altitude platforms in urban environments, in 2014 IEEE Global Communications Conference (IEEE, Piscataway, 2014), pp. 2898–2904

    Google Scholar 

  43. J.M. Seddon, S. Newman, Basic Helicopter Aerodynamics, vol. 40 (Wiley, Hoboken, 2011)

    Book  Google Scholar 

  44. M. Han, S. Khairy, L. X. Cai, Y. Cheng, R. Zhang, Reinforcement learning for efficient and fair coexistence between LTE-LAA and Wi-Fi. IEEE Trans. Veh. Technol. 69(8), 8764–8776 (2020)

    Article  Google Scholar 

  45. D. Shi, J. Ding, S.M. Errapotu, H. Yue, W. Xu, X. Zhou, M. Pan, Deep Q-network-based route scheduling for TNC vehicles with passengers’ location differential privacy. IEEE Internet Things J. 6(5), 7681–7692 (2019)

    Article  Google Scholar 

  46. T. Hester, M. Vecerik, O. Pietquin, M. Lanctot, T. Schaul, B. Piot, D. Horgan, J. Quan, A. Sendonaris, I. Osband, et al., Deep Q-learning from demonstrations, in Thirty-Second AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  47. V. Mnih, A.P. Badia, M. Mirza, A. Graves, T. Lillicrap, T. Harley, D. Silver, K. Kavukcuoglu, Asynchronous methods for deep reinforcement learning, in International Conference on Machine Learning (2016), pp. 1928–1937

    Google Scholar 

  48. Y. Zeng, R. Zhang, Energy-efficient UAV communication with trajectory optimization. IEEE Trans. Wireless Commun. 16(6), 3747–3760 (2017)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ran Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Zhang, R., Nguyen, D.M.(., Wang, M., Cai, L.X., Shen, X.(. (2022). Responsive Regulation of Dynamic UAV Communication Networks Based on Deep Reinforcement Learning. In: Cai, L., Mark, B.L., Pan, J. (eds) Broadband Communications, Computing, and Control for Ubiquitous Intelligence. Wireless Networks. Springer, Cham. https://doi.org/10.1007/978-3-030-98064-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-98064-1_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-98063-4

  • Online ISBN: 978-3-030-98064-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics