Abstract
In order to provide spectrum and energy efficient communication for unmanned aerial vehicle assisted cellular network, the problem of joint beamforming and power allocation (JBPA) in aerial multicell scenario is addressed. The JBPA multi-objective optimization model which would simultaneously maximize the achievable spectrum and energy efficiency is first developed. In view of the model, the centralized deep reinforcement learning (DRL) algorithm, i.e., upper confidence bound based Dueling deep Q network (UCB DDQN) with Mish activation function, is proposed to solve the multi-objective optimization problem and we make use of this learning algorithm to design JBPA strategy. Furthermore, a federated UCB DDQN learning based JBPA is to proposed tackle the challenge of the centralized DRL would require excessive data exchange. Simulation results validate that the faster convergence speed and the total weighted energy-spectrum efficiency (TWESE) achieved by the JBPA based on UCB DDQN is greater than conventional DQN based resource allocation approach, and also indicate that the federated UCB DDQN achieves better TWESE performance than the UCB DDQN.
Similar content being viewed by others
Availability of Data and Material
Simulation results obtained using python and all materials listed in reference section.
Code Availability
Available on request.
References
Wang, H., Zhao, H., Zhang, J., Ma, D., Li, J., & Wei, J. (2020). Survey on unmanned aerial vehicle networks: A cyber physical system perspective. IEEE Communications Surveys and Tutorials, 22(2), 1027–1070.
Dang, S., Amin, O., Shihada, B., & Alouini, M.-S. (2020). What should 6G be? Nature Electronics, 3(1), 20–29.
Gupta, M., Vikash, & Varma, S. (2018). Configuration of aerial mesh networks with Internet of Things. In 2018 International conference on wireless communications, in proceedings of the signal processing and networking (WiSPNET) (pp. 1–3).
Vikash, L. M., & Varma, S. (2020). Performance evaluation of real-time stream processing systems for Internet of Things applications. Future Generation Computer Systems, 113, 207–217.
Vikash, L. M., & Varma, S. (2021). Middleware technologies for smart wireless sensor networks towards Internet of Things: A comparative review. Wireless Personal Communications, 116, 1539–1574.
Yu, X., Teng, T., Dang, X., Leung, S.-H., & Xu, F. (2021). Joint power allocation and beamforming for energy-efficient design in multiuser distributed MIMO systems. IEEE Transactions on Communications, 69(6), 4128–4143.
Shao, W., Zhang, S., Zhang, X., Ma, J., & Zhao, N. (2019). Suppressing interference and power allocation over the multicell MIMO-NOMA networks. IEEE Communications Letters, 23(8), 1397–1400.
Fu, Y., Zhang, M., & Salaün, L. (2020). Zero-forcing oriented power minimization for multi-cell MISO-NOMA systems: A joint user grouping, beamforming, and power control perspective. IEEE Journal on Selected Areas in Communications, 38(8), 1925–1940.
Mismar, F. B., Evans, B. L., & Alkhate, A. (2020). Deep reinforcement learning for 5G networks: Joint beamforming, power control, and interference coordination. IEEE Transactions on Communications, 68(3), 1581–1592.
Liu, M., & Wang, R. (2020). Deep reinforcement learning based dynamic power and beamforming design for time-varying wireless downlink interference channel. arXiv preprint, http://arxiv.org/abs/2011.03780v1
Chen, X., Wu, X., Han, S., & Xie, Z. (2019). Joint optimization of EE and SE considering interference threshold in ultra-dense networks. In 2019 15th international wireless communications and mobile computing conference (IWCMC) (pp. 1305–1310).
Liu, Z., Han, Y., Fan, J., Zhang, L., & Lin, Y. (2020). Joint optimization of spectrum and energy efficiency considering the C-V2X security: A deep reinforcement learning approach. In: Proceedings of the IEEE 18th international conference on industrial informatics (INDIN) (pp. 315–320).
Guo, Y., Liu, Y., Wu, Q., Li, X., & Shi, Q. (2023). Joint beamforming and power allocation for RIS aided full-duplex integrated sensing and uplink communication system. IEEE Transactions on Wireless Communications.
Muy, S., & Lee, J.-R. (2023). Joint optimization of trajectory, beamforming, and power allocation in UAV-enabled WPT networks using DRL combined with water-filling algorithm. Vehicular Communications, 43, 100632.
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., & Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518, 529–533.
Misra, D. (2019). Mish: A self regularized non-monotonic activation function. arXiv preprint, https://arxiv.org/abs/1908.08681
Funding
This work was supported in part by the Program of the Aeronautical Science Foundation of China under Grant 2018ZC1503.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declared that they have no conflicts of interest to this work. We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, H., Lv, X. & Zhang, S. Multi-objective Deep Reinforcement Learning Based Joint Beamforming and Power Allocation in UAV Assisted Cellular Communication. Wireless Pers Commun 134, 809–829 (2024). https://doi.org/10.1007/s11277-024-10927-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11277-024-10927-5