Skip to main content
Log in

Multi-objective Deep Reinforcement Learning Based Joint Beamforming and Power Allocation in UAV Assisted Cellular Communication

  • Published:
Wireless Personal Communications Aims and scope Submit manuscript

Abstract

In order to provide spectrum and energy efficient communication for unmanned aerial vehicle assisted cellular network, the problem of joint beamforming and power allocation (JBPA) in aerial multicell scenario is addressed. The JBPA multi-objective optimization model which would simultaneously maximize the achievable spectrum and energy efficiency is first developed. In view of the model, the centralized deep reinforcement learning (DRL) algorithm, i.e., upper confidence bound based Dueling deep Q network (UCB DDQN) with Mish activation function, is proposed to solve the multi-objective optimization problem and we make use of this learning algorithm to design JBPA strategy. Furthermore, a federated UCB DDQN learning based JBPA is to proposed tackle the challenge of the centralized DRL would require excessive data exchange. Simulation results validate that the faster convergence speed and the total weighted energy-spectrum efficiency (TWESE) achieved by the JBPA based on UCB DDQN is greater than conventional DQN based resource allocation approach, and also indicate that the federated UCB DDQN achieves better TWESE performance than the UCB DDQN.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Availability of Data and Material

Simulation results obtained using python and all materials listed in reference section.

Code Availability

Available on request.

References

  1. Wang, H., Zhao, H., Zhang, J., Ma, D., Li, J., & Wei, J. (2020). Survey on unmanned aerial vehicle networks: A cyber physical system perspective. IEEE Communications Surveys and Tutorials, 22(2), 1027–1070.

    Article  Google Scholar 

  2. Dang, S., Amin, O., Shihada, B., & Alouini, M.-S. (2020). What should 6G be? Nature Electronics, 3(1), 20–29.

    Article  Google Scholar 

  3. Gupta, M., Vikash, & Varma, S. (2018). Configuration of aerial mesh networks with Internet of Things. In 2018 International conference on wireless communications, in proceedings of the signal processing and networking (WiSPNET) (pp. 1–3).

  4. Vikash, L. M., & Varma, S. (2020). Performance evaluation of real-time stream processing systems for Internet of Things applications. Future Generation Computer Systems, 113, 207–217.

    Article  Google Scholar 

  5. Vikash, L. M., & Varma, S. (2021). Middleware technologies for smart wireless sensor networks towards Internet of Things: A comparative review. Wireless Personal Communications, 116, 1539–1574.

    Article  Google Scholar 

  6. Yu, X., Teng, T., Dang, X., Leung, S.-H., & Xu, F. (2021). Joint power allocation and beamforming for energy-efficient design in multiuser distributed MIMO systems. IEEE Transactions on Communications, 69(6), 4128–4143.

    Article  Google Scholar 

  7. Shao, W., Zhang, S., Zhang, X., Ma, J., & Zhao, N. (2019). Suppressing interference and power allocation over the multicell MIMO-NOMA networks. IEEE Communications Letters, 23(8), 1397–1400.

    Article  Google Scholar 

  8. Fu, Y., Zhang, M., & Salaün, L. (2020). Zero-forcing oriented power minimization for multi-cell MISO-NOMA systems: A joint user grouping, beamforming, and power control perspective. IEEE Journal on Selected Areas in Communications, 38(8), 1925–1940.

    Article  Google Scholar 

  9. Mismar, F. B., Evans, B. L., & Alkhate, A. (2020). Deep reinforcement learning for 5G networks: Joint beamforming, power control, and interference coordination. IEEE Transactions on Communications, 68(3), 1581–1592.

    Article  Google Scholar 

  10. Liu, M., & Wang, R. (2020). Deep reinforcement learning based dynamic power and beamforming design for time-varying wireless downlink interference channel. arXiv preprint, http://arxiv.org/abs/2011.03780v1

  11. Chen, X., Wu, X., Han, S., & Xie, Z. (2019). Joint optimization of EE and SE considering interference threshold in ultra-dense networks. In 2019 15th international wireless communications and mobile computing conference (IWCMC) (pp. 1305–1310).

  12. Liu, Z., Han, Y., Fan, J., Zhang, L., & Lin, Y. (2020). Joint optimization of spectrum and energy efficiency considering the C-V2X security: A deep reinforcement learning approach. In: Proceedings of the IEEE 18th international conference on industrial informatics (INDIN) (pp. 315–320).

  13. Guo, Y., Liu, Y., Wu, Q., Li, X., & Shi, Q. (2023). Joint beamforming and power allocation for RIS aided full-duplex integrated sensing and uplink communication system. IEEE Transactions on Wireless Communications.

  14. Muy, S., & Lee, J.-R. (2023). Joint optimization of trajectory, beamforming, and power allocation in UAV-enabled WPT networks using DRL combined with water-filling algorithm. Vehicular Communications, 43, 100632.

    Article  Google Scholar 

  15. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., & Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518, 529–533.

    Article  Google Scholar 

  16. Misra, D. (2019). Mish: A self regularized non-monotonic activation function. arXiv preprint, https://arxiv.org/abs/1908.08681

Download references

Funding

This work was supported in part by the Program of the Aeronautical Science Foundation of China under Grant 2018ZC1503.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haitao Li.

Ethics declarations

Conflict of interest

The authors declared that they have no conflicts of interest to this work. We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, H., Lv, X. & Zhang, S. Multi-objective Deep Reinforcement Learning Based Joint Beamforming and Power Allocation in UAV Assisted Cellular Communication. Wireless Pers Commun 134, 809–829 (2024). https://doi.org/10.1007/s11277-024-10927-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11277-024-10927-5

Keywords

Navigation