Skip to main content

Advertisement

Log in

Deep reinforcement learning empowered joint mode selection and resource allocation for RIS-aided D2D communications

  • Review
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Device-to-device (D2D) communication has been regarded as a promising solution to alleviate the mobile traffic explosion problem for its capabilities of improving system data rate and resource utilization. A reconfigurable intelligent surface (RIS) aided mobile D2D communications framework is investigated, where the RIS is deployed to improve communication quality. As the transmission distance of D2D pairs changes, the mode selection for D2D pairs and the phase shift design for RIS is essential for mobile scenarios. Therefore, we formulate a joint optimization problem of mode selection, channel assignment, power allocation, and discrete phase shift selection to maximize the average sum data rate of D2D pairs. This problem is also constrained by the maximum transmit power and the minimum data rate requirements of users, where the latter is to guarantee the fairness of D2D pairs. We first reformulate the original sequential decision-making problem into a Markov game (MG) problem to solve the challenging optimization. Furthermore, a multi-agent deep reinforcement learning (MADRL) framework is proposed, in which multiple agents cooperatively determine the joint mode selection and resource allocation strategy. The proposed MADRL-based framework combines both the multi-pass deep Q-networks (MP-DQN) algorithm and the decaying DQN algorithm to solve the optimization problem. Specifically, we adopt the MP-DQN algorithm for D2D pairs to handle the hybrid discrete-continuous action space. Moreover, the decaying DQN algorithm is invoked by the RIS agent to select discrete phase shifts. Simulation results demonstrate that the proposed algorithm can converge under different cases. The proposed MADRL-based algorithm outperforms the combination algorithm of DQN and the deep deterministic policy gradient (DDPG) in terms of system performance. Moreover, it is also shown that the average sum data rate of D2D pairs can be significantly improved by deploying the RIS and further enhanced by increasing the number of reflecting elements (REs).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability

Data sharing are not applicable to this article as no datasets were generated or analyzed during the current study.

References

  1. Abouelmagd EI, Awad ME, Elzayat EMA, Abbas IA (2014) Reduction the secular solution to periodic solution in the generalized restricted three-body problem. Astrophys Space Sci 350:495–505. https://doi.org/10.1007/s10509-013-1756-z

    Article  Google Scholar 

  2. Alkhayyat A, Hammood DA, Mahmoud MS (2020) Transmission mode selection for reliable critical data transmission. In: 2020 3rd International conference on engineering technology and its applications (IICETA), pp 236–240. https://doi.org/10.1109/IICETA50496.2020.9318796

  3. Bester CJ, James SD, Konidaris GD (2019) Multi-pass Q-networks for deep reinforcement learning with parameterised action spaces. arXiv:1905.04388

  4. Bi Z, Zhou W (2020) Deep reinforcement learning based power allocation for D2D network. In: 2020 IEEE 91st vehicular technology conference (VTC2020-Spring), pp 1–5. https://doi.org/10.1109/VTC2020-Spring48590.2020.9129537

  5. Chen Y, Ma C (2020) Overview of D2D communication technology under 5G cellular network coverage. In: 2020 IEEE 6th international conference on computer and communications (ICCC), pp 1297–1301. https://doi.org/10.1109/ICCC51575.2020.9344968

  6. Chen C, Sung C, Chen H (2019) Capacity maximization based on optimal mode selection in multi-mode and multi-pair D2D communications. IEEE Trans Veh Technol 68(7):6524–6534. https://doi.org/10.1109/TVT.2019.2913987

    Article  Google Scholar 

  7. Chen Y, Ai B, Zhang H, Niu Y, Song L, Han Z, Vincent Poor H (2021) Reconfigurable intelligent surface assisted device-to-device communications. IEEE Trans Wirel Commun 20(5):2792–2804. https://doi.org/10.1109/TWC.2020.3044302

    Article  Google Scholar 

  8. Chen J, Guo L, Jia J, Shang J, Wang X (2022) Resource allocation for IRS assisted SGF NOMA transmission: a MADRL approach. IEEE J Sel Areas Commun 40(4):1302–1316. https://doi.org/10.1109/JSAC.2022.3144726

    Article  Google Scholar 

  9. Chen J, Ma Z, Liu Y, Jia J, Wang X (2022) Energy efficient resource allocation for MSCA enabled CoMP in hetnets. IEEE Trans Veh Technol 71(3):2965–2978. https://doi.org/10.1109/TVT.2022.3142075

    Article  Google Scholar 

  10. Chen J, Xie Y, Mu X, Jia J, Liu Y, Wang X (2022) Energy efficient resource allocation for IRS assisted CoMP systems. IEEE Trans Wirel Commun 21(7):5688–5702. https://doi.org/10.1109/TWC.2022.3142784

    Article  Google Scholar 

  11. Cheng N, Zhou H, Lei L, Zhang N, Zhou Y, Shen X, Bai F (2017) Performance analysis of vehicular device-to-device underlay communication. IEEE Trans Veh Technol 66(6):5409–5421. https://doi.org/10.1109/TVT.2016.2627582

    Article  Google Scholar 

  12. Dai Y, Sheng M, Liu J, Cheng N, Shen X, Yang Q (2019) Joint mode selection and resource allocation for D2D-enabled NOMA cellular networks. IEEE Trans Veh Technol 68(7):6721–6733. https://doi.org/10.1109/TVT.2019.2916395

    Article  Google Scholar 

  13. Du B, Liu Y, Atiatallah Abbas I (2016) Existence and asymptotic behavior results of periodic solution for discrete-time neutral-type neural networks. J Frankl Inst 353(2):448–461. https://doi.org/10.1016/j.jfranklin.2015.11.013

    Article  MathSciNet  MATH  Google Scholar 

  14. Gong S, Lu X, Hoang DT, Niyato D, Shu L, Kim DI, Liang YC (2020) Toward smart wireless communications via intelligent reflecting surfaces: a contemporary survey. IEEE Commun Surv Tutor 22(4):2283–2314. https://doi.org/10.1109/COMST.2020.3004197

    Article  Google Scholar 

  15. Gu B, Zhang X, Lin Z, Alazab M (2021) Deep multiagent reinforcement-learning-based resource allocation for internet of controllable things. IEEE Internet of Things J 8(5):3066–3074. https://doi.org/10.1109/JIOT.2020.3023111

    Article  Google Scholar 

  16. He Y, Ren J, Yu G, Cai Y (2019) D2D communications meet mobile edge computing for enhanced computation capacity in cellular networks. IEEE Trans Wirel Commun 18(3):1750–1763. https://doi.org/10.1109/TWC.2019.2896999

    Article  Google Scholar 

  17. Huang J, Yang Y, He G, Xiao Y, Liu J (2021) Deep reinforcement learning-based dynamic spectrum access for D2D communication underlay cellular networks. IEEE Commun Lett 25(8):2614–2618. https://doi.org/10.1109/LCOMM.2021.3079920

    Article  Google Scholar 

  18. Ji Z, Qin Z (2020) Reconfigurable intelligent surface enhanced device-to-device communications. In: GLOBECOM 2020—2020 IEEE global communications conference, pp 1–6. https://doi.org/10.1109/GLOBECOM42002.2020.9322411

  19. Jia J, Deng Y, Chen J, Aghvami AH, Nallanathan A (2017) Availability analysis and optimization in CoMP and CA-enabled HetNets. IEEE Trans Commun 65(6):2438–2450. https://doi.org/10.1109/TCOMM.2017.2679747

    Article  Google Scholar 

  20. Jiang W, Feng G, Qin S, Yum TSP, Cao G (2019) Multi-agent reinforcement learning for efficient content caching in mobile D2D networks. IEEE Trans Wirel Commun 18(3):1610–1622. https://doi.org/10.1109/TWC.2019.2894403

    Article  Google Scholar 

  21. Khalid W, Yu H, Do DT, Kaleem Z, Noh S (2021) RIS-aided physical layer security with full-duplex jamming in underlay D2D networks. IEEE Access 9:99667–99679. https://doi.org/10.1109/ACCESS.2021.3095852

    Article  Google Scholar 

  22. Khoshafa MH, Ngatched TMN, Ahmed MH (2021) Reconfigurable intelligent surfaces-aided physical layer security enhancement in D2D underlay communications. IEEE Commun Lett 25(5):1443–1447. https://doi.org/10.1109/LCOMM.2020.3046946

    Article  Google Scholar 

  23. Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: 3rd International conference for learning representations (ICLR). arXiv:1412.6980

  24. Lien S, Chien C, Tseng F, Ho T (2016) 3GPP device-to-device communications for beyond 4G cellular networks. IEEE Commun Mag 54(3):29–35. https://doi.org/10.1109/MCOM.2016.7432168

    Article  Google Scholar 

  25. Liu Y, Liu W, Obaid MA, Abbas IA (2016) Exponential stability of Markovian jumping Cohen–Grossberg neural networks with mixed mode-dependent time-delays. Neurocomputing 177:409–415. https://doi.org/10.1016/j.neucom.2015.11.046

    Article  Google Scholar 

  26. Liu T, Feng L, Li W, Yang Z (2021) Radio resource allocation for RIS-aided D2D communication based on greedy hypergraph-with-weight coloring. In: 2021 22nd Asia-pacific network operations and management symposium (APNOMS), pp 84–89. https://doi.org/10.23919/APNOMS52696.2021.9562507

  27. Liu Y, Liu X, Mu X, Hou T, Xu J, Di Renzo M, Al-Dhahir N (2021) Reconfigurable intelligent surfaces: principles and opportunities. IEEE Commun Surv Tutor 23(3):1546–1577. https://doi.org/10.1109/COMST.2021.3077737

    Article  Google Scholar 

  28. Mach P, Becvar Z, Najla M (2019) Resource allocation for D2D communication with multiple D2D pairs reusing multiple channels. IEEE Wirel Commun Lett 8(4):1008–1011. https://doi.org/10.1109/LWC.2019.2903798

    Article  Google Scholar 

  29. Pradhan C, Li A, Song L, Li J, Vucetic B, Li Y (2020) Reconfigurable intelligent surface (RIS)-enhanced two-way OFDM communications. IEEE Trans Veh Technol 69(12):16270–16275. https://doi.org/10.1109/TVT.2020.3038942

    Article  Google Scholar 

  30. Sun Y, Peng M, Mao S (2019) Deep reinforcement learning-based mode selection and resource management for green fog radio access networks. IEEE Internet of Things J 6(2):1960–1971. https://doi.org/10.1109/JIOT.2018.2871020

    Article  Google Scholar 

  31. Tan J, Liang YC, Zhang L, Feng G (2021) Deep reinforcement learning for joint channel selection and power control in D2D networks. IEEE Trans Wirel Commun 20(2):1363–1378. https://doi.org/10.1109/TWC.2020.3032991

    Article  Google Scholar 

  32. Tang H, Ding Z (2016) Mixed mode transmission and resource allocation for D2D communication. IEEE Trans Wirel Commun 15(1):162–175. https://doi.org/10.1109/TWC.2015.2468725

    Article  Google Scholar 

  33. Tao Q, Wang J, Zhong C (2020) Performance analysis of intelligent reflecting surface aided communication systems. IEEE Commun Lett 24(11):2464–2468. https://doi.org/10.1109/LCOMM.2020.3011843

    Article  Google Scholar 

  34. Tehrani MN, Uysal M, Yanikomeroglu H (2014) Device-to-device communication in 5G cellular networks: challenges, solutions, and future directions. IEEE Commun Mag 52(5):86–92. https://doi.org/10.1109/MCOM.2014.6815897

    Article  Google Scholar 

  35. Wang X, Zhang Y, Shen R, Xu Y, Zheng FC (2020) DRL-based energy-efficient resource allocation frameworks for uplink NOMA systems. IEEE Internet of Things J 7(8):7279–7294. https://doi.org/10.1109/JIOT.2020.2982699

    Article  Google Scholar 

  36. Wu Q, Zhang R (2020) Towards smart and reconfigurable environment: intelligent reflecting surface aided wireless network. IEEE Commun Mag 58(1):106–112. https://doi.org/10.1109/MCOM.001.1900107

    Article  Google Scholar 

  37. Wu Q, Zhang R (2020) Towards smart and reconfigurable environment: intelligent reflecting surface aided wireless network. IEEE Commun Mag 58(1):106–112. https://doi.org/10.1109/MCOM.001.1900107

    Article  Google Scholar 

  38. Wu Y, Yu W, Griffith D, Golmie N (2020) Modeling and performance assessment of dynamic rate adaptation for M2M communications. IEEE Trans Netw Sci Eng 7(1):285–303. https://doi.org/10.1109/TNSE.2018.2869093

    Article  MathSciNet  Google Scholar 

  39. Xiang H, Yang Y, He G, Huang J, He D (2022) Multi-agent deep reinforcement learning-based power control and resource allocation for D2D communications. IEEE Wirel Commun Lett 11(8):1659–1663. https://doi.org/10.1109/LWC.2022.3170998

    Article  Google Scholar 

  40. Xu X, Zhang Y, Sun Z, Hong Y, Tao X (2016) Analytical modeling of mode selection for moving D2D-enabled cellular networks. IEEE Commun Lett 20(6):1203–1206. https://doi.org/10.1109/LCOMM.2016.2552171

    Article  Google Scholar 

  41. Yang H, Ye Y, Chu X, Dong M (2020) Resource and power allocation in SWIPT-enabled device-to-device communications based on a nonlinear energy harvesting model. IEEE Internet of Things J 7(11):10813–10825. https://doi.org/10.1109/JIOT.2020.2988512

    Article  Google Scholar 

  42. Yang Y, Zheng B, Zhang S, Zhang R (2020) Intelligent reflecting surface meets OFDM: protocol design and rate maximization. IEEE Trans Commun 68(7):4522–4535. https://doi.org/10.1109/TCOMM.2020.2981458

    Article  Google Scholar 

  43. Yang G, Liao Y, Liang YC, Tirkkonen O, Wang G, Zhu X (2021) Reconfigurable intelligent surface empowered device-to-device communication underlaying cellular networks. IEEE Trans Commun 69(11):7790–7805. https://doi.org/10.1109/TCOMM.2021.3102640

    Article  Google Scholar 

  44. Zeng S, Zhang H, Di B, Han Z, Song L (2021) Reconfigurable intelligent surface (RIS) assisted wireless coverage extension: RIS orientation and location optimization. IEEE Commun Lett 25(1):269–273. https://doi.org/10.1109/LCOMM.2020.3025345

    Article  Google Scholar 

  45. Zhong R, Liu X, Liu Y, Chen Y (2022) Multi-agent reinforcement learning in NOMA-aided UAV networks for cellular offloading. IEEE Trans Wirel Commun 21(3):1498–1512. https://doi.org/10.1109/TWC.2021.3104633

    Article  Google Scholar 

  46. Zhou G, Pan C, Ren H, Wang K, Nallanathan A (2020) A framework of robust transmission design for IRS-aided MISO communications with imperfect cascaded channels. IEEE Trans Signal Process 68:5092–5106. https://doi.org/10.1109/TSP.2020.3019666

    Article  MathSciNet  MATH  Google Scholar 

  47. Zhou H, Wu T, Zhang H, Wu J (2021) Incentive-driven deep reinforcement learning for content caching and D2D offloading. IEEE J Sel Areas Commun 39(8):2445–2460. https://doi.org/10.1109/JSAC.2021.3087232

    Article  Google Scholar 

  48. Zhu K, Hossain E (2015) Joint mode selection and spectrum partitioning for device-to-device communication: a dynamic Stackelberg game. IEEE Trans Wirel Commun 14(3):1406–1420

    Article  Google Scholar 

Download references

Funding

This work was supported in part by the National Natural Science Foundation of China under Grants No. 61972079, 62172084, 62132004, in part by the Major Research Plan of National Natural Science Foundation of China under Grant No. 92167103, in part by the LiaoNing Revitalization Talents Program under Grant No. XLYC2007162, in part by the LiaoNing Key Research and Development Program under Grant No. 2023JH2/101300196, in part by the Central Government Guided Local Science and Technology Development Fund Project under Grant No. 2020ZY0003, in part by the Science and Technology Plan Project of Inner Mongolia Autonomous Region of China under Grant No. 2020GG0189, in part by the Fundamental Research Funds for the Central Universities under Grants No. N2224001-7, N2216009, N2216006, N2116004.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jie Jia.

Ethics declarations

Conflict of interest

We promise that this manuscript is the authors’ original work and has not been published nor has it been submitted simultaneously elsewhere. All authors have checked the manuscript and have agreed to the submission.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guo, L., Jia, J., Chen, J. et al. Deep reinforcement learning empowered joint mode selection and resource allocation for RIS-aided D2D communications. Neural Comput & Applic 35, 18231–18249 (2023). https://doi.org/10.1007/s00521-023-08745-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08745-0

Keywords

Navigation