Deep reinforcement learning empowered joint mode selection and resource allocation for RIS-aided D2D communications

Guo, Liang; Jia, Jie; Chen, Jian; Du, An; Wang, Xingwei

doi:10.1007/s00521-023-08745-0

Deep reinforcement learning empowered joint mode selection and resource allocation for RIS-aided D2D communications

Review
Published: 21 July 2023

Volume 35, pages 18231–18249, (2023)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Liang Guo¹,
Jie Jia ORCID: orcid.org/0000-0001-7296-5061^1,2,
Jian Chen¹,
An Du¹ &
…
Xingwei Wang^1,2

396 Accesses
2 Citations
Explore all metrics

Abstract

Device-to-device (D2D) communication has been regarded as a promising solution to alleviate the mobile traffic explosion problem for its capabilities of improving system data rate and resource utilization. A reconfigurable intelligent surface (RIS) aided mobile D2D communications framework is investigated, where the RIS is deployed to improve communication quality. As the transmission distance of D2D pairs changes, the mode selection for D2D pairs and the phase shift design for RIS is essential for mobile scenarios. Therefore, we formulate a joint optimization problem of mode selection, channel assignment, power allocation, and discrete phase shift selection to maximize the average sum data rate of D2D pairs. This problem is also constrained by the maximum transmit power and the minimum data rate requirements of users, where the latter is to guarantee the fairness of D2D pairs. We first reformulate the original sequential decision-making problem into a Markov game (MG) problem to solve the challenging optimization. Furthermore, a multi-agent deep reinforcement learning (MADRL) framework is proposed, in which multiple agents cooperatively determine the joint mode selection and resource allocation strategy. The proposed MADRL-based framework combines both the multi-pass deep Q-networks (MP-DQN) algorithm and the decaying DQN algorithm to solve the optimization problem. Specifically, we adopt the MP-DQN algorithm for D2D pairs to handle the hybrid discrete-continuous action space. Moreover, the decaying DQN algorithm is invoked by the RIS agent to select discrete phase shifts. Simulation results demonstrate that the proposed algorithm can converge under different cases. The proposed MADRL-based algorithm outperforms the combination algorithm of DQN and the deep deterministic policy gradient (DDPG) in terms of system performance. Moreover, it is also shown that the average sum data rate of D2D pairs can be significantly improved by deploying the RIS and further enhanced by increasing the number of reflecting elements (REs).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A deep reinforcement learning-based D2D spectrum allocation underlaying a cellular network

Article 30 May 2024

Distributed Dynamic Spectrum Access for D2D Communications Underlying Cellular Networks Using Deep Reinforcement Learning

Deep Reinforcement Learning Based Throughput Maximization Scheme for D2D Users Underlaying NOMA-Enabled Cellular Network

Data availability

Data sharing are not applicable to this article as no datasets were generated or analyzed during the current study.

References

Abouelmagd EI, Awad ME, Elzayat EMA, Abbas IA (2014) Reduction the secular solution to periodic solution in the generalized restricted three-body problem. Astrophys Space Sci 350:495–505. https://doi.org/10.1007/s10509-013-1756-z
Article Google Scholar
Alkhayyat A, Hammood DA, Mahmoud MS (2020) Transmission mode selection for reliable critical data transmission. In: 2020 3rd International conference on engineering technology and its applications (IICETA), pp 236–240. https://doi.org/10.1109/IICETA50496.2020.9318796
Bester CJ, James SD, Konidaris GD (2019) Multi-pass Q-networks for deep reinforcement learning with parameterised action spaces. arXiv:1905.04388
Bi Z, Zhou W (2020) Deep reinforcement learning based power allocation for D2D network. In: 2020 IEEE 91st vehicular technology conference (VTC2020-Spring), pp 1–5. https://doi.org/10.1109/VTC2020-Spring48590.2020.9129537
Chen Y, Ma C (2020) Overview of D2D communication technology under 5G cellular network coverage. In: 2020 IEEE 6th international conference on computer and communications (ICCC), pp 1297–1301. https://doi.org/10.1109/ICCC51575.2020.9344968
Chen C, Sung C, Chen H (2019) Capacity maximization based on optimal mode selection in multi-mode and multi-pair D2D communications. IEEE Trans Veh Technol 68(7):6524–6534. https://doi.org/10.1109/TVT.2019.2913987
Article Google Scholar
Chen Y, Ai B, Zhang H, Niu Y, Song L, Han Z, Vincent Poor H (2021) Reconfigurable intelligent surface assisted device-to-device communications. IEEE Trans Wirel Commun 20(5):2792–2804. https://doi.org/10.1109/TWC.2020.3044302
Article Google Scholar
Chen J, Guo L, Jia J, Shang J, Wang X (2022) Resource allocation for IRS assisted SGF NOMA transmission: a MADRL approach. IEEE J Sel Areas Commun 40(4):1302–1316. https://doi.org/10.1109/JSAC.2022.3144726
Article Google Scholar
Chen J, Ma Z, Liu Y, Jia J, Wang X (2022) Energy efficient resource allocation for MSCA enabled CoMP in hetnets. IEEE Trans Veh Technol 71(3):2965–2978. https://doi.org/10.1109/TVT.2022.3142075
Article Google Scholar
Chen J, Xie Y, Mu X, Jia J, Liu Y, Wang X (2022) Energy efficient resource allocation for IRS assisted CoMP systems. IEEE Trans Wirel Commun 21(7):5688–5702. https://doi.org/10.1109/TWC.2022.3142784
Article Google Scholar
Cheng N, Zhou H, Lei L, Zhang N, Zhou Y, Shen X, Bai F (2017) Performance analysis of vehicular device-to-device underlay communication. IEEE Trans Veh Technol 66(6):5409–5421. https://doi.org/10.1109/TVT.2016.2627582
Article Google Scholar
Dai Y, Sheng M, Liu J, Cheng N, Shen X, Yang Q (2019) Joint mode selection and resource allocation for D2D-enabled NOMA cellular networks. IEEE Trans Veh Technol 68(7):6721–6733. https://doi.org/10.1109/TVT.2019.2916395
Article Google Scholar
Du B, Liu Y, Atiatallah Abbas I (2016) Existence and asymptotic behavior results of periodic solution for discrete-time neutral-type neural networks. J Frankl Inst 353(2):448–461. https://doi.org/10.1016/j.jfranklin.2015.11.013
Article MathSciNet MATH Google Scholar
Gong S, Lu X, Hoang DT, Niyato D, Shu L, Kim DI, Liang YC (2020) Toward smart wireless communications via intelligent reflecting surfaces: a contemporary survey. IEEE Commun Surv Tutor 22(4):2283–2314. https://doi.org/10.1109/COMST.2020.3004197
Article Google Scholar
Gu B, Zhang X, Lin Z, Alazab M (2021) Deep multiagent reinforcement-learning-based resource allocation for internet of controllable things. IEEE Internet of Things J 8(5):3066–3074. https://doi.org/10.1109/JIOT.2020.3023111
Article Google Scholar
He Y, Ren J, Yu G, Cai Y (2019) D2D communications meet mobile edge computing for enhanced computation capacity in cellular networks. IEEE Trans Wirel Commun 18(3):1750–1763. https://doi.org/10.1109/TWC.2019.2896999
Article Google Scholar
Huang J, Yang Y, He G, Xiao Y, Liu J (2021) Deep reinforcement learning-based dynamic spectrum access for D2D communication underlay cellular networks. IEEE Commun Lett 25(8):2614–2618. https://doi.org/10.1109/LCOMM.2021.3079920
Article Google Scholar
Ji Z, Qin Z (2020) Reconfigurable intelligent surface enhanced device-to-device communications. In: GLOBECOM 2020—2020 IEEE global communications conference, pp 1–6. https://doi.org/10.1109/GLOBECOM42002.2020.9322411
Jia J, Deng Y, Chen J, Aghvami AH, Nallanathan A (2017) Availability analysis and optimization in CoMP and CA-enabled HetNets. IEEE Trans Commun 65(6):2438–2450. https://doi.org/10.1109/TCOMM.2017.2679747
Article Google Scholar
Jiang W, Feng G, Qin S, Yum TSP, Cao G (2019) Multi-agent reinforcement learning for efficient content caching in mobile D2D networks. IEEE Trans Wirel Commun 18(3):1610–1622. https://doi.org/10.1109/TWC.2019.2894403
Article Google Scholar
Khalid W, Yu H, Do DT, Kaleem Z, Noh S (2021) RIS-aided physical layer security with full-duplex jamming in underlay D2D networks. IEEE Access 9:99667–99679. https://doi.org/10.1109/ACCESS.2021.3095852
Article Google Scholar
Khoshafa MH, Ngatched TMN, Ahmed MH (2021) Reconfigurable intelligent surfaces-aided physical layer security enhancement in D2D underlay communications. IEEE Commun Lett 25(5):1443–1447. https://doi.org/10.1109/LCOMM.2020.3046946
Article Google Scholar
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: 3rd International conference for learning representations (ICLR). arXiv:1412.6980
Lien S, Chien C, Tseng F, Ho T (2016) 3GPP device-to-device communications for beyond 4G cellular networks. IEEE Commun Mag 54(3):29–35. https://doi.org/10.1109/MCOM.2016.7432168
Article Google Scholar
Liu Y, Liu W, Obaid MA, Abbas IA (2016) Exponential stability of Markovian jumping Cohen–Grossberg neural networks with mixed mode-dependent time-delays. Neurocomputing 177:409–415. https://doi.org/10.1016/j.neucom.2015.11.046
Article Google Scholar
Liu T, Feng L, Li W, Yang Z (2021) Radio resource allocation for RIS-aided D2D communication based on greedy hypergraph-with-weight coloring. In: 2021 22nd Asia-pacific network operations and management symposium (APNOMS), pp 84–89. https://doi.org/10.23919/APNOMS52696.2021.9562507
Liu Y, Liu X, Mu X, Hou T, Xu J, Di Renzo M, Al-Dhahir N (2021) Reconfigurable intelligent surfaces: principles and opportunities. IEEE Commun Surv Tutor 23(3):1546–1577. https://doi.org/10.1109/COMST.2021.3077737
Article Google Scholar
Mach P, Becvar Z, Najla M (2019) Resource allocation for D2D communication with multiple D2D pairs reusing multiple channels. IEEE Wirel Commun Lett 8(4):1008–1011. https://doi.org/10.1109/LWC.2019.2903798
Article Google Scholar
Pradhan C, Li A, Song L, Li J, Vucetic B, Li Y (2020) Reconfigurable intelligent surface (RIS)-enhanced two-way OFDM communications. IEEE Trans Veh Technol 69(12):16270–16275. https://doi.org/10.1109/TVT.2020.3038942
Article Google Scholar
Sun Y, Peng M, Mao S (2019) Deep reinforcement learning-based mode selection and resource management for green fog radio access networks. IEEE Internet of Things J 6(2):1960–1971. https://doi.org/10.1109/JIOT.2018.2871020
Article Google Scholar
Tan J, Liang YC, Zhang L, Feng G (2021) Deep reinforcement learning for joint channel selection and power control in D2D networks. IEEE Trans Wirel Commun 20(2):1363–1378. https://doi.org/10.1109/TWC.2020.3032991
Article Google Scholar
Tang H, Ding Z (2016) Mixed mode transmission and resource allocation for D2D communication. IEEE Trans Wirel Commun 15(1):162–175. https://doi.org/10.1109/TWC.2015.2468725
Article Google Scholar
Tao Q, Wang J, Zhong C (2020) Performance analysis of intelligent reflecting surface aided communication systems. IEEE Commun Lett 24(11):2464–2468. https://doi.org/10.1109/LCOMM.2020.3011843
Article Google Scholar
Tehrani MN, Uysal M, Yanikomeroglu H (2014) Device-to-device communication in 5G cellular networks: challenges, solutions, and future directions. IEEE Commun Mag 52(5):86–92. https://doi.org/10.1109/MCOM.2014.6815897
Article Google Scholar
Wang X, Zhang Y, Shen R, Xu Y, Zheng FC (2020) DRL-based energy-efficient resource allocation frameworks for uplink NOMA systems. IEEE Internet of Things J 7(8):7279–7294. https://doi.org/10.1109/JIOT.2020.2982699
Article Google Scholar
Wu Q, Zhang R (2020) Towards smart and reconfigurable environment: intelligent reflecting surface aided wireless network. IEEE Commun Mag 58(1):106–112. https://doi.org/10.1109/MCOM.001.1900107
Article Google Scholar
Wu Q, Zhang R (2020) Towards smart and reconfigurable environment: intelligent reflecting surface aided wireless network. IEEE Commun Mag 58(1):106–112. https://doi.org/10.1109/MCOM.001.1900107
Article Google Scholar
Wu Y, Yu W, Griffith D, Golmie N (2020) Modeling and performance assessment of dynamic rate adaptation for M2M communications. IEEE Trans Netw Sci Eng 7(1):285–303. https://doi.org/10.1109/TNSE.2018.2869093
Article MathSciNet Google Scholar
Xiang H, Yang Y, He G, Huang J, He D (2022) Multi-agent deep reinforcement learning-based power control and resource allocation for D2D communications. IEEE Wirel Commun Lett 11(8):1659–1663. https://doi.org/10.1109/LWC.2022.3170998
Article Google Scholar
Xu X, Zhang Y, Sun Z, Hong Y, Tao X (2016) Analytical modeling of mode selection for moving D2D-enabled cellular networks. IEEE Commun Lett 20(6):1203–1206. https://doi.org/10.1109/LCOMM.2016.2552171
Article Google Scholar
Yang H, Ye Y, Chu X, Dong M (2020) Resource and power allocation in SWIPT-enabled device-to-device communications based on a nonlinear energy harvesting model. IEEE Internet of Things J 7(11):10813–10825. https://doi.org/10.1109/JIOT.2020.2988512
Article Google Scholar
Yang Y, Zheng B, Zhang S, Zhang R (2020) Intelligent reflecting surface meets OFDM: protocol design and rate maximization. IEEE Trans Commun 68(7):4522–4535. https://doi.org/10.1109/TCOMM.2020.2981458
Article Google Scholar
Yang G, Liao Y, Liang YC, Tirkkonen O, Wang G, Zhu X (2021) Reconfigurable intelligent surface empowered device-to-device communication underlaying cellular networks. IEEE Trans Commun 69(11):7790–7805. https://doi.org/10.1109/TCOMM.2021.3102640
Article Google Scholar
Zeng S, Zhang H, Di B, Han Z, Song L (2021) Reconfigurable intelligent surface (RIS) assisted wireless coverage extension: RIS orientation and location optimization. IEEE Commun Lett 25(1):269–273. https://doi.org/10.1109/LCOMM.2020.3025345
Article Google Scholar
Zhong R, Liu X, Liu Y, Chen Y (2022) Multi-agent reinforcement learning in NOMA-aided UAV networks for cellular offloading. IEEE Trans Wirel Commun 21(3):1498–1512. https://doi.org/10.1109/TWC.2021.3104633
Article Google Scholar
Zhou G, Pan C, Ren H, Wang K, Nallanathan A (2020) A framework of robust transmission design for IRS-aided MISO communications with imperfect cascaded channels. IEEE Trans Signal Process 68:5092–5106. https://doi.org/10.1109/TSP.2020.3019666
Article MathSciNet MATH Google Scholar
Zhou H, Wu T, Zhang H, Wu J (2021) Incentive-driven deep reinforcement learning for content caching and D2D offloading. IEEE J Sel Areas Commun 39(8):2445–2460. https://doi.org/10.1109/JSAC.2021.3087232
Article Google Scholar
Zhu K, Hossain E (2015) Joint mode selection and spectrum partitioning for device-to-device communication: a dynamic Stackelberg game. IEEE Trans Wirel Commun 14(3):1406–1420
Article Google Scholar

Download references

Funding

This work was supported in part by the National Natural Science Foundation of China under Grants No. 61972079, 62172084, 62132004, in part by the Major Research Plan of National Natural Science Foundation of China under Grant No. 92167103, in part by the LiaoNing Revitalization Talents Program under Grant No. XLYC2007162, in part by the LiaoNing Key Research and Development Program under Grant No. 2023JH2/101300196, in part by the Central Government Guided Local Science and Technology Development Fund Project under Grant No. 2020ZY0003, in part by the Science and Technology Plan Project of Inner Mongolia Autonomous Region of China under Grant No. 2020GG0189, in part by the Fundamental Research Funds for the Central Universities under Grants No. N2224001-7, N2216009, N2216006, N2116004.

Author information

Authors and Affiliations

School of Computer Science and Engineering, Northeastern University, Shenyang, 110819, China
Liang Guo, Jie Jia, Jian Chen, An Du & Xingwei Wang
Engineering Research Center of Security Technology of Complex Network System, Ministry of Education, Shenyang, 110819, China
Jie Jia & Xingwei Wang

Authors

Liang Guo
View author publications
You can also search for this author in PubMed Google Scholar
Jie Jia
View author publications
You can also search for this author in PubMed Google Scholar
Jian Chen
View author publications
You can also search for this author in PubMed Google Scholar
An Du
View author publications
You can also search for this author in PubMed Google Scholar
Xingwei Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jie Jia.

Ethics declarations

Conflict of interest

We promise that this manuscript is the authors’ original work and has not been published nor has it been submitted simultaneously elsewhere. All authors have checked the manuscript and have agreed to the submission.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Guo, L., Jia, J., Chen, J. et al. Deep reinforcement learning empowered joint mode selection and resource allocation for RIS-aided D2D communications. Neural Comput & Applic 35, 18231–18249 (2023). https://doi.org/10.1007/s00521-023-08745-0

Download citation

Received: 26 October 2022
Accepted: 31 May 2023
Published: 21 July 2023
Issue Date: September 2023
DOI: https://doi.org/10.1007/s00521-023-08745-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep reinforcement learning empowered joint mode selection and resource allocation for RIS-aided D2D communications

Abstract

Access this article

Similar content being viewed by others

A deep reinforcement learning-based D2D spectrum allocation underlaying a cellular network

Distributed Dynamic Spectrum Access for D2D Communications Underlying Cellular Networks Using Deep Reinforcement Learning

Deep Reinforcement Learning Based Throughput Maximization Scheme for D2D Users Underlaying NOMA-Enabled Cellular Network

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Deep reinforcement learning empowered joint mode selection and resource allocation for RIS-aided D2D communications

Abstract

Access this article

Similar content being viewed by others

A deep reinforcement learning-based D2D spectrum allocation underlaying a cellular network

Distributed Dynamic Spectrum Access for D2D Communications Underlying Cellular Networks Using Deep Reinforcement Learning

Deep Reinforcement Learning Based Throughput Maximization Scheme for D2D Users Underlaying NOMA-Enabled Cellular Network

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation