D2D Resource Allocation Based on Reinforcement Learning and QoS

Kuo, Fang-Chang; Wang, Hwang-Cheng; Tseng, Chih-Cheng; Wu, Jung-Shyr; Xu, Jia-Hao; Chang, Jieh-Ren

doi:10.1007/s11036-023-02145-3

D2D Resource Allocation Based on Reinforcement Learning and QoS

Published: 11 July 2023

Volume 28, pages 1076–1095, (2023)
Cite this article

Mobile Networks and Applications Aims and scope Submit manuscript

Fang-Chang Kuo¹,
Hwang-Cheng Wang¹,
Chih-Cheng Tseng²,
Jung-Shyr Wu³,
Jia-Hao Xu¹ &
…
Jieh-Ren Chang ORCID: orcid.org/0000-0002-1830-5094¹

170 Accesses
Explore all metrics

Abstract

Device-to-device (D2D) communications is designed to improve the overall network performance, including low latency, high data rates, and system capacity of the fifth-generation (5G) wireless networks. The system capacity can even be improved by reusing resources between D2D user equipments (DUEs) and cellular user equipments (CUEs) without causing harmful interference to the CUEs. A D2D resource allocation scheme is expected to have the characteristic that one CUE be allocated with a variable number of resource blocks (RBs), and the RBs be reused by more than one DUE. In this study, the Multi-Player Multi-Armed Bandit (MPMAB) reinforcement learning scheme is employed to model such a problem by establishing a preference matrix to facilitate greedy resource allocation. A fair resource allocation scheme is then proposed and shown to achieve fairness, prevent waste of resources, and alleviate starvation. Moreover, this scheme has better performance when there are not too many D2D pairs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep reinforcement learning-based methods for resource scheduling in cloud computing: a review and future directions

Article Open access 23 April 2024

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

A Review of Client Selection Methods in Federated Learning

Article 01 November 2023

Data Availability

Data supporting this study are openly available from https://drive.google.com/drive/folders/1GNvIUmLMD1CsgaDsDdPvaic80I7IHNp4?usp=sharing.

References

Osseiran A et al (2013) The foundation of the mobile and wireless communications system for 2020 and beyond: Challenges, enablers and technology solutions. In 2013 IEEE 77th Vehicular Technology Conference (VTC Spring). IEEE, pp. 1–5
Ericsson L (2011) More than 50 billion connected devices. White Paper 14(1):124
Google Scholar
Agiwal M, Roy A, Saxena N (2016) Next generation 5G wireless networks: a comprehensive survey. IEEE Commun Surv Tutorials 18(3):1617–1655
Article Google Scholar
Osseiran A et al (2014) Scenarios for 5G mobile and wireless communications: the vision of the METIS project. IEEE Commun Mag 52(5):26–35
Article Google Scholar
Mustafa HAU, Imran MA, Shakir MZ, Imran A, Tafazolli R (2015) Separation framework: an enabler for cooperative and D2D communication for future 5G networks. IEEE Commun Surv Tutorials 18(1):419–445
Article Google Scholar
Asadi A, Wang Q, Mancuso V (2014) A survey on device-to-device communication in cellular networks. IEEE Commun Surv Tutorials 16(4):1801–1819
Article Google Scholar
Doumi T et al (2013) LTE for public safety networks. IEEE Commun Mag 51(2):106–112
Article Google Scholar
Lin X, Andrews JG, Ghosh A, Ratasuk R (2014) An overview of 3GPP device-to-device proximity services. IEEE Commun Mag 52(4):40–48
Article Google Scholar
Feng D, Lu L, Yuan-Wu Y, Li GY, Li S, Feng G (2014) Device-to-device communications in cellular networks. IEEE Commun Mag 52(4):49–55
Article Google Scholar
Hakola S, Chen T, Lehtomäki J, Koskela T (2010) Device-to-device (D2D) communication in cellular network-performance analysis of optimum and practical communication mode selection. In 2010 IEEE wireless communication and networking conference. IEEE, pp. 1–6
Peng T, Lu Q, Wang H, Xu S, Wang W (2009) Interference avoidance mechanisms in the hybrid cellular and device-to-device systems. In 2009 IEEE 20th international symposium on personal, indoor and mobile radio communications. IEEE, pp. 617–621
Kaufman B, Aazhang B (2008) Cellular networks with an overlaid device to device network. In 2008 42nd Asilomar conference on signals, systems and computers. IEEE, pp. 1537–1541
Li X, Wang Z, Sun Y, Gu Y, Hu J (2017) Mathematical characteristics of uplink and downlink interference regions in D2D communications underlaying cellular networks. Wireless Pers Commun 93(4):917–932
Article Google Scholar
3GPP (2016) TS 36.213: Evolved Universal Terrestrial Radio Access (E-UTRA); Physical layer procedures
Sun H, Sheng M, Wang X, Zhang Y, Liu J, Wang K (2013) Resource allocation for maximizing the device-to-device communications underlaying LTE-Advanced networks. In 2013 IEEE/CIC International Conference on Communications in China-Workshops (CIC/ICCC). IEEE, pp. 60–64
Li X, Shankaran R, Orgun MA, Fang G, Xu Y (2018) Resource allocation for underlay D2D communication with proportional fairness. IEEE Trans Veh Technol 67(7):6244–6258
Article Google Scholar
Zhou W, Sun X, Ma C, Yue J, Yu H, Luo H (2013) An interference coordination mechanism based on resource allocation for network controlled device-to-device communication. In 2013 IEEE/CIC International Conference on Communications in China-Workshops (CIC/ICCC). IEEE, pp. 109–114
Zulhasnine M, Huang C, Srinivasan A (2010) Efficient resource allocation for device-to-device communication underlaying LTE network. In 2010 IEEE 6th International conference on wireless and mobile computing, networking and communications. IEEE, pp. 368–375
Wang F, Song L, Han Z, Zhao Q, Wang X (2013) Joint scheduling and resource allocation for device-to-device underlay communication. In 2013 IEEE wireless communications and networking conference (WCNC). IEEE, pp. 134–139
Ren H, Jiang F, Wang H (2017) Resource allocation based on clustering algorithm for hybrid device-to-device networks. In 2017 9th International Conference on Wireless Communications and Signal Processing (WCSP). IEEE, pp. 1–6
Sutton RS, Barto AG (2011) Reinforcement learning: An introduction. The MIT Press
Google Scholar
Wang M, Cui Y, Wang X, Xiao S, Jiang J (2017) Machine learning for networking: workflow, advances and opportunities. IEEE Network 32(2):92–99
Article Google Scholar
Usama M et al (2019) Unsupervised machine learning for networking: Techniques, applications and research challenges. IEEE Access 7:65579–65615
Article Google Scholar
Friedman J, Hastie T, Tibshirani R (2001) The elements of statistical learning (no. 10). Springer series in statistics New York
Kotsiantis SB, Zaharakis I, Pintelas P (2007) Supervised machine learning: a review of classification techniques. Emerg Artif Intell Appl Comput Eng 160(1):3–24
Google Scholar
Alpaydin E (2020) Introduction to machine learning. MIT Press
Google Scholar
Kaelbling LP, Littman ML, Moore AW (1996) Reinforcement learning: a survey. J Artif Intell Res 4:237–285
Article Google Scholar
Luo Y, Shi Z, Zhou X, Liu Q, Yi Q (2014) Dynamic resource allocations based on q-learning for d2d communication in cellular networks. In 2014 11th International Computer Conference on Wavelet Actiev Media Technology and Information Processing (ICCWAMTIP). IEEE, pp. 385–388
Zhang Y, Wang C-Y, Wei H-Y (2018) Incentive compatible overlay D2D system: a group-based framework without CQI feedback. IEEE Trans Mob Comput 17(9):2069–2086
Article Google Scholar
Neogi A, Chaporkar P, Karandikar A (2018) Multi-Player Multi-Armed Bandit Based Resource Allocation for D2D Communications. arXiv preprint arXiv:1812.11837
Huynh T, Onuma T, Kuroda K, Hasegawa M, Hwang W-J (2016) Joint downlink and uplink interference management for device to device communication underlaying cellular networks. IEEE Access 4:4420–4430. https://doi.org/10.1109/ACCESS.2016.2603149
Ghosh A, Ratasuk R (2011) Essentials of LTE and LTE-A. Cambridge University Press
Book Google Scholar
W. contributors (July 22) Multi-armed bandit. Available: https://en.wikipedia.org/w/index.php?title=Multi-armed_bandit&oldid=1032060189. Accessed 8 July 2023
Robot B (2020) Multi-Armed Bandits: Part 1 Mathematical Framework and Terminology. Available: https://towardsdatascience.com/multi-armed-bandits-part-1-b8d33ab80697. Accessed 8 July 2023
Kuo F-C, Christian S, Wang H-C, Lin W-J, Tseng C-C (2020) D2D resource allocation with power control based on multi-player multi-armed bandit. Wireless Pers Commun 113(3):1455–1470
Article Google Scholar
Tran-Thanh L, Chapman A, De Cote EM, Rogers A, Jennings NR (2010) Epsilon–first policies for budget–limited multi-armed bandits. In Twenty-Fourth AAAI Conference on Artificial Intelligence
Garivier A, Moulines E (2011) On upper-confidence bound policies for switching bandit problems. In International Conference on Algorithmic Learning Theory. Springer, pp. 174–188
Liu X, Derakhshani M, Lambotharan S, Van der Schaar M (2020) Risk-aware multi-armed bandits with refined upper confidence bounds. IEEE Signal Process Lett 28:269–273
Article Google Scholar
Kuo F-C, Ting K-C, Wang H-C, Tseng C-C (2017) On demand resource allocation for LTE uplink transmission based on logical channel groups. Mobile Netw Appl 22(5):868–879
Article Google Scholar
Lucas JM, Saccucci MS (1990) Exponentially weighted moving average control schemes: properties and enhancements. Technometrics 32(1):1–12
Article MathSciNet Google Scholar
3GPP (2014) TR 36.843 Study on LTE device to device proximity services; Radio aspects
Jain DMCR, Hawe W (1984) "A Quantitative Measure of Fairness and Discrimination for Resource Allocation in Shared Computer Systems," Digital Equipment Corporation (DEC) Research Report TR-301

Download references

Acknowledgements

This research was supported by Ministry of Science and Technology of Taiwan under grant No. 108-2221-E-197-009 and 108-2221-E-197-011.

Author information

Authors and Affiliations

Department of Electronic Engineering, National Ilan University, Taiwan, Republic of China
Fang-Chang Kuo, Hwang-Cheng Wang, Jia-Hao Xu & Jieh-Ren Chang
Department of Electrical Engineering, National Ilan University, Taiwan, Republic of China
Chih-Cheng Tseng
Department of Communication Engineering, National Central University, Taiwan, Republic of China
Jung-Shyr Wu

Authors

Fang-Chang Kuo
View author publications
You can also search for this author in PubMed Google Scholar
Hwang-Cheng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Chih-Cheng Tseng
View author publications
You can also search for this author in PubMed Google Scholar
Jung-Shyr Wu
View author publications
You can also search for this author in PubMed Google Scholar
Jia-Hao Xu
View author publications
You can also search for this author in PubMed Google Scholar
Jieh-Ren Chang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jieh-Ren Chang.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix Pseudocode of MPMAB_GRA scheme

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Kuo, FC., Wang, HC., Tseng, CC. et al. D2D Resource Allocation Based on Reinforcement Learning and QoS. Mobile Netw Appl 28, 1076–1095 (2023). https://doi.org/10.1007/s11036-023-02145-3

Download citation

Accepted: 15 November 2021
Published: 11 July 2023
Issue Date: June 2023
DOI: https://doi.org/10.1007/s11036-023-02145-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

D2D Resource Allocation Based on Reinforcement Learning and QoS

Abstract

Access this article

Similar content being viewed by others

Deep reinforcement learning-based methods for resource scheduling in cloud computing: a review and future directions

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

A Review of Client Selection Methods in Federated Learning

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's note

Appendix Pseudocode of MPMAB_GRA scheme

Rights and permissions

About this article

Cite this article

Keywords

Navigation

D2D Resource Allocation Based on Reinforcement Learning and QoS

Abstract

Access this article

Similar content being viewed by others

Deep reinforcement learning-based methods for resource scheduling in cloud computing: a review and future directions

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

A Review of Client Selection Methods in Federated Learning

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's note

Appendix Pseudocode of MPMAB_GRA scheme

Appendix Pseudocode of MPMAB_GRA scheme

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation