Adaptive resource allocation in 5G CP-OFDM systems using Markovian model-based reinforcement learning algorithm

Carneiro, Daniel P. Q.; Cardoso, Alisson A.; Vieira, Flávio H. T.

doi:10.1007/s00521-023-08406-2

Adaptive resource allocation in 5G CP-OFDM systems using Markovian model-based reinforcement learning algorithm

S.I. : Latin American Computational Intelligence
Published: 10 March 2023

Volume 35, pages 9421–9435, (2023)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Daniel P. Q. Carneiro ORCID: orcid.org/0000-0002-3050-6278¹,
Alisson A. Cardoso¹^na1 &
Flávio H. T. Vieira¹^na1

235 Accesses
1 Altmetric
Explore all metrics

Abstract

In this work, an adaptive resource allocation algorithm based on reinforcement learning is proposed for multicarrier communication systems that consider multiple users and multipath channel characteristics assuming propagation of millimeter waves. An adaptive Markovian model represented by the queueing states in buffers and the channel states is proposed to describe the Cyclic Prefix—Orthogonal Frequency Division Multiplexing (CP-OFDM) communication system. Novel utility functions for the Markovian model-based Q-learning algorithm are introduced and evaluated. The performance of the proposed adaptive resource allocation scheme based on Markovian model reinforcement learning is verified via computational simulations considering real traffic traces. The simulation results show that the application of the proposed resource scheduling algorithm provides general improvements in the communication system performance parameters such as increased throughput and decreased packet loss when using the reward function proposals and increased energy efficiency for one of the proposed reward functions when compared to some algorithms present in the literature. Simulation results confirm that the proposed reward functions in conjunction with the Markov model make the scheduling of users and the sharing of resources more efficient for millimeter-wave-based CP-OFDM networks than traditional Q-learning-based algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning to Solve Decision Problems Over Two Timescales: An Application to 5G Puncturing

Article 22 September 2023

Reinforcement Based User Scheduling for Cellular Communications

D2D Resource Allocation Based on Reinforcement Learning and QoS

Article 01 June 2023

Abbreviations

CP-OFDM:: Cyclic Prefix—Orthogonal Frequency Division Multiplexing
BER:: Bit error rate
TDL:: Tapped delay line
QoS:: Quality of service
TDM:: Time division multiplexing
IoT:: Internet of things
TTI:: Time transmission interval
LTE:: Long-term evolution
RB:: Resource blocks
MDP:: Markov decision process
MIMO:: Multiple input multiple output
MLE:: Maximum likelihood estimation
RL:: Reinforcement learning algorithm
MBRL:: Model-based reinforcement learning
AWGN:: Additive white Gaussian noise
SNR:: Signal-to-noise ratio
QAM:: Quadrature amplitude modulation

References

Mahmoud M, Mohamad A (2016) A study of efficient power consumption wireless communication techniques/ modules for Internet of Things(IoT) applications. Adv Internet Things 06:19–29. https://doi.org/10.4236/ait.2016.62002
Article Google Scholar
Sangaiah AK, Hosseinabadi AAR, Shareh MB, Bozorgi Rad SY, Zolfagharian A, Chilamkurti N (2020) IoT resource allocation and optimization based on heuristic algorithm. Sensors 20(2):539
Article Google Scholar
Zhu J, Song Y, Jiang D, Song H (2018) A new deep-Q-learning-based transmission scheduling mechanism for the cognitive internet of things. IEEE Internet Things J 5(4):2375–2385. https://doi.org/10.1109/JIOT.2017.2759728
Article Google Scholar
Nassar A, Yilmaz Y (2019) Reinforcement learning for adaptive resource allocation in Fog RAN for IoT with heterogeneous latency requirements. IEEE Access 7:128014–128025. https://doi.org/10.1109/ACCESS.2019.2939735
Article Google Scholar
Wang HS, Moayeri N (1995) Finite-state Markov channel-a useful model for radio communication channels. IEEE Trans Veh Technol 44(1):163–171. https://doi.org/10.1109/25.350282
Article Google Scholar
3GPP: Study on channel model for frequencies from 0.5 to 100 GHz (Release 15). Tech rep, 3GPP TR 38.901 (2018)
Sinclair S, Wang T, Jain G, Banerjee S, Yu C (2020) Adaptive discretization for model-based reinforcement learning. Adv Neural Inf Process Syst 33:3858–3871
Google Scholar
Ford R, Rangan S, Mellios E, Kong D, Nix A (2017) Markov channel-based performance analysis for millimeter wave mobile networks. In: 2017 IEEE wireless communications and networking conference (WCNC). pp. 1–6. IEEE
Patteti K, Kumar T, Kalitkar K (2016) M-QAM BER and SER analysis of multipath fading channels in long term evolutions (LTE). Int J Signal Process, Image Process Pattern Recognit (IJSIP) Vol. 9, 361–368. doi: https://doi.org/10.14257/ijsip.2016.9.1.34
David López-Pérez et al (2012) On distributed and coordinated resource allocation for interference mitigation in self-organizing LTE networks. IEEE/ACM Trans Netw 214:1145–1158
Google Scholar
Meng Hao et al (2020) Fine-grained powercap allocation for power-constrained systems based on multi-objective machine learning. IEEE Trans Parallel Distrib Syst 32(7):1789–1801
Google Scholar
Zhang H, Feng M, Long K, Karagiannidis GK, Nallanathan A (2019) Artificial intelligence-based resource allocation in ultradense networks: applying event-triggered Q-learning algorithms. IEEE Veh Technol Mag 14(4):56–63. https://doi.org/10.1109/MVT.2019.2938328
Article Google Scholar
Liang H, Zhang X, Zhang J, Li Q, Zhou S, Zhao L (2019) A novel adaptive resource allocation model based on SMDP and reinforcement learning algorithm in vehicular cloud system. IEEE Trans Veh Technol 68(10):10018–10029
Article Google Scholar
Wang S, Bi J, Wu J, Vasilakos AV, Fan Q (2019) VNE-TD: a virtual network embedding algorithm based on temporal-difference learning. Comput Netw 161:251–263
Article Google Scholar
Kuroda K, Kato H, Kim SJ, Naruse M, Hasegawa M (2018) Improving throughput using multi-armed bandit algorithm for wireless LANs. Nonlinear Theory Appl, IEICE 9(1):74–81
Article Google Scholar
Gentile C, Li S, Zappella G (2014) Online clustering of bandits. In: international conference on machine learning (pp. 757-765). PMLR
Li S (2016) The art of clustering bandits (Doctoral dissertation, Università degli Studi dell’Insubria)
Niu Y et al (2015) Exploiting device-to-device communications in joint scheduling of access and backhaul for mm wave small cells. IEEE J Sel Areas Commun 33(10):2052–2069
Article Google Scholar
Egorov V, Shpilman A (2022) Scalable multi-agent model-based reinforcement learning. arXiv preprint arXiv:2205.15023
Voelcker C, Liao V, Garg A, Farahmand AM (2022) Value gradient weighted model-based reinforcement learning. arXiv preprint arXiv:2204.01464
Luo FM, Xu T, Lai H, Chen XH, Zhang W, Yu Y (2022). A survey on model-based reinforcement learning. arXiv preprint arXiv:2206.09328
Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. MIT press
MATH Google Scholar
Sklar B, Harris F (2020) Digital communications: fundamentals and applications. Prentice Hall Upper Saddle River, NJ, USA:, 3 edn
Jain R (2007) Channel models: a tutorial. In: WiMAX forum AATG. vol. 10. Washington Univ. St. Louis, Dept. CSE. https://www.cse.wustl.edu/~jain/cse574-08/ftp/channel_model_tutorial.pdf
Kuhn HW (1955) The Hungarian method for the assignment problem. Nav Res Logist Q 2(1–2):83–97
Article MathSciNet MATH Google Scholar
3GPP: 5G; Telecommunication management; study on system and functional aspects of energy efficiency in 5G networks (release 16). https://www.etsi.org/deliver/etsi_tr/132900_132999/132972/16.01.00_60/tr_132972v160100p.pdf
Jain R, Durresi A, Babic G(1999) Throughput fairness index: an explanation. ATM Forum contribution. 99(45)
Van Otterlo M, Wiering M (2012) Reinforcement learning and markov decision processes. In: reinforcement learning (pp. 3–42). Springer, Berlin, Heidelberg
MAWI (2000) MAWI working group traffic archive. https://mawi.wide.ad.jp/mawi/2019 Kenjiro Cho, Koushirou Mitsuya and Akira Kato.“Traffic data repository at the WIDE Project.USENIX 2000 FREENIX Track, San Diego, CA (2000)

Download references

Acknowlegments

Work was carried out with the support of the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)—F. Code 001.

Funding

All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interests or non-financial interests in the subject matter or materials discussed in this manuscript. The authors have no financial or proprietary interests in any material discussed in this article. Traffic traces data used can be found at: https://mawi.wide.ad.jp/mawi/samplepoint-F/2019/.

Author information

A. A. Cardoso and F. H. T. Vieira have contributed equally to this work.

Authors and Affiliations

Escola de Engenharia Elétrica, Mecânica e Computação (EMC), Universidade Federal de Goiás (UFG), Av. Universitária, Goiânia, Goiás, 74605-010, Brazil
Daniel P. Q. Carneiro, Alisson A. Cardoso & Flávio H. T. Vieira

Authors

Daniel P. Q. Carneiro
View author publications
You can also search for this author in PubMed Google Scholar
Alisson A. Cardoso
View author publications
You can also search for this author in PubMed Google Scholar
Flávio H. T. Vieira
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniel P. Q. Carneiro.

Ethics declarations

Conflicts of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Carneiro, D.P.Q., Cardoso, A.A. & Vieira, F.H.T. Adaptive resource allocation in 5G CP-OFDM systems using Markovian model-based reinforcement learning algorithm. Neural Comput & Applic 35, 9421–9435 (2023). https://doi.org/10.1007/s00521-023-08406-2

Download citation

Received: 01 April 2022
Accepted: 13 February 2023
Published: 10 March 2023
Issue Date: May 2023
DOI: https://doi.org/10.1007/s00521-023-08406-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adaptive resource allocation in 5G CP-OFDM systems using Markovian model-based reinforcement learning algorithm

Abstract

Access this article

Similar content being viewed by others

Learning to Solve Decision Problems Over Two Timescales: An Application to 5G Puncturing

Reinforcement Based User Scheduling for Cellular Communications

D2D Resource Allocation Based on Reinforcement Learning and QoS

Abbreviations

References

Acknowlegments

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Adaptive resource allocation in 5G CP-OFDM systems using Markovian model-based reinforcement learning algorithm

Abstract

Access this article

Similar content being viewed by others

Learning to Solve Decision Problems Over Two Timescales: An Application to 5G Puncturing

Reinforcement Based User Scheduling for Cellular Communications

D2D Resource Allocation Based on Reinforcement Learning and QoS

Abbreviations

References

Acknowlegments

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation