Abstract
Cognitive Radio (CR) is an intelligent device equipped with a Cognitive Engine (CE) capable of making decisions and finding the best policy for a dynamic network. Superior decision-making policy takes extensive learning time. A suitable learning algorithm reduces the learning time and provides a boost to CE capabilities. The underlay CR model allows PU and SU to coexist in the same frequency band by restricting SU interference below an acceptable level. This paper presents an underlay Resource Allocation (RA) model that employs Transfer Learning (TL) and Meta Reinforcement Learning (MRL) to solve a non-convex optimization problem. The allocation of resources is performed by incorporating TL and MRL into the existing Deep Deterministic Policy Gradient (DDPG) method. The merging of TL and MRL accelerates the network’s learning process and allows it to adapt rapidly to the changing environments. The proposed algorithms are compared to basic Q learning, dueling Deep Q Networks, and hybrid algorithms in terms of Quality of Experience (QoE) metric, learning speed, congestion rate, and stability. The simulation findings indicate that our proposed approach outperforms the existing techniques. The adaptability of the network is also tested by changing the environment from Additive White Gaussian Noise (AWGN) to Rayleigh fading. In addition, trade-offs between network scalability, congestion, and performance are evaluated.
Similar content being viewed by others
Data availability
The data will be made available upon reasonable request.
References
Slamnik-Krijestorac, N., Kremo, H., Ruffini, M., & Marquez-Barja, J. M. (2020). Sharing distributed and heterogeneous resources toward end-to-end 5G networks: A comprehensive survey and a taxonomy. IEEE Communications Surveys & Tutorials, 22(3), 1592–1628.
Xu, Y., Gui, G., Gacanin, H., & Adachi, F. (2021). A survey on resource allocation for 5G heterogeneous networks: Current research, future trends, and challenges. IEEE Communications Surveys & Tutorials, 23(2), 668–695.
Li, T., Zhu, K., Luong, N. C., Niyato, D., Wu, Q., Zhang, Y., & Chen, B. (2022). Applications of Multi-Agent Reinforcement Learning in Future Internet: A Comprehensive Survey. IEEE Communications Surveys & Tutorials.
Wang, Y., Ye, Z., Wan, P., & Zhao, J. (2019). A survey of dynamic spectrum allocation based on reinforcement learning algorithms in cognitive radio networks. Artificial intelligence review, 51(3), 493–506.
Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y. & Wierstra, D. (2015). Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971.
Mishra, N., Srivastava, S., & Sharan, S. N. (2021). Raddpg: Resource allocation in cognitive radio with deep reinforcement learning. In International conference on communication systems & NETworkS (COMSNETS) (pp. 589–595).
Finn, C., Abbeel, P., & Levine, S. (2017). Model-agnostic meta-learning for fast adaptation of deep networks. In International conference on machine learning PMLR (pp. 1126–1135).
Wang, J., Hu, J., Min, G., Zomaya, A. Y., & Georgalas, N. (2020). Fast adaptive task offloading in edge computing based on meta reinforcement learning. IEEE Transactions on Parallel and Distributed Systems, 32(1), 242–253.
Singh, J. S. P. (2022). APC: Adaptive power control technique for multi-radio multi-channel cognitive radio networks. Wireless Personal Communications, 122(4), 3603–3632.
Yuan, S., Zhang, Y., Ma, T., Cheng, Z., & Guo, D. (2022). Graph convolutional reinforcement learning for resource allocation in hybrid overlay-underlay cognitive radio network with network slicing. IET Communications.
Liu, S., Wu, J., & He, J. (2021). Dynamic multichannel sensing in cognitive radio: Hierarchical reinforcement learning. IEEE Access, 9, 25473–25481.
Giri, M. K., & Majumder, S. (2022). Deep Q-learning based optimal resource allocation method for energy harvested cognitive radio networks. Physical Communication, 101766.
Ekwe, S., Oladejo, S., Akinyemi, L., & Ventura, N. (2021) QoE-aware Q-learning resource allocation for spectrum reuse in 5G communications network. In Southern Africa telecommunication networks and applications conference (SATNAC).
Zhou, S., Cheng, Y., Lei, X., & Duan, H. (2020). Deep deterministic policy gradient with prioritized sampling for power control. IEEE Access, 8, 194240–194250.
Guo, S., & Zhao, X. (2022). Deep reinforcement learning optimal transmission algorithm for cognitive Internet of Things with RF energy harvesting. IEEE Transactions on Cognitive Communications and Networking.
Liu, J., Lin, C. H. R., Hu, Y. C., & Donta, P. K. (2022). Joint beamforming, power allocation, and splitting control for SWIPT-enabled iot networks with deep reinforcement learning and game theory. Sensors, 22(6), 2328.
Zhou, H., Erol-Kantarci, M., & Poor, V. (2022). Learning from peers: Deep transfer reinforcement learning for joint radio and cache resource allocation in 5G RAN slicing. IEEE Transactions on Cognitive Communications and Networking.
Aref, M. A., & Jayaweera, S. K. (2021). Spectrum-agile cognitive radios using multi-task transfer deep reinforcement learning. IEEE Transactions on Wireless Communications, 20(10), 6729–6742.
Lu, Z., & Gursoy, M. C. (2021). Dynamic channel access via meta-reinforcement learning. IEEE Global Communications Conference (GLOBECOM), 01-06.
Nazir, M., Sabah, A., Sarwar, S., Yaseen, A., & Jurcut, A. (2021). Power and resource allocation in wireless communication network. Wireless Personal Communications, 119(4), 3529–3552.
Chen, S., Rui, L., Gao, Z., Li, W., & Qiu, X. (2022). Cache-assisted collaborative task offloading and resource allocation strategy: A meta reinforcement learning approach. IEEE Internet of Things Journal.
Shah-Mohammadi, F., & Kwasinski, A. (2020). Fast learning cognitive radios in underlay dynamic spectrum access: Integration of transfer learning into deep reinforcement learning. In IEEE Wireless Telecommunications Symposium (WTS) (pp. 1–7).
Mohammadi, F. S., & Kwasinski, A. (2018). QoE-driven integrated heterogeneous traffic resource allocation based on cooperative learning for 5G cognitive radio networks. IEEE 5G World Forum (5GWF) 244–249.
Kougioumtzidis, G., Poulkov, V., Zaharis, Z. D., & Lazaridis, P. I. (2022). A survey on multimedia services QoE assessment and machine learning-based prediction. IEEE Access, 10, 19507–19538.
Khan, S., Duhovnikov, S., Steinbach, E., & Kellerer, W. (2007). MOS-based multiuser multiapplication cross-layer optimization for mobile multimedia communication. Advances in Multimedia.
Chen, Y., Wu, K., & Zhang, Q. (2014). From QoS to QoE: A tutorial on video quality assessment. IEEE Communications Surveys & Tutorials, 17(2), 1126–1165.
Hanhart, P., & Ebrahimi, T. (2014). Calculation of average coding efficiency based on subjective quality scores. Journal of Visual Communication and Image Representation, 25(3), 555–564.
Tan, X., Zhou, L., Wang, H., Sun, Y., Zhao, H., Seet, B. C., & Leung, V. C. (2022). Cooperative multi-agent reinforcement learning based distributed dynamic spectrum access in cognitive radio networks. IEEE Internet of Things Journal.
Albinsaid, H., Singh, K., Biswas, S., & Li, C. P. (2021). Multi-agent reinforcement learning based distributed dynamic spectrum access. IEEE Transactions on Cognitive Communications and Networking.
Zhang, T., Zhu, K., & Wang, J. (2020). Energy-efficient mode selection and resource allocation for D2D-enabled heterogeneous networks: A deep reinforcement learning approach. IEEE Transactions on Wireless Communications, 20(2), 1175–1187.
Wang, X., Zhang, Y., Shen, R., Xu, Y., & Zheng, F. C. (2020). DRL-based energy-efficient resource allocation frameworks for uplink NOMA systems. IEEE Internet of Things Journal, 7(8), 7279–7294.
Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., & Riedmiller, M. (2014). Deterministic policy gradient algorithms. International conference on machine learning (PMLR), 387–395.
Jaiswal, R., Deshmukh, S., Elnourani, M., & Beferull-Lozano, B. (2022). Transfer learning based joint resource allocation for underlay D2D communications. In IEEE wireless communications and networking conference (WCNC) (pp. 1479–1484).
Wang, X., Zhang, Y., Wu, H., Liu, T., & Xu, Y. (2022). Deep transfer reinforcement learning for resource allocation in hybrid multiple access systems. Physical Communication, 55, 101923.
Zuo, G., Tian, Z., Huang, S., & Gong, D. (2021). Sample-efficient reinforcement learning based on dynamics models via meta-policy optimization. In International conference on cognitive systems and signal processing, Springer (pp. 360–373).
Ding, Y., Huang, Y., Tang, L., Qin, X., & Jia, Z. (2022). Resource allocation in V2X communications based on multi-agent reinforcement learning with attention mechanism. Mathematics, 10(19), 3415.
Yuan, Y., Zheng, G., Wong, K. K., & Letaief, K. B. (2021). Meta-reinforcement learning based resource allocation for dynamic V2X communications. IEEE Transactions on Vehicular Technology, 70(9), 8964–8977.
Yang, H., Zhao, J., Lam, K. Y., Xiong, Z., Wu, Q., & Xiao, L. (2022). Distributed deep reinforcement learning based spectrum and power allocation for heterogeneous networks. IEEE Transactions on Wireless Communications.
Alwarafy, A., Ciftler, B. S., Abdallah, M., Hamdi, M., & Al-Dhahir, N. (2022). Hierarchical multi-agent DRL-based framework for joint multi-rat assignment and dynamic resource allocation in next-generation hetnets. IEEE Transactions on Network Science and Engineering.
He, Y., Wang, Y., Lin, Q., & Li, J. (2022). Meta-hierarchical reinforcement learning (MHRL)-based dynamic resource allocation for dynamic vehicular networks. IEEE Transactions on Vehicular Technology, 71(4), 3495–3506.
Park, S., Simeone, O., & Kang, J. (2020). Meta-learning to communicate: Fast end-to-end training for fading channels. In IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 5075–5079).
Author information
Authors and Affiliations
Contributions
All authors participated in the literature review, material collection, concept design, simulations, and analysis. Nikita Mishra drafted the manuscript, while the other two authors reviewed and approved it.
Corresponding author
Ethics declarations
Competing interests
No funds, grants, or other support were received during the preparation of this manuscript. There are no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mishra, N., Srivastava, S. & Sharan, S.N. DDPG with Transfer Learning and Meta Learning Framework for Resource Allocation in Underlay Cognitive Radio Network. Wireless Pers Commun 130, 729–755 (2023). https://doi.org/10.1007/s11277-023-10307-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11277-023-10307-5