Abstract
The use of mechanisms based on artificial intelligence techniques to perform dynamic learning has received much attention recently and has been applied in solving many problems. However, the convergence analysis of these mechanisms does not always receive the same attention. In this paper, the convergence of the mechanism using reinforcement learning to determine the channel detection sequence in a multi-channel, multi-user radio network is discussed and, through simulations, recommendations are presented for the proper choice of the learning parameter set to improve the overall reward. Then, applying the related set of parameters to the problem, the mechanism is compared to other intuitive sorting mechanisms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
McHenry, M.A.: NSF Spectrum Occupancy Measurements Project (2005)
FCC: FCC-03-322 - NOTICE OF PROPOSED RULE MAKING AND ORDER. Technical report, Federal Communications Commission, 30 December 2003
Cheng, H.T., Zhuang, W.: Simple channel sensing order in cognitive radio networks. IEEE J. Sel. Areas Commun. (2011)
Chow, Y.S., Robbins, H., Siegmund, D.: Great Expectations: The Theory of Optimal Stopping. Houghton Mifflin Company, Boston (1971)
Mendes, A.C., Augusto, C.H.P., Da Silva, M.W., Guedes, R.M., De Rezende, J.F.: Channel sensing order for cognitive radio networks using reinforcement learning. In: IEEE LCN (2011)
Claus, C., Boutilier, C.: The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems. National Conference on Artificial Intelligence (1998)
Tan, M.: Multi-agent Reinforcement Learning: Independent vs. Cooperative Agents. In: Readings in Agents (1997)
Lauer, M., Riedmiller, M.: An algorithm for distributed reinforcement learning in cooperative multi-agent systems. In: ICML (2000)
Kapetanakis, S., Kudenko, D.: Improving on the reinforcement learning of coordination in cooperative multi-agent systems. In: AAMAS (2002)
Lauer, M., Riedmiller, M.: Reinforcement learning for stochastic cooperative multiagent systems. In: AAMAS (2004)
Bowling, M.: Convergence and No-Regret in Multiagent Learning. In: Advances in Neural Information Processing Systems 17. MIT Press, Cambridge (2005)
Jafari, A., Greenwald, A., Gondek, D., Ercal, G.: On no-regret learning, fictitious play and nash equilibrium. In: Proceedings of the 18th International Conference on Machine Learning (2001)
Zapechelnyuk, A.: Limit behavior of no-regret dynamics. Technical report, School of Economics, Kyiv, Ucraine (2009)
Leslie, D., Collins, E.: Generalised weakened fctitious play. Games Econ. Behav. 56(2) (2006)
Brown, G.: Some notes on computation of games solutions. Research memoranda rm-125-pr, RAND Corporation, Santa Monica, California (1949)
Verbeeck, K., Nowé, A., Parent, J., Tuyls, K.: Exploring selfish reinforcement learning in repeated games with stochastic rewards. In: JAAMAS (2006)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MP, Cambridge (1998)
Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8, 279–292 (1992)
Yau, K.A., Komisarczuk, P., Teal, P.D.: Applications of reinforcement learning to cognitive radio networks. In: IEEE International Conference in Communications (ICC) (July 2010)
Yau, K.A., Komisarczuk, P., Teal, P.D.: Enhancing network performance in distributed cognitive radio networks using single-agent and multi-agent reinforcement learning. In: IEEE Conference on Local Computer Networks (October 2010)
Vu, H.L., Sakurai, T.: Collision probability in saturated IEEE 802.11 networks. In: Australian Telecommunication Networks and Applications Conference (2006)
Hasselt, H.: Double q-learning. In: NIPS (2010)
Acknowledgements
This work has been conducted under the project “BIOMA – Bioeconomy integrated solutions for the mobilization of the Agri-food market” (POCI-01-0247-FEDER-046112), by “BIOMA” Consortium, and financed by European Regional Development Fund (FEDER), through the Incentive System to Research and Technological development, within the Portugal2020 Competitiveness and Internationalization Operational Program.
This work has also been supported by FCT - Fundação para a Ciência e Tecnologia within the Project Scope: UIDB/05757/2020.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Mendes, A. (2021). Convergence of the Reinforcement Learning Mechanism Applied to the Channel Detection Sequence Problem. In: Pereira, A.I., et al. Optimization, Learning Algorithms and Applications. OL2A 2021. Communications in Computer and Information Science, vol 1488. Springer, Cham. https://doi.org/10.1007/978-3-030-91885-9_30
Download citation
DOI: https://doi.org/10.1007/978-3-030-91885-9_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-91884-2
Online ISBN: 978-3-030-91885-9
eBook Packages: Computer ScienceComputer Science (R0)