Advertisement

Reinforcement Learning-Based Spectrum Management for Cognitive Radio Networks: A Literature Review and Case Study

  • Marco Di FeliceEmail author
  • Luca Bedogni
  • Luciano Bononi
Living reference work entry

Abstract

In cognitive radio (CR) networks, the cognition cycle, i.e., the ability of wireless transceivers to learn the optimal configuration meeting environmental and application requirements, is considered as important as the hardware components which enable the dynamic spectrum access (DSA) capabilities. To this purpose, several machine learning (ML) techniques have been applied on CR spectrum and network management issues, including spectrum sensing, spectrum selection, and routing. In this paper, we focus on reinforcement learning (RL), an online ML paradigm where an agent discovers the optimal sequence of actions required to perform a task via trial-end-error interactions with the environment. Our study provides both a survey and a proof of concept of RL applications in CR networking. As a survey, we discuss pros and cons of the RL framework compared to other ML techniques, and we provide an exhaustive review of the RL-CR literature, by considering a twofold perspective, i.e., an application-driven taxonomy and a learning methodology-driven taxonomy. As a proof of concept, we investigate the application of RL techniques on joint spectrum sensing and decision problems, by comparing different algorithms and learning strategies and by further analyzing the impact of information sharing techniques in purely cooperative or mixed cooperative/competitive tasks.

References

  1. 1.
    Akyildiz IF, Lee WY, Vuran MC, Mohanty S (2006) NeXt generation/dynamic spectrum access/cognitive radio wireless networks: a survey. Comput Netw J 50(1):2127–2159Google Scholar
  2. 2.
    Mitola J (2000) Cognitive radio an integrated agent architecture for software defined radio. PhD Dissertation, KTH StockholmGoogle Scholar
  3. 3.
    Yucek T, Arslan H (2009) A survey of spectrum sensing algorithms for cognitive radio applications. J IEEE Commun Surv Tutor 11(1):116–130Google Scholar
  4. 4.
    Lee WY, Akyildiz I (2008) Optimal spectrum sensing framework for cognitive radio networks. IEEE Trans Wirel Commun 7(10):3845–3857Google Scholar
  5. 5.
    Sherman M, Mody AN, Martinez R, Rodriguez C, Reddy R (2008) IEEE standards supporting cognitive radio and networks, dynamic spectrum access, and coexistence. IEEE Commun Mag 46(7):72–79Google Scholar
  6. 6.
    Flores AB, Guerra RE, Knightly EW (2013) IEEE 802.11af: a standard for TV white space spectrum sharing. IEEE Commun Mag 51(10):92–100Google Scholar
  7. 7.
    Clancy C, Hecker J, Stuntbeck E, OShea T (2007) Applications of machine learning to cognitive radio networks. IEEE Wirel Commun 14(4):47–52Google Scholar
  8. 8.
    Mitchell T (1997) Machine learning. McGraw Hill, New YorkGoogle Scholar
  9. 9.
    Gavrilovska L, Atanasovksi V, Macaluso I, DaSilva L (2013) Learning and reasoning in cognitive radio networks. IEEE Commun Surv Tutor 15(4):1761–1777Google Scholar
  10. 10.
    Bkassiny M, Li Y, Jayaweera SK (2013) A survey on machine-learning techniques in cognitive radios. IEEE Commun Surv Tutor 15(3):1136–1159Google Scholar
  11. 11.
    Wang W, Kwasinksi A, Niyato D, Han Z (2016) A survey on applications of model-free strategy learning in cognitive wireless networks. IEEE Commun Surv Tutor 18(3):1717–1757Google Scholar
  12. 12.
    Barto AG, Sutton R (1998) Reinforcement learning: an introduction. MIT Press, CambridgeGoogle Scholar
  13. 13.
    Kaelbling LP, Littman ML, Moore AW (1996) Reinforcement learning: a survey. J Artif Intell Res 4(1):237–285Google Scholar
  14. 14.
    Busoniu L, Babuska R, De Schutter B (2008) A comprehensive survey of multiagent reinforcement learning. IEEE Trans Syst Man Cybern 38(2):156–171Google Scholar
  15. 15.
    Busoniu L, Babuska R, De Schutter B (2006) Multi-agent reinforcement learning: a survey. In: Proceedings of IEEE ICARCV, SingaporeGoogle Scholar
  16. 16.
    Watkins CJ, Dayan P (1992) Technical note: Q-learning. Mach Learn 8(1):279–292Google Scholar
  17. 17.
    Rummery GA, Niranjan M (1994) Online Q-learning using connectionist systems. Technical ReportGoogle Scholar
  18. 18.
    Di Felice MK, Wu C, Bononi L, Meleis W (2010) Learning-based spectrum selection in cognitive radio ad hoc networks. In: Proceedings of IEEE/IFIP WWIC, LuleaGoogle Scholar
  19. 19.
    Yau KLA, Komisarczuk P, Teal PD (2012) Reinforcement learning for context awareness and intelligence in wireless networks: review, new features and open issues. J Netw Comput Appl 35(1):235–267Google Scholar
  20. 20.
    Yau KLA, Komisarczuk P, Teal PD (2010) Applications of reinforcement learning to cognitive radio networks. In: Proceedings of IEEE ICC, CapetownGoogle Scholar
  21. 21.
    Raza Syed A, Alvin Yau KL, Qadir J, Mohamad H, Ramli N, Loong Keoh S (2016) Route selection for multi-hop cognitive radio networks using reinforcement learning: an experimental study. In: Proceedings of IEEE access 4(1):6304–6324Google Scholar
  22. 22.
    Vucevic N, Akyildiz IF, Romero JP (2010) Cooperation reliability based on reinforcement learning for cognitive radio networks. In: Proceedings of IEEE SDR, BostonGoogle Scholar
  23. 23.
    Jiang T, Grace D, Mitchell PD (2011) Efficient exploration in reinforcement learning-based cognitive radio spectrum sharing. IET Commun 5(10):1309–1317Google Scholar
  24. 24.
    Ozekin E, Demirci FC, Alagoz F (2013) Self-evaluating reinforcement learning based spectrum management for cognitive ad hoc networks. In: Proceedings of IEEE ICOIN, BangkokGoogle Scholar
  25. 25.
    Macaluso I, DaSilva L, Doyle L (2012) Learning Nash equilibria in distributed channel selection for frequency-agile radios. In: Proceedings of IEEE ECAI, MontpellierGoogle Scholar
  26. 26.
    Lall S, Sadhu AK, Konar A, Mallik KK, Ghosh S (2016) Multi-agent reinforcement learning for stochastic power management in cognitive radio network. In: Proceedings of IEEE Microcom, DurgapurGoogle Scholar
  27. 27.
    Kapetanakis S, Kudenko D (2002) Reinforcement learning of coordination to cooperative multi-agent systems. In: Proceedings of AAAI, Menlo ParkGoogle Scholar
  28. 28.
    Wahab B, Yang Y, Fan Z, Sooriyabandara M (2009) Reinforcement learning based spectrum-aware routing in multi-hop cognitive radio networks. In: Proceedings of IEEE CROWNCOM, HannoverGoogle Scholar
  29. 29.
    Chowdhury K, Wu C, Di Felice M, Meleis W (2010) Spectrum management of cognitive radio using multi-agent reinforcement learning. In: Proceedings of IEEE AAMAS, TorontoGoogle Scholar
  30. 30.
    Faganello LR, Kunst R, Both CB (2013) Improving reinforcement learning algorithms for dynamic spectrum allocation in cognitive sensor networks. In: Proceedings of IEEE WCNC, ShanghaiGoogle Scholar
  31. 31.
    Wu Y, Hu F, Kumar S, Zhu Y, Talari A, Rahnavard N, Matyjas JD (2014) A learning-based QoE-driven spectrum handoff scheme for multimedia transmissions over cognitive radio networks. IEEE J Sel Areas Commun 32(11):2134–2148Google Scholar
  32. 32.
    Chen X, Zhao Z, Zhang H (2013) Stochastic power adaptation with multiagent reinforcement learning for cognitive wireless mesh networks. IEEE Trans Mob Comput 12(11):2155–2166Google Scholar
  33. 33.
    Zhou P, Chang Y, Copeland JA (2010) Learning through reinforcement for repeated power control game in cognitive radio networks. In: Proceedings of IEEE Globecom, MiamiGoogle Scholar
  34. 34.
    Di Felice M, Chowdhury K, Kim W, Kassler A, Bononi L (2011) End-to-end protocols for cognitive radio ad hoc networks: an evaluation study. Perform Eval (Elsevier) 68(9):859–875Google Scholar
  35. 35.
    Reddy YB (2008) Detecting primary signals for efficient utilization of spectrum using Q-learning. In: Proceedings of IEEE ITNG, Las VegasGoogle Scholar
  36. 36.
    Berhold U, Fu F, Van Der Schaar M, Jondral FK (2008) Detection of spectral resources in cognitive radios using reinforcement learning. In: Proceedings of IEEE Dyspan, pp 1–5Google Scholar
  37. 37.
    Di Felice M, Chowdhury KR, Kassler A, Bononi L (2011) Adaptive sensing scheduling and spectrum selection in cognitive wireless mesh networks. In: Proceedings of IEEE Flex-BWAN, MauiGoogle Scholar
  38. 38.
    Arunthavanathan S, Kandeepan S, Evans RJ (2013) Reinforcement learning based secondary user transmissions in cognitive radio networks. In: Proceedings of IEEE Globecom, AtlantaGoogle Scholar
  39. 39.
    Mendes AC, Augusto CHP, da Silva MWR, Guedes RM, de Rezende JF (2011) Channel sensing order for cognitive radio networks using reinforcement learning. In: Proceedings of IEEE LCN, BonnGoogle Scholar
  40. 40.
    Lo BF, Akyldiz IF (2010) Reinforcement learning-based cooperative sensing in cognitive radio ad hoc networks. In: Proceedings of IEEE PIMRC, IstanbulGoogle Scholar
  41. 41.
    Lunden J, Kulkarni SR, Koivunen V, Poor HV (2011) Exploiting spatial diversity in multiagent reinforcement learning based spectrum sensing. In: Proceedings of IEEE CAMSAP, San JuanGoogle Scholar
  42. 42.
    Lunden J, Kulkarni SR, Koivunen V, Poor HV (2013) Multiagent reinforcement learning based spectrum sensing policies for cognitive radio networks. IEEE J Sel Top Signal Process 7(5):858–868Google Scholar
  43. 43.
    Jao Y, Feng Z (2010) Centralized channel and power allocation for cognitive radio network: a Q-learning solution. In: Proceedings of IEEE FNMS, FlorenceGoogle Scholar
  44. 44.
    Galindo-Serrano A, Giupponi L, Blasco P, Dohler M (2010) Learning from experts in cognitive radio networks: the docitive paradigm. In: Proceedings of IEEE CROWNCOM, CannesGoogle Scholar
  45. 45.
    Galindo-Serrano A, Giupponi L (2010) Distributed Q-learning for aggregated interference control in cognitive radio networks. IEEE Trans Veh Tech 59(4):1823–1834Google Scholar
  46. 46.
    Chowdhury KR, Di Felice M, Doost-Mohammady R, Meleis W, Bononi L (2011) Cooperation and communication in cognitive radio networks based on TV spectrum experiments. In: Proceedings of IEEE WoWMoM, LuccaGoogle Scholar
  47. 47.
    Emre M, Gur G, Bayhan S, Alagoz F (2015) CooperativeQ: energy-efficient channel access based on cooperative reinforcement learning. In: Proceedings of IEEE ICCW, LondonGoogle Scholar
  48. 48.
    Saad H, Mohamed A, ElBatt T (2012) Distributed cooperative Q-learning for power allocation in cognitive femtocell networks. In: Proceedings of IEEE VTC-Fall, Quebec CityGoogle Scholar
  49. 49.
    Venkatraman P, Hamdaoui B, Guizani M (2010) Opportunistic bandwidth sharing thorough reinforcement learning. IEEE Trans Veh Tech 59(6):3148–3153Google Scholar
  50. 50.
    Bernardo F, Augusti R, Perez-Romero J, Sallent O (2010) Distributed spectrum management based on reinforcement learning. In: Proceeding of IEEE CROWNCOM, HannoverGoogle Scholar
  51. 51.
    Yau KLA, Komisarczuk P, Teal PD (2010) Context-awareness and intelligence in distributed cognitive radio networks: a reinforcement learning approach. In: Proceedings of IEEE AusCTW, CanberraGoogle Scholar
  52. 52.
    Yau KLA, Komisarczuk P, Teal PD (2010) Enhancing network performance in distributed cognitive radio networks using single-agent and multi-agent reinforcement learning. In: Proceedings of IEEE LCN, DenverGoogle Scholar
  53. 53.
    Yau KLA, Komisarczuk P, Teal PD (2010) Achieving context awareness and intelligence in distributed cognitive radio networks: a payoff propagation approach. In: Proceedings of IEEE WAINA, SingaporeGoogle Scholar
  54. 54.
    Kakalou I, Papadimitriou GI, Nicopoliditis P, Sarigiannidis PG, Obaidat MS (2015) A reinforcement learning-based cognitive MAC protocol. In: Proceedings of IEEE ICC, LondonGoogle Scholar
  55. 55.
    Agrawal R (1995) Sample mean based index policies with o(log(n)) regret for the multi-armed bandit problem. Adv Appl Prob 27(1):1054–1078Google Scholar
  56. 56.
    Robert C, Moy C, Wang CX (2014) Reinforcement learning approaches and evaluation criteria for opportunistic spectrum access. In: Proceeding of IEEE ICC, SydneyGoogle Scholar
  57. 57.
    Jouini W, Di Felice M, Bononi L, Moy C (2012) Coordination and collaboration in secondary networks: a multi-armed bandit based framework. In: Technical Report. Available at: https://arxiv.org/abs/1204.3005
  58. 58.
    Li H (2010) Multi-agent Q-learning for competitive spectrum access in cognitive radio systems. In: Proceedings of IEEE SDR, BostonGoogle Scholar
  59. 59.
    Alsarhan A, Agarwal A (2010) Resource adaptations for revenue optimization in cognitive mesh network using reinforcement learning. In: Proceedings of IEEE GLOBECOM, MiamiGoogle Scholar
  60. 60.
    Teng Y, Zhang Y, Niu F, Dai C, Song M (2010) Reinforcement learning based auction algorithm for dynamic spectrum access in cognitive radio networks. In: Proceedings of IEEE VTC Fall, OttawaGoogle Scholar
  61. 61.
    Cesana M, Cuomo F, Ekici E (2011) Routing in cognitive radio networks: challenges and solutions. Ad Hoc Netw (Elsevier) 9(3):228–248Google Scholar
  62. 62.
    Chowdhury KM, Di Felice (2009) SEARCH: a routing protocol for mobile cognitive radio ad-hoc networks. Comput Commun (Elsevier) 32(18):1983–1997Google Scholar
  63. 63.
    Litman M, Boyan J (1994) Packet routing in dynamically changing networks: a reinforcement learning approach. Adv Neural Inform Process Syst 7(1):671–678Google Scholar
  64. 64.
    Chetret D, Tham C, Wong L (2004) Reinforcement learning and CMAC-based adaptive routing for MANETs. In: Proceedings of IEEE ICON, SingaporeGoogle Scholar
  65. 65.
    Al-Rawi AHA, Alvin Yau KL, Mohamad H, Ramli N, Hashim W (2014) A reinforcement learning-based routing scheme for cognitive radio ad hoc networks. In: Proceedings of IEEE WMNC, VilamouraGoogle Scholar
  66. 66.
    Zheng K, Li H, Qiu RC, Gong S (2012) Multi-objective reinforcement learning based routing in cognitive radio networks: walking in a random maze. In: Proceedings of IEEE ICNC, MauiGoogle Scholar
  67. 67.
    Safdar T, Hasbulah HB, Rehan M (2015) Effect of reinforcement learning on routing of cognitive radio ad hoc networks. In: Proceedings of IEEE ISMSC, IponGoogle Scholar
  68. 68.
    Pourpeighhambar B, Dehghan M, Sabaei M (2017) Non-cooperative reinforcement learning based routing in cognitive radio networks. Comput Commun (Elsevier) 106(1):11–23Google Scholar
  69. 69.
    Dowling J, Curran E, Cunningham R, Cahill V (2005) Using feedback in collaborative reinforcement learning to adaptively optimize MANET routing. IEEE Trans Syst Man Cybern 35(3):360–372Google Scholar
  70. 70.
    Macaluso I, Finn D, Ozgul BAL, DaSilva (2013) Complexity of spectrum activity and benefits of reinforcement learning for dynamic channel selection. IEEE J Sel Areas Commun 31(11):2237–2246Google Scholar
  71. 71.
    Ren Y, Dmochowski P, Komisarczuk P (2010) Analysis and implementation of reinforcement learning on a GNU radio cognitive radio platform. In: Proceedings of IEEE CROWNCOM, CannesGoogle Scholar
  72. 72.
    Moy C, Nafkha A, Naoues M (2015) Reiforcement learning demonstrator for opportunistic spectrum access on real radio signals. In: Proceedings of IEEE DySPAN, StockholmGoogle Scholar
  73. 73.
    Dayan P, Niv Y (2008) Reinforcement learning: the good, the bad and the ugly. Curr Opin Neurobiol 18(1):1–12Google Scholar
  74. 74.
    Naparstek O, Cohen K (2017) Deep multi-user reinforcement learning for distributed dynamic spectrum access. In: CoRR abs/1704.02613Google Scholar
  75. 75.
    Ferreira VP, Paffenroth R, Wyglinski RMA, Hackett MT, Bilen GS, Reinhart CR, Mortense JD (2017) Multi-objective reinforcement learning-based deep neural networks for cognitive space communications. In: Proceedings of IEEE CCAA, ClevelandGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringUniversity of BolognaBolognaItaly

Section editors and affiliations

  • Yue Gao
    • 1
  1. 1.School of Electronic Engineering and Computer ScienceQueen Mary University of LondonLondonUK

Personalised recommendations