Skip to main content
Log in

Application of reinforcement learning to wireless sensor networks: models and algorithms

  • Published:
Computing Aims and scope Submit manuscript


Wireless sensor network (WSN) consists of a large number of sensors and sink nodes which are used to monitor events or environmental parameters, such as movement, temperature, humidity, etc. Reinforcement learning (RL) has been applied in a wide range of schemes in WSNs, such as cooperative communication, routing and rate control, so that the sensors and sink nodes are able to observe and carry out optimal actions on their respective operating environment for network and application performance enhancements. This article provides an extensive review on the application of RL to WSNs. This covers many components and features of RL, such as state, action and reward. This article presents how most schemes in WSNs have been approached using the traditional and enhanced RL models and algorithms. It also presents performance enhancements brought about by the RL algorithms, and open issues associated with the application of RL in WSNs. This article aims to establish a foundation in order to spark new research interests in this area. Our discussion has been presented in a tutorial manner so that it is comprehensive and applicable to readers outside the specialty of both RL and WSNs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others


  1. Ghataoura DS, Mitchell JE, Matich GE (2011) Networking and application interface technology for wireless sensor network surveillance and monitoring. IEEE Comm Magazine 49(10):90–97

    Article  Google Scholar 

  2. Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, Cambridge

    Google Scholar 

  3. Zhang D, Ma H (2007) A Q-learning-based decision making scheme for application reconfiguration in sensor networks. CSCWD’07 proc 11th Intl Conf Comp Supported Cooperative Work in Design. IEEE, Melbourne, Australia, pp 1122–1127

  4. Seah MWS, Tham CK, Srinivasan V, Xin A (2007) Achieving coverage through distributed reinforcement learning in wireless sensor networks. ISSNIP’07 proc 3rd Intl Conf Intell Sensors, Sensor Net and Info. IEEE, Melbourne, Australia, pp 425–430

  5. Dong S, Agrawal P, Sivalingam K (2007) Reinforcement learning based geographic routing protocol for UWB wireless sensor network. GLOBECOM’07 proc Global Telecomm Conf. IEEE, Washington, DC, pp 652–656

  6. Yau K-LA, Komisarczuk P, Teal PD (2012) Reinforcement learning for context awareness and intelligence in wireless networks. Elsevier J Net Comp App 35(1):253–267

    Article  Google Scholar 

  7. Chu Y, Mitchell PD, Grace D (2012) Reinforcement learning based ALOHA for multi-hop wireless sensor networks with informed receiving. WSS’12 proc IET Conf Wls Sensor Sys. IEEE, London, UK, pp 1–6

  8. Liu Z, Elhanany I (2006) RL-MAC: a reinforcement learning based MAC protocol for wireless sensor networks. Inderscience Intl J Sensor Net 1(3/4):117–124

    Article  Google Scholar 

  9. Mao J, Xiang F, Lai H (2009) RL-based superframe order adaptation algorithm for IEEE 802.15.4 networks. In: CCDC’09 proc Ch Ctrl and Decision Conf. IEEE, Guilin, China, pp 1–5

  10. Gummeson J, Ganesan D, Corner MD, Shenoy P (2010) An adaptive link layer for heterogeneous multi-radio mobile sensor networks. IEEE J Sel Area Comm 28(7):1094–1104

    Article  Google Scholar 

  11. Hsu RC, Liu CT, Wang KC, Lee WM (2009) QoS-aware power management for energy harvesting wireless sensor network utilizing reinforcement learning. CSE’09 proc Intl Conf Comp Sc and Engin. IEEE, Vancouver, Canada, pp 537–542

  12. Shah K, Francesco MD, Anastasi G, Kumar M (2011) A framework for resource-aware data accumulation in sparse wireless sensor networks. Elsevier J Comp Comm 34(17):2094–2103

    Article  Google Scholar 

  13. Liang X, Chen M, Xiao Y, Balasingham I, Leung VCM (2010) MRL-CC: a novel cooperative communication protocol for QoS provisioning in wireless sensor networks. Inderscience Intl J Sensor Net 8(2):98–108

    Article  Google Scholar 

  14. Forster A, Murphy AL (2009) Clique: role-free clustering with Q-learning for wireless sensor networks. ICDCS’09 proceedings 29th IEEE Intl Conf Dist Comp Sys. IEEE, Quebec, Canada, pp 441–449

  15. Saoseng JY, Tham CK (2006) Coordinated rate control in wireless sensor network. ICCS’06 proc 10\(^{th}\)IEEE Singapore Intl Conf Comm Sys. IEEE, Singapore, pp 1–5

  16. Tan H, Zhao L, Liu W, Niu Y, Zhao C (2011) Adaptive congestion avoidance scheme based on reinforcement learning for wireless sensor network. ICCTA’11 proc IET Intl Conf Comm Tech and App. IEEE, Beijing, China, pp 228–232

  17. Tham CK, Renaud JC (2005) Multi-agent systems on sensor networks: a distributed reinforcement learning approach. ISSNIP’05 proc Intl Conf Intell Sensors, Sensor Net and Info. IEEE, Melbourne, Australia, pp 423–429

  18. Khan MI, Rinner B (2012) Resource coordination in wireless sensor networks by cooperative reinforcement learning. PERCOMW’12 proc IEEE Intl Conf Pervasive Comp and Comm Workshops. IEEE, Lugano, Switzerland, pp 895–900

  19. Mahadevan S (1994) To discount or not to discount in reinforcement learning: a case study comparing R learning and Q learning. ICML’94: Proceedings of the 11th International Conference on Machine Learning. Morgan Kaufmann, Amherst, MA, pp 164–172

  20. Mao S, Tang H, Zhou L, Ma X (2011) An energy conservation optimization strategy for wireless sensor network node based on Q-learning. ASCC’11 proc Asian Ctrl Conf. IEEE, Kaohsiung, Taiwan, pp 938–943

  21. Alberola RdP, Pesch D (2012) Duty cycle learning algorithm (DCLA) for IEEE 802.15.4 beacon-enabled wireless sensor networks. Elsevier Ad Hoc Net 10(4):664–679

  22. Arroyo-Valles R, Alaiz-Rodriguez R, Guerrero-Curieses A, Cid-Sueiro J (2007) Q-probabilistic routing in wireless sensor networks. ISSNIP’07: Proc 3rd Intl Conf Intell Sensors, Sensor Net and Info. IEEE, Melbourne, Australia, pp 1–6

  23. Forster A, Murphy AL (2011) FROMS: a failure tolerant and mobility enabled multicast routing paradigm with reinforcement learning for WSNs. Elsevier Ad Hoc Net 9(5):940–965

    Article  Google Scholar 

  24. Villaverde BC, Rea S, Pesch D (2012) InRout: a QoS aware route selection algorithm for industrial wireless sensor networks. Elsevier Ad Hoc Net 10(3):458–478

    Article  Google Scholar 

  25. Hu T, Fei Y (2012) MURAO: a multi-level routing protocol for acoustic-optical hybrid underwater wireless sensor networks. SECON’12 proc 9th Ann IEEE Comm Soc Conf Sensor, Mesh and Ad hoc Comm and Net. IEEE, Seoul, South Korea, pp 218–226

  26. Liang X, Balasingham I, Byun SS (2008) A multi-agent reinforcement learning based routing protocol for wireless sensor networks. ISWCS’08 proc IEEE Intl Symp on Wls Comm Sys. IEEE, Reykjavik, Iceland, pp 552–557

  27. Liang X, Balasingham I, Byun SS (2008) A reinforcement learning based routing protocol with QoS support for biomedical sensor networks. ISABEL’08 proc 1st Intl Symp App Sc and Biomedical and Comm Tech. IEEE, Aalborg, Denmark, pp 1–5

  28. Naputta Y, Usaha W (2012) RL-based routing in biomedical mobile wireless sensor networks using trust and reputation. ISWCS’12 proc 9th Intl Symp Wls Comm Sys. IEEE, Paris, France, pp 521–525

  29. Chu Y, Mitchell PD, Grace D (2012) ALOHA and Q-learning based medium access control for wireless sensor networks. ISWCS’12 proc Intl Symp Wls Comm Sys. IEEE, Paris, France, pp 511–515

  30. Mihaylov M, Borgne YAL, Tuyls K, Nowe A (2012) Decentralised reinforcement learning for energy-efficient scheduling in wireless sensor networks. Inderscience Intl J Comm Net Distrib Sys 9(3/4):207–224

    Google Scholar 

  31. Niu J, Deng Z (2013) Distributed self-learning scheduling approach for wireless sensor network. Elsevier Ad Hoc Net 11(4):1276–1286

  32. Liang X, Chen M, Leung VCM, Balasingham I (2010) Soft QoS provisioning for wireless sensor networks: a cooperative communications approach.In: CHINACOM’10: Proceedings of 5th Intl ICST Conf Commu and Net in China. IEEE, Beijing, China, pp 1–8

  33. Liang X, Balasingham I, Leung VCM (2009) Cooperative communications with relay selection for QoS provisioning in wireless sensor networks. GLOBECOM’09 proc Global Telecomm Conf. IEEE, Honolulu, Hawaii, pp 1–8

  34. Maalej M, Besbes H, Cherif S (2012) A cooperative communication protocol for saving energy consumption in WSNs. ComNet’12 proc Intl Conf Comm and Net. IEEE, Kunming, China, pp 1–5

  35. Renaud JC, Tham CK (2006) Coordinated sensing coverage in sensor networks using distributed reinforcement learning. ICON’06 proc 14th IEEE Intl Conf Net. IEEE, Singapore, pp 1–6

  36. Hu T, Fei Y (2010) QELAR: a machine-learning-based adaptive routing protocol for energy-efficient and lifetime-extended underwater sensor networks. IEEE Trans Mob Comp 9(6):796–809

    Article  Google Scholar 

  37. Yau K-LA, Komisarczuk P, Teal PD (2011) Achieving context awareness and intelligence in distributed cognitive radio networks: a payoff propagation approach. In: WAINA’11 proc IEEE Workshops Intl Conf Ad Info Net and App. IEEE, Singapore

Download references


This work was supported by the Malaysian Ministry of Education (MOE) under Fundamental Research Grant Scheme (FRGS/1/2014/ICT03/SYUC/02/2).

Author information

Authors and Affiliations


Corresponding author

Correspondence to Kok-Lim Alvin Yau.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yau, KL.A., Goh, H.G., Chieng, D. et al. Application of reinforcement learning to wireless sensor networks: models and algorithms. Computing 97, 1045–1075 (2015).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


Mathematics Subject Classification