Reinforcement Learning Methodologies for Controlling Occupant Comfort in Buildings

Han, Mengjie; May, Ross; Zhang, Xingxing

doi:10.1007/978-981-16-2778-1_9

Part of the book series: Sustainable Development Goals Series ((SDGS))

681 Accesses
1 Citations

Abstract

Classical building control systems are becoming vulnerable with increasing complexities in contemporary built environments and energy systems. Due to this, the reinforcement learning (RL) method is becoming more distinctive and applicable in control networks for buildings. This chapter, therefore, conducts a comprehensive review of RL techniques applied in control systems for occupant comfort in indoor built environments. The empirical applications of RL-based control systems are presented, depending on comfort objectives (thermal comfort, indoor air quality, and lighting) along with other objectives which invariably includes energy consumption. The class of RL algorithms and implementation details regarding how the value functions have been represented and how the policies are improved are also illustrated. This chapter shows there are limited works for which RL has been explored for controlling occupant comfort, especially in indoor air quality and lighting. Relatively few of the reviewed works incorporate occupancy patterns and/or occupant feedback into the control loop. Moreover, this chapter identifies a gap with regard to the performance of implementing cooperative multiagent RL (MARL). Based on our findings, current challenges and further opportunities are discussed. We expect to clarify the feasible theory and functions of RL for building control systems, which would promote their widespread application in built environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Hardcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A reinforcement learning approach for thermostat setpoint preference learning

Article 01 August 2023

Multi-track Transfer Reinforcement Learning for Power Consumption Management of Building Multi-type Air-Conditioners

Distributed and Self-learning Approaches for Energy Management

References

Altnan E (1999) Constrained Markov decision processes. Chapman & Hall/CRC
Google Scholar
ASHRAE Standard 55 (2017) Thermal environmental conditions for human occupancy. ASHRAE Inc.
Google Scholar
Baghaee S, Ulusoy I (2018) User comfort and energy efficiency in HVAC systems by Q-learning. In: 2018 26th signal processing and communications applications conference (SIU), pp 1–4
Google Scholar
Barrett E, Linder S (2015) Autonomous HVAC control, a reinforcement learning approach. In: Bifet A, May M, Zadrozny B, Gavalda R, Pedreschi D, Bonchi F et al (eds) Machine learning and knowledge discovery in databases. Springer International Publishing, pp 3–19
Google Scholar
Bellman R (1957a) A Markovian decision process. Indiana Univ Math J 6(4):679–684
Article Google Scholar
Bellman R (1957b) Dynamic programming. Princeton Univ. Press, Princeton, NJ
Google Scholar
Bielskis AA, Guseinoviene E, Drungilas D, Gricius G, Zulkas E (2013) Modelling of ambient comfort affect reward based adaptive laboratory climate controller. Elektronika Ir Elektrotechnika 19(8):79–82
Article Google Scholar
Bonte M, Perles A, Lartigue B, Thellier F (2014) An occupant behaviour model based on artificial intelligence for energy building simulation. In: Proceedings of the 13th international IBPSA conference
Google Scholar
Boodi A, Beddiar K, Benamour M, Amirat Y, Benbouzid M (2018) Intelligent systems for building energy and occupant comfort optimization: a state of the art review and recommendations. Energies 11(10):2604
Article Google Scholar
Brockman G, Cheung V, Pettersson L, Schneider J, Schulman J, Tang J et al (2016) OpenAI Gym. arXiv:1606.01540 [Cs]
Buşoniu L, Babuška R, De Schutter B (2010) Multi-agent reinforcement learning: an overview. In: Srinivasan D, Jain LC (eds) Innovations in multi-agent systems and applications—1, vol 310, pp 183–221. Springer Berlin Heidelberg
Google Scholar
CEN prEN15251 (2005) Criteria for the indoor environment including thermal, indoor air quality, light and noise
Google Scholar
Chen Y, Norford LK, Samuelson HW, Malkawi A (2018) Optimal control of HVAC and window systems for natural ventilation through reinforcement learning. Energy Build 169:195–205
Article Google Scholar
Chenari B, Dias Carrilho J, Gameiro da Silva M (2016) Towards sustainable, energy-efficient and healthy ventilation strategies in buildings: a review. Renew Sustain Energy Rev 59:1426–1447
Article Google Scholar
Cheng Z, Zhao Q, Wang F, Jiang Y, Xia L, Ding J (2016) Satisfaction based Qlearning for integrated lighting and blind control. Energy Build 127:43–55
Article Google Scholar
Christiano P, Leike J, Brown TB, Martic M, Legg S, Amodei D (2017) Deep reinforcement learning from human preferences. arXiv:1706.03741 [Cs, Stat]
Dalamagkidis K, Kolokots D (2008) Reinforcement learning for building environmental control. In: Weber C, Elshaw M, Michael N (eds) Reinforcement learning. ITech Education and Publishing
Google Scholar
Dalamagkidis K, Kolokotsa D, Kalaitzakis K, Stavrakakis GS (2007) Reinforcement learning for energy conservation and comfort in buildings. Build Environ 42(7):2686–2698
Article Google Scholar
Eller L, Siafara LC, Sauter T (2018) Adaptive control for building energy management using reinforcement learning. IEEE Int Conf Industr Technol (ICIT) 2018:1562–1567
Google Scholar
Enescu D (2017) A review of thermal comfort models and indicators for indoor environments. Renew Sustain Energy Rev 79:1353–1379
Article Google Scholar
Ernst D, Geurts P, Wehenkel L (2005) Tree-based batch mode reinforcement learning. J Mach Learn Res 6:503–556
Google Scholar
Frontczak M, Wargocki P (2011) Literature survey on how different factors influence human comfort in indoor environments. Build Environ 46(4):922–937
Article Google Scholar
Fu Q, Hu L, Wu H, Hu F, Hu W, Chen J (2018) A Sarsa-based adaptive controller for building energy conservation. J Comput Methods Sci Eng 18(2):329–338
Google Scholar
Galasiu AD, Veitch JA (2006) Occupant preferences and satisfaction with the luminous environment and control systems in daylit offices: a literature review. Energy Build 38(7):728–742
Article Google Scholar
Gambier A (2004). Real-time control systems: a tutorial. In: Presented at the 5th Asian control conference (IEEE Cat. No. 04EX904), pp 1024–1031
Google Scholar
Grondman I, Busoniu L, Lopes GA, Babuska R (2012) A survey of actor-critic reinforcement learning: standard and natural policy gradients. IEEE Trans Syst Man Cybern Part C 42(6):1291–1307
Article Google Scholar
Gu S, Lillicrap T, Sutskever I, Levine S (2016) Continuous deep Q-learning with model-based acceleration. In: Presented at the conference on machine learning, vol 48
Google Scholar
Guo X, Tiller D, Henze G, Waters C (2010) The performance of occupancy-based lighting control systems: a review. Light Res Technol 42(4):415–431
Article Google Scholar
Guyot G, Sherman MH, Walker IS (2018) Smart ventilation energy and indoor air quality performance in residential buildings: a review. Energy Build 165:416–430
Article Google Scholar
Haq MA, Hassan MY, Abdullah H, Rahman HA, Abdullah MP, Hussin F et al (2014) A review on lighting control technologies in commercial buildings, their performance and affecting factors. Renew Sustain Energy Rev 33:268–279
Article Google Scholar
Hurtado LA, Mocanu E, Nguyen PH, Gibescu M, Kamphuis RIG (2018) Enabling cooperative behaviour for building demand response based on extended joint action learning. IEEE Trans Industr Inf 14(1):127–136
Article Google Scholar
Jouffe L (1997) Ventilation control learning with FACL. In: Proceedings of 6th international fuzzy systems conference, vol 3, pp 1719–1724
Google Scholar
Kaelbling LP, Littman ML, Moore AW (1996) Reinforcement learning: a survey. J Artif Intell Res 4:237–285
Article Google Scholar
Khalili AH, Wu C, Aghajan H (2010) Hierarchical preference learning for light control from user feedback. In: IEEE computer society conference on computer vision and pattern recognition—workshops, pp 56–62
Google Scholar
Klein L, Kwak J, Kavulya G, Jazizadeh F, Becerik-Gerber B, Varakantham P et al (2012) Coordinating occupant behaviour for building energy and comfort management using multi-agent systems. Autom Constr 22:525–536
Article Google Scholar
Konda VR, Tsitsiklis JN (2000) Actor-critic algorithms. Presented Adv Neural Inform Process Syst 12:1008–1014
Google Scholar
Kruisselbrink T, Dangol R, Rosemann A (2018) Photometric measurements of lighting quality: an overview. Build Environ 138:42–52
Article Google Scholar
Li B, Xia L (2015) A multi-grid reinforcement learning method for energy conservation and comfort of HVAC in buildings. IEEE Int Conf Autom Sci Eng (CASE) 2015:444–449
Google Scholar
Li D, Zhao D, Zhu Y, Xia Z (2015) Thermal comfort control based on MEC algorithm for HVAC systems. Int Joint Conf Neural Netw (IJCNN) 2015:1–6
Google Scholar
Li N, Cui H, Zhu C, Zhang X, Su L (2016) Grey preference analysis of indoor environmental factors using sub-indexes based on Weber/Fechner’s law and predicted mean vote. Indoor Built Environ 25(8):1197–1208
Google Scholar
Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y et al (2016) Continuous control with deep reinforcement learning. arXiv:1509.02971 [Cs, Stat]
Littman ML (1994) Markov games as a framework for multi-agent reinforcement learning. In: Presented at the conference on machine learning, pp 157–163
Google Scholar
Lu S, Wang W, Lin C, Hameen E (2019) Data-driven simulation of a thermal comfort-based temperature set-point control with ASHRAE RP884. Build Environ
Google Scholar
Marinakis V, Karakosta C, Doukas H, Androulaki S, Psarras J (2013) A building automation and control tool for remote and real time monitoring of energy consumption. Sustain Cities Soc 6:11–15
Article Google Scholar
Mataric MJ (1994) Reward functions for accelerated learning. In: Presented at the proceedings 11th international conference on machine learning (ICML-94), pp 181–189
Google Scholar
Merabti S, Draoui B, Bounaama F (2016) A review of control systems for energy and comfort management in buildings. In: 2016 8th international conference on modelling, identification and control (ICMIC), pp 478–486
Google Scholar
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533
Article Google Scholar
Mnih V, Badia AP, Mirza M, Graves A, Lillicrap TP, Harley T et al (2016) Asynchronous methods for deep reinforcement learning. arXiv:1602.01783 [Cs]
Mozer MC (1998) The neural network house: an environment hat adapts to its inhabitants, vol 5
Google Scholar
Nagy A, Kazmi H, Cheaib F, Driesen J, Leuven K (2018) Deep reinforcement learning for optimal control of space heating. arXiv:1805.03777 [Stat.AP]
Oca S, Hong T, Langevin J (2018) The human dimensions of energy use in buildings: a review. Renew Sustain Energy Rev 81:731–742
Google Scholar
Park JY, Nagy Z (2018) Comprehensive analysis of the relationship between thermal comfort and building control research—a data-driven literature review. Renew Sustain Energy Rev 82:2664–2679
Article Google Scholar
Park JY, Dougherty T, Fritz H, Nagy Z (2019) LightLearn: an adaptive and occupant centered controller for lighting based on reinforcement learning. Build Environ 147:397–414
Article Google Scholar
Pedro F, Kalyan V, Pedro L, Una-May O (2014) Using reinforcement learning to optimize occupant comfort and energy usage in HVAC systems. J Ambient Intell Smart Environ 6:675–690
Article Google Scholar
Roetzel A, Tsangrassoulis A, Dietrich U, Busching S (2010) A review of occupant control on natural ventilation. Renew Sustain Energy Rev 14(3):1001–1013
Article Google Scholar
Royapoor AI, Caraiscos C (2009) Advanced control systems engineering for energy and comfort management in a building environment—a review. Renew Sustain Energy Rev 13(6–7):1246–1261
Google Scholar
Royapoor M, Antony A, Roskilly T (2018) A review of building climate and plant controls, and a survey of industry perspectives. Energy Build 158:453–465
Article Google Scholar
Ruelens F, Iacovella S, Claessens BJ, Belmans R (2015) Learning agent for a heatpump thermostat with a set-back strategy using model-free reinforcement learning. Energies 8(8):8300–8318
Article Google Scholar
Rummery G, Niranjan M (1994) On-line Q-learning using connectionist systems. Cambridge University
Google Scholar
Sato K, Samejima M, Akiyoshi M, Komoda N (2012) A scheduling method of air conditioner operation using workers daily action plan towards energy saving and comfort at office. In: Proceedings of 2012 IEEE 17th international conference on emerging technologies & factory automation (ETFA 2012), pp 1–6
Google Scholar
Schmidt M, Moreno MV, Schulke A, Macek K, Mařik K, Pastor AG (2017) Optimizing legacy building operation: the evolution into data-driven predictive cyber-physical systems. Energy Build 148:257–279
Article Google Scholar
Schwartz HM (2014) Multi-agent machine learning. a reinforcement approach, 1st ed. Wiley
Google Scholar
Sen S, Sekaran M, Hale J (1994) Learning to coordinate without sharing information. In: Presented at the 12th national conference on artificial intelligence (AAAI-94), pp 426–431
Google Scholar
Shaikh PH, Nor NBM, Nallagownden P, Elamvazuthi I, Ibrahim T (2013) Robust stochastic control model for energy and comfort management of buildings. Aust J Basic Appl Sci 7(10):137–144
Google Scholar
Shaikh PH, Nor NBM, Nallagownden P, Elamvazuthi I, Ibrahim T (2014) A review on optimized control systems for building energy and comfort management of smart sustainable buildings. Renew Sustain Energy Rev 34:409–429
Article Google Scholar
Silver D (2015) RL course by David Silver. UCL. Retrieved from http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html
Silver D, Huang A, Maddison CJ, Guez A, Sifre L, van den Driessche G et al (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484–489
Article Google Scholar
Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez A et al (2017) Mastering the game of go without human knowledge. Nature 550(7676):354–359
Article Google Scholar
Song Y, Wu S, Yan YY (2015) Control strategies for indoor environment quality and energy efficiency—a review. Int J Low-Carbon Technol 10(3):305–312
Article Google Scholar
Sun B, Luh PB, Jia Q, Yan B (2013) Event-based optimization with non-stationary uncertainties to save energy costs of HVAC systems in buildings. IEEE Int Conf Autom Sci Eng (CASE) 2013:436–441
Google Scholar
Sun B, Luh PB, Jia Q, Yan B (2015a) Event-based optimization within the lagrangian relaxation framework for energy savings in HVAC systems. IEEE Trans Autom Sci Eng 12(4):1396–1406
Article Google Scholar
Sun Y, Somani A, Carroll TE (2015b) Learning based bidding strategy for HVAC systems in double auction retail energy markets. Am Control Conf (ACC) 2015:2912–2917
Google Scholar
Sutton RS, Barto AG (2018) Reinforcement learning: an introduction, 2nd edn. The MIT Press, Cambridge, Massachusetts
Google Scholar
Sycara KP (1998) Multiagent systems. AI Mag 19:79–92
Google Scholar
Urieli D, Stone P (2013) A learning agent for heat-pump thermostat control, p 8
Google Scholar
Vazquez-Canteli JR, Nagy Z (2019) Reinforcement learning for demand response: a review of algorithms and modeling techniques. Appl Energy 235:1072–1089
Article Google Scholar
Vazquez-Canteli JR, Ulyanin S, Kampf J, Nagy Z (2019) Fusing tensorflow with building energy simulation for intelligent energy management in smart cities. Sustain Cities Soc 45:243–257
Article Google Scholar
Vesely M, Zeiler W (2014) Personalized conditioning and its impact on thermal comfort and energy performance—a review. Renew Sustain Energy Rev 34:401–408
Article Google Scholar
Wang W, Zmeureanu R, Rivard H (2005) Applying multi-objective genetic algorithms in green building design optimization. Build Environ 40(11):1512–1525
Article Google Scholar
Wang Y, Kuckelkorn J, Liu Y (2017a) A state of art review on methodologies for control strategies in low energy buildings in the period from 2006 to 2016. Energy Build 147:27–40
Article Google Scholar
Wang Y, Velswamy K, Huang B (2017b) A long-short term memory recurrent neural network based reinforcement learning controller for office heating ventilation and air conditioning systems. Processes 5(3):46
Article Google Scholar
Wang N, Phelan PE, Harris C, Langevin J, Nelson B, Sawyer K (2018) Past visions, current trends, and future context: a review of building energy, carbon, and sustainability. Renew Sustain Energy Rev 82:976–993
Article Google Scholar
Watkins CJCH (1989) Learning from delayed rewards PhD thesis. University of Cambridge
Google Scholar
Wei T, Wang Y, Zhu Q (2017) Deep reinforcement learning for building HVAC control. In: Proceedings of the 54th annual design automation conference 2017 on—DAC’17, pp 1–6
Google Scholar
Wenqi G, Zhou M (2009) Technologies toward thermal comfort-based and energyefficient HVAC systems: a review. In: 2009 IEEE international conference on systems, man and cybernetics, pp 3883–3888
Google Scholar
Xu X, He H, Hu D (2002) Efficient reinforcement learning using recursive leastsquares methods. J Artif Intell Res 16:259–292. https://doi.org/10.1613/jair.946
Article Google Scholar
Yan D, Hong T, Dong B, Mahdavi A, D’Oca S, Gaetani I et al (2017) IEA EBC Annex 66: definition and simulation of occupant behaviour in buildings. Energy Build 156:258–270
Article Google Scholar
Yang R, Wang L (2012) Multi-objective optimization for decision-making of energy and comfort management in building automation and control. Sustain Cities Soc 2(1):1–7
Article Google Scholar
Yang R, Wang L (2013) Multi-zone building energy management using intelligent control and optimization. Sustain Cities Soc 6:16–21
Article Google Scholar
Yang L, Nagy Z, Goffin P, Schlueter A (2015) Reinforcement learning for optimal control of low exergy buildings. Appl Energy 156:577–586
Article Google Scholar
Ye W, Zhang X, Gao J, Cao G, Zhou X, Su X (2017a) Indoor air pollutants, ventilation rate determinants and potential control strategies in Chinese dwellings: a literature review. Science Total Environ 586:696–729
Article Google Scholar
Ye D, Zhang M, Vasilakos AV (2017b) A survey of self-organisation mechanisms in multi-agent systems. IEEE Trans Syst Man Cybern Syst 47(3):441–461
Article Google Scholar
Yu Z, Dexter A (2010) Online tuning of a supervisory fuzzy controller for low-energy building system using reinforcement learning. Control Eng Pract 18(5):532–539
Article Google Scholar
Zalejska-Jonsson A, Wilhelmsson M (2013) Impact of perceived indoor environment quality on overall satisfaction in Swedish dwellings. Build Environ 63:134–144
Article Google Scholar
Zhang Z, Lam KP (2018) Practical implementation and evaluation of deep reinforcement learning control for a radiant heating system. In: Proceedings of the 5th conference on systems for built environments—BuildSys’ 18, pp 148–157
Google Scholar
Zhang Z, Chong A, Pan Y, Zhang C, Lu S, Lam KP (2018) A deep reinforcement learning approach to using whole building energy model for HVAC optimal control. In: Presented at the 2018 building performance modeling conference and simbuild co-organized by ASHRAE and IBPSA-USA
Google Scholar

Download references

Author information

Authors and Affiliations

Micro Data Analysis, Dalarna University, 79188, Falun, Sweden
Mengjie Han & Ross May
Department of Energy and Community Buildings, Dalarna University, 79188, Falun, Sweden
Xingxing Zhang

Authors

Mengjie Han
View author publications
You can also search for this author in PubMed Google Scholar
Ross May
View author publications
You can also search for this author in PubMed Google Scholar
Xingxing Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mengjie Han .

Editor information

Editors and Affiliations

Dalarna University, Falun, Sweden
Xingxing Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Han, M., May, R., Zhang, X. (2021). Reinforcement Learning Methodologies for Controlling Occupant Comfort in Buildings. In: Zhang, X. (eds) Data-driven Analytics for Sustainable Buildings and Cities. Sustainable Development Goals Series. Springer, Singapore. https://doi.org/10.1007/978-981-16-2778-1_9

Download citation

DOI: https://doi.org/10.1007/978-981-16-2778-1_9
Published: 12 September 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-2777-4
Online ISBN: 978-981-16-2778-1
eBook Packages: Earth and Environmental ScienceEarth and Environmental Science (R0)

Publish with us

Policies and ethics

Reinforcement Learning Methodologies for Controlling Occupant Comfort in Buildings

Abstract

Access this chapter

Similar content being viewed by others

A reinforcement learning approach for thermostat setpoint preference learning

Multi-track Transfer Reinforcement Learning for Power Consumption Management of Building Multi-type Air-Conditioners

Distributed and Self-learning Approaches for Energy Management

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Reinforcement Learning Methodologies for Controlling Occupant Comfort in Buildings

Abstract

Access this chapter

Similar content being viewed by others

A reinforcement learning approach for thermostat setpoint preference learning

Multi-track Transfer Reinforcement Learning for Power Consumption Management of Building Multi-type Air-Conditioners

Distributed and Self-learning Approaches for Energy Management

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation