Abstract
An occupant’s window opening and closing behaviour can significantly influence the level of comfort in the indoor environment. Such behaviour is, however, complex to predict and control conventionally. This chapter, therefore, proposes a novel reinforcement learning (RL) method for the advanced control of window opening and closing. The RL control aims at optimising the time point for window opening/closing through observing and learning from the environment. The theory of model-free RL control is developed with the objective of improving occupant comfort, which is applied to historical field measurement data taken from an office building in Beijing. Preliminary testing of RL control is conducted by evaluating the control method’s actions. The results show that the RL control strategy improves thermal and indoor air quality by more than 90% when compared with the actual historically observed occupant data. This methodology establishes a prototype for optimally controlling window opening and closing behaviour. It can be further extended by including more environmental parameters and more objectives such as energy consumption. The model-free characteristic of RL avoids the disadvantage of implementing inaccurate or complex models for the environment, thereby enabling a great potential in the application of intelligent control for buildings.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
We use Q(S, A) to represent an approximate value function from the data and q(S,A) to represent the target of the approximation.
References
Andersen R, Fabi V, Toftum J, Corgnati SP, Olesen BW (2013) Window opening behaviour modelled from measurements in Danish dwellings. Build Environ 69:101–113. https://doi.org/10.1016/j.buildenv.2013.07.005
ASHRAE standard 55—thermal environmental conditions for human occupancy (2017). ASHRAE Inc
Bellman R (1957a) A Markovian Decision Process. Indiana Univ Mathe J 6(4):679–684. https://doi.org/10.1512/iumj.1957.6.56038
Bellman R (1957b) Dynamic programming. Princeton Univercity Press, Princeton, NJ
Botvinick M, Ritter S, Wang JX, Kurth-Nelson Z, Blundell C, Hassabis D (2019) Reinforcement learning, fast and slow. Trends Cogn Sci 23(5):408–422. https://doi.org/10.1016/j.tics.2019.02.006
Chen Y, Norford LK, Samuelson HW, Malkawi A (2018) Optimal control of HVAC and window systems for natural ventilation through reinforcement learning. Energy Build 169:195–205. https://doi.org/10.1016/j.enbuild.2018.03.051
Chen B, Cai Z, Berges M (2019) Gnu-RL: a precocial reinforcement learning solution for building hvac control using a differentiable MPC policy. New York, NY, USA, pp 316–325.https://doi.org/10.1145/3360322.3360849
Cheng W-L, Chen Y-S, Zhang J, Lyons TJ, Pai J-L, Chang S-H (2007) Comparison of the revised air quality index with the PSI and AQI indices. Sci Total Environ 382(2–3):191–198. https://doi.org/10.1016/j.scitotenv.2007.04.036
D’Oca S, Hong T (2014) A data-mining approach to discover patterns of window opening and closing behaviour in offices. Build Environ 82:726–739. https://doi.org/10.1016/j.buildenv.2014.10.021
Dalamagkidis K, Kolokotsa D, Kalaitzakis K, Stavrakakis GS (2007) Reinforcement learning for energy conservation and comfort in buildings. Build Environ 42(7):2686–2698. https://doi.org/10.1016/j.buildenv.2006.07.010
Ding X, Du W, Cerpa A (2019) OCTOPUS: deep reinforcement learning for holistic smart building control. New York, NY, USA, pp 326–335. https://doi.org/10.1145/3360322.3360857
Dussault J-M, Sourbron M, Gosselin L (2016) Reduced energy consumption and enhanced comfort with smart windows: comparison between quasi-optimal, predictive and rule-based control strategies. Energy Build 127:680–691. https://doi.org/10.1016/j.enbuild.2016.06.024
Enescu D (2017) A review of thermal comfort models and indicators for indoor environments. Renew Sustain Energy Rev 79:1353–1379. https://doi.org/10.1016/j.rser.2017.05.175
Fabi V, Andersen RV, Corgnati S, Olesen BW (2012) Occupants’ window opening behaviour: a literature review of factors influencing occupant behaviour and models. Build Environ 58:188–198. https://doi.org/10.1016/j.buildenv.2012.07.009
Fabi V, Andersen RV, Corgnati SP, Olesen BW (2013) A methodology for modelling energy-related human behaviour: Application to window opening behaviour in residential buildings. Build Simul 6(4):415–427. https://doi.org/10.1007/s12273-013-0119-6
Fazenda P, Veeramachaneni K, Lima P, O’Reilly U-M (2014) Using reinforcement learning to optimize occupant comfort and energy usage in HVAC systems. J Ambient Intell Smart Environ 6(6):675–690. https://doi.org/10.3233/AIS-140288
Fritsch R, Kohler A, Nygård-Ferguson M, Scartezzini J-L (1990) A stochastic model of user behaviour regarding ventilation. Build Environ 25(2):173–181. https://doi.org/10.1016/0360-1323(90)90030-U
Frontczak M, Andersen RV, Wargocki P (2012) Questionnaire survey on factors influencing comfort with indoor environmental quality in Danish housing. Build Environ 50:56–64. https://doi.org/10.1016/j.buildenv.2011.10.012
Haldi F, Robinson D (2009) Interactions with window openings by office occupants. Build Environ 44(12):2378–2395. https://doi.org/10.1016/j.buildenv.2009.03.025
Han M et al (2019) A review of reinforcement learning methodologies for controlling occupant comfort in buildings. Sustain Cities Soc 51:101748. https://doi.org/10.1016/j.scs.2019.101748
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Hong T, Wang Z, Luo X, Zhang W (2020) State-of-the-art on research and applications of machine learning in the building life cycle. Energy Build 212(109831):1–15
Huizenga C, Abbaszadeh S, Zagreus L, Arens EA (2006) Air quality and thermal comfort in office buildings: results of a large indoor environmental quality survey. Healthy Build Lisbon 3:393–397
Jassim MS, Coskuner G (2017) Assessment of spatial variations of particulate matter (PM10 and PM2.5) in Bahrain identified by air quality index (AQI). Arab J Geosci 10(19). https://doi.org/10.1007/s12517-016-2808-9
Jeong B, Jeong J-W, Park JS (2016) Occupant behaviour regarding the manual control of windows in residential buildings. Energy Build 127:206–216. https://doi.org/10.1016/j.enbuild.2016.05.097
Jin W, Zhang N, He J (2015) Experimental study on the influence of a ventilated window for indoor air quality and indoor thermal environment. Procedia Eng 121:217–224. https://doi.org/10.1016/j.proeng.2015.08.1058
Kyrkilis G, Chaloulakou A, Kassomenos PA (2007) Development of an aggregate air quality Index for an urban Mediterranean agglomeration: Relation to potential health effects. Environ Int 33(5):670–676. https://doi.org/10.1016/j.envint.2007.01.010
Li N, Li J, Fan R, Jia H (2015) Probability of occupant operation of windows during transition seasons in office buildings. Renew Energy 73:84–91. https://doi.org/10.1016/j.renene.2014.05.065
Mandic DP, Chambers JA (2001) Recurrent neural networks for prediction: learning algorithms, architectures, and stability. John Wiley, Chichester; New York
Mnih V et al (2013) Playing Atari with Deep Reinforcement learning. arXiv:1312.5602 [cs], Accessed: 26 Jan 2019. [Online]. Available: http://arxiv.org/abs/1312.5602
Mnih V et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533. https://doi.org/10.1038/nature14236
Mozer MC (1998) The neural network house: An environment that adapts to its inhabitants. AAAI Spring Symp Intell Environ 58:110–114
Nagy A, Kazmi H, Cheaib F, Driesen J (2018) Deep reinforcement learning for optimal control of space heating. arXiv:1805.03777
Nunes de Freitas P, Guedes MC (2015) The use of windows as environmental control in ‘Baixa Pombalina’s’ heritage buildings. Renew Energy 73:92–98. https://doi.org/10.1016/j.renene.2014.08.029
Pan S et al (2018) A study on influential factors of occupant window-opening behaviour in an office building in China. Build Environ 133:41–50. https://doi.org/10.1016/j.buildenv.2018.02.008
Pan S et al (2019) A model based on Gauss Distribution for predicting window behaviour in building. Build Environ 149:210–219. https://doi.org/10.1016/j.buildenv.2018.12.008
Park JY, Dougherty T, Fritz H, Nagy Z (2019) LightLearn: an adaptive and occupant centered controller for lighting based on reinforcement learning. Build Environ 147:397–414. https://doi.org/10.1016/j.buildenv.2018.10.028
Pascanu R, Mikolov T, Bengio Y (2013) On the difficulty of training recurrent neural networks. In: International conference on machine learning, pp 1310–1318
Pu H, Luo K, Wang P, Wang S, Kang S (2017) Spatial variation of air quality index and urban driving factors linkages: evidence from Chinese cities. Environ Sci Pollut Res 24(5):4457–4468. https://doi.org/10.1007/s11356-016-8181-0
Rijal HB, Tuohy P, Nicol F, Humphreys MA, Samuel A, Clarke J (2008) Development of an adaptive window-opening algorithm to predict the thermal comfort, energy use and overheating in buildings. J Build Perform Simul 1(1):17–30. https://doi.org/10.1080/19401490701868448
Rijal HB, Humphreys MA, Nicol JF (2018) Development of a window opening algorithm based on adaptive thermal comfort to predict occupant behaviour in Japanese dwellings. Jpn Architectural Rev 1(3):310–321. https://doi.org/10.1002/2475-8876.12043
Roulet C-A et al (2006) Perceived health and comfort in relation to energy use and building characteristics. Build Res Inf 34(5):467–474. https://doi.org/10.1080/09613210600822279
Ruelens F, Claessens BJ, Vandael S, Iacovella S, Vingerhoets P, Belmans R (2014) Demand response of a heterogeneous cluster of electric water heaters using batch reinforcement learning. Wroclaw, Poland, pp 1–7
Ruelens F, Iacovella S, Claessens BJ, Belmans R (2015) Learning agent for a heat-pump thermostat with a set-back strategy using model-free reinforcement learning. Energies 8:8300–8318. https://doi.org/10.3390/en8088300
Shaikh PH, Nor NBM, Nallagownden P, Elamvazuthi I, Ibrahim T (2013) Robust stochastic control model for energy and comfort management of buildings. Aust J Basic Appl Sci 7(10):137–144
Shi G, Liu D, Wei Q (2017) Echo state network-based Q-learning method for optimal battery control of offices combined with renewable energy. IET Control Theory Appl 11(7):915–922
Shi Z et al (2018) Seasonal variation of window opening behaviours in two naturally ventilated hospital wards. Build Environ 130:85–93. https://doi.org/10.1016/j.buildenv.2017.12.019
Silver D et al (2016) Mastering the game of Go with deep neural networks and tree search. Nature 529(7587):484–489. https://doi.org/10.1038/nature16961
Silver D et al (2017) Mastering the game of go without human knowledge. Nature 550(7676):354–359. https://doi.org/10.1038/nature24270
Singh J (1996) Review: health, comfort and productivity in the indoor environment. Indoor and Built Environ 5(1):22–33. https://doi.org/10.1177/1420326X9600500105
Stazi F, Naspi F, Ulpiani G, Di Perna C (2017) Indoor air quality and thermal comfort optimization in classrooms developing an automatic system for windows opening and closing. Energy Build 139:732–746. https://doi.org/10.1016/j.enbuild.2017.01.017
Sutton RS, Barto AG (2018) Reinforcement learning: an introduction, 2nd edn. The MIT Press, Cambridge, MA
Tanner RA, Henze GP (2014) Stochastic control optimization for a mixed mode building considering occupant window opening behaviour. J Build Perform Simul 7(6):427–444. https://doi.org/10.1080/19401493.2013.863384
Wang L, Greenberg S (2015) Window operation and impacts on building energy consumption. Energy Build 92:313–321. https://doi.org/10.1016/j.enbuild.2015.01.060
Watkins CJCH (1989) Learning from delayed rewards. Ph.D. thesis, University of Cambridge
Werbos PJ (1990) Backpropagation through time: what it does and how to do it. Proc IEEE 78(10):1550–1560. https://doi.org/10.1109/5.58337
Yun GY, Steemers K (2008) Time-dependent occupant behaviour models of window control in summer. Build Environ 43(9):1471–1482. https://doi.org/10.1016/j.buildenv.2007.08.001
Zhang H, Arens E, Pasut W (2011) Air temperature thresholds for indoor comfort and perceived air quality. Build Res Inf 39(2):134–144. https://doi.org/10.1080/09613218.2011.552703
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
May, R., Han, M., Zhang, X. (2021). A Novel Reinforcement Learning Method for Improving Occupant Comfort via Window Opening and Closing. In: Zhang, X. (eds) Data-driven Analytics for Sustainable Buildings and Cities. Sustainable Development Goals Series. Springer, Singapore. https://doi.org/10.1007/978-981-16-2778-1_10
Download citation
DOI: https://doi.org/10.1007/978-981-16-2778-1_10
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-2777-4
Online ISBN: 978-981-16-2778-1
eBook Packages: Earth and Environmental ScienceEarth and Environmental Science (R0)