Explaining Deep Reinforcement Learning-Based Methods for Control of Building HVAC Systems

Jiménez-Raboso, Javier; Manjavacas, Antonio; Campoy-Nieves, Alejandro; Molina-Solana, Miguel; Gómez-Romero, Juan

doi:10.1007/978-3-031-44067-0_13

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1902))

Included in the following conference series:

World Conference on Explainable Artificial Intelligence

598 Accesses
1 Altmetric

Abstract

Deep reinforcement learning (DRL) has emerged as a powerful tool for controlling complex systems, by combining deep neural networks with reinforcement learning techniques. However, due to the black-box nature of these algorithms, the resulting control policies can be difficult to understand from a human perspective. This limitation is particularly relevant in real-world scenarios, where an understanding of the controller is required for reliability and safety reasons. In this paper we investigate the application of DRL methods for controlling the heating, ventilation and air-conditioning (HVAC) system of a building, and we propose an Explainable Artificial Intelligence (XAI) approach to provide interpretability to these models. This is accomplished by combining different XAI methods including surrogate models, Shapley values, and counterfactual examples. We show the results of the DRL-based controller in terms of energy consumption and thermal comfort and provide insights and explainability to the underlying control strategy using this XAI layer.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Cognitive Systems for Energy Efficiency and Thermal Comfort in Smart Buildings

An innovative heterogeneous transfer learning framework to enhance the scalability of deep reinforcement learning controllers in buildings with integrated energy systems

Article Open access 20 February 2024

Energy Management of a Residential Heating System Through Deep Reinforcement Learning

References

ASHRAE: Guideline 36–2021: High Performance Sequences of Operation for HVAC Systems. ASHRAE (2021)
Google Scholar
Azuatalam, D., Lee, W.L., de Nijs, F., Liebman, A.: Reinforcement learning for whole-building HVAC control and demand response. Energy AI 2, 100020 (2020). https://doi.org/10.1016/j.egyai.2020.100020
Article Google Scholar
Barredo Arrieta, A., et al.: Explainable artificial intelligence (XAI): concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion 58, 82–115 (2020). https://doi.org/10.1016/j.inffus.2019.12.012. https://www.sciencedirect.com/science/article/pii/S1566253519308103
Barrett, E., Linder, S.: Autonomous HVAC control, a reinforcement learning approach. In: Bifet, A., et al. (eds.) ECML PKDD 2015. LNCS (LNAI), vol. 9286, pp. 3–19. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23461-8_1
Chapter Google Scholar
Bastani, O., Kim, C., Bastani, H.: Interpretability via model extraction (2018). https://doi.org/10.48550/arXiv.1706.09773
Biemann, M., Scheller, F., Liu, X., Huang, L.: Experimental evaluation of model-free reinforcement learning algorithms for continuous HVAC control. Appl. Energy 298, 117164 (2021). https://doi.org/10.1016/j.apenergy.2021.117164
Article Google Scholar
Breiman, L., Friedman, J., Stone, C.J., Olshen, R.: Classification and Regression Trees. Chapman and Hall/CRC, Wadsworth, Belmont, CA (1984). https://doi.org/10.1201/9781315139470
Cho, S., Park, C.S.: Rule reduction for control of a building cooling system using explainable AI. J. Build. Perform. Simul. 15(6), 832–847 (2022). https://doi.org/10.1080/19401493.2022.2103586
Article Google Scholar
Dulac-Arnold, G., et al.: Challenges of real-world reinforcement learning: definitions, benchmarks and analysis. Mach. Learn. 110(9), 2419–2468 (2021). https://doi.org/10.1007/s10994-021-05961-4
Article MathSciNet Google Scholar
Dulac-Arnold, G., Mankowitz, D., Hester, T.: Challenges of real-world reinforcement learning. arXiv preprint arXiv:1904.12901 (2019)
Fong, R.C., Vedaldi, A.: Interpretable explanations of black boxes by meaningful perturbation. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2017)
Google Scholar
Fu, Q., Han, Z., Chen, J., Lu, Y., Wu, H., Wang, Y.: Applications of reinforcement learning for building energy efficiency control: a review. J. Build. Eng. (2022). https://doi.org/10.1016/j.jobe.2022.104165
Article Google Scholar
Fujimoto, S., Hoof, H., Meger, D.: Addressing function approximation error in actor-critic methods. In: International Conference on Machine Learning, pp. 1582–1591 (2018). https://doi.org/10.48550/arXiv.1802.09477
Geng, G., Geary, G.: On performance and tuning of PID controllers in HVAC systems. In: Proceedings of IEEE International Conference on Control and Applications, pp. 819–824. IEEE (1993). https://doi.org/10.1109/CCA.1993.348229
Gomez-Romero, J., et al.: A probabilistic algorithm for predictive control with full-complexity models in non-residential buildings. IEEE Access 7, 38748–38765 (2019)
Article Google Scholar
Guo, W., Wu, X., Khan, U., Xing, X.: Edge: explaining deep reinforcement learning policies. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 12222–12236. Curran Associates, Inc. (2021). https://proceedings.neurips.cc/paper_files/paper/2021/file/65c89f5a9501a04c073b354f03791b1f-Paper.pdf
Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: International Conference on Machine Learning, pp. 1861–1870. PMLR (2018). https://doi.org/10.48550/arXiv.1801.01290
Heuillet, A., Couthouis, F., Díaz-Rodríguez, N.: Explainability in deep reinforcement learning. Knowl.-Based Syst. 214, 106685 (2021). https://doi.org/10.1016/j.knosys.2020.106685
International Energy Agency: Tracking buildings (2021). https://www.iea.org/reports/tracking-buildings-2021
Jiménez-Raboso, J., Campoy-Nieves, A., Manjavacas-Lucas, A., Gómez-Romero, J., Molina-Solana, M.: Sinergym: a building simulation and control framework for training reinforcement learning agents. In: Proceedings of the 8th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation, pp. 319–323. Association for Computing Machinery, New York, USA (2021). https://doi.org/10.1145/3486611.3488729
Juozapaitis, Z., Koul, A., Fern, A., Erwig, M., Doshi-Velez, F.: Explainable reinforcement learning via reward decomposition. In: Proceedings at the International Joint Conference on Artificial Intelligence. A Workshop on Explainable Artificial Intelligence (2019)
Google Scholar
Krause, J., Perer, A., Ng, K.: Interacting with predictions: visual inspection of black-box machine learning models. In: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, pp. 5686–5697. CHI 2016, Association for Computing Machinery, New York, NY, USA (2016). https://doi.org/10.1145/2858036.2858529
Landajuela, M., et al.: Discovering symbolic policies with deep reinforcement learning. In: Meila, M., Zhang, T. (eds.) Proceedings of the 38th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 139, pp. 5979–5989. PMLR (2021). https://proceedings.mlr.press/v139/landajuela21a.html
Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015). https://doi.org/10.48550/arXiv.1509.02971
Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 4765–4774. Curran Associates, Inc. (2017). http://papers.nips.cc/paper/7062-a-unified-approach-to-interpreting-model-predictions.pdf
Madhikermi, M., Malhi, A.K., Främling, K.: Explainable artificial intelligence based heat recycler fault detection in air handling unit. In: Calvaresi, D., Najjar, A., Schumacher, M., Främling, K. (eds.) EXTRAAMAS 2019. LNCS (LNAI), vol. 11763, pp. 110–125. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30391-4_7
Chapter Google Scholar
Madumal, P., Miller, T., Sonenberg, L., Vetere, F.: Explainable reinforcement learning through a causal lens (2019). https://doi.org/10.48550/arXiv.1905.10958
Meas, M., et al.: Explainability and transparency of classifiers for air-handling unit faults using explainable artificial intelligence (XAI). Sensors 22(17), 6338 (2022). https://doi.org/10.3390/s22176338
Article Google Scholar
Milani, S., Topin, N., Veloso, M., Fang, F.: A survey of explainable reinforcement learning (2022). https://doi.org/10.48550/arXiv.2202.08434
Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937. PMLR (2016). https://doi.org/10.48550/arXiv.1602.01783
Mnih, V., et al.: Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013). https://doi.org/10.48550/arXiv.1312.5602
Montavon, G., Lapuschkin, S., Binder, A., Samek, W., Müller, K.R.: Explaining nonlinear classification decisions with deep Taylor decomposition. Pattern Recogn. 65, 211–222 (2017). https://doi.org/10.1016/j.patcog.2016.11.008. https://www.sciencedirect.com/science/article/pii/S0031320316303582
Mothilal, R.K., Sharma, A., Tan, C.: Explaining machine learning classifiers through diverse counterfactual explanations. In: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pp. 607–617. FAT 2020, Association for Computing Machinery, New York, NY, USA (2020). https://doi.org/10.1145/3351095.3372850
Papadopoulos, S., Kontokosta, C.E., Vlachokostas, A., Azar, E.: Rethinking HVAC temperature setpoints in commercial buildings: the potential for zero-cost energy savings and comfort improvement in different climates. Build. Environ. 155, 350–359 (2019). https://doi.org/10.1016/j.buildenv.2019.03.062
Article Google Scholar
Papernot, N., McDaniel, P.: Deep k-nearest neighbors: towards confident, interpretable and robust deep learning (2018). https://doi.org/10.48550/arXiv.1803.04765
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet Google Scholar
Perera, A., Kamalaruban, P.: Applications of reinforcement learning in energy systems. Renew. Sustain. Energy Rev. 137, 110618 (2021). https://doi.org/10.1016/j.rser.2020.110618
Pérez-Lombard, L., Ortiz, J., Pout, C.: A review on buildings energy consumption information. Energy Build. 40(3), 394–398 (2008). https://doi.org/10.1016/j.enbuild.2007.03.007
Article Google Scholar
Qing, Y., Liu, S., Song, J., Wang, H., Song, M.: A survey on explainable reinforcement learning: concepts, algorithms, challenges (2022). https://doi.org/10.48550/arXiv.2211.06665
Ribeiro, M., Singh, S., Guestrin, C.: “why should I trust you?”: explaining the predictions of any classifier. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, pp. 97–101. Association for Computational Linguistics, San Diego, California (2016). https://doi.org/10.18653/v1/N16-3020. https://aclanthology.org/N16-3020
Roth, A.M., Topin, N., Jamshidi, P., Veloso, M.: Conservative q-improvement: reinforcement learning for an interpretable decision-tree policy (2019). https://doi.org/10.48550/arXiv.1907.01180
Schulman, J., Levine, S., Moritz, P., Jordan, M.I., Abbeel, P.: Trust region policy optimization (2017). https://doi.org/10.48550/arXiv.1502.05477
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017). https://doi.org/10.48550/arXiv.1707.06347
Shapley, L.S.: A Value for n-Person Games, pp. 307–318. Princeton University Press, Princeton (1953). https://doi.org/10.1515/9781400881970-018
Shuai, H., He, H.: Online scheduling of a residential microgrid via Monte-Carlo tree search and a learned model. IEEE Trans. Smart Grid 12(2), 1073–1087 (2021). https://doi.org/10.1109/TSG.2020.3035127
Article Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018). https://doi.org/10.1109/tnn.1998.712192
Valladares, W., et al.: Energy optimization associated with thermal comfort and indoor air control via a deep reinforcement learning algorithm. Build. Environ. 155, 105–117 (2019)
Article Google Scholar
Vázquez-Canteli, J.R., Ulyanin, S., Kämpf, J., Nagy, Z.: Fusing TensorFlow with building energy simulation for intelligent energy management in smart cities. Sustain. Urban Areas 45, 243–257 (2019). https://doi.org/10.1016/j.scs.2018.11.021
Article Google Scholar
Vouros, G.A.: Explainable deep reinforcement learning: state of the art and challenges. ACM Comput. Surv. 55(5), 92:1–92:39 (2022). https://doi.org/10.1145/3527448
Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992)
Article Google Scholar
Yang, Y., Srinivasan, S., Hu, G., Spanos, C.J.: Distributed control of multizone HVAC systems considering indoor air quality. IEEE Trans. Control Syst. Technol. 29(6), 2586–2597 (2021). https://doi.org/10.1109/TCST.2020.3047407
Article Google Scholar
Yao, Y., Shekhar, D.K.: State of the art review on model predictive control (MPC) in heating ventilation and air-conditioning (HVAC) field. Build. Environ. 200, 107952 (2021). https://doi.org/10.1016/j.buildenv.2021.107952
Yu, L., Qin, S., Zhang, M., Shen, C., Jiang, T., Guan, X.: A review of deep reinforcement learning for smart building energy management. IEEE Internet Things J. (2021). https://doi.org/10.1109/JIOT.2021.3078462
Article Google Scholar
Zhang, K., Zhang, J., Xu, P.D., Gao, T., Gao, D.W.: Explainable AI in deep reinforcement learning models for power system emergency control. IEEE Trans. Comput. Soc. Syst. 9(2), 419–427 (2022). https://doi.org/10.1109/TCSS.2021.3096824
Article Google Scholar
Zhang, Z., Chong, A., Pan, Y., Zhang, C., Lam, K.P.: Whole building energy model for HVAC optimal control: a practical framework based on deep reinforcement learning. Energy Build. 199, 472–490 (2019)
Article Google Scholar
Zhong, X., Zhang, Z., Zhang, R., Zhang, C.: End-to-end deep reinforcement learning control for HVAC systems in office buildings. Designs 6(3), 52 (2022). https://doi.org/10.3390/designs6030052
Article Google Scholar

Download references

Acknowledgements

This work has been partially funded by the European Union – NextGenerationEU (IA4TES project, MIA.2021.M04.0008), Junta de Andalucía (D3S project, 30.BG.29.03.01 – P21.00247) and the Spanish Ministry of Science (SPEEDY, TED2021.130454B.I00). A. Manjavacas is also funded by FEDER/Junta de Andalucía (IFMIF-DONES project, SE21_UGR_IFMIF-DONES).

Author information

Authors and Affiliations

Department of Computer Science and Artificial Intelligence, Universidad de Granada, 18014, Granada, Spain
Javier Jiménez-Raboso, Antonio Manjavacas, Alejandro Campoy-Nieves, Miguel Molina-Solana & Juan Gómez-Romero

Authors

Javier Jiménez-Raboso
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Manjavacas
View author publications
You can also search for this author in PubMed Google Scholar
Alejandro Campoy-Nieves
View author publications
You can also search for this author in PubMed Google Scholar
Miguel Molina-Solana
View author publications
You can also search for this author in PubMed Google Scholar
Juan Gómez-Romero
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Javier Jiménez-Raboso .

Editor information

Editors and Affiliations

Technological University Dublin, Dublin, Ireland
Luca Longo

Appendix

1.1 A. Observation space description

Table 3. Observation space used for training the DRL agent. It is composed of 20 variables normalized into range [0, 1] using min-max strategy.

Full size table

1.2 B. PPO hyperparameters

The trained PPO model has the default architecture of the ActorCriticPolicy class of StableBaselines3. This architecture is similar for policy and value nets, and consists of a feature extractor followed by 2 fully-connected hidden layers with 64 units per layer. The activation function is tanh.

The remaining hyperparameters are listed in Table 4.

Table 4. Hyperparameters for training PPO model

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jiménez-Raboso, J., Manjavacas, A., Campoy-Nieves, A., Molina-Solana, M., Gómez-Romero, J. (2023). Explaining Deep Reinforcement Learning-Based Methods for Control of Building HVAC Systems. In: Longo, L. (eds) Explainable Artificial Intelligence. xAI 2023. Communications in Computer and Information Science, vol 1902. Springer, Cham. https://doi.org/10.1007/978-3-031-44067-0_13

Download citation

DOI: https://doi.org/10.1007/978-3-031-44067-0_13
Published: 21 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44066-3
Online ISBN: 978-3-031-44067-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Explaining Deep Reinforcement Learning-Based Methods for Control of Building HVAC Systems

Abstract

Access this chapter

Similar content being viewed by others

Cognitive Systems for Energy Efficiency and Thermal Comfort in Smart Buildings

An innovative heterogeneous transfer learning framework to enhance the scalability of deep reinforcement learning controllers in buildings with integrated energy systems

Energy Management of a Residential Heating System Through Deep Reinforcement Learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

1.1 A. Observation space description

1.2 B. PPO hyperparameters

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Explaining Deep Reinforcement Learning-Based Methods for Control of Building HVAC Systems

Abstract

Access this chapter

Similar content being viewed by others

Cognitive Systems for Energy Efficiency and Thermal Comfort in Smart Buildings

An innovative heterogeneous transfer learning framework to enhance the scalability of deep reinforcement learning controllers in buildings with integrated energy systems

Energy Management of a Residential Heating System Through Deep Reinforcement Learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

1.1 A. Observation space description

1.2 B. PPO hyperparameters

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation