Abstract
Deep Reinforcement Learning (DRL)-based control shows enhanced performance in the management of integrated energy systems when compared with Rule-Based Controllers (RBCs), but it still lacks scalability and generalisation due to the necessity of using tailored models for the training process. Transfer Learning (TL) is a potential solution to address this limitation. However, existing TL applications in building control have been mostly tested among buildings with similar features, not addressing the need to scale up advanced control in real-world scenarios with diverse energy systems. This paper assesses the performance of an online heterogeneous TL strategy, comparing it with RBC and offline and online DRL controllers in a simulation setup using EnergyPlus and Python. The study tests the transfer in both transductive and inductive settings of a DRL policy designed to manage a chiller coupled with a Thermal Energy Storage (TES). The control policy is pre-trained on a source building and transferred to various target buildings characterised by an integrated energy system including photovoltaic and battery energy storage systems, different building envelope features, occupancy schedule and boundary conditions (e.g., weather and price signal). The TL approach incorporates model slicing, imitation learning and fine-tuning to handle diverse state spaces and reward functions between source and target buildings. Results show that the proposed methodology leads to a reduction of 10% in electricity cost and between 10% and 40% in the mean value of the daily average temperature violation rate compared to RBC and online DRL controllers. Moreover, online TL maximises self-sufficiency and self-consumption by 9% and 11% with respect to RBC. Conversely, online TL achieves worse performance compared to offline DRL in either transductive or inductive settings. However, offline Deep Reinforcement Learning (DRL) agents should be trained at least for 15 episodes to reach the same level of performance as the online TL. Therefore, the proposed online TL methodology is effective, completely model-free and it can be directly implemented in real buildings with satisfying performance.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Abbreviations
- α :
-
Boltzmann temperature coefficient
- β :
-
temperature term weight of reward function
- γ :
-
discount factor
- δ :
-
electricity cost term weight of reward function
- η rte :
-
round-trip efficiency of battery
- θ :
-
peak term weight of reward function
- μ :
-
learning rate
- χ i :
-
internal heat capacity [kJ/(m2·K)]
- C BESS :
-
nominal capacity of battery [kWh]
- c el,sell :
-
electricity price for selling [€/kWh]
- c el :
-
electricity buying price [€/kWh]
- D S :
-
source domain
- D T :
-
target domain
- E BESS,b :
-
total energy supplied to the building from BESS [kWh]
- E ch,max :
-
maximum battery charging energy [kWh]
- E CHILLER :
-
chiller energy consumption [kWh]
- E cost,source :
-
electricity cost for source building [€]
- E cost :
-
electricity cost for target buildings [€]
- E dis,max :
-
maximum battery discharging energy [kWh]
- E LOAD :
-
non HVAC loads electrical consumption [kWh]
- E PUMP :
-
circulation pumps energy consumption [kWh]
- E PV,b :
-
total energy supplied to the building from PV [kWh]
- E PV,tot :
-
total energy produced by PV [kWh]
- E PV :
-
energy production from PV [kWh]
- E TOT,b :
-
total building electricity consumption [kWh]
- f(·):
-
objective predictive function
- g :
-
solar heat gain coefficient
- I E :
-
income from selling the excess of energy produced by PV to the grid [€]
- P BESS,ch,max :
-
maximum battery charging power [kW]
- P BESS,ch :
-
battery charging power [kW]
- P BESS,dis,max :
-
maximum battery discharging power [kW]
- P BESS,dis :
-
battery discharging power [kW]
- Q cap :
-
capacity of chiller [kW]
- R E :
-
electricity cost term of reward function
- R P :
-
peak term of reward function
- R T :
-
temperature term of reward function
- r t :
-
reward at control time step t
- RBC CF :
-
rule-based controller part choosing whether to supply cooling energy to the building
- RBC el :
-
rule-based controller managing the operation of the electrical part of energy system
- RBC OM :
-
rule-based controller part choosing the operation mode of the energy system
- RBC th :
-
rule-based controller managing the operation of the thermal part of energy system
- SOC BESS :
-
state-of-charge of the BESS
- SOC TES :
-
state-of-charge of the water storage
- SP INT :
-
indoor air temperature setpoint [°C]
- \(\overline{{\Delta}T_{\rm{viol,daily}}}\) :
-
mean value of the daily average temperature violation rate
- T ch :
-
chiller supply temperature [°C]
- T INT :
-
indoor air temperature [°C]
- T LOW :
-
lower threshold limit of temperature comfort range [°C]
- T s,max :
-
storage temperature upper boundary [°C]
- T s,min :
-
storage temperature lower boundary [°C]
- T S :
-
source task
- T T :
-
target task
- T UPP :
-
upper threshold limit of temperature comfort range [°C]
- T viol :
-
temperature violation [°C]
- U OP :
-
thermal transmittance of the opaque envelope [W/(m2·K)]
- U TR :
-
thermal transmittance of the transparent envelope [W/(m2·K)]
- AC:
-
alternate current
- AHUs:
-
air handling units
- AI:
-
artificial intelligence
- BESS:
-
battery energy storage system
- BCVTB:
-
building control virtual test bed
- CF:
-
cooling fraction
- COP:
-
coefficient of performance
- DC:
-
direct current
- DNNs:
-
deep neural networks
- DoC:
-
depth of charge
- DoD:
-
depth of discharge
- DR:
-
demand response
- DRL:
-
deep reinforcement learning
- DTW:
-
dynamic time warping
- FDD:
-
fault detection and diagnosis
- HVAC:
-
heating, ventilation and air conditioning
- IES:
-
integrated energy systems
- IL:
-
imitation learning
- IRL:
-
inverse reinforcement learning
- KPIs:
-
key performance indicators
- LfD:
-
learning from demonstration
- MILP:
-
mixed-integer linear programming
- ML:
-
machine learning
- MPC:
-
model predictive control
- OM:
-
operation mode
- OTL:
-
online transfer learning
- PID:
-
proportional-integrative-derivative
- PV:
-
photovoltaic
- RBC:
-
rule-based controller
- RES:
-
renewable energy sources
- RL:
-
reinforcement learning
- SAC:
-
soft-actor-critic
- SC:
-
self-consumption
- SOC:
-
state-of-charge
- SS:
-
self-sufficiency
- TES:
-
thermal energy storage
- TL:
-
transfer learning
- TOU:
-
time-of-use
- VAV:
-
variable air volume
References
Akiba T, Sano S, Yanase T, et al. (2019). Optuna: A next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.
Amato A, Bilardo M, Fabrizio E, et al. (2021). Energy evaluation of a PV-based test facility for assessing future self-sufficient buildings. Energies, 14: 329.
Anvari-Moghaddam A, Rahimi-Kian A, Mirian MS, et al. (2017). A multi-agent based energy management solution for integrated buildings and microgrid system. Applied Energy, 203: 41–56.
ARERA (2022). Arera - andamento del prezzo dell’energia elettrica per il consumatore domestico tipo in maggior tutela. Available at https://www.arera.it/it/dati/eep35.htm. Accessed 23 Aug 2022. (in Italian)
ASHRAE (2021). High performance sequences of operation for HVAC systems. Atlanta, GA, USA: American Society of Heating, Refrigerating and Air-Conditioning Engineers.
Austin Energy (2023). Electricity Tariff Pilot Programs. Available https://austinenergy.com/ae. Accessed 23 Aug 2022.
Bellman R (1966). Dynamic programming. Science, 153: 34–37.
Bienvenido-Huertas D, Oliveira M, Rubio-Bellido C, et al. (2019). A comparative analysis of the international regulation of thermal properties in building envelope. Sustainability, 11: 5574.
Brandi S, Fiorentini M, Capozzoli A (2022a). Comparison of online and offline deep reinforcement learning with model predictive control for thermal energy management. Automation in Construction, 135: 104128.
Brandi S, Gallo A, Capozzoli A (2022b). A predictive and adaptive control strategy to optimize the management of integrated energy systems in buildings. Energy Reports, 8: 1550–1567.
Brandi S, Piscitelli MS, Martellacci M, et al. (2020). Deep reinforcement learning to optimise indoor temperature control and heating energy consumption in buildings. Energy and Buildings, 224: 110225.
Brockman G, Cheung V, Pettersson L, et al. (2016). Openai Gym.
Chiang Y-T, Lu C-H, Hsu JY-J (2017). A feature-based knowledge transfer framework for cross-environment activity recognition toward smart home applications. IEEE Transactions on Human-Machine Systems, 47: 310–322.
Christodoulou P (2019). Soft actor-critic for discrete action settings. arXiv:1910.07207.
Coraci D, Brandi S, Piscitelli MS, et al. (2021). Online implementation of a soft actor-critic agent to enhance indoor temperature control and energy efficiency in buildings. Energies, 14: 997.
Coraci D, Brandi S, Capozzoli A (2023a). Effective pre-training of a deep reinforcement learning agent by means of long short-term memory models for thermal energy management in buildings. Energy Conversion and Management, 291: 117303.
Coraci D, Brandi S, Hong T, et al. (2023b). Online transfer learning strategy for enhancing the scalability and deployment of deep reinforcement learning control in smart buildings. Applied Energy, 333: 120598.
Crawley DB, Lawrie LK, Winkelmann FC, et al. (2001). EnergyPlus: Creating a new-generation building energy simulation program. Energy and Buildings, 33: 319–331.
Da Silva FL, Costa AHR (2019). A survey on transfer learning for multiagent reinforcement learning systems. Journal of Artificial Intelligence Research, 64: 645–703.
Deltetto D, Coraci D, Pinto G, et al. (2021). Exploring the potentialities of deep reinforcement learning for incentive-based demand response in a cluster of small commercial buildings. Energies, 14: 2933.
Dey S, Marzullo T, Henze G (2023a). Inverse reinforcement learning control for building energy management. Energy and Buildings, 286: 112941.
Dey S, Marzullo T, Zhang X, et al. (2023b). Reinforcement learning building control approach harnessing imitation learning. Energy and AI, 14: 100255.
Durisch W, Bitnar B, Mayor J-C, et al. (2007). Efficiency model for photovoltaic modules and demonstration of its application to energy yield estimation. Solar Energy Materials and Solar Cells, 91: 79–84.
Elehwany H, Ouf M, Gunay B, et al. (2024). A reinforcement learning approach for thermostat setpoint preference learning. Building Simulation, 17: 131–146.
Esrafilian-Najafabadi M, Haghighat F (2023). Transfer learning for occupancy-based HVAC control: A data-driven approach using unsupervised learning of occupancy profiles and deep reinforcement learning. Energy and Buildings, 300: 113637.
European Commission (2019). European Green Deal.
Fang X, Gong G, Li G, et al. (2023). Cross temporal-spatial transferability investigation of deep reinforcement learning control strategy in the building HVAC system level. Energy, 263: 125679.
Finck C, Beagon P, Clauß J, et al. (2018). Review of applied and tested control possibilities for energy flexibility in buildings—A technical report from IEA EBC Annex 67 Energy Flexible Buildings.
Fulpagare Y, Huang K-R, Liao Y-H, et al. (2022). Optimal energy management for air cooled server fans using deep reinforcement learning control method. Energy and Buildings, 277: 112542.
Grubinger T, Chasparis GC, Natschläger T (2017). Generalized online transfer learning for climate control in residential buildings. Energy and Buildings, 139: 63–71.
Haarnoja T, Zhou A, Hartikainen K, et al. (2019). Soft actor-critic algorithms and applications. arXiv: 1812.05905.
Holmgren WF, Hansen CW, Mikofski MA (2018). Pvlib python: a python package for modeling solar energy systems. Journal of Open Source Software, 3: 884.
Huynh A, Dias Barkokebas R, Al-Hussein M, et al. (2021). Energy-efficiency requirements for residential building envelopes in cold-climate regions. Atmosphere, 12: 405.
IEA (2019). World Energy Outlook 2019. Available at https://www.iea.org/reports/world-energy-outlook-201. Accessed 14 Sept 2023.
Jacobson MZ, Jadhav V (2018). World estimates of PV optimal tilt angles and ratios of sunlight incident upon tilted and tracked PV panels relative to horizontal panels. Solar Energy, 169: 55–66.
Kaya M, Bilge H (2019). Deep metric learning: A survey. Symmetry, 11: 1066.
Li H, Chaudhari P, Yang H, et al. (2020). Rethinking the hyperparameters for fine-tuning. arXiv: 2002.11770.
Li A, Xiao F, Fan C, et al. (2021). Development of an ANN-based building energy model for information-poor buildings using transfer learning. Building Simulation, 14: 89–101.
Li G, Chen L, Liu J, et al. (2023). Comparative study on deep transfer learning strategies for cross-system and cross-operation-condition building energy systems fault diagnosis. Energy, 263: 125943.
Li J, Zhang C, Zhao Y, et al. (2022). Federated learning-based short-term building energy consumption prediction method for solving the data silos problem. Building Simulation, 15: 1145–1159.
Lissa P, Schukat M, Keane M, et al. (2021). Transfer learning applied to DRL-Based heat pump control to leverage microgrid energy efficiency. Smart Energy, 3: 100044.
Ministry of Economic Development (2015a). Interministerial Decree of 26 June 2015. Appendix A. Available at https://www.mise.gov.it/index.php/it/normativa/decretiinterministeriali/decreto-interministeriale-26-giugno-2015applicazionedelle-metodologie-di-calcolo-delle-prestazionienergetiche-edefinizione-delle-prescrizioni-e-dei-requisiti-minimi-degli-difici?cldee=ZW5lcmdpYS5kZW1hcmNvQGxpYmVyby5pdA%3D%3D&urlid=0?hitcount=0. Accessed 23 Aug 2022.
Ministry of Economic Development (2015b). Interministerial Decree of 26 June 2015. Appendix b. Available at https://www.mise.gov.it/index.php/it/normativa/decretiinterministeriali/decreto-interministeriale-26-giugno-2015applicazione-dellemetodologie-di-calcolo-delle-prestazionienergetiche-e-definizionedelle-prescrizioni-e-dei-requisiti-minimi-degli-edifici?cldee=ZW5lcmdpYS5kZW1hcmNvQGxpYmVyby5pdA%3D%3D&urlid=0?hitcount=0. Accessed 23 Aug 2022.
Mnih V, Kavukcuoglu K, Silver D, et al. (2015). Human-level control through deep reinforcement learning. Nature, 518: 529–533.
Modelica Association (2000). Modelica® — A unified object-oriented language for physical systems modeling. Tutorial (1.4 ed.). Available at http://www.modelica.org/documents/ModelicaTutorial14.pdf.
Mosaico G, Saviozzi M, Silvestro F, et al. (2019). Simplified state space building energy model and transfer learning based occupancy estimation for HVAC optimal control. In: Proceedings of IEEE 5th International Forum on Research and Technology for Society and Industry (RTSI), Florence, Italy.
Nagy Z, Henze G, Dey S, et al. (2023). Ten questions concerning reinforcement learning for building energy management. Building and Environment, 241: 110435.
Nweye K, Sankaranarayanan S, Nagy Z (2023). MERLIN: Multi-agent offline and transfer learning for occupant-centric operation of grid-interactive communities. Applied Energy, 346: 121323.
Pan SJ, Yang Q (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22: 1345–1359.
Pinto G, Deltetto D, Capozzoli A (2021). Data-driven district energy management with surrogate models and deep reinforcement learning. Applied Energy, 304: 117642.
Pinto G, Kathirgamanathan A, Mangina E, et al. (2022a). Enhancing energy management in grid-interactive buildings: a comparison among cooperative and coordinated architectures. Applied Energy, 310: 118497.
Pinto G, Messina R, Li H, et al. (2022b). Sharing is caring: An extensive analysis of parameter-based transfer learning for the prediction of building thermal dynamics. Energy and Buildings, 276: 112530.
Pinto G, Wang Z, Roy A, et al. (2022c). Transfer learning for smart buildings: a critical review of algorithms, applications, and future perspectives. Advances in Applied Energy, 5: 100084.
Piscitelli MS, Brandi S, Capozzoli A, et al. (2021). A data analytics-based tool for the detection and diagnosis of anomalous daily energy patterns in buildings. Building Simulation, 14: 131–147.
PVSites (2016). European climate zones and bio-climatic design requirements. Available at https://www.pvsites.eu/downloads/category/project-results?page=4. Accessed 23 Aug 2022.
Ruusu R, Cao S, Manrique Delgado B, et al. (2019). Direct quantification of multiple-source energy flexibility in a residential building using a new model predictive high-level controller. Energy Conversion and Management, 180: 1109–1128.
Salsbury TI (2005). A survey of control technologies in the building automation industry. IFAC Proceedings Volumes, 38: 90–100.
Smith SL, Kindermans PJ, Ying C, et al. (2017). Don’t decay the learning rate, increase the batch size. arXiv: 1711.00489.
Sutton RS, Barto AG (2018). Reinforcement Learning: An Introduction, 2 edn. Cambridge, MA, USA: MIT Press.
Taylor ME, Stone P (2009). Transfer learning for reinforcement learning domains: A survey. Journal of Machine Learning Research, 10: 1633–1685.
Tsikaloudaki K, Laskos K, Bikas D (2012). On the establishment of climatic zones in Europe with regard to the energy performance of buildings. Energies, 5: 32–44.
Vázquez-Canteli JR, Ulyanin S, Kämpf J, et al. (2019). Fusing TensorFlow with building energy simulation for intelligent energy management in smart cities. Sustainable Cities and Society, 45: 243–257.
Vázquez-Canteli JR, Dey S, Henze G, et al. (2020). CityLearn: Standardizing research in multi-agent reinforcement learning for demand response and urban energy management. arXiv:2012.10504.
Wang L, Geng X, Ma X, et al. (2019). Ridesharing car detection by transfer learning. Artificial Intelligence, 273: 1–18.
Wang D, Zheng W, Wang Z, et al. (2023a). Comparison of reinforcement learning and model predictive control for building energy system optimization. Applied Thermal Engineering, 228: 120430.
Wang X, Kang X, An J, et al. (2023b). Reinforcement learning approach for optimal control of ice-based thermal energy storage (TES) systems in commercial buildings. Energy and Buildings, 301: 113696.
Wei Z, Calautit J (2023). Evaluation of model predictive control (MPC) of solar thermal heating system with thermal energy storage for buildings with highly variable occupancy levels. Building Simulation, 16: 1915–1931.
Wetter M, Benne K, Tummescheit H, et al. (2023). Spawn: coupling Modelica Buildings Library and EnergyPlus to enable new energy system and control applications. Journal of Building Performance Simulation, https://doi.org/10.1080/19401493.2023.2266414.
Xiong Q, Li Z, Cai W, et al. (2023). Model free optimization of building cooling water systems with refined action space. Building Simulation, 16: 615–627.
Yang L, Nagy Z, Goffin P, et al. (2015). Reinforcement learning for optimal control of low exergy buildings. Applied Energy, 156: 577–586.
Zelinka J, Prágr M, Szadkowski R, et al. (2022). Traversability transfer learning between robots with different cost assessment policies. In: Proceedings of International Conference on Modelling and Simulation for Autonomous Systems.
Zhang Z, Chong A, Pan Y, et al. (2019). Whole building energy model for HVAC optimal control: a practical framework based on deep reinforcement learning. Energy and Buildings, 199: 472–490.
Zhang T, Aakash Krishna GS, Afshari M, et al. (2022a). Diversity for transfer in learning-based control of buildings. In: Proceedings of the 13th ACM International Conference on Future Energy Systems.
Zhang Z, Li Y, Wang J, et al. (2022b). ReMoS: Reducing defect inheritance in transfer learning via relevant model slicing. In: Proceedings of IEEE/ACM 44th International Conference on Software Engineering (ICSE), Pittsburgh, PA, USA.
Zhu Z, Lin K, Jain AK, et al. (2020). Transfer learning in deep reinforcement learning: A survey. arXiv: 2009.07888.
Zou Z, Yu X, Ergan S (2020). Towards optimal control of air handling units using deep reinforcement learning and recurrent neural network. Building and Environment, 168: 106535.
Acknowledgements
The work of Silvio Brandi and Alfonso Capozzoli is funded by the project NODES which has received funding from the MUR — M4C2 1.5 of PNRR funded by the European Union — NextGenerationEU (Grant agreement no. ECS00000036).
Funding
Funding note: Open access funding provided by Politecnico di Torino within the CRUI-CARE Agreement.
Author information
Authors and Affiliations
Contributions
Davide Coraci: conceptualization, methodology, software, investigation, formal analysis, data curation, writing — original draft, visualization; Silvio Brandi: conceptualization, methodology, investigation, writing — review & editing; Tianzhen Hong: methodology, validation, writing — review — editing; Alfonso Capozzoli: conceptualization, methodology, validation, writing — review & editing, supervision.
Corresponding author
Ethics declarations
The authors have no competing interests to declare that are relevant to the content of this article. Tianzhen Hong and Alfonso Capozzoli are Editorial Board members of Building Simulation.
Rights and permissions
Open Access : This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
About this article
Cite this article
Coraci, D., Brandi, S., Hong, T. et al. An innovative heterogeneous transfer learning framework to enhance the scalability of deep reinforcement learning controllers in buildings with integrated energy systems. Build. Simul. 17, 739–770 (2024). https://doi.org/10.1007/s12273-024-1109-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12273-024-1109-6