A novel deep generative modeling-based data augmentation strategy for improving short-term building energy predictions

Fan, Cheng; Chen, Meiling; Tang, Rui; Wang, Jiayuan

doi:10.1007/s12273-021-0807-6

A novel deep generative modeling-based data augmentation strategy for improving short-term building energy predictions

Research Article
Published: 14 July 2021

Volume 15, pages 197–211, (2022)
Cite this article

Building Simulation Aims and scope Submit manuscript

Cheng Fan^1,2,
Meiling Chen^1,2,
Rui Tang³ &
…
Jiayuan Wang^1,2

832 Accesses
28 Citations
1 Altmetric
Explore all metrics

Abstract

Short-term building energy predictions serve as one of the fundamental tasks in building operation management. While large numbers of studies have explored the value of various supervised machine learning techniques in energy predictions, few studies have addressed the potential data shortage problem in developing data-driven models. One promising solution is data augmentation, which aims to enrich existing building data resources for reliable predictive modeling. This study proposes a deep generative modeling-based data augmentation strategy for improving short-term building energy predictions. Two types of conditional variational autoencoders have been designed for synthetic energy data generation using fully connected and one-dimensional convolutional layers respectively. Data experiments have been designed to evaluate the value of data augmentation using actual measurements from 52 buildings. The results indicate that conditional variational autoencoders are capable of generating high-quality synthetic data samples, which in turns helps to enhance the accuracy in short-term building energy predictions. The average performance enhancement ratios in terms of CV-RMSE range between 12% and 18%. Practical guidelines have been obtained to ensure the validity and quality of synthetic building energy data. The research outcomes are valuable for enhancing the robustness and reliability of data-driven models for smart building operation management.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enhancing PM2.5 Predictions in Dakar Through Automated Data Integration into a Data Assimilation Model

Article 15 May 2024

Synergistic approach for streamflow forecasting in a glacierized catchment of western Himalaya using earth observation and machine learning techniques

Article 13 May 2024

Data Augmentation techniques in time series domain: a survey and taxonomy

Article Open access 24 March 2023

Abbreviations

CVAE:: conditional variational autoencoder
CV-RMSE:: coefficient of variation of root mean squared error
GAN:: generative adversarial network
LSTM:: long short-term memory
M ₁, M ₂, …, M ₁₂ :: month from January to December
P(A,B) :: joint probability of A and B
P(A∣B) :: conditional probability of A given B
PER:: performance enhancement ratio
RMSE:: root mean squared error
T ₁, T ₂, …, T _n :: time steps from 1 to n
VAE:: variational autoencoder

References

Amasyali K, El-Gohary NM (2018). A review of data-driven building energy consumption prediction studies. Renewable and Sustainable Energy Reviews, 81: 1192–1205.
Article Google Scholar
Antoniou A, Storkey A, Edwards H (2018). Data augmentation generative adversarial networks. arXiv: 1711.04340v3.
Baldi P (2012). Autoencoders, unsupervised learning and deep architectures. JMLR Workshop and Conference Proceedings, 27: 37–50.
Google Scholar
Bregere M, Bessa RJ (2020). Simulating tariff impact in electrical energy consumption profiles with conditional variational autoencoders. IEEE Access, 8: 131949.
Article Google Scholar
Chen Z, Xu P, Feng F, et al. (2021). Data mining algorithm and framework for identifying HVAC control strategies in large commercial buildings. Building Simulation, 14: 63–74.
Article Google Scholar
Chollet F, Allaire JJ (2018). Deep Learning with R. New York: Manning Publications.
Google Scholar
Creswell A, White T, Dumoulin V, et al. (2017). Generative adversarial networks: An overview. In: Proceedings of IEEE Signal Processing Magazine Special Issue on Deep Learning for Visual Understanding.
Fan C, Xiao F, Zhao Y (2017). A short-term building cooling load prediction method using deep learning algorithms. Applied Energy, 195: 222–233.
Article Google Scholar
Fan C, Sun Y, Zhao Y, et al. (2019a). Deep learning-based feature engineering methods for improved building energy prediction. Applied Energy, 240: 35–45.
Article Google Scholar
Fan C, Xiao F, Yan C, et al. (2019b). A novel methodology to explain and evaluate data-driven building energy performance models based on interpretable machine learning. Applied Energy, 235: 1551–1560.
Article Google Scholar
Fan C, Wang J, Gang W, et al. (2019c). Assessment of deep recurrent neural network-based strategies for short-term building energy predictions. Applied Energy, 236: 700–710.
Article Google Scholar
Fan C, Sun Y, Xiao F, et al. (2020). Statistical investigations of transfer learning-based methodology for short-term building energy predictions. Applied Energy, 262: 114499.
Article Google Scholar
Fan C, Yan D, Xiao F, et al. (2021a). Advanced data analytics for enhancing building performances: From data-driven to big data-driven approaches. Building Simulation, 14: 3–24.
Article Google Scholar
Fan C, Liu X, Xue P, et al. (2021b). Statistical characterization of semi-supervised neural networks for fault detection and diagnosis of air handling units. Energy and Buildings, 234: 110733.
Article Google Scholar
Fan C, Liu Y, Liu X, et al, (2021c). A study on semi-supervised learning in enhancing performance of AHU unseen fault detection with limited labeled data. Sustainable Cities and Society, 70: 102874.
Article Google Scholar
Fan C, Chen M, Wang X, et al. (2021d). A review on data preprocessing techniques toward efficient and reliable knowledge discovery from building operational data. Frontiers in Energy Research, 9: 652801.
Article Google Scholar
Fawaz HI, Forestier G, Weber J, et al. (2018). Data augmentation using synthetic data for time series classification with deep residual networks. arXiv: 10808.02455v1.
Frid-Adar M, Klang E, Amitai M, et al. (2018). Synthetic data augmentation using GAN for improved liver lesion classification. In: Proceedings of IEEE 15th International Symposium on Biomedical Imaging.
Gal Y, Ghahramani Z (2016). A theoretically grounded application of dropout in recurrent neural networks. In: Proceedings of NIPS.
Gong M, Wang J, Bai Y, Li B, Zhang L (2020). Heat load prediction of residential buildings based on discrete wavelet transform and tree-based ensemble learning. Journal of Building Engineering, 32: 101455.
Article Google Scholar
Goodfellow I, Bengio Y, Courville A (2016). Deep Learning. Cambridge, MA, USA: MIT Press, USA.
MATH Google Scholar
Grubinger T, Chasparis GC, Natschläger T (2017). Generalized online transfer learning for climate control in residential buildings. Energy and Buildings, 139: 63–71.
Article Google Scholar
Hastie T, Tibshirani R, Friedman J (2016). The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. New York: Springer.
MATH Google Scholar
Hochreiter S, Schmidhuber J (1997). Long short-term memory. Neural Computation, 9: 1735–1780.
Article Google Scholar
Kingma DP, Welling M (2013). Auto-encoding variational Bayes. arXiv: 1312.6114.
Le Guennec A, Malinowski S, Tavenard R (2016). Data augmentation for time series classification using convolutional neural networks. In: Proceedings of ECML/PKDD Workshop in Advanced Analytics and Learning on Temporal Data.
Li A, Xiao F, Fan C, et al. (2021). Development of an ANN-based building energy model for information-poor buildings using transfer learning. Building Simulation, 14: 89–101.
Article Google Scholar
Miller C, Meggers F (2017). The Building Data Genome Project: An open, public data set from non-residential building electrical meters. Energy Procedia, 122: 439–444.
Article Google Scholar
Ng AY, Jordan MI (2001). On discriminative vs. generative classifiers: A comparison of logistic regression and naive Bayes. In: In: Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic (NIPS’01).
Piscitelli MS, Brandi S, Capozzoli A, et al. (2021). A data analytics-based tool for the detection and diagnosis of anomalous daily energy patterns in buildings. Building Simulation, 14: 131–147.
Article Google Scholar
R Development Core Team (2008). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.
Google Scholar
Rashid KM, Louis J (2019). Times-series data augmentation and deep learning for construction equipment activity recognition. Advanced Engineering Informatics, 42: 100944.
Article Google Scholar
Ribeiro M, Grolinger K, El Yamany HF, et al. (2018). Transfer learning with seasonal and trend adjustment for cross-building energy forecasting. Energy and Buildings, 165: 352–363.
Article Google Scholar
Seyedzadeh S, Rahimian FP, Rastogi P, et al. (2019). Tuning machine learning models for prediction of building energy loads. Sustainable Cities and Society, 47: 101484.
Article Google Scholar
Shao S, Wang P, Yan R (2019). Generative adversarial networks for data augmentation in machine fault diagnosis. Computers in Industry, 106: 85–93.
Article Google Scholar
Shao M, Wang X, Bu Z, et al. (2020). Prediction of energy consumption in hotel buildings via support vector machines. Sustainable Cities and Society, 57: 102128.
Article Google Scholar
Simão M, Neto P, Gibaru O (2019). Improving novelty detection with generative adversarial networks on hand gesture data. Neurocomputing, 358: 437–445.
Article Google Scholar
Sohn K, Yan X, Lee H (2015). Learning structured output representation using deep conditional generative models. In: Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS’15).
Sun Y, Haghighat F, Fung BCM (2020). A review of the-state-of-the-art in data-driven approaches for building energy prediction. Energy and Buildings, 221: 110022.
Article Google Scholar
Tian C, Li C, Zhang G, et al. (2019). Data driven parallel prediction of building energy consumption using generative adversarial nets. Energy and Buildings, 186: 230–243.
Article Google Scholar
Um TT, Pfister FMJ, Pichler D, et al. (2017). Data augmentation of wearable sensor data for Parkinson’s disease monitoring using convolutional neural networks. In: Proceedings of ACM International Conference on Multimodal Interaction.
Walker S, Khan W, Katic K, et al. (2020). Accuracy of different machine learning algorithms and added-value of predicting aggregated-level energy performance of commercial buildings. Energy and Buildings, 209: 109705.
Article Google Scholar
Wang R, Lu S, Feng W (2020). A novel improved model for building energy consumption prediction based on model integration. Applied Energy, 262: 114561.
Article Google Scholar
Wang Z, Hong T (2020). Generating realistic building electrical load profiles through the Generative Adversarial Network (GAN). Energy and Buildings, 224: 110299.
Article Google Scholar
Wang Z, Srinivasan RS (2017). A review of artificial intelligence based building energy use prediction: Contrasting the capabilities of single and ensemble prediction models. Renewable and Sustainable Energy Reviews, 75: 796–808.
Article Google Scholar
Wei Y, Zhang X, Shi Y, et al. (2018). A review of data-driven approaches for prediction and classification of building energy consumption. Renewable and Sustainable Energy Reviews, 82: 1027–1047.
Article Google Scholar
Weiss K, Khoshgoftaar TM, Wang D (2016). A survey of transfer learning. Journal of Big Data, 3: 9.
Article Google Scholar
Wen Q, Sun L, Song X, et al. (2020). Time series data augmentation for deep learning: A survey. arXiv: 2002.12478v1.
Xu P, Du R, Zhang Z (2019). Predicting pipeline leakage in petrochemical system through GAN and LSTM. Knowledge-Based Systems, 175: 50–61.
Article Google Scholar
Yu Z, Haghighat F, Fung BCM, et al. (2010). A decision tree method for building energy demand modeling. Energy and Buildings, 42: 1637–1646.
Article Google Scholar
Zhao Y, Zhang C, Zhang Y, et al. (2020). A review of data mining technologies in building energy systems: Load prediction, pattern identification, fault detection and diagnosis. Energy and Built Environment, 1: 149–164.
Article Google Scholar
Zhou Y, Chen J, Yu ZJ, et al. (2020). A novel model based on multi-grained cascade forests with wavelet denoising for indoor occupancy estimation. Building and Environment, 167: 106461.
Article Google Scholar

Download references

Acknowledgements

The authors gratefully acknowledge the support of this research by the National Natural Science Foundation of China (No. 51908365, No. 71772125) and the Philosophical and Social Science Program of Guangdong Province, China (GD18YGL07).

Author information

Authors and Affiliations

Key Laboratory for Resilient Infrastructures of Coastal Cities, Shenzhen University, Ministry of Education, Shenzhen, China
Cheng Fan, Meiling Chen & Jiayuan Wang
Sino-Australia Joint Research Center in BIM and Smart Construction, Shenzhen University, Shenzhen, China
Cheng Fan, Meiling Chen & Jiayuan Wang
Building Technology & Urban Systems Division, Lawrence Berkeley National Laboratory, Berkeley, USA
Rui Tang

Authors

Cheng Fan
View author publications
You can also search for this author in PubMed Google Scholar
Meiling Chen
View author publications
You can also search for this author in PubMed Google Scholar
Rui Tang
View author publications
You can also search for this author in PubMed Google Scholar
Jiayuan Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rui Tang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fan, C., Chen, M., Tang, R. et al. A novel deep generative modeling-based data augmentation strategy for improving short-term building energy predictions. Build. Simul. 15, 197–211 (2022). https://doi.org/10.1007/s12273-021-0807-6

Download citation

Received: 06 January 2021
Revised: 22 March 2021
Accepted: 08 April 2021
Published: 14 July 2021
Issue Date: February 2022
DOI: https://doi.org/10.1007/s12273-021-0807-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A novel deep generative modeling-based data augmentation strategy for improving short-term building energy predictions

Abstract

Access this article

Similar content being viewed by others

Enhancing PM2.5 Predictions in Dakar Through Automated Data Integration into a Data Assimilation Model

Synergistic approach for streamflow forecasting in a glacierized catchment of western Himalaya using earth observation and machine learning techniques

Data Augmentation techniques in time series domain: a survey and taxonomy

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A novel deep generative modeling-based data augmentation strategy for improving short-term building energy predictions

Abstract

Access this article

Similar content being viewed by others

Enhancing PM2.5 Predictions in Dakar Through Automated Data Integration into a Data Assimilation Model

Synergistic approach for streamflow forecasting in a glacierized catchment of western Himalaya using earth observation and machine learning techniques

Data Augmentation techniques in time series domain: a survey and taxonomy

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation