Introduction

An increasing number of distributed energy resources, e.g., electric vehicles introduce challenges to the energy system, e.g., grid balancing (Billanes et al., 2017; Ma et al., 2019). This challenge is especially more crucial to distribution system operators (DSOs) who have to maintain control of the energy grid (Ma et al., 2016). Compared to large-scale battery, demand response is a better solution (Ma & Jørgensen, 2018). However, demand response might create enormous active activities in the grid due to the communication between the electricity markets and the demand sides, which potentially affects the distribution grid stability (Christensen et al., 2021). Therefore, accurate predictions of the future loads in the distribution grid are necessary. Globally, households make up a large part of electricity consumption (Nejat et al., 2015). However, the majority of the distribution grid design cannot afford large number of DERs, especially the synchronous charging (Ma et al., 2021). Thus, it is critical for DSOs to accurately forecast energy load in the residential areas to improve their operation quality and reduce maintenance costs.

In recent years, researchers have experimented with deep neural networks for accurate load forecasting. The introduction of smart meters, smart grids, and IoT devices has made it possible to gather even more data than before that researchers can use to build robust models (Amin et al., 2019). Many configurations of artificial neural networks have been applied to the load forecasting problem with excellent results. The advantages of deep neural networks for load forecasting have become more evident as the number of data increases. Their inherent ability to automatically learn patterns and extract features in the data with multiple input variables makes them a suitable method to solve this problem (Koprinska et al., 2018).

This paper investigates the state-of-the-art techniques using deep neural networks for load forecasting. The paper presents recommendations for accurate and efficient models based on the literature analysis. The remaining sections of this paper are organized as follows. The methodology section describes the research process for finding the relevant literature. The results section presents state-of-the-art deep learning models for load forecasting and literature analysis. Finally, the results of the review are discussed and concluded. This study’s limitations and future work are presented in the discussion and conclusion, respectively.

Methodology

This scoping review outlines and analyses state-of-the-art electric load forecasting methodologies using deep neural networks. The literature search is performed using three online databases relevant to the energy domain. The following databases were selected: IEEE Xplore, ACM digital library, and Elsevier (Scopus). The scoping review covers journal and research articles, conference papers, and books without any limitations to the year of publication.

Search strings were combined from prioritized keywords using boolean operators. The combined strings were used to collect the literature from each selected database based on the contents of the title and abstract. The keyword search was restricted to the title or abstract if a large number of results were given. Table 1 summarizes the search strings and the number of results for each database conducted on the 23rd of February 2021.

Table 1 Results of keyword search in databases

The 3560 publications were imported using reference management software before removing all duplicates, resulting in 2030 unique publications. The remaining references went through three rounds of filtering based on 1) keywords, 2) non-related abstracts, and 3) non-deep learning to remove non-related publications. Table 2 shows the breakdown of the filtering.

Table 2 Procedure for filtering out non-related references

For each of the remaining 99 publications, a full-text search was performed. For 18 references, a full-text could not be found, resulting in 81 review-related publications with full-texts. Focus aspects based on keywords were created to organize the references, which further narrowed the number of publications to an acceptable amount, which this review analyses. Table 3 shows the number of publications in each focus aspect in the reference management software.

Table 3 Number of references for each focus aspect

Results

The amount of research conducted for predicting energy load using deep neural networks has increased substantially in recent years. With an increasing number of electric cars and appliances on the grid and the growing reliance on renewable energy sources in recent years, accurate load predictions are crucial. Researchers have experimented with various deep learning models and unique combinations to increase their performance. This section provides an overview of the applications and methods used for energy load forecasting based on the conducted literature search.

The literature introduces two main areas in the energy domain to apply load forecasting using deep learning. Demand-side management can be defined as any utility activities that affect the demand side’s load profile (Clark & Kelly, 2016). Demand-side management introduces several factors such as HVAC systems, IoT devices, lighting, or occupancy. Researchers include these aspects in their prediction models to make more accurate forecasts of a building’s energy demand (Bedi et al., 2020; Ferlito et al., 2015; Timur et al., 2020).

The second main area introduced in the literature is grid control. This area focuses on understanding what happens in the grid on a short to long-term horizon from the supply side. The importance of accurate forecasts becomes especially important here as utility companies must react accordingly within a short time. In addition to energy consumption data, researchers add various weather-related data in their models to increase the accuracy (Selvi & Mishra, 2018; Selvi & Mishra, 2020; Torabi & Hashemi, 2012; Eseye et al., 2019).

Several applications in the system for each main area are introduced in the energy domain. Researchers base their models on data collected from single or multiple households, public or office buildings, larger districts such as countries or states, or grids. Applications on households are divided between demand-side management and grid control depending on the use case. In demand-side management, the household’s role is essential to ensure an efficient smart grid. The system can leverage demand-side management tactics based on household energy demand to shift loads, shave peaks or fill valleys (Rodrigues et al., 2017; Saatwong & Suwankawin, 2016). Further, the household energy demand is much more volatile than an aggregated load of multiple households, meaning researchers need to consider other external inputs such as occupancy behavior, building characteristics, and even income and employment status (Ramokone et al., 2020; Yuce et al., 2017). Regarding grid control, researchers leverage clustering algorithms on households before training their models to ensure that volatile patterns do not become an issue and increase accuracy (Aurangzeb et al., 2021; Jarábek et al., 2018; Khan & Jayaweera, 2018).

Predicting a building’s energy load relates to the demand-side management aspect of the literature. Researchers use predictions to open more possibilities for demand response tactics in public or office buildings. These buildings have different external inputs than households, for instance, HVAC systems, automated lighting, and occupancy patterns (Katsatos & Moustris, 2019; Gao et al., 2018; Barzola-Monteses et al., 2020). In some buildings, a Supervisory Control and Data Acquisition (SCADA) system is installed to manage the energy consumption, supporting demand response (Vinagre et al., 2015; Vinagre et al., 2016).

In the literature, grids, smart grids, and microgrids relate to both grid control and demand-side management. DSOs can use grid load predictions for more efficient energy scheduling and locating potential volatile areas in their electricity net, thereby improving their operation quality (Chan et al., 2019; Shikulskaya et al., 2020). For demand-side management, smart grids can leverage demand response measures but require accurate predictions to do so efficiently (Krishnan et al., 2020; Kaur et al., 2019; Bruno et al., 2018).

The following outlines the state-of-the-art deep learning and machine learning methods used in recent years. Recently, the most successful methods using neural networks have been hybrid models combining two or more neural networks, sometimes including a conventional method such as ARIMA (Krishnan et al., 2020; Rosato et al., 2019; Qi et al., 2020; Pramono et al., 2019; Kim & Cho, 2019a; Rosato et al., 2020; Kim & Cho, 2019b). These models achieve very high accuracy, primarily because of the specific combination of convolutional neural networks and long short-term memory networks. The CNN’s ability to automatically learn features (Koprinska et al., 2018) and the LSTM, with its strength in sequential data, are the basis for accurate and robust models (Rosato et al., 2019; Qi et al., 2020; Kim & Cho, 2019a). The hybrid models all require multivariate inputs to exploit the combined neural networks’ capabilities. Weather and calendar days data inputs are standard among these methods.

In recent years, researchers have used LSTM models widely with excellent results. However, the models lack the ability to learn features, requiring extensive feature engineering beforehand (Al Khafaf et al., 2019). Also, CNN’s have been performing well in their predictions, though they have some difficulties in volatile consumption patterns (Aurangzeb & Alhussein, 2020). Thus, the two neural networks complement each other very much in a hybrid model. Neural networks have also been applied to other domains such as weather forecasting with success (Wollsen, 2015), proving that neural networks can be used for other areas that use time-series-based data.

Researchers have carried out unique approaches such as ensembling conventional models with neural networks, boosting, decomposition measures, and clustering algorithms with good performances (Rahman et al., 2019; Ves et al., 2019; Bot et al., 2020; Ai et al., 2020; Chenglei et al., 2015). Additionally, researchers often compare the deep learning models to conventional methods such as autoregressive integrated moving average (ARIMA). While the conventional methods perform well, the neural networks outperform them when it comes to load forecasting (Panapongpakorn & Banjerdpongchai, 2019; Kuo & Huang, 2018). Therefore, some neural networks are combined with conventional methods to improve performance. Building hybrid models increases performance metrics substantially compared to its non-hybrid subparts (Chan et al., 2019; Krishnan et al., 2020).

Table 4 presents a collection of the models researchers have experimented with. Further, the data source is included to provide some context of the applications and give a foundation for comparing the experimental results. The evaluation criteria selected are RMSE, RMSPE, MAPE, and MAE. The forecast horizon is also added to outline the relationship between the performance and the future prediction point. The table presents varying results for many different configurations of neural networks presented earlier in this section. In particular, it shows how well the hybrid CNN-LSTM performs when just looking at evaluation metrics. However, evaluation metrics should not be the only deciding factor for choosing the model. The data context and forecast horizon significantly impact the networks’ performance. This table only includes publications that were analyzed earlier in this section. Publications that did not use the evaluation metrics displayed in the table were not included.

Table 4 Selected models including their performance and data source

The models used in the literature are not unique to any specific energy domain aspect. The main focus is to increase the models’ accuracy, where researchers leverage many different methods as described in the previous section. While non-hybrid models perform very well, the literature suggests that hybrid models are applied more often and are more successful. The main argument for hybrid models is how they complement each other, where the other lacks. LSTM networks and CNNs are the most common neural networks for load forecasting using deep learning, both as hybrids and non-hybrids.

The forecasting horizon is an essential factor to remember when comparing the models’ performances. From Table 4, we can see that the errors of the models range widely depending on the forecasting horizon and the data source. Based on this, before choosing the desired model, the use cases should be considered beforehand. It is not enough to compare the forecasting horizons because the datasets can impact the model’s ability to predict validation data. For instance, the models from (Pramono et al., 2019; Kim & Cho, 2019a), which use the same hybrid architecture and forecasting horizon of 1 h, show substantially different results. While there can be many factors impacting this performance, the data inputs undoubtedly influence the error rate.

The models are applied from households to buildings and grids to whole countries. The literature suggests that buildings are heavily related to demand-side management, whereas grid and household applications can be seen from both a grid and demand-side perspective. The significant challenge for households is the volatility in the load profile. Researchers circumvent this by clustering households to similar profiles and aggregating them.

For building applications, exogenous data such as HVAC systems or lighting is essential. These factors can leverage demand response for load shifting and increase the accuracy of the load demand for the building. Grid applications can be used to schedule the energy supply better, increasing the operation quality of DSOs. From a demand-side perspective, accurate forecasts make demand-response in smart grids more relevant as there is more certainty of the grid’s future state. Each application requires multivariate inputs for an accurate forecast based on the literature.

Researchers who use more relevant variables received well-performing models. However, the literature did not cover the limitations of applying a high-parameter model into a real-world scenario. Increasing relevant input variables may increase the accuracy, but the models may not be efficient enough that DSOs can rely on their prediction on a short-term basis. Furthermore, increasing the model’s complexity by ensembling multiple different models or other unique approaches is not necessarily a good idea in a real-world scenario. It is especially true if a more straightforward model performs at the same level as the complex model.

The main areas introduced in the first section share the primary goal of making as accurate forecasts as possible. However, they differ in their real-world applications. Demand-side management uses accurate forecasts to leverage demand response measures, or in other ways managing the demand in households, buildings, or smart grids. On the other hand, grid control uses forecasts to understand what will happen in the energy system on short to medium-term horizons and react accordingly from the supply side. Two considerable challenges for both areas are volatile load profiles and uncertainties in consumption. The literature suggests increasing feature engineering and data inputs to understand the variations, seasonality, and load profile patterns.

The previous sections outlined the main areas for load forecasting using deep learning techniques and the state-of-the-art methods applied to households, buildings, and grids. Hybrid-based models for all aspects of the energy domain were most successful, outperforming many conventional and single-type neural network models. The literature suggests the dominating hybrid model be a CNN-LSTM because of its ability to learn features and sequential data, respectively. Researchers have used non-hybrid CNNs and LSTM networks to successfully forecast electricity consumption, emphasizing the advantages of these two neural networks. While conventional methods perform well, most neural network models outperform these by a large margin, underlining neural networks’ potential for load forecasting. The use of multivariate data is an important factor in achieving accurate models. Depending on the application, such data could be the weather, calendar days, indoor climate, HVAC systems, and occupancy behavior. Thus, based on the research carried out for the literature review, it is recommended to use a convolutional-recurrent neural network hybrid (CNN-RNN) for the most accurate and efficient load forecast. The hybrid model has advantages over non-hybrid models, such as increased sequence and pattern recognition in the data. However, it is impossible to decide on a single architecture for every use case because the performances vary depending on the forecasting horizon, data sources, and application scenarios. These factors must be analyzed and tested beforehand to find the most suitable model for the scenario. The literature suggests that a hybrid neural network should be trained on day and time features, weather data, and historical consumption data as a minimum. Finally, if working with household consumption, it is suggested to aggregate the load of all households or create aggregated clusters of similar households.

Discussion

The review presents many experiments using deep neural networks for load forecasting in recent years because of the increasing amount of data available. The results indicate that hybrid models consisting of convolutional and recurrent neural networks are the state-of-the-art methodology for accurate load forecasting. Further, multivariate inputs are a standard among deep learning applications. Compared to conventional methods, deep neural networks are in many ways more advantageous for load forecasting.

In the literature, hybrid models are compared to conventional methods and single neural networks with remarkable results (Chan et al., 2019; Krishnan et al., 2020; Dudek et al., 2020). Models must be accurate throughout the year because DSOs depend on the predictions for their grid operation. Thus, improvements to neural networks’ configurations must be made continually to make these models applicable in real-world scenarios. Moreover, the literature divided applications between demand and supply, suggesting there are very different ways of applying deep neural networks for load forecasting within each perspective. Demand-side purposes often require much different input data than the supply-side. Demand-side management aims to somehow affect the load without supplying more or less energy. Therefore, systems must be in place for that control. DSOs can react to the load by down- or up-regulating energy on the supply-side. Meaning, a low error margin is required as many inaccurate predictions could result in power outages. (Clark & Kelly, 2016)

The results of the scoping review fit very well with the rising trend of using deep neural networks to solve problems. While previous research has focused on improving conventional methods such as ARIMA for load forecasting, this review demonstrates that the applications of recurrent and convolutional networks for time series data are outperforming them. This literature review confirms the strengths of the recurrent neural networks for time series data as a high number of publications used these for their ability to learn sequential data (Hrnjica & Mehr, 2020). The results also build on the existing evidence that convolutional neural networks are used extensively because of their ability to learn repeating patterns and features in the data automatically (Koprinska et al., 2018). Based on the analyzed literature, a mapping of the focus aspects and their applications can be made, as shown in Fig. 1. The figure presents the separation of the demand-side applications from the supply-side. As outlined in the scoping review results, public and office buildings primarily relate to demand-side management when it comes to deep learning load forecasting. A slight overlap to the supply side is visualized as this scoping review does not cover all publications. Further, smart grid applications primarily relate to demand-side applications because it is possible to leverage demand-response measures in these grids. On the supply side, households are mainly covered. However, the exceptions are households that utilize energy management systems related to demand response. The predictions made for grid applications are visualized on the supply side. Here, grid operators can react to any electric load change by regulating energy. Finally, the most common input features for each application are displayed. There are significant differences in the deep neural networks’ input data, depending on the focus aspect. Future research must emphasize the data’s inputs and context for their results to be comparable.

Fig. 1
figure 1

Mapping of the focus aspects and the applications

This review’s limitations are the context and data sources in which the many models are applied. The amount and type of data used to train the models may impact their performance, limiting the generalizability. Various geographical regions may show different consumption patterns depending on the societal background. For instance, the performance in (Barzola-Monteses et al., 2020) on a public building is difficult to compare to the performance of many households’ load as in (Kim & Cho, 2019a). However, the significant findings from such a comparison are the model’s architecture and configuration. Furthermore, this review suggests a difference in model performance if the consumption data is taken from a household, public building, office building, or grid. Therefore, a thorough analysis of each data source and its context is required, and controlled testing on the forecasting horizons with different datasets is needed. However, this is beyond the scope of this review.

Finally, this study narrowed the search to deep learning techniques, leaving out publications focusing heavily on conventional methods. This paper compares conventional machine learning methods and deep neural networks based on publications that benchmark the models against each other, as seen in (Panapongpakorn & Banjerdpongchai, 2019). The only comparison to conventional forecasting models is made if the literature of this review compares them. Consequently, this review does not consider research focusing solely on conventional methods.

Conclusion

This scoping review presents the state-of-the-art methodologies for applying load forecasting using deep neural networks in the energy domain. The related publications suggested two different load forecasting perspectives within the energy domain after conducting the literature search: demand-side and supply-side. The demand-side perspective focuses on what accurate load forecasts can do to improve demand-response measures. In contrast, the supply-side perspective focuses on using the predictions to control the grid operation better. For each perspective, the publications trained their models on several data sources. Households, office and public buildings, grids, smart grids, and countries were among the applications for the models.

The literature suggests that deep learning for electric load forecasting is emerging as the primary method for load forecasting because of its inherent ability to learn features and patterns in data. Based on the scoping review results, it is recommended to use a hybrid model consisting of a convolutional neural network and a recurrent neural network, specifically a long short-term memory network. A CNN-LSTM hybrid can capture repeating patterns in the CNN part and learn sequential features in the LSTM part of the network, making a robust model that complements itself. Furthermore, the literature indicates a model should use historical consumption data, weather data, and day features as inputs as a minimum. However, it must be noted that the forecasting horizons and the data inputs and sources play a substantial role in the models’ performances. Before deciding on a model architecture, benchmark tests on non-hybrid neural networks should be made to understand the specific dataset and its demands in a neural network.

Future research on electric load forecasting using deep learning should consider the data’s context and source. Applying deep neural networks to load forecasting problems can be done in various ways, which is evident in this study. The many applications make it hard to compare the models’ performance against each other, thereby limiting the generalizability. It is suggested to carry out controlled tests on hybrid and non-hybrid models on several datasets to understand the difference a dataset and forecasting horizon can make on the performance of a neural network.