1 Introduction

A smart grid is an electricity network enabling a two-way flow of electricity and data with digital communications technology. This gives the ability of monitoring, managing, and automatic decision-making. Besides, smart grid uses a wide range of resources based on information technology techniques to enable new and existing guidelines in minimizing energy costs and reducing electricity wastes. The motivation for proposing the Long Short-Term Memory (LSTM) model are the Power station failures are characterized by unpredictable behavior due to various factors such as weather conditions, physical structure, control systems, and load behavior and achieving high accuracy in predicting power failures is essential for efficient grid management and preventing potential disruptions. However, due to the unpredictable nature of the failures, traditional methods may struggle to provide accurate short-term predictions in addition to LSTM model is integrated into a larger framework that includes K-means clustering for pattern recognition and Monte Carlo simulation for accurate temporal prediction.

According to Ali et al. (2013), the smart grid is one of the most complicated and largest systems considering the design and building processes, although it is one of the easiest to use. It uses all kinds of power plants (including hydro, solar, coal, nuclear, wind turbine, and natural gas, among others), substations, transformers, and high-voltage transmission lines (Hasan et al. 2019), therefore, there is the need for a demand-responsive electrical grid with high efficiency of energy use. The traditional grid uses a one-way limited interaction, in which power flows to the consumers from the power plant. In contrast, the smart grid introduces a two-way interchange which involves the exchange of both information and electricity, in both directions (between consumer and power utilities). The growing network of computers, automation, control, and communications are instrumental in making the grid “greener”, more reliable, more secure, and more efficient (Hasan et al. 2019). The major issues in the existing methods are unpredictable characteristics, complexity of data, temporal dependencies, high accuracy requirement, data preprocessing challenges and model selection and tuning.

This data could be useful when being set to work with different aspects or dimensions of smart grid such as integration with renewable energy sources, management of intermittent power supplies, real-time data responses as well as the energy pricing strategies among others (Jakkula and Cook 2007). As such, it becomes a necessity that we would develop the right tools and methods which could help in conserving the energy by gathering the data from the smart grid using sensors which could then be used to recognize patterns from previous data and forecast or predict to conserve energy in the smart grids.

Some of the algorithms that could be used for prediction are related to deep learning algorithms like Long-short term memory (LSTM), Recurrent Neural Network (RNN), Gated Recurrent Unit (GRU). In this work, the used predictor is the most efficient one of them, in terms of accuracy and delay.

LSTM is an RNN variant that is meant specifically for time series data. The LSTM is used in addressing this problem in addition to empowering RNNs algorithms using internal memory cells (Li et al. 2020; Bui et. al. 2020).

RNNs are a form of neural networks that adopt the feedback connections in various nodes in remembering the previous time steps values. As such, they can capture the time series data’s temporal behaviour (Tealab, Ahmed, 2018).

GRU is a kind of gated RNN which is largely used in mitigating the gradient vanishing problem of RNNs using gating mechanism in addition to making the structure simpler without interfering with the effect of LSTM neural network (Luo et al. 2021).

However, since these prediction methods are based on regression techniques, which tries to find a common pattern for the historical samples to use to predict future values. Considering our application, the historical samples from the energy generators and the load of the smart city may not have a constant pattern. This is due to the stochastic behavior of the environment. Therefore, to convert this dynamicity to a static pattern, in this work, K-Means clustering algorithm is used.

K-means clustering algorithm refers to a simple unsupervised learning algorithm used in solving clustering problems which is useful in clustering analysis. According to (Xu et al. 2020), the algorithm is applied using certain procedures that classify a certain set of data into clusters defined by the letter “K”.

Our methodology in this work is to convert the stochastic behavior of the attributes into an accurate pattern using a clustering algorithm (i.e., K-means). This allows us to be able to identify their fitting curve and use a suitable regression-based algorithm (i.e., LSTM and GRU) for an accurate short-term prediction.

The main objective of this work is to propose a method that allows enhancing the accuracy of the short-term generated power prediction for the smart grid environment.

Although several works have tackled the problem of the prediction in the SmartGrid context, most of these works focus only on the long-term prediction. The advantage of long-term prediction is in bringing long-term strategy and planning, however, the methods that are currently used for this task provide accurate predictions.

The methodology focuses on modelling of the environment.

The main objective of this work is to propose a method that allows enhancing the accuracy of the short-term generated power prediction for the SmartGrid environment.

While the proposed techniques for predicting power station failures in the Smart Grid offer several advantages, they also come with certain limitations are data dependency, computational complexity, overfitting, interpretability and limited explainability.

The main contributions to the existing body of knowledge that this study will make include:

  1. 1.

    Identify the most useful factors that affect the accuracy of the Smart Grid short term prediction process.

  2. 2.

    Implement a model (or a combination of already existing models) for recognizing patterns of failure in the Smart Grid.

  3. 3.

    Identification of the best deep learning algorithm to mine data from a synthetic testbed.

  4. 4.

    Providing a solution that enhances the protection level for smart grid dynamic environment against failures.

Our knowledge gap contribution in this work is to convert the stochastic behavior of the attributes into an accurate pattern using a clustering algorithm (i.e., K-means). This allows us to be able to identify their fitting curve and use a suitable regression-based algorithm (i.e., LSTM and GRU) for an accurate short-term prediction.

The paper is organized as follows: First, we provide an introduction to the topic and its significance. Next, we review the relevant literature to establish the context and background for our study. We then outline the methodology used in our research, followed by a description of the experiments conducted. The results of these experiments are presented and analyzed. Finally, we discuss our conclusions and suggest potential directions for further research. The paper concludes with a list of references.

2 Related works

Efficient delivery of energy resources to the smart grid requires a balanced energy demand and supply by developing energy resource management strategies. However, the significant fluctuations in energy demand and supply enhance the challenges in the development of these energy resource management schemes. This problem has been tackled using different approaches in follows:

For instance, this work (Yu et. al. 2015) developed several approaches to predict energy supply and demand effectively. The study then develops machine learning-based methods for accurate energy consumption and generation forecasts. Lastly, the study used the prediction results to establish energy consumption upper and lower bounds realizing optimal demand and anomaly detections.

The problem is the smart meters acquire large amounts of data through sophisticated signal processing algorithms. The methodology will be applied in the study for it will the first develops a new classification scheme that categorizes users based on their consumption patterns. The study will then test the proposed and benchmarked models. Additionally, this research uses semi-Markov models to generate more extensive and more realistic test data due to insufficient power consumption data (Tornai et. al. 2016).

The problem in electrical load prediction is a fundamental factor in the planning, operations, and resource management within the grid system. The numerous restructurings of the grid and the integration of new devices to the grid heighten the need for forecasting to better plan for energy supply and demand. The study assesses the prediction model’s performance and effectiveness against several metrics (Chemetova et. al.2017).

The problem in harvesting various renewable energy forms led to the use of the smart grids integrated with photovoltaic (PV) power. However, various atmospheric conditions, for instance, rain, affect solar irradiance occurrence. This solution proposes the adoption of wavelet transform and Elman Neural Network (WT- ENN) for short-term solar energy production and irradiance forecasting. Also, the study reconstructed solar irradiance using the prediction model and the new coefficients. The prediction model’s performance was then assessed using two real-world data solar irradiance datasets (Huang et. al. 2019).

The problem of Smart grid systems allowing consumers to use more energy from the grid or vend it back to the grid for other consumers. Smart homes with photovoltaic systems can establish the daily energy yield. This solution recommends the use of multi-layer perceptron based on photovoltaic forecasting on rooftop PV systems. The study then trains its historical data, conducts cross-validation, and tests the model using real-world PV data (Parvez et. al. 2020).

Smart grids offer better integration of power systems between energy producers and consumers. The bidirectional nature of these smart grids calls energy consumption optimization measures to maintain the grid’s reliability and supply-demand balance. The solution evaluates the available short-term energy consumption prediction models to determine next-day energy consumption forecasts at one-hour intervals, realizing a 24-point forecast. This methodology conducted a thorough assessment of various high-level machine algorithms adopted to forecast and evaluate the various model instances (Petrican et. al. 2018).

The problem is Efficient energy delivery in the smart grid requires adopting energy resource management strategies that balance energy supply and demand. This solution proposes several techniques that accurately model and predict energy production and demand over time. Similarly, the study recommends modeling analyses that statistical output models of energy consumption and machine learning approaches improve prediction accuracy (Yu et. al. 2014).

Several countries continue to record an increase in their solar power capacity connections to the distribution grids. Adopting the smart grid concept has since contributed to this increase. The solution study suggests a new forecasting model that uses autoregressive models and gradient boosting algorithms. The researchers propose a model that overcomes the information and communication technology (ICT) limitations to promote solar energy forecasts at secondary substation levels. It then combined the values obtained from various distributed sensors (Bessa 2014, November).

Various new technology appliances that consumers currently use in their households overwhelm the existing smart grid infrastructure as they were initially not developed to support these devices. This solution proposes the implementation of various methods energy providers can improve their energy consumption forecasts for households despite their variability in electrical appliance usage. The authors assess the existing prediction models and their significance. Then describes various modeling techniques that assess the existing statistical approaches and machine learning algorithms (Lauer et. al. 2019).

Internationally, photovoltaic systems already are amongst the most extensively employed kinds of renewable energy. Still, photovoltaic energy is exceedingly unpredictable as it is subject to the elements. This presents challenges for network operators in terms of integrating, controlling, and operating this form of energy. Hence, the significance of renewable energy forecasting is growing in importance as a strategy for effectively managing the integration of renewable energy into the electrical grid. The focus of this research is a comparison of available algorithms for analyzing energy data. The following algorithms and a solar production database for smart houses have been proposed for this purpose: The Three Neural Networks: LSTM, FFNN, and GRU.This study utilized working on Data from 300 randomly chosen solar customers in the Ausgrid electricity grid area were used. From July 1, 2010, through June 30, 2013, we collected data on consumption. Both normal electricity usage for always-supplied electricity and load-controlled electricity consumption are stored here (Souabi et al., 2023).

Eco-friendly generators, and in particular wind generators, have emerged as a solution to the electric demand problem in light of projections of fossil fuels depletion in the coming years and the detrimental impact they have due to the exhaust gases they emit. Since the output power of wind turbines is reliant upon the variable as well as intermittent behavior of wind speed, it is problematic to maximize the employment of this renewable energy source. Most studies have basically focused on the development of novel algorithms, but none of these investigations have encompassed a pre-processing phase of the data in an effort to get as much beneficial information as possible from prevailing datasets. The aim of this research endeavor entails exploring the potential for boosting the precision of existing wind speed forecasts for a 10-minute time frame by integrating time-frequency decomposition techniques with varied machine learning approaches. The error metrics obtained indicate that the newly developed wind speed forecasting model achieved a level of accuracy within 0.1% of the validation database approximately 62% of the time. In this study, the average wind speed, monthly, and hourly numbers from the database were used as input to simple models, and the findings were then used. After the forecasters have been trained, their true accuracy can be measured against a validation dataset (Rodríguez et al. 2023).

In order to increase customer satisfaction with bike sharing programs, it is important for managers to accurately predict trip demand in order to better manage the distribution and relocation of bikes. In recent years, a plethora of deep learning techniques have been put forth with the aim of improving the accuracy of bicycle utilization prediction. To capture spatial-temporal dependency in past trip demand, it is common to combine convolutional CNN-RNN. The convolution process in a regular CNN is often carried out by a kernel that “walks” over a “matrix-format” city in order to extract features over geographically close neighborhoods. In this study, Singapore has implemented a single dockless bike sharing system alongside four station-based system in London, New York, and Washington D.C. are used to compare and contrast the concept with a set of benchmark models. It compares IrConv+LSTM to other benchmark models and finds that it performs better across all five cities (Li et al. 2023).

Control centers make extensive use of short-term load forecasting to investigate shifting consumer load patterns and anticipate the load value at a future time. It is a crucial piece of equipment for building a smart grid. There are a wide variety of influences on the load parameters. To begin, in this research we reconstruct data using several feature parameters and feed it into a ResNeT network to extract features. Second, LSTM is fed the recovered feature vector to make near-term load predictions. Finally, the proposed combination technique is compared to other models using a real-world example, demonstrating its superiority through the verification of the examination of the viability and superiority of input parameter feature extraction. This study employed the application of short-term load forecasting, wherein the model was compared against various methods inclusive of MLR, CNN, LSTM, CNN-LSTM, as well as ResNet for the aim of short-term load forecasting. Each of the models essentially makes predictions for three time periods: December 1, 2010; December 1 to December 2, 2010; as well as December 1 to December 7, 2010 (Chen et al. 2023).

The precise and effective prediction of load is highly important for ensuring the stable operation and scheduling of contemporary power systems. Nevertheless, the nature of load data typically exhibits nonlinearity and non-stationarity, posing challenges for achieving precise forecasting. While certain serial hybrid models have demonstrated effectiveness in extracting spatiotemporal features from load data, the sequential extraction of features is found to be inefficient as it results in the loss of significant features. The primary objective of this study is to investigate a novel ensemble framework that can be utilized for short-term load forecasting. The proposed framework utilizes parallel CNN and GRU, incorporating an enhanced variant of the iResNet. Primarily, the raw data is subjected to preprocessing techniques purposed for reconstructing the electrical characteristics. Besides, (CNN) is responsible for extracting spatial features, whereas the (GRU) is employed for extracting temporal features. Subsequently, the integration of the two features extracted is achieved through the utilization of an attention mechanism that operates dynamically. The iResNet model is utilized to accurately forecast power consumption. This research employed the use of working. This paper introduces an innovative ensemble framework for predicting short-term load. The framework combines parallel CNN and LSTM models, while integrating the iResNet architecture (Hua et al. 2023).

3 Materials and methods

To realize our proposed solution, we need first to identify the environment model, which in this case includes a photovoltaic grid. Then the process of generating the energy depending on the physical structure and the weather conditions in addition to the generated signal needs to be modelled and then simulated. After this, the generated data are clustered using the K-means clustering algorithm. Finally, (Long-Short-Term Memory) LSTM and (Gated Recurrent Unit) GRU are used to provide the short-term predictions.

3.1 Environment and PV Model

The earth rotates around the sun approximately 8766 h about 365.242 days. Earth is closest to the Sun (147million km) on January 2, and this point is called perihelion.

Specific points on earth aligned with sun position. It’s determined by two angels; they are altitude angle (α) and azimuth angle \(\theta_{s}\)

The altitude angle is the angular height of the sun is measured from the horizontal. The altitude angle can be given by: (Duffie et al. 2013).

$$\sin \alpha = \sin L \sin \delta + \cos L \cos \delta \cos \omega$$
(1)

where, \(L\) : attitude of the location, ξ: Angle of declination, ω: Hour angle.

The declination angle is between Earth sun vector and equatorial plane its calculated degree, arguments to trig function noted in radian mode (Duffie et al. 2013).

(2)

(Iqbal 1983), noted hour angel ω is the angular displacement of the sun local point is given by:

$$\omega = 15^\circ \left( {AST - 12h} \right)$$
(3)

AST the true daily motion of solar time is given by a daily apparent solar motion of true observed sun. AST is constructed on the actual solar day. The two intervals fall between two consecutive returns of local meridian and the sun. Solar time is illustrious as, (Duffie et al. 2013).

$$AST = LMT + E_{0} T \pm \frac{40}{{\left( {LSMT - LOD} \right) }}$$
(4)

\(LMT\): Local meridian time, \(LOD\): Longitude, \(LSMT\): Local standard meridian time, \(E_{0} T\): Equation of time, \(AST\): Apparent solar time, h: Hour.

(Iqbal 1983), the \(LSMT\) is a reference meridian used for a particular time zone, used for Greenwich Mean Time.

\({\text{LSMT}}\) is given by:

$$LMST = 15^{ \circ } T_{GMT}$$
(5)

(Duffie et al. 2013), The \(E_{0} T\) is the difference between apparent and mean solar times, both taken at a given longitude at the same real instant of time.

\(E_{0} T\) is given by:

$$E_{o} T = 9.87\sin \left( {2B} \right) - 7.53 \cos B - 1.5\sin B$$
(6)

where, \(B\) can be given by;

$$B = \frac{2\pi }{{365}}\left( {N - 81} \right)$$
(7)

where, N: Day number defined as the number of days elapsed each year up to a particular date (Iqbal 1983).

Angular displacement of the Sun reference line from the source axis, (Duffie et al. 2013). The azimuth angle can be given by:

$$\sin \theta = \frac{\cos \delta \sin \omega }{{\cos \alpha }}$$
(8)

The solar source model is to estimate the emitted radiation from the Sun. The function of the temperature is described as radiant energy of emitting objects.

We associate radiating energy with the blackbody. A blackbody is defined as a perfect absorber and emitter. A perfect absorber can absorb all the received energy with any reflections, (Planck 1914).

Planck’s law describes the wavelengths emitted by a blackbody at a specific temperature as follows:

$$E_{\lambda } = \frac{{3.74 \times 10^{8} }}{{ \lambda^{5} \left[ {\exp \frac{14,40}{{ \lambda T}}} \right] - 1}}$$
(9)

\(E_{\lambda }\): Total emissive per unit area of blackbody emission rate (W/m2 µm), T: Absolute temperature of the blackbody (K), λ: Wavelength (µm).

Solar radiation value outside the atmosphere varies as the Earth orbits the Sun. Therefore, the distance between the Sun and the Earth must be considered in modeling extraterrestrial solar radiation. (Duffie et al. 2013). Thus, the (\(G_{{{\text{ex}}}}\)) is given by:

$$G_{{{\text{ex}}}} = G_{0} \left[ {\frac{{R_{av} }}{R}} \right]^{2}$$
(10)

where, \(G_{{{\text{ex}}}}\): Extraterrestrial solar radiation. \(G_{0}\): Solar constant. \(R_{av}\) : Mean distance between the Sun and the Earth. \(R\) : Instantaneous distance between the Sun and the Earth depends on the day of the year or day number.

There are different approximations for the factor (\(R_{av} /R)\) in the literature (Iqbal 1983). A recommended approximation can be given by:

$$\left[ {\frac{{R_{av} }}{R}} \right] = 1 + 0.0333 \cos \left[ {\frac{2\pi N}{{365}}} \right]$$
(11)

By substituting Equations (11, 10)

The extraterrestrial solar radiation unit of time falling at a right on square meter of a surface can be given by:

$$G_{ex} = G_{0} \left( {1 + 0.0333 \cos \left[ {\frac{2\pi N}{{365}}} \right]} \right)$$
(12)

Once the surface faces the Sun (normal to a central ray), the solar irradiance falling on, is \(G_{ex}\), utilizes maximum solar radiation at that distance. If the surface is not normal to the Sun, the solar radiation drops on it will be decreased by cosine of the angle between the surface normal and a central ray from the Sun (Duffie et al. 2013).

Thus, the extraterrestrial solar radiation on a horizontal surface located in a specific location (\(G_{exH}\)) can be calculated by: (Iqbal 1983).

$$G_{exH} = G_{ex} \cos \varphi$$
(13)

where, φ: Solar zenith angle.

(Duffie et al. 2013). The solar zenith angle value is equal to the altitude value, and thus Equation (13) can be rewritten as follows:

$$G_{exH} = G_{0} \left[ {1 + 0.0333 \cos \left[ {\frac{360N}{{365}}} \right]} \right] \sin L \sin \delta + \cos L \cos \delta \cos \omega$$
(14)

Finally, the total extraterrestrial solar energy \(E_{{{\text{ex}}}}\) (Wh/m2) is calculated as follows: (Duffie et al. 2013).

$$E_{ex} = \mathop \smallint \limits_{{T_{sr} }}^{{T_{ss} }} G_{exH} {\text{d}}t$$
(15)

There are several components of a solar radiation on a tilted surface are in addition to the direct \((G_{B,\beta } )\) and diffuse \((G_{D,\beta } )\) solar radiation, reflected solar radiation (\(G_{R}\)\()\) is added to form the global solar radiation incident on a tilted surface (Duffie et al. 2013).

$$G_{T,\beta } = G_{B,\beta } + G_{D,\beta } + G_{R}$$
(16)

(Iqbal 1983). The solar energy components on a horizontal surface as follows:

$$G_{T,\beta } = G_{B} R_{B} + G_{D} R_{D} + G_{T} \rho R_{R}$$
(17)

where, \(R_{B}\), \(R_{D}\), and \(R_{R}\):are coefficients. \(\rho\) : Ground Aledo. \(R_{B}\): Ratio between global solar energy on a horizontal surface and global solar energy on a tilted surface. \(R_{D}\): Ratio between diffuse solar energy on a horizontal surface and diffuse solar energy on a tilted surface, \(R_{R}\) : Factor of reflected solar energy on a tilted surface.

The finding of solar energy components on a tilted surface is to estimate the coefficients \(R_{B}\), \(R_{D}\), and \(R_{R}\). Used model for calculating \(R_{B}\) is the Liu and Jordan model (Liu and Jordan 1963).

$$R_{B} = \frac{{\cos \left( {L - \beta } \right)\cos \delta \sin \omega_{ss} + \omega_{ss} \sin \left( {L - \beta } \right)\sin \delta }}{{\cos L \cos \delta \sin \omega_{ss} + \omega_{ss} \sin L \sin \delta }}$$
(18)

The surfaces in the southern hemisphere, the slope toward the equator \({R}_{B}\) is given as:

$$R_{B} = \frac{{\cos \left( {L + \beta } \right)\cos \delta \sin \omega_{ss} + \omega_{ss} \sin \left( {L + \beta } \right)\sin \delta }}{{\cos L \cos \delta \sin \omega_{ss} + \omega_{ss} \sin L \sin \delta }}$$
(19)

The most recommended formula \({R}_{R}\) is:

$$R_{R} = \frac{1 - \cos \beta }{2}$$
(20)

\(R_{D}\) Have been classified into isotropic and anisotropic models.

Four statistic errors are used, which are:

  1. 1.

    (\(MAPE\) ): Mean absolute percentage error.

  2. 2.

    (\(MBE\) ): Mean bias error.

  3. 3.

    (\(MAE\) ): Mean absolute error.

  4. 4.

    (\(RMSE\) ): Root mean square error.

(Hyndman et al. 2006), The general accuracy of a neural network can be highlighted by \(MAPE\). \(MAPE\) can be defined as follows:

$$MAPE = \frac{1}{n}\mathop \sum \limits_{t = 1}^{n} \left| {\frac{M - P}{M}} \right|$$
(21)

where, \(M\) : Measured data. \(P\) : Predicted data.

(Willmott et al. 2005), The information of long‐term performance of the neural network model can also be evaluated by \(MBE\). \(MBE\) can be calculated as follows:

$$MBE = \frac{1}{n}\mathop \sum \limits_{i = 1}^{n} (P_{i} - M_{i} )$$
(22)

(Willmott et al. 2005), The mean absolute error \(MAE\): is a measure of errors between paired observations expressing the same phenomenon. Examples of Y versus X include comparisons of predicted versus observed, subsequent time versus initial time, and one technique of measurement versus an alternative technique of measurement. \(MAE\) is calculated as:

$$MAE = \mathop \sum \limits_{i = 1}^{n} \left( {\frac{{\left| {y_{i} - x_{i} } \right|}}{n}} \right)$$
(23)

where, \(y_{i}\) : Prediction, \(x_{i}\) : True value, n: Total number of data points.

The final statistic error is \(RMSE\); it represents the measurement of the variation of the predicted data around the measured data. (Chai et al. 2014).

The short‐term performance information of the model can be evaluated by \(RMSE\):

$$RMSE = \sqrt {\frac{1}{n}\mathop \sum \limits_{i = 1}^{n} (P_{i} - M_{i} )^{2} }$$
(24)

3.2 Problem definition and proposed Model

Having identified the models for the environment, the operation, and the generated signals, now the problem of the accurate generated power prediction can be formulated as below.

$$F\left( {A^{*} } \right) = \left( {\mathop {\max }\limits_{0 \le \eta \le 1} P,\eta , \mathop {\min }\limits_{{0 \le Q_{S} \le 1}} Q_{S} } \right)$$

The above definition is a min-max-optimization problem,

where, \(P\) : PV power output, \({\upeta }\) : Is conversion efficiency of PV module. \(Q_{S}\) : Respects the thermal energy losses through radiation and convection heat transfer from modules.

While the optimization variables can be defined as:

However, a problem with all these variables can be defined as an NP-Hard problem. And cannot be solved using traditional optimization techniques. Therefore, to solve this problem we will follow the below methodology.

After having modelled the environment, the operation, and the signals to generate the patterns that are like the real patterns of the physical instruments, this pattern is clustered using k-mean clustering algorithm.

After that, we implement a Monte-Carlo simulation with the identification of all the bounds of the remained stochastic variables, and the optimization outputs, as mentioned in the above table. Inside the iterations of this simulation, Particle Swarm Optimization (PSO) algorithm, which is a metaheuristic stochastic-based algorithm, is applied to identify the optimal values of the selected optimization variables.

After finding the optimal values of the parameters, these parameters as well as the output values, will be used in addition to the current and previous loads to produce the short-term prediction, for example, LSTM or GRU algorithms will be used in this phase. The below Fig. 1 shows the block diagram of our proposed solution. The below figure summarizes this proposed method.

Fig. 1
figure 1

Methodology block diagram

The main parameters which are tried to improve by the author are achieving high accuracy in predicting power failures is a primary objective. The author seeks to improve the accuracy of short-term predictions by addressing the challenges posed by the unpredictable nature of power failures and the complex interaction of various factors such as weather conditions and load behavior, Exact optimal values for the future predicted of the P, \({Q}_{S}\), and efficiency for using LSTM, GRU algorithms. And statistic errors. In the prediction phase, initially LSTM, see Fig. 2a, has been selected with training input is the output of the clustering phase with size of 1400 × 34and then this and 300 hidden layers with three output signals, representing the next or the future temporal values of \(n, p\) and the \(Qs\) in the LSTM we have used look up in order to use only the most useful or the most related samples in building that pattern. Moreover, GRU, see Fig. 2b, followed the same structure to compare both algorithms using the same benchmark to be able to figure out which one provides us with the most accurate future temporal value, and which one provides us with a most with the fastest processing time.

Fig. 2
figure 2

GRU and LSTM Structures

3.3 Datasets

Regarding the load, we have acquired it from a with short-term slots of a frequency of 5min which is very useful for our application in short-term prediction. This dataset (Dataset employed by this research can be retrieved from UK Smart Grid Industry 2021-2024) contains 371 samples each sample is 5min separated from the other sample from the period of the first of January to the second of January in the year in the previous year 2020.This dataset was generated from a real site located in London city in the UK. The exact coordinates are 51.5074°N, 0.1278°W.

3.4 Results

Modelling the sun’s position for a specific location like London involves predicting where the sun will be in the sky at different times of the day and year. This process considers factors like the latitude and longitude of London, the date and time, and the Earth’s tilt and orbit. By understanding the sun's position, we can better analyze how sunlight interacts with the area, impacting variables such as temperature and solar radiation. This modelling helps in various applications, from urban planning to renewable energy development.

In Fig. 3, two plots depict aspects of the sun’s position in London throughout the year. The first plot illustrates the alpha angle, which represents the sun’s position from day 0 to day 350. This angle helps visualize how the sun’s position changes over the course of the year relative to London’s coordinates. The second plot shows Theta, another angle that describes the sun’s position over time at the same location. Both plots provide valuable insights into the sun’s movements and can aid in understanding factors like daylight duration and solar energy availability in London throughout the year.

Fig. 3
figure 3

The alpha angle of the sun position

Figure 4, below chooses the hourly extraterrestrial solar radiation profile for 16 days of January, each plots of this 16 plots for a specific date is shown that the plots almost similar to each other but the related plot for corresponding days increase with the increase of that day which means the peak value of each day as increasing according to the day number for example day 1 we have the value around 350 for day 2 its around 360 and so on the big value .the x-axis here is \(\left(LMT\right)\) and y -axis the \(\left(GextH\right)\).While Fig. 5 shows five minutes step for only one day.

Fig. 4
figure 4

The hourly extraterrestrial solar radiation

Fig. 5
figure 5

Five minutes step for only one day solar radiation

Figure 6a displays the solar radiation data recorded at five-minute intervals for a single day in January. This data provides insights into the intensity of sunlight received at the specified location in London during that time period. On the other hand, Fig. 6b showcases the diffuse solar radiation specifically for the same day in January and at the same five-minute intervals. Diffuse solar radiation refers to sunlight scattered by the atmosphere before reaching the Earth’s surface, and its measurement aids in understanding the distribution of sunlight in the area. These plots offer detailed information on solar radiation patterns, crucial for various applications such as solar energy planning and building design.

Fig. 6
figure 6

The solar radiation for one day and diffuse solar radiation showing time from sunrise to sunset and from sunset to sunrise

Figure 7 presents samples of global solar radiation and diffuse solar radiation for specific days, including the first day, day 50, day 100, day 180, day 250, and day 360. Each sample day provides insight into the variation of solar radiation throughout that particular day. The data is based on 60-minute intervals of solar radiation measurements for each day. By observing these variations, we can gain a better understanding of how solar radiation fluctuates over the course of a day and how it may change from one day to another. This information is valuable for predicting future solar radiation patterns and can aid in various applications such as energy production forecasting and solar panel efficiency optimization.

Fig. 7
figure 7

Global solar radiation and diffused solar radiation at different days in the year based on one day and 7h

The analysis of Figs. 1, 2, 3, 4, 5, 6, 7 reveals a notable observation: the position of the sun for the same day varies across different years. This variability is a fundamental characteristic of natural phenomena, including wind patterns, dust levels, shading effects, and more. As depicted in the figures, these natural factors exhibit stochastic behavior, meaning they follow random and unpredictable patterns. Since electricity generation from renewable resources is heavily influenced by these factors, predicting the amount of energy generated becomes a complex challenge. Recognizing this challenge, this study contributes by developing a methodology aimed at addressing the issue of short-term prediction for energy generation reliant on renewable resources and their associated factors. By devising techniques to forecast the short-term future energy output under the influence of these stochastic natural elements, this research aims to enhance our ability to effectively utilize renewable energy sources despite their inherent unpredictability. This contribution is crucial for improving energy planning and management in the context of renewable energy systems.

4 Discussion

Figure 8 displays the generated signals comprising 1000 normal and 1000 faulty signals, with each signal characterized by 34 features. These signals have been produced utilizing the stochastic features of Monte Carlo simulation, adding a layer of randomness and variability to the dataset. This dataset serves as a testing ground for evaluating the performance of the prediction model. By incorporating stochastic elements into the signal generation process, the dataset reflects real-world scenarios where various factors contribute to signal behavior unpredictability. Testing the prediction model with such a diverse and realistic dataset enables researchers to assess its robustness and effectiveness in accurately predicting outcomes despite inherent stochasticity.

Fig. 8
figure 8

Sample traffic from the generated big-data signals

In Fig. 9, we need to know the status of the pattern. Whether this signal is normal or a fault signal. The behavior of the normal traffic and that the behavior faulty traffic in the real life is a little bit of stochastic that does not follow a stable pattern. Therefore, we will not be able to identify the exact features of the input pattern and the target for them. Therefore, our problem can be defined as a clustering problem, to solve this clustering problem, we have used k-mean clustering algorithm. The previously generated sample traffic has been sent to k-mean clustering algorithm then trained on it after that the clustering algorithm showed as a very clear recognition for the statues of traffic as shown in Fig. 9. The given pattern is found in the first column of the table and therefore the predicted pattern has been produced by the k-mean clustering algorithm and as we see here both are identical with no missing values. Therefore, when we calculated the loss, we found the loss low close to 0 which means that the accuracy is almost 100%.

Fig. 9
figure 9

Pattern clustering with K-mean

4.1 Prediction final results (for LSTM)

Figure 10 depicts the plots of input training and target training, as well as input testing and target testing for the prediction process, specifically using LSTM (Long Short-Term Memory) models. Additionally, it showcases the output of the prediction process. Remarkably, the output closely resembles the target testing data, indicating the effectiveness of the LSTM model in accurately predicting outcomes based on the input data. This alignment underscores the model's capability to capture and learn from patterns in the training data, allowing it to make accurate predictions for unseen testing data. Such performance validates the utility of LSTM models in forecasting tasks, particularly in scenarios where temporal dependencies and long-range dependencies are prevalent Table 1.

Fig. 10
figure 10

The input training and target training input testing and targeted testing for the prediction process (LSTM)

Table 1 While the definitions of the used notations and their ranges, are illustrated in the below table

Figure 11 illustrates the prediction process for forecasting future traffic based on recognized patterns. To achieve this, the identified parameters are fed into a prediction algorithm, specifically utilizing Long Short-Term Memory (LSTM), which has been determined to be the most effective recognition technique through trial and error, as indicated in the accompanying table. The training, validation, and testing phases of the prediction process are visualized in Fig. 11. This process involves training the LSTM model on historical traffic data, validating its performance, and testing its predictive capabilities on unseen data. LSTM models are well-suited for this task due to their ability to capture long-term dependencies in sequential data, making them a valuable tool for traffic prediction and forecasting applications.

Fig. 11
figure 11

Prediction (LSTM model and parameters)

In Fig. 12, the prediction task is performed using a different approach: Gated Recurrent Unit (GRU), while utilizing the same input data. This comparison phase is conducted to evaluate the performance of GRU against LSTM. It’s evident from the results that GRU exhibits slightly faster computation compared to LSTM. However, this efficiency comes at the cost of a slightly higher error rate, approximately 1%. Despite the faster processing speed, the GRU model sacrifices a small degree of accuracy compared to LSTM. This trade-off highlights the importance of considering both speed and accuracy requirements when selecting the appropriate model for a given prediction task. Overall, the comparison between LSTM and GRU offers valuable insights into their respective strengths and weaknesses, aiding in the selection of the most suitable model for specific predictive analytics applications.

Fig. 12
figure 12

Prediction (GRU model and parameters)

In Table 2 conducting short-term predictions, the most efficient algorithm between Gated Recurrent Unit (GRU) and Long-Short Term Memory (LSTM) was utilized. The evaluation was based on several metrics including Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE), Mean Absolute Error (MAE), Mean Bias Error (MBE), and Accuracy. By employing these metrics, the goal was to identify the algorithm that provided predictions closest to the actual values. The comparison aimed to determine the algorithm that offers the most accurate and reliable predictions for the given dataset. Evaluating the performance of both GRU and LSTM models across these metrics allowed for a comprehensive assessment of their predictive capabilities. Ultimately, selecting the algorithm that minimizes errors while maximizing accuracy is crucial for achieving reliable short-term predictions.

Table 2 LSTM and GRU final comparative results

After implementing the optimization algorithm, we have obtained the optimal below values for using LSTM and GRU algorithms. The below tables Illustrate the exact optimal values for the future predicted values of the P, \({Q}_{S}\), and the efficiency Table 3.

Table 3 Exact optimal values for the future predicted of the P, \({\varvec{Q}}_{{\varvec{S}}}\), and efficiency for using LSTM, GRU algorithms

5 Conclusions

Accurately estimating faults in the electric supply, which is based on data exhibiting stochastic behavior, poses a significant challenge. Two crucial factors in this estimation are weather conditions and load behavior. These factors require a thorough analysis of active generated power (QS, which indicates the quality of the solar resource) and the active efficiency of the photovoltaic (PV) grid. Our proposed solution leverages K-means clustering to convert this stochastic data into a recognizable pattern of faults, which can then be used for short-term predictions using Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) algorithms. These algorithms provide quick and accurate results due to their capability to handle sequential data and remember important information over long periods.

To further enhance the application of fault estimation, we used the results from K-means clustering as inputs for Monte Carlo simulations. Monte Carlo simulations are statistical techniques that utilize repeated random sampling to compute the results, enabling accurate temporal predictions. This method validates our initial assumption regarding the stochastic nature of the data. The outcomes demonstrate the high accuracy of short-term predictions based on this stochastic approach. The dataset used for our research, obtained from the UK Smart Grid Industry (covering the period from 2021 to 2024), comprised 371 samples with short-term slots at 5-minute intervals. This dataset proved to be highly effective for our short-term prediction application.

6 Future work

Future work will need to consider dynamic environmental changes imposed by emerging non-zero carbon policies. These changes will introduce new parameters in weather conditions and sunlight availability, which are critical for solar power generation. As a result, new factors such as temperature, wind speed, and humidity will need to be incorporated into the predictive models. Although the mathematical models presented in this paper have proven to be accurate, these new environmental factors will necessitate updates to the LSTM and GRU models to ensure continued accuracy in predictions. Specifically, future research should focus on refining these models to account for the varying impacts of these additional environmental parameters on solar power generation and fault estimation.