1 Introduction

Electric energy is an irrevocable part of the modern human life. Since the world population continues to increase and also the countries focus on enhancing the welfare of their people by producing and consuming more industrial goods, the additional amount of electric energy is unavoidably demanded every year. However, the sources for the energy production are scarce and mostly non-renewable [1]. In addition, unexpected and unforeseen uncertainties such as the drought and the pandemic outbreaks have considerable impacts on the clean energy productions [2]. Therefore, the state authorities and the related international organizations must be able to predict the future energy demands accurately and the corresponding imported and local sources for the energy production. Henceforth, this paper develops a hierarchical prediction model consisting of non-parametric source models and parametric imported-local source models by considering various uncertainties.

As in many fields, machine learning algorithms are frequently used in estimating energy production from different sources and related issues. In the literature, while there is a number of studies on the estimation of electricity production with solar energy [3], few studies on electricity production from non-renewable or fuel-using sources are performed [4, 5]. In this context, different methods are recommended for global energy demand forecasting [6]. Additionally, there are further studies such as power load prediction [7] and improving the reliability of power systems using parametric and non-parametric machine learning techniques [8]. Undoubtedly, these studies contribute to the affordable and clean energy, responsible consumption and production, sustainable cities and communities, climate action and life on land as outlined in the United Nations Sustainable Development Goals (SDGs) [9].

Source prediction models are utilized for various purposes such as discovering the locations, quantities and qualities of the minerals. Zang et al. reported that China initiated the national resources assessment project in 2006 and completed it in 2013 to reveal 25 crucial minerals. With this project, 80 statistical and machine learning prediction models have been developed and integrated hierarchically to perform regional and national exploration of the resources [10]. Ahmadi et al. applied fuzzy logic, neural network and imperialist competitive algorithm to remotely predict the oil rates. To train these machine learning algorithms, they collected 1600 dataset from 50 wells in Iran where the temperature and pressure of the lines were the inputs and the oil flow rate was the output of the algorithm [11]. Qiao et al. constructed a wavelet basis function for the sparse autoencoder with a long short-term memory algorithm and estimated the daily natural gas production and consumption in the United States of America [12]. Li et al. implemented vector autoregressive, radial basis function network, genetic algorithm and particle swarm optimization algorithms to predict the medium and long-term coal demand of China. The inputs of the algorithm were the gross domestic product, coal prices, coal production, industrial structure, energy structure, total population and domestic production data between 1987 and 2012 [13].

With respect to the energy consumption prediction models, they play an important role in energy planning, management and conservation. Wang et al. proposed a stacking model which is enriched with the advantages of the various prediction algorithms to ensure that the terminal trained model can perform predictions from different spatial and structural angles. To show the superiority of the stacking model, it was compared by the random forest, gradient boosted decision tree, extreme gradient boosting, support vector machines and k-nearest neighbour algorithms [14]. Lei et al. incorporated the rough set theory to weaken the data coupling factors and deep learning algorithm to reduce the dimension of the energy consumption data. Short-term and medium-term energy consumption data were used for testing, and the results were compared with the Elan neural network and fuzzy logic algorithms [15]. Kim and Cho proposed a deep learning-based prediction model with an autoencoder to select the informative data for the training. The model optimizes a cost function which essentially formulates the optimum electric energy management [16].

In terms of the energy production prediction models, they can provide short-, medium- and long-term energy production. Wasilewski and Baczynski constructed a multi-layer neural network that optimizes the multi-criteria cost function to predict the intraday and next day wind power-based electric energy production. The inputs of the model were the historical weather forecasts, and the outputs were the energy production of two wind farms at different power capacities [17]. Similarly, Bugal et al. considered a radial basis function neural network to forecast the electric energy production from the solar panels. The inputs were the number of the sunny hours, length of the day, air pressure, maximum air temperature, daily insolation and cloudiness where the output was the produced energy [18]. Monteiro et al. implemented a three-layer neural network to estimate hourly electric energy production from a hydro power plant. The first layer forecasted the daily average power production, the second layer modified and finalized it and the third layer compared the estimate with the recent available data to update the first layer estimates [19]. Piotrowski et al. implemented long short-term memory, multi-layer neural network, support vector machines and k-nearest neighbour machine learning algorithms to perform the 2 days ahead electric energy productions. With these algorithms, the wind power, the solar power and the hydropower-based electric energy productions were estimated and statistically compared [20].

All these developed models generally provide short-term energy estimates from a single source such as the wind and hydro. However, countries produce electric energies from various sources where a number of them have priority since they are locally available. In addition, these models are non-parametric, whose parameter spaces do not correspond to the real parameter spaces and ignore the various uncertainties. Based on these gaps, the key contributions of this paper can be summarized as:

  1. 1.

    Develops non-parametric source models which have higher-order polynomial bases with 16 inputs including the imported coal, natural gas, wind and solar powers.

  2. 2.

    Modifies the iterative neural network and the batch type least squares machine learning algorithms to optimize the unknown polynomial parameters of the source models.

  3. 3.

    Generates the model of the uncertainties occurring due to unexpected drought, watery years, building new hydropower dams and pandemic outbreaks.

  4. 4.

    Constructs parametric imported-local source models and incorporates the modelled uncertainties into them.

  5. 5.

    Modifies the particle swarm optimization meta-heuristic algorithm to learn the unknown imported-local source model parameters.

  6. 6.

    Predicts 10 years ahead individual and total imported and local sources for the energy production in Türkiye.

  7. 7.

    Performs experiments on the real data, analyses and compares the results extensively.

In the rest of the paper, Sect. 2 presents the proposed model architecture, Sect. 3 provides the input–output training data, Sect. 4 constructs the non-parametric source models, Sect. 5 introduces the parametric imported-local source models, Sect. 6 analyses the experimental results, and finally Sect. 7 summarizes the paper and expresses the future works.

2 Proposed Model Architecture

Figure 1 illustrates the proposed hierarchical parametric and non-parametric forecasting source models. The non-parametric source models in Fig. 1 receive the current 16 source data \({x}_{k-1}\) (\(k\) denotes years) covering the sources from the imported coal to the solar power. These parametric models estimate the next year sources \({x}_{k}\) where a number of them are imported or local sources used for the electric energy production. These estimated sources \({x}_{k}\), the estimated imported sources \({I}_{k}^{t}\), the estimated local sources \({L}_{k}^{t}\), the imported model uncertainty \({I}_{k}^{u}\) and the local model uncertainty \({L}_{k}^{u}\) are fed to the parametric imported-local source models in Fig. 1. These parametric models estimate the total future imported sources \({I}_{k+1}^{t}\) and the total future local sources \({L}_{k+1}^{t}\). The next section introduces each part of the proposed model architecture in Fig. 1.

Fig. 1
figure 1

Proposed model architecture

3 Input–Output Training Data

Understanding the key insights about the input–output training data greatly help researchers to build sophisticated, accurate and reliable parametric prediction models. Therefore, this section provides the imported and local sources utilized for the energy production in Türkiye between 2001 and 2021, which are essentially the input–output data of the prediction models.

Next subsection presents the imported sources utilized for the energy production.

3.1 Imported Sources for the Energy Production

Countries with insufficient sources have to meet their energy production demands from the foreign sources. Dynamic economies, fast growing populations and unexpected uncertainties like the droughts and pandemic outbreaks cause challenges in predicting the imported sources for the energy production. Therefore, as shown in Fig. 2, together with the clear patterns in the imported sources, abrupt ups and downs called as the uncertainties have also appeared.

Fig. 2
figure 2

The imported fuel oil, coal and natural gas sources for the energy production (GWh) in Türkiye

As can be clearly seen from Fig. 2, the fuel oil as an energy production source has moderately lessened over the years, but it remains at its minimum between 2011 and 2021, which is probably due to its ongoing requirement for watering the plants in rural areas of the country. It is also noticeable from Fig. 2, both the imported coal and the natural gas have grown rapidly, where the natural gas has larger fluctuations in some years than the coal. This is because the natural gas is mostly transported through the pipelines while the coal is carried by the ships. Henceforth, producing the electric energy from the natural gas is easier and faster. It is important to note that the fluctuations in the imported sources mainly stem from the unexpected uncertainties such as the drought and watery years which are parametrized in Sects. 5.1.2 and 5.1.3. Together with these widely used sources, there are also seldomly required sources for the energy production as shown in Fig. 3.

Fig. 3
figure 3

The imported LNG, LPG, naphtha and motorin for the energy production (GWh) in Türkiye

Figure 3 clearly illustrates that even though the motorin and naphtha have the larger amounts than the LPG and LNG, they still constitute a small amount of the imported sources compared to the coal and natural gas. It is noticeable that although the LPG and naphtha are quite extensively preferred for the energy production, they have utterly fallen out of use in 2005 and 2010, respectively. Even though these sources have sharp increases and decreases in their amounts, since their shares among the imported sources are small, the impact of the uncertainties associated with them is considerably small.

The next subsection introduces the local sources for the energy production.

3.2 Local Sources for the Energy Production

Dependency on the foreign energy sources causes a current account deficit and also can be used as a political tool against the countries. Thus, most of the countries prioritize the local sources and also invest in new sources like the renewable energies for the energy production. Figure 4 presents the geothermal, wind, lignite, and hydro local sources for the energy production in Türkiye.

Fig. 4
figure 4

The local geothermal, wind, lignite and hydro sources for the energy production (GWh) in Türkiye

As can be seen from Fig. 4, the energy production from the wind, which is a clean and renewable energy, has steadily increased since 2009. Similarly, the energy production from the geothermal, being a clean and reliable energy source, has risen slowly and less noticeably than the wind-based energy production. Amount of the energy production from the hydro was 24,000 GWh in 2001 and it jumped to 89,000 GWh in 2019, which is a watery year. It is also noticeable that the energy production from the hydro has grown more than two times between 2001 and 2021. This is due to the fact that Türkiye has heavily invested in building large dams for the energy production recently. In addition, even though the lignite-based energy production has a slight increment, it has fluctuated over the years with a large variance. With respect to the hydro-based energy production, it has changes between the successive years, but these are mostly due to watery and drought years, which are generally unexpected. Figure 5 demonstrates the remaining local sources for the energy production in Türkiye.

Fig. 5
figure 5

The local waste heat, solar, asphaltite, hard coal and biomass sources for the energy production (GWh) in Türkiye

As shown in Fig. 5, the biomass-based energy production has leaped from 230 GWh in 2001 to 6720 GWh in 2021. In addition, the asphaltite-based energy production has surged from 447.6 GWh in 2009 to 2373 GWh in 2021. A similar sharp rise has also been observed in solar-based energy production, which it has risen from 10 GWh in 2018 to 1550 GWh in 2021. Even though there is a slight increment in the hard coal-based energy production, it has varied considerably year by year. To see the whole picture in the energy production sources, the next subsection provides a comparative analysis of the total imported and local sources for the energy production.

3.3 Total Imported-Local Sources for Energy Production

The developed parametric models will be able to predict the long-term future imported and local sources for the energy production where Fig. 6 shows its output training data. As can be seen from Fig. 6, the imported and local sources for the energy production had equal shares in 2001 and 2002. However, after 2003 both the imported and local sources for the energy production have mostly increased steadily. In addition, it is clear that the amount of the sources has slightly or sharply fluctuated in a number of years due to drought year, watery year, COVID-19 pandemic, building of new dams and solar panel fields for the energy production which are specified in Table 1.

Fig. 6
figure 6

Total imported and local sources-based energy production in Türkiye

Table 1 Extraordinary years, reasons, imported and local sources for the energy production

It would be also useful to analyse the percentages of each source in terms of the total imported and local sources shown in Fig. 7. As can be understood from Fig. 7, the imported and local sources constitute %52.5 and %47.5 of the total sources for the energy production in Türkiye. In terms of the imported sources, the imported coal and the natural gas have the largest percentages which are %23.3 and %73.4, respectively. With respect to the local sources, the hydro and lignite sources have % 48 and % 33.8, respectively. These four sources greatly dominate the energy production in Türkiye, but all the sources will be considered in the model development.

Fig. 7
figure 7

Percentages of the total imported and local sources-based energy production (GWh) in Türkiye

Since the imported sources are the alternative to the local sources, they are highly coupled to each other as demonstrated in Table 1. The only exception to this statement is the year 2010 when the new dams contributed to the local sources, but it has not been reflected in the imported sources. This occurs probably due to the large overall energy demand just after the global economic crises in 2008 and 2009. All these expected and unexpected changes will be considered as uncertainties in this paper and will be modelled in Sects. 5.1.2 and 5.1.3.

Next subsection introduces the non-parametric source models.

4 Non-Parametric Source Models

Since the sources for the energy production such as the solar, wind and biomass are independent (de-coupled) from each other, their unknown parameter spaces can be expanded and modelled as the non-parametric models. This section creates a higher-order polynomial basis for the source models and optimizes their unknown parameters with the neural network (NN) and batch least square (BLS) machine learning approaches.

4.1 NN-Based Optimization of the Source Models

A polynomial basis consisting of a bias, linear and quadratic data is constructed to create a smooth parameter space with a first-order derivative for each source model. For example, the basis \({\mathbf{b}}_{k}^{{S}^{o}}\) for the solar source \({S}^{o}\) is built as

$${\mathbf{b}}_{k}^{{S^{o} }} = \left[ {\begin{array}{*{20}c} 1 & {S_{k}^{o} } & {S_{k}^{o2} } \\ \end{array} } \right]$$
(1)

The corresponding unknown parameter vector \({\mathbf{w}}_{k}^{{S}^{o}}\) is

$${\mathbf{w}}_{k}^{{S^{o} }} = \left[ {\begin{array}{*{20}c} {w_{1}^{{S^{o} }} } & {w_{2}^{{S^{o} }} } & {w_{3}^{{S^{o} }} } \\ \end{array} } \right]$$
(2)

The estimated output \({\widehat{y}}_{k}^{{S}^{o}}\) is

$$\hat{y}_{k}^{{S^{o} }} = {\mathbf{w}}_{k}^{{S^{o} }} *{\mathbf{b}}_{k}^{{S^{oT} }}$$
(3)

The estimation error \({e}_{k}^{{S}^{o}}\) is

$$e_{k}^{{S^{o} }} = y_{k}^{{S^{o} }} - \hat{y}_{k}^{{S^{o} }}$$
(4)

The NN-based parameter update rule of \({\mathbf{w}}_{k}^{{S}^{o}}\) in Eq. (2) is

$${\mathbf{w}}_{k}^{{S^{o} }} = {\mathbf{w}}_{k - 1}^{{S^{o} }} - \eta^{{S^{o} }} \frac{{\partial \left( {e_{k}^{{S^{o} }} } \right)^{2} }}{{\partial {\mathbf{w}}_{k}^{{S^{o} }} }}$$
(5)

where \({\eta }^{{S}^{o}}\) is learning rate. Taking the partial derivative of the rule in Eq. (5) with respect to \({\mathbf{w}}_{k}^{{S}^{o}}\) yields

$${\mathbf{w}}_{k}^{{S^{o} }} = {\mathbf{w}}_{k - 1}^{{S^{o} }} + \eta^{{S^{o} }} e_{k}^{{S^{o} }} {\mathbf{b}}_{k}^{{S^{o} }}$$
(6)

Since the unknown parameter vector is updated iteratively, the learning problem can adapt itself to the changing source dynamics. The BLS-based optimization, introduced next, uses the whole available data to learn the unknown parameters.

4.2 BLS-Based Optimization of the Source Models

The BLS can reduce the effects of the random uncertainties because it performs the optimization by using the whole data. The basis in Eq. (1) is modified for the BLS-based optimization as

$${\mathbf{b}}^{{S^{o} }} = \left[ {\begin{array}{*{20}c} {1^{N \times 1} } & {{\mathbf{S}}^{o} } & {{\mathbf{S}}^{o} \cdot {\mathbf{S}}^{o} } \\ \end{array} } \right]$$
(7)

where \({\mathbf{S}}^{o}\in {R}^{N}\) has dimension of \(N\) and \(\bullet\) is the vector dot product. The estimated output \({\widehat{y}}_{k}^{{S}^{o}}\) with the unknown parameter vector \({\mathbf{w}}^{{S}^{o}}\) is

$$\hat{y}_{k}^{{S^{o} }} = {\mathbf{w}}^{{S^{o} }} *{\mathbf{b}}^{{S^{oT} }}$$
(8)

The quadratic training error is

$$\begin{aligned} & e^{2} = \left( {y - {\mathbf{b}}^{{S^{o} }} {\mathbf{w}}^{{S^{oT} }} } \right)^{T} \left( {y - {\mathbf{b}}^{{S^{o} }} {\mathbf{w}}^{{S^{oT} }} } \right)\\ & = y^{T} y - 2{\mathbf{b}}^{{S^{o} }} {\mathbf{w}}^{{S^{oT} }} y + {\mathbf{w}}^{{S^{oT} }} {\mathbf{b}}^{{S^{oT} }} {\mathbf{b}}^{{S^{o} }} {\mathbf{w}}^{{S^{o} }}\end{aligned}$$
(9)

The slope of the quadratic batch error with respect to the unknown parameters \({\mathbf{w}}^{{S}^{o}}\) is

$$\frac{{\partial e^{2} }}{{\partial {\mathbf{w}}^{{S^{o} }} }} = - 2{\mathbf{b}}^{{S^{oT} }} y + 2{\mathbf{b}}^{{S^{oT} }} {\mathbf{b}}^{{S^{o} }} {\mathbf{w}}^{{S^{o} }}$$
(10)

Applying the stationary condition on Eq. (10) yields the optimum unknown parameters given by

$$w = \left( {{\mathbf{b}}^{{S^{oT} }} {\mathbf{b}}^{{S^{o} }} } \right)^{ - 1} {\mathbf{b}}^{{S^{oT} }} y$$
(11)

The next section introduces the imported and local source models which are used to predict the future imported and local sources for the energy production.

5 Parametric Imported-Local Source Models

In contrast to the non-parametric source models introduced in Sect. 4, the developed imported and local source models in this section are parametric since there is an available exact parameter space. Moreover, contrary to the single dimensional source models in Sect. 4, the developed imported and local models are multi-dimensional due to high coupling among the imported and local sources. This section introduces the parametric model structures, their associated uncertainties and their PSO-based optimization.

5.1 Parametric Model Structures and Uncertainties

Specifying the model structures leads to parametric models where their parameter spaces correspond to the real parameter spaces. Henceforth, the modelling error or the structured uncertainty is eliminated. This section constructs the parametric model structures and the unstructured uncertainties.

5.1.1 Parametric Imported-Local Source Prediction Models

The uncertain imported source model can be parameterized as

$$\hat{I}_{k + 1} = {\mathbf{w}}_{k}^{I} *{\mathbf{b}}_{k}^{{I^{T} }}$$
(12)

where \({\widehat{I}}_{k+1}\) is the imported source estimate, \(k\) is the sample representing the years, \({\mathbf{w}}_{k}^{I}\in {R}^{1\times 10}\) is the imported model unknown parameter vector and its each entry data of dimension 10 associates with the basis vector \({\mathbf{b}}_{k}^{I}\) given by

$${\mathbf{b}}_{k}^{I} = \left[ {I_{k} L_{k} F_{k}^{o} M_{k}^{o} I_{k}^{c} N_{k}^{g} L_{k}^{g} L_{k}^{p} N_{k}^{a} I_{k}^{u} } \right]$$
(13)

Table 2 introduces the imported model basis \({\mathbf{b}}_{k}^{I}\) data. Similarly, the uncertain local source model can be parameterized as

$$\hat{L}_{k + 1} = {\mathbf{w}}_{k}^{L} *{\mathbf{b}}_{k}^{{L^{T} }}$$
(14)

where \({\widehat{L}}_{k+1}\) is local source estimate, \({\mathbf{w}}_{k}^{L}\in {R}^{1\times 12}\) is the unknown parameter vector and its each entry data of dimension 12 associates with the basis \({\mathbf{b}}_{k}^{L}\) given by

$${\mathbf{b}}_{k}^{L} = \left[ {L_{k}\;\;I_{k} \;\; H_{k} ^{c} \;\; L_{k} ^{i} \;\; A_{k} ^{s} \;\; B_{k}^{i} \;\; W_{k}^{h}\;\; H_{k} ^{y} \;\; W_{k} ^{i} \;\; G_{k} ^{e} \;\; S_{k} ^{o} \;\; L_{k} ^{u} } \right]$$
(15)
Table 2 Imported model basis data

Table 3 introduces the data in the local model basis \({\mathbf{b}}_{k}^{L}\). The next subsection quantifies the uncertainties in the imported source model basis in Eq. (13).

Table 3 Local model basis data

Next subsection introduces the parametric uncertainties constructed as the imported source models.

5.1.2 Parametric Imported Source Model Uncertainties

Table 1 specifies the observable uncertainties in the sources utilized for the energy production. As indicated in Table 1, 2014 was a drought year and since the hydro-based energy production reduced significantly as shown in Fig. 4, this reduction was compensated by the natural gas-based energy as can be seen from Fig. 2. One can formulate the corresponding uncertainty as the mean subtraction of the natural gas-based energy production demand in 2013 and 2015, because they were attained under the certain conditions. One can represent it as

$$I_{14}^{u} = N_{14}^{g} - \frac{1}{2}\left( {N_{13}^{g} + N_{15}^{g} } \right)$$
(16)

where \({I}_{14}^{u}\) is the amount of the imported model uncertainty, \({N}_{13}^{g}\), \({N}_{14}^{g}\), \({N}_{15}^{g}\) are the natural gas-based energy productions in 2013, 2014 and 2015, respectively.

In Table 1, 2015 was labelled as the watery year which caused a sharp decrease in the natural gas-based energy production as exhibited in Fig. 2. The watery year was also supported by the increase in the biomass-based energy production and they are formulated as

$$I_{15}^{u} = \frac{1}{2}\left( {N_{15}^{g} + B_{15}^{i} } \right) - \frac{1}{4}\left[ {\left( {I_{14}^{u} + N_{14}^{g} + N_{16}^{g} } \right) + \left( {B_{14}^{i} + B_{16}^{i} } \right)} \right]$$
(17)

where \({I}_{15}^{u}\) is the amount of the imported model uncertainty in 2015, \({N}_{16}^{g}\) is the natural gas-based energy production in 2016, \({B}_{14}^{i}\), \({B}_{15}^{i}\), \({B}_{16}^{i}\) are the biomass-based energy productions in 2014, 2015 and 2016, respectively.

The imported model uncertainties \({I}_{17}^{u}\) and \({I}_{19}^{u}\) associated with the drought year in 2017 and watery year in 2019 can be modelled similarly as in Eqs. (16) and (17), respectively. However, the drought year and COVID-19 pandemic in 2021 requires slightly different formulation since the last available data is for 2021. In this year, the natural gas-based energy production increased whereas the imported coal-based energy production reduced. Therefore, one can formulate it as

$$I_{21}^{u} = \frac{1}{2}\left( {N_{21}^{g} + I_{21}^{c} } \right) - \frac{1}{4}\left[ {\left( {N_{19}^{g} + N_{20}^{g} } \right) + \left( {I_{19}^{c} + I_{20}^{c} } \right)} \right]$$
(18)

where \({I}_{21}^{u}\) is the amount of the imported model uncertainty, \({N}_{19}^{g}\), \({N}_{20}^{g}\), \({N}_{21}^{g}\) are the natural gas-based energy productions in 2019, 2020 and 2021, respectively, \({I}_{19}^{c}\), \({I}_{20}^{c}\), \({I}_{21}^{c}\) are the imported coal-based energy productions in 2019, 2020 and 2021, respectively. Thus, the imported model uncertainty is

$$I_{k}^{u} = \left[ {\begin{array}{*{20}c} {0^{1 \times 13} } & {I_{14}^{u} } & {I_{15}^{u} } \\ \end{array} \begin{array}{*{20}c} 0 & {I_{17}^{u} } & 0 \\ \end{array} \begin{array}{*{20}c} {I_{19}^{u} } & 0 & {I_{21}^{u} } \\ \end{array} } \right]$$
(19)

The orders of the zeros are determined based on the years of the uncertainties provided in Table 1 and the starting year of the data. The next subsection quantifies the local model uncertainties.

5.1.3 Local Source Model Uncertainties

Uncertainties in the local sources are usually associated with the hydro-based energy production. With respect to the uncertainty in 2010, its character differs from the others since it was stemmed from the newly built dam for the hydro-based energy production and it led to a proportional reduction in the lignite-based energy production. Henceforth, even though this yielded a noticeable change in the local source, it is an expected uncertainty. One can formulate it as

$$L_{10}^{u} = \frac{1}{2}\left( {H_{10}^{y} + L_{10}^{i} } \right) - \frac{1}{4}\left[ {\left( {H_{9}^{y} + H_{11}^{y} } \right) + \left( {L_{9}^{i} + L_{11}^{i} } \right)} \right]$$
(20)

where \({L}_{10}^{u}\) is the amount of the local model uncertainty in 2010, \({H}_{9}^{y}\), \({H}_{10}^{y}\), \({H}_{11}^{y}\) are the hydro-based energy productions in 2009, 2010 and 2011, respectively, \(L_{9}^{i}\), \(L_{10}^{i}\), \(L_{11}^{i}\) are the lignite-based energy productions in 2009, 2010 and 2011, respectively. \(L_{14}^{u}\), \(L_{15}^{u}\), \(L_{19}^{u}\) uncertainties occur due to drought year in 2014, rainy years in 2015 and 2019, respectively. They can all be modelled as in Eq. (16) in terms of the hydro-based energy production. In 2016, the solar-based energy production increased significantly and the corresponding uncertainty \(L_{16}^{u}\) can be modelled similarly.

With respect to the drought year and the COVID-19 pandemic in 2021, its uncertainty \(L_{21}^{u}\) can be modelled as in Eq. (18) by considering the hydro and lignite sources in Fig. 4. Thus, the local model uncertainty in Eq. (15) is formed as

$$L_{k}^{u} = \left[ {0^{1 \times 9}\;\;\; L_{10}^{u}\;\;\; 0^{1 \times 3}\;\;\; L_{14}^{u}\;\;\; L_{15}^{u}\;\;\; L_{16}^{u}\;\;\; 0^{1 \times 2}\;\;\; L_{19}^{u}\;\;\; 0 L_{21}^{u} } \right]$$
(21)

Since the imported and local source models are uncertain and highly coupled, their parameters are optimized by the PSO meta-heuristic algorithm introduced next.

5.2 PSO for the Imported-Local Source Models

The PSO is a gradient-free and search-based meta-heuristic algorithm that essentially mimics the motion of the bird flocks and schooling fishes. It selects the imported source model unknown parameters \({\mathbf{w}}_{k}^{I}\) by a rate \({\mathbf{v}}_{k}\) expressed as

$${\mathbf{v}}_{k} = \eta^{v} {\mathbf{v}}_{k - 1} + r_{1} \eta^{p} \left( {{\mathbf{p}}_{k} - {\mathbf{w}}_{k}^{I} } \right) + r_{2} \eta^{g} \left( {{\mathbf{g}}_{k} - {\mathbf{w}}_{k}^{I} } \right)$$
(22)

where \({\mathbf{p}}_{k}\) and \({\mathbf{g}}_{k}\) are the local and global best unknown parameter solutions among the limits \({\mathbf{w}}_{k}^{min} < {\mathbf{w}}_{k}^{I} < {\mathbf{w}}_{k}^{max}\), respectively, \(\eta^{v}\), \(\eta^{p}\), and \(\eta^{g}\) are the update rates, \(r_{1}\) and \(r_{2}\) are stochastic search variables. The update rule for the unknown parameter is given by

$${\mathbf{w}}_{k}^{I} = {\mathbf{w}}_{k - 1}^{I} + {\mathbf{v}}_{k}$$
(23)

Algorithm 1 summarizes the imported source model optimization with the PSO.

figure a

The next section provides the results and analyses them extensively.

6 Results

This section initially presents the parameters of the machine learning algorithms. Then, this section analyses the optimization results and provides the predicted imported and local sources for the energy production in Türkiye until 2031.

6.1 Parameters of the Machine Learning Algorithms

Table 4 specifies the parameters of the machine learning algorithms. The next subsection presents the estimates with the non-parametric source models.

Table 4 Parameters of the machine learning algorithms

Next subsection analyses the estimates with the non-parametric source models.

6.2 Estimates with the Non-Parametric Source Models

Each source model has been trained by using the iterative NN and BLS where Fig. 8 compares the corresponding estimates. As can be seen from Fig. 8, the NN-based estimates closely follow the real source data while the BLS produces estimates with noticeable biases. This is mainly because that the NN is an iterative approach, and it can capture the time-varying character of the data. However, the BLS can learn the overall character of the data rather than their instant characters. Since the learned character with the BLS is the dominant one, the overall estimates do not fit each sample of the real data.

Fig. 8
figure 8

Source model estimates (GWh), a fuel oil, b motorin, c imported coal, d natural gas, e LPG, f naphtha. The dashed blue lines, dotted green lines and solid red lines represent the real, estimates with the NN and estimates with the BLS, respectively

Similar comments can be made for the source model estimates in Fig. 9. The sharp changes, the periodic changes and the steadily varying characters are all learned by the machine learning algorithms. Distinctive biases with the BLS-based estimates disappear with the NN-based estimates due to its iterative nature.

Fig. 9
figure 9

Source model estimates (GWh), a hard coal, b lignite, c asphaltite, d biomass, e waste heat, f hydro, g wind, h solar, j geothermal. The dashed blue lines, dotted green lines and solid red lines represent the real, estimates with the NN and estimates with the BLS, respectively

The next subsection presents the estimates with the parametric imported-local source models.

6.3 Estimates with the Parametric Imported-Local Source Models

The imported and local source models are highly coupled to each other since the overall energy demand is provided by either of them. Figure 10 presents the estimated imported and local sources with and without the uncertainty models. Figure 10a and c clearly shows that the developed parametric models with the explicitly modelled uncertainties are able to accurately estimate the imported and local sources for the energy production, respectively. However, as can be seen from Fig. 10b and d, without the uncertainty models in Eqs. (19) and (21), the estimates diverge from the real data. This result shows the importance of the explicit uncertainty modelling and constructing parametric model structures which can represent the exact unknown parameter space rather than the approximate ones.

Fig. 10
figure 10

Estimates with the imported-local source models (GWh). a Imported source estimates with the modelled uncertainty, b imported source estimates without the uncertainty model, c local source estimates with the modelled uncertainty, d local source estimates without the uncertainty model

The next subsection provides the future predictions of the sources used for the energy production.

6.4 Predicted Future Sources

Individual source predictions for the energy production are performed with the non-parametric source models introduced in Sect. 4. Figure 11 shows the predicted sources for the energy production and provides the predicted sources from 2022 to 2031 used for the energy production in Türkiye. It is predicted that the fuel oil (Fig. 11a), the motorin (Fig. 11b) and the LPG (Fig. 11f) requirement will increase, but their maximum amounts will be smaller compare to the others. The imported coal (Fig. 11c) and the LNG (Fig. 11e) for the energy production will increase, whereas the natural gas (Fig. 11d) and hard coal (Fig. 11h) will moderately reduce. It is also predicted that the naphtha-based energy production will fluctuate over the years.

Fig. 11
figure 11

Future predictions of the sources for the energy production (GWh) in Türkiye. a Fuel oil, b motorin, c imported coal, d natural gas, e LNG, f LPG, g naphtha, h hard coal

The developed model predicts that the lignite (Fig. 12a), the asphaltite (Fig. 12b) and the waste heat (Fig. 12d) will reduce slightly and smoothly. It is also predicted that the biomass (Fig. 12c), the hydro (Fig. 12e), the wind (Fig. 12f), the geothermal (Fig. 12g) and the solar (Fig. 12h) will increase significantly for the energy production purpose over the next ten years.

Fig. 12
figure 12

Future predictions of the sources for the energy production (GWh) in Türkiye. a Lignite, b asphaltite, c biomass, d waste heat, e hydro, f wind, g geothermal, h solar

6.5 Predicted Future Imported-Local Sources

Figure 13 presents the predicted future imported-local sources utilized for the energy production in Türkiye. As can be seen from Fig. 13, requirement of the imported sources for the energy production will remain constant, while the local sources for the energy production will increase in 2022 and 2023. However, later both the imported and local sources will rise steadily to meet the energy demand of the country.

Fig. 13
figure 13

Predicted future total sources (GWh): a imported, b local

7 Discussion

Today, although the use of renewable energy sources is increasing, we are still dependent on non-renewable energy sources. While the use of these resources contributes to the economy of the countries, it is also obvious that foreign-dependent resources impose a burden on the economies. Therefore, for accurate planning, it is desirable that electrical energy production estimates are as close to reality as possible. Certainly, this can be achieved with the hierarchical model staking into account the existing uncertainties such as the one proposed in this paper.

Flesca et al., focused on estimating the non-renewable energy production with uncertain data collected for Italy to estimate the daily energy production from fossil coal, fossil gas, hard coal, fossil oil, waste and further non-renewable sources. In this regard, they used twelve machine learning algorithms, among which the best performing were gradient boosted regression tree (GBRT) and extra tree regressor (ETR) [4]. Abu Al-Haija et al. have presented energy demand forecasting based on a nonlinear autoregressive (NAR) neural network for the next decade. According to this study, it is estimated that global energy consumption will increase until 2032. However, no evaluation has been made regarding the use of energy resources [6]. Pandey et al., have forecasted non-renewable and renewable energy from multiple sources such as hydropower, solar, wind and bioenergy using grey forecasting model DGM (1,1,α). According to this, non-renewable and renewable energy production have expected to increase [21]. Kamani and Ardehali have made long-term prediction of electrical energy consumption for different scenarios, taking into account solar and wind energy sources, and optimized the ANN (artificial neural network) with PSO and E-PSO (evolutionary particle swarm optimization) algorithms [22].

A research has been conducted to make predictions about the different resources of used for the energy production in Türkiye. Ediger et al. developed a decision support system for forecasting fossil fuel production by applying a regression, an autoregressive integrated moving average (ARIMA) and seasonal autoregressive integrated moving average (SARIMA) methods by 2038. They have used the regression model for hard coal and lignite, the ARIMA model for asphaltite and natural gas, and the SARIMA for oil. According to the results of this study, in Türkiye from 2004 to 2038, the production will end in 2019 for hard coal, in 2024 for natural gas, in 2028 for oil, in 2038 for asphaltite, and in 2038 for lignite [23]. Akpinar et al. have predicted that hydroelectric energy will have a 13.6% share in Türkiye's electrical energy production in 2030, and if the current trend continues, the hydroelectric energy potential of 129.9 billion kWh can be used by 2102 [24]. Using the fractional nonlinear grey Bernoulli model (FANGBM(1,1)), Şahin has predicted the share of hydroelectric energy in Türkiye's total installed power and electricity production in 2030 as 38.2 and 23.8%, respectively [25]. Melikoğlu has estimated that Türkiye's natural gas demand in 2030 by developing two semi-empirical models based on econometrics, gross domestic product (GDP) at purchasing power parity (PPP) per capita and demographic data, population change. Accordingly, it is calculated that there will be a natural gas demand of 76.8 billion m3 for the linear model and 83.8 billion m3 for the logistics model [26].

Küçükvar et al. have made predictions using the ARIMA model to investigate the financial footprints of national electricity production scenarios from different energy sources in Türkiye and UK until 2050. They predicted that the electricity needs of both countries would largely depend on coal and natural gas [27]. Gulay et al. have suggested that different types of hybrid models that combine a decomposition of both the machine learning and statistical approaches forecasting electricity production from different energy sources for Türkiye. In their study, the next 14-month period was evaluated and they proposed the STL-based (a seasonal-trend decomposition procedure based on loess) LSTM (long short-term memory) hybrid model approach, which showed superior model performance. However, no optimization was performed and no uncertainty was taken into account in these studies [28].

It is important to note that the system identification-based approaches such as NAR, ARIMA and SARIMA have been extensively implemented in the literature for forecasting the future energy demand and required energy sources for the energy production. Even though they can be trained with less data, they require a process to generate a quite large number of background unknown model structures and statistical approaches to select the best forecasting model among generated ones. Thus, it necessities a time-consuming process to construct various model structures. In addition, since it contains a model selection process, it introduces an extra uncertainty. Henceforth, in this research, a hybrid and hierarchical model consisting of both the parametric and non-parametric models is proposed. As these models are constructed using the available expert knowledges, the source of the uncertainty and the time-consuming workload are lessened.

Moreover, in this research employing LSTM type optimization is avoided as it has multi-layers and each layer has a number of neurons, which are all adjusted manually. As this process possesses similar challenges with the system identification-based approaches, it is not considered in this paper as well.

In this research, parametric and non-parametric forecasting source models containing modelled uncertainties were designed for energy production forecasting. In this way, Türkiye’s imported-local sources for electrical energy production were estimated 10 years ahead and this is only possible in case of developing stable and robust forecasting models. Otherwise, predictions become unstable and sensitive to the changes in the input data. This is highly likely for the energy data since it is affected from the known and unknown uncertainty sources. Therefore, in this research, uncertainty sources are explicitly recognized through the available expert-based knowledges and they are reflected in the background model structure development process.

Finally, it is important to highlight that since the developed forecasting model is data-driven and flexible, they can be trained with the data collected from various sources. Henceforth, they can be utilized to make long-term electrical energy production demands and the forecasting can be modified instantly as the conditions change with the time.

8 Conclusion and Future Works

This paper proposed a hierarchical forecasting model consisting of the parametric source models and non-parametric imported-local source models. This paper also modelled the uncertainties stemming from the drought years, watery years, pandemic outbreak and building new dams for the electric energy production. To optimize the unknown parameters of these models, iterative neural networks, batch least squares and particle swarm optimization algorithms were modified and trained with the data for Türkiye. The models predicted the 16 imported and local sources for electricity production until 2031. The results highlighted that even though Türkiye will produce more electric energy from the local sources, it will also require more imported sources to meet its future electric energy demand.

As a future work, these models should be enriched with the population increment, industrial policy and the climate changes.