Introduction

In present day, solar energy is widely applied at many sites around the world to improve sustainability and reduce major environmental problems, such as global warming and air pollution [1].

Brazil has great potential to exploit solar energy, with 700,000 residential consumers predicted to have roof-mounted solar panels up to 2024, according to the Brazilian Electricity Regulatory Agency [2]. Due to environmental and economic reasons, tropical countries such as Brazil will need to use their solar energy potential, estimated from 1500 kW to 2200 kW h m−2/year, to diversify their energy matrix. Presently, Brazil has an installed capacity of 134 GW, whose 5% are solar or wind based [3]. Its geographical location affords very large annual solar irradiation, and the interest in the development of solar power plants has been arising. For example, from 2014 to 2015, the photovoltaic sector increased 266% [4]. From 2012 to 2016, over 23 MWp of small-scale grid-connected photovoltaic systems have been installed in the distribution generation modality, while 2091.7 MWp of centralized generation have been approved during 2014 and 2015 [5]. It is important to note that forecasting models play an important role in the national energy development plan [6], particularly concerning the matrix of several renewable and highly seasonal energy sources as Brazil’s.

Among Brazilian regions, the northeast shows high viability to solar energy exploiting [7] due to its proximity to Equator and climatology, with a big area inserted in the semiarid zone. The region attains average daily horizontal global solar radiation of 5.49 kWh/m2 and the normal component of beam radiation of 5.05 kWh/m2. Furthermore, Brazilian Northeast region has higher monthly global solar radiation averages than Portugal and Spain, in addition to less monthly variability. The state of Ceará, located at northernmost part of Northeast Region, is one of the Brazilian states with the highest solar potential. The state is home of the biggest non-governmental solar power plant in the country, with capacity of 3 MW and expandable to 5 MW [7].

Solar energy use becomes especially important given the near future context of high fossil fuel prices and environmental destruction [8]. In this sense, the literature [1] indicates the availability of solar irradiation data as a fundamental factor for solar system experts to successfully simulate, operate and assess solar energy technology and its applications. Even as for energy generation, the knowledge about incident solar radiation in a particular site plays an essential role for agricultural, hydrological and ecological applications. The best way to obtain global solar irradiation data is through remote measurements at a given local using specific devices. However, due to the high cost of calibration and maintenance of these devices, the acquisition of solar irradiation data restrains to several meteorological stations around the world [9].

Difficulties and uncertainties in measuring global solar irradiation have led to the development of several models and algorithms to estimate it from a few meteorological properties computed on a frequent basis: maximum, minimum and mean atmospheric temperature; relative humidity, cloudiness, etc. Through the last years, a high number of models have been designed to assess the global solar irradiation over a horizontal surface. Among them, it highlights the empirical models [10, 11], satellite data based [12, 13], stochastic [14, 15], heuristics [6, 16, 17] and statistical [12, 18, 19] models.

Recently, artificial intelligence and computational intelligence techniques, especially artificial neural networks (ANNs), have been thoroughly used to solve real-world problems. These applications concern situations for which traditional methodologies do not seem suitable or further precision is required, such as meteorological variables forecasting as wind speed [20,21,22], precipitation [23] and land surface temperature [24]. The implementation of such approaches in estimating solar radiation has received specific attention in the last years [1, 25,26,27,28,29,30,31,32,33,34,35,36].

The present work, therefore, consists in establishing a global solar radiation estimation model that aims to estimate radiation from more easily measured meteorological variables obtained at the same instant as the desired forecast. Specifically, this work presents three ANNs adopting meteorological variables measured at Fortaleza as predictors. The local climate at Fortaleza, a coastal city inserted in the semiarid zone of Brazil, is highly influenced by the occurrence of El Niño and La Niña, which affect the meteorological conditions of many regions of the world. Therefore, the study of meteorological data collected at this particular site contributes to the new body of knowledge from the international perspective as it allows further comprehension of globally relevant phenomena. Furthermore, few studies in the literature review approaches such a long data set. Its influence on the performance of ANNs meteorological prediction models may represent a valuable subject to researchers working in this field. At last, this work introduces another contribution: the application of Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm as an alternative to traditional back propagation method.

Related works and research gaps

Once this work discusses artificial neural networks, it is relevant to briefly introduce the subject before the literature review. Consequently, all the aspects of artificial neural networks presented along this section should not be unknown by the time they will be discussed.

Neural networks, or artificial neural networks (ANNs), represent a technology with origins in many disciplines: neuroscience, math, statistics, physics, computer science and engineering. Neural networks have applications in fairly diverse fields as classification [37,38,39], clustering [40,41,42], prediction [43,44,45] and pattern recognition [46,47,48,49]. The main reason concerns its property to learn from input data with or without supervision [50, 51] through the use of an appropriate training method.

Climate and region

Climatological and regional characteristics such as topography, vegetation, proximity to the shore and water masses influence the solar radiation over a specific site.

The literature review denotes the study of these influences around the world as the following works shown. A combined model coupling linear autoregressive moving average and a recurrent dynamic ANNs was developed to estimate daily global solar radiation for two different climate sites in Algeria [30].

The research conducted in Australia [23] uses a multi-location study combining ANNs and satellite data to estimate monthly global solar radiation. Thirteen meteorological stations located across southern Quebec, Canada, apply three ANNs [14] compared with different input combinations of temperature and relative humidity to three geostatistical interpolation models in respect to their capacity to fill missing values of daily global solar radiation in data sets. Similarly, a comparison of three different ANNs methods to predict daily solar radiation for twelve locations of different climatic zones was performed in China [35].

Several prediction models were developed for sites in Iran [26, 32, 34]. Two different types of ANNs were used to predict daily global solar radiation at Dezful city [26]. The Iranian researchers [32] coupled ANNs and a metaheuristic method to predict daily solar radiation at Mashhad. Data sets from four stations in the United States of America and two stations in Iran were used to compare ANNs, gene expression programming, wavelet regression and five temperature-based empirical models capacity of daily global solar radiation prediction [34].

Two different ANNs, two adaptive neuro-fuzzy inference systems and two support vector machines daily global solar radiation prediction models for six locations in Mexico were compared [52], indicating how ANNs can compete in performance against several well-known approaches.

A multi-location ANNs to predict monthly average daily global solar radiation over Italy was developed [25]. Instead of using different periods to train and test the model, different subsets of stations were used. A subset of 17 locations was used for the ANNs training, while the testing step was against data from the remaining 28 locations.

Six ANNs-based models to estimate horizontal global radiation for Madinah, Saudi Arabia, were developed and compared to conventional regression models. They used different sets of variables to each ANNs’ model and their results show higher precision than empirical models based exclusively on insolation and air temperature. Thereof, it may be observed that it is reasonable to use artificial neural networks as an attractive alternative to several different types of existing models.

In another work [52], an investigation was conducted to estimate global solar radiation using neural networks for a mountainous region in southeastern Spain. The presented results show that artificial neural networks may be considered as an easy and effective technique to estimate solar radiation in sloped terrain. Also, hybrid ANNs–metaheuristic method to predict daily global solar radiation in Spain for Murcia City from novel meteorological variables was proposed [33].

The models proposed by [28, 36] were developed to predict global solar radiation in Turkey. In [28], data were taken from Eastern Mediterranean Region of the country to develop three monthly global solar radiation ANNs prediction models for the cities of Mersin, Adana, Kahramanmaras and Antakya. Another work aimed the same region [36], and data from Mersin, Adana, Kahramanmaras and Hatay were used. They compared an ANNs model to ten different daily global solar radiation empirical prediction methods.

The authors in [31] developed two ANNs models to estimate monthly global horizontal irradiance for Abu Dhabi, Dubai and Al-Ain. In a similar direction, the ANNs approach was applied for predicting global horizontal irradiation for twelve cities in Zimbabwe [29].

The literature review covers most continents of the world, including Africa, Asia, North America and Oceania. This present research work is a case study of a site at the Northeastern region of Brazil, located at Fortaleza city, in state of Ceará. This region stands out as having high DNI levels during almost the whole year. In this sense, this work tries to fulfill a gap of South American studies and mainly coastal semiarid climates case studies.

Input parameters

A standard set of input parameters for predicting global radiation is not pre-established. The choice of parameters varies according to the author approaches and the regions studied. Indeed, a relevant parameter to a case study may not play an important role in different circumstances. The choice of a parameter may also be constrained by the availability of measured data.

Typical input parameters are climatological, temporal or geographical. Climatological parameters include temperatures (maxima, minima, averages and amplitudes), wind speed, pressure, relative humidity, sunshine duration, extraterrestrial radiation and clearness index. Temporal parameters typically are day of the year and month of the year; while the geographical parameters include latitude, longitude and altitude.

However, some authors innovate in the choice of parameters as it may produce a better characterization of the site. For example, cumulated rainfall was included in [25]; also, the frequency of rainy days and heating degrees days arise among the chosen parameters. The work in [53] aimed to estimate solar radiation over a mountainous area. Therefore, the authors included the slope of the measurement sites among the relevant parameters.

Some cases [32] not only compute air temperatures as inputs, but also use the earth skin temperature. The complete parameter set contains the average air temperature, minimum air temperature, maximum air temperature, relative humidity, pressure, and wind speed. Similarly, the land surface temperature was added in [23] as input parameter. The data were obtained from satellite for the geographic coordinates altitude, latitude, longitude and month. Evaporation is also included in the set of parameters [26]. The other predictive variables are day of the year, daily mean air temperatures, relative humidity, sunshine hours and wind speed.

Some authors [33] approach the problem by a truly innovative way. According to this review, their choice of predictive variables had not yet been considered in other studies about radiation prediction. Besides the extraterrestrial solar radiation, the study also includes aerosol depth product, total ozone amount, total precipitable water and cloud amount.

In agreement with the literature review, this study applies the following parameters as predictive variables: day of the year, maximum temperature, minimum temperature, sunshine duration, precipitation, cloudiness, extraterrestrial radiation, relative humidity, evaporation and wind speed. A meteorological station, located at coordinates 3.745278 S, 38.582153 W, measured and recorded all these input parameters. Figure 1 depicts the location of the meteorological station.

Fig. 1
figure 1

Meteorological station location

Data set size

The data set size plays a direct role in the quality of prediction produced by an ANNs, as it determines the weights and biases that minimize the cost function for a wider input range. A long data set size allows the network to improve the prediction of long-term phenomena that may not be evident in short spans. However, the data availability mostly influences the data set size. Also, calibration and degradation issues of the instruments may invalidate some values reducing the data set length.

Commonly, the literature [14, 23, 26, 27, 29, 30] refers to data sets shorter than 10 years. Some authors [28, 32, 36] present data sets longer than 10 years, but more rarely [34, 35] works use data sets longer than 30 years. Some multi-location studies [31, 52] apply sets of different sizes, depending on the station. The research work presented in [35] draws special attention once it has the longest data set of all the considered studies, ranging across 54 years.

This work used a 44-year-long data set as a start point, what may be considered a somewhat big database, in front of the previous works. However, as shown further ahead in Sect. 6, the effective period used was constrained to the longest consecutive interval with high data availability, which leads to an amount of 14 years. Although this is significantly shorter than the original data set, it is still among the longest when compared to literature review. Thus, this work has a data set shorter than those presented in references [32, 34, 35] and [52].

Training method

Training of an ANNs consists in minimizing a cost function, evaluated by the deviation between the expected results and the predicted outputs. By feeding the ANNs with a particular set of inputs, called training sets, the method varies the weights and bias of the ANNs to decrease the cost function.

Common approaches use the back propagation method [14, 25,26,27, 34], and the Levenberg–Marquardt algorithm [29, 30, 52]. Some authors may use heuristic methods to minimize the cost function and hence train the ANNs, as in [32, 33]. In the literature review, some works applied different methods or even compared its effects on the accuracy of the ANNs [23, 28, 31, 36].

This work used the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm, a quasi-Newton optimization method. This method is present in [23, 31] for the training of an ANNs solar radiation prediction method.

Other uses of BFGS to train ANNs for estimating meteorological parameters are present for evapotranspiration index [23], wind speed [20] and land surface temperature [24].

In the prediction of meteorological parameters through methods not involving ANNs, BFGS algorithm can be found in the works which predicted solar radiation [54, 55] and wind speed [20].

Scope of present work

This works aims to fulfill some of the research gaps identified in the literature review presented above. The main characteristics of the referenced studies in ANNs are summarized in Table 1.

Table 1 Summary of literature review in ANNs prediction models

Regarding the literature review, it reveals a clear lack of solar radiation ANNs prediction models in South America. Furthermore, the studied region shows interesting features such as a semiarid coastal weather and the high influence of El Niño and La Niña phenomena on the meteorological conditions.

Although the data set does not contain atypical or exclusive input parameters in comparison with the literature review, it spans 44 years and it is longer than all but one data set presented in the review ranged from 2 years to 33 years. The exception being credited to [35], who used a 54-year-long data set. Besides, the present work studies and compares the prediction of global solar radiation in three different temporal bases: daily, weekly and monthly averaged. The authors did not find similar approach in the literature.

Finally, according to the authors’ review, most works used the Levenberg–Marquardt algorithm and back propagation as the main training methods for ANNs solar radiation prediction models. Only two references employed BFGS algorithm and it proved appropriate for training an ANNs.

Artificial neural networks (ANNs)

The expression “neural networks” remains from early attempts to find a mathematical representation of how biological systems process information [51]. Therefore, the ANNs perform basing on the way neurons work inside the human brain.

The main components of a neuron consist of the input signals, the synaptic weights, the applied bias, the additive junction and its activation function. Input signals represent received stimuli from neuron that will be transformed into an output signal by different processes occurring throughout its course in the neuron. The synaptic weights represent a connection link between input signal and the neuron. Each input signal is connected to the neuron through a different link. The bias role is to enhance or reduce the total information processed by the neuron to generate an answer to the initial stimulus.

The addictive junction, usually a linear combiner, receives information from different input signals weighted by its synaptic weights and biases, generating liquid information to be processed by the neuron. Finally, the activation function of a neuron is responsible for transforming the liquid information received from the neuron into output information, i.e., it produces an answer to the stimuli received by the neuron.

Different architectures of ANNs are possible according to how its neurons are connected to each other. A Multilayer Perceptron (MLP) is a class of ANNs where the neurons are organized in an input layer, an output layer and one or more intermediate layers called hidden layers.

In the present study, a MLP was created using R programming language to predict the value of global solar radiation in three different time basis.

The Broyden–Fletcher–Goldfarb–Shanno algorithm

With respect to unconstrained optimization [56], quasi-Newton methods are widely applied algorithms employed to the task of finding local minima of functions, as the cost functions of ANNs may be. The quasi-Newton is based on Newton’s method to find a static point of a function, where its gradient is zero.

The main goal of the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm is to find a descent direction and a descent step which will lead to a faster learning. Consequently, this algorithm uses information from the second-order derivative of the cost function. This information is represented by the approximation of the Hessian matrix, B.

The algorithm can be summarized as a sequence of steps [57], which are described by Eqs. (1)–(5).

Step 0 Given x1 ∈ ℛn, B1 ∈ ℛn×n defined positive, calculate g1 = ∇f(x1). If g1 = 0, stop; else, make k = 1.

Step 1 Choose dk = − B −1 k gk.

Step 2 Do a linear search along direction dk to obtain a value of γk > 0, xk+1 = xk + γk × dk and gk+1 = ∇f(xk+1);

If gk + 1 = 0, stop.

Step 3 Choose

$$B_{k + 1} = B_{k} - \frac{{B_{k} s_{k} s_{k}^{\text{T}} B_{k} }}{{s_{k}^{\text{T}} B_{k} s_{k} }} + \frac{{y_{k} y_{k}^{\text{T}} }}{{s_{k}^{\text{T}} y_{k} }}$$
(1)

where

$$s_{k} \, = \,\gamma_{k} d_{k}$$
(2)
$$y_{k} = g_{k + 1} - g_{k}$$
(3)

Step 4k≔ k + 1; Go to step 1

Still according to [57], in BFGS algorithm, a learning rate of αk is necessary to obey to Wolfe conditions defined by the following equations:

$$f(x_{k} + \gamma_{k} d_{k} ) - f(x_{k} ) \le \delta_{1} \gamma_{k} d_{k}^{\text{T}} g_{k}$$
(4)
$$d_{k}^{\text{T}} \nabla f\left( {x_{k} + \gamma_{k} d_{k} } \right) \ge \delta_{2} d_{k}^{\text{T}} g_{k}$$
(5)

In this algorithm, f is the function to be minimized, i.e., the cost function during the training of the ANNs. The xk parameter represents the weights and bias during iteration k.

The quality of the models developed from neural network algorithms may be evaluated by error parameters, as the root-mean-square error (RMSE) and the mean absolute percentage error (MAPE); as well as by the coefficient of determination R2.

Several error metrics can be calculated. Throughout this work, the errors presented in Eqs. (6)–(9) are used to assess and compare the models.

$${\text{RMSE}} = \sqrt {\frac{1}{N}\mathop \sum \limits_{i = 1}^{N} \left( {\hat{y}_{i} - y_{i} } \right)^{2} } ,$$
(6)
$$n{\text{RMSE}} = \frac{\text{RMSE}}{{\frac{1}{N}\mathop \sum \nolimits_{i = 1}^{N} y_{i} }} ,$$
(7)
$${\text{MAPE}} = \frac{1}{N}\mathop \sum \limits_{i = 1}^{N} \left| {\frac{{\left( {\hat{y}_{i} - y_{i} } \right)}}{{y_{i} }}} \right| ,$$
(8)
$$R^{2} = 1 - \frac{{\mathop \sum \nolimits_{i = 1}^{N} \left( {\hat{y}_{i} - y_{i} } \right)^{2} }}{{\mathop \sum \nolimits_{i = 1}^{N} \left( {\hat{y}_{i} - \bar{y}} \right)}} ,$$
(9)

where N is the quantity of data points in the sample, ŷi is the value predicted by the neural network and yi is the real observed value.

ANNs setup, training and error assessment

To perform the case study of this work, a training algorithm has been developed. The applied pseudocode is as follows:

  1. 1.

    Choose the number of neurons in the hidden layer of the neural network;

  2. 2.

    Choose the activation function of the neurons;

  3. 3.

    Initiate the synaptic weights randomly;

  4. 4.

    Initiate the hessian matrix approximation as the identity matrix;

  5. 5.

    Run the propagation algorithm;

  6. 6.

    Calculate the cost function;

  7. 7.

    Calculate the cost function gradient through the retropropagation algorithm;

  8. 8.

    Calculate the descent direction through the BFGS algorithm;

  9. 9.

    Do a linear research along this direction to find the optimal step;

  10. 10.

    Recalculate vectors s and y using Eqs. (7) and (8);

  11. 11.

    Recalculate the hessian approximation using Eq. (6);

  12. 12.

    Go back to step 5 and loop until convergence is achieved;

The R language [58] was applied in the development of these algorithms. It is an Open Source and free development tool.

Experimental data pre-processing

One of the objectives of the present work was to perform a case study using ANNs algorithms to obtain a regression model to predict solar radiation from easily obtainable meteorological variables.

Accordingly, experimental data containing maximum and minimum temperatures, wind speed, cloudiness, precipitation and solar radiation were obtained from a meteorological station located at the Federal University of Ceará. For this long-term measurement campaign, the instruments (glass thermometers, Campbell–Stokes pattern sunshine recorder, standard rain gauge, wet bulb thermometer, class A evaporation pan and cup anemometer) have enough robustness, accuracy and reliability, even if the data logging was not automated. In particular, the actinograph is a Bimetallic Actinograph, Robitzsch–Fuess Type 58 dc, which presents errors smaller than 5%, according to the manufacturer. The data series cover an interval ranging from 1969 to 2012.

In addition to the variables mentioned in Sect. 2.2, extraterrestrial radiation data have been evaluated and used. These data were calculated for the city of Fortaleza following Eq. (10), which can be found in [59].

$$G_{\text{on}} = G_{\text{sc}} \left[ {1.000110 + 0.034221 \times \cos B + 0.001280 \times \sin B + 0.000719 \times \cos 2B + 0.000077 \times \sin 2B} \right] ,$$
(10)

where B is given by B = (n − 1) × (360/365). The number “n” represents the number of the day.

From the daily raw data, three separate series were generated for comparison: daily, weekly and monthly averaged data from the meteorological variables. Then, the series were analyzed about their descriptive statistics. Furthermore, the total number of available data and the ratio of blank data were calculated to select the best data subset for the case study.

Finally, the selected data were normalized according to Eq. (11), as suggested [25]:

$$X_{N} = 0.1 + 0.8 \times \frac{{X_{\text{R}} - X_{\text{Min}} }}{{X_{\text{Max}} - X_{\text{Min}} }},$$
(11)

where XN is the normalized variable, XR is its real value, XMax and XMin are, respectively, the maximum and minimum values of the variable in the data sample. The convergence of the algorithm becomes faster if the mean of each variable of the training set is close to zero [60].

Case study

Three approaches were conducted with normalized data using, respectively, daily, weekly and monthly averaged series. The radiation prediction models have been developed by a training algorithm created for the purpose of this work.

The studied models consisted of an ANNs structure with a hidden layer, the inputs of which are the variables presented in Sect. 2.2 and the calculated extraterrestrial radiation (Gon) for the correspondent time steps. The values of the time steps were chosen accordingly to the averaging frequency: for daily data, day of the year was used; for weekly data, week of the year was used; and for monthly data, month of the year was used.

Thus, the studied ANNs had the structure shown in Fig. 2.

Fig. 2
figure 2

Modeled ANNs structure

To check the different obtained models, data have been split into two groups: training data and test data. The training data set size for each case study corresponds to 70% of the total data set. The remaining 30% was used as a test data set. Training data were used to develop a neural network model and test data were used to assess its predictive capacity.

The evaluation of model quality was made by the application of Eqs. (6)–(9). To find the number of neurons in the hidden layers which generates the least prediction error for each case study, a chart was created linking the prediction error and testing error to the data size transmitted to the network. The goal of this analysis was to find out how much data are needed to perform solar radiation studies.

Finally, the resulting number of neurons obtained from previous analysis was used to train the ANNs again, and the results were evaluated using Eqs. (6)–(9) and summarized in charts and tables.

Results and discussion

The daily extraterrestrial solar radiation has been calculated to include it among the descriptive variables of the global solar radiation prediction model obtained experimentally.

Table 2 presents the availability of data per year. The interval ranging from 1974 to 1988 composes the most complete continuous subset, and therefore it was chosen for analysis.

Table 2 Percentage of experimental data present in each year

The descriptive statistics of the data set at daily, weekly and monthly basis can be found respectively on Tables 3, 4 and 5.

Table 3 Statistics for the daily data set
Table 4 Statistics for the weekly data set
Table 5 Statistics for the monthly data set

From the tables presented, it can be noted that no significant changes occur at the mean values when the time basis is changed. Furthermore, it can be noted that increasing the interval size leads to decreasing standard deviations, i.e., the coefficient of variation decreases. This may lead to the conjecture that simpler prediction models, with fewer neurons, could be able to predict global solar radiation from data sets with longer sampling interval sizes. It could also be conjectured that training neural networks from data sets with smaller variability would demand a smaller training set to obtain an acceptable accuracy.

The results of several training sessions were evaluated graphically before defining the optimal number of neurons to compare the performance of the developed algorithm to the ones found in literature.

Training a neural network aims to minimize the cost function, i.e., to reduce the training error through feeding it repeatedly with training samples and the expected results for those samples. This essentially consists in an optimization problem where the error plays the role of the cost function. As expected, a high number of training sessions results in increasingly smaller errors, i.e., the network becomes more able to predict the results associated with the training samples. However, this produces overfitting as a result of the bias variance trade-off. As the bias of the ANNs decreases and it comes closer to the expected results for training data, an increase occurs in its variance, i.e., its sensibility to new (test) data. Figure 3 shows this behavior in training sessions done with the monthly data set.

Fig. 3
figure 3

Behavior of training and test errors versus number of training sessions

The curves of theoretical error versus the number of iterations often decline smoothly. However, real graph [61] depicts very jagged lines as seen in Fig. 3. Hence, the algorithm developed in this work behaves as expected, which confers its ability to train neural networks from daily, weekly and monthly data samples.

The next step of the analysis consisted in determining the complexity of the neural network, i.e., the minimum number of neurons in the hidden layer that would lead to the least prediction error.

Determining the number of neurons inside the hidden layer classically is made by trial and error [62]. For this, it trains various structures to assess their performances. Finally, the best configuration is chosen. The problem with this kind of approach is its high manual cost, even though computational algorithms may be written to build, train and test neural networks.

Figures 4, 5 and 6 show the results for the different case studies. Once all the variables have been normalized according to Eq. (10), RMSE is dimensionless.

Fig. 4
figure 4

Error vs. complexity of network for the daily case study

Fig. 5
figure 5

Error vs. complexity of network for the weekly case study

Fig. 6
figure 6

Error vs. complexity of network for the monthly case study

As seen in these graphs, fewer iterations in models result in less flexibility, and therefore produces high errors in modeling for both training and test data sets. As the number of neurons increases, the model becomes more flexible and its ability to predict both data sets improves. However, a further increase in the number of neurons can produce an excessively flexible model. This provides small errors in the training data site, but results in an increase error in test data set prediction, as it projects patterns that do not exist in the test data set. Therefore, an overly flexible model is not reliable in predicting unknown points, and a compromise in the number of neurons, visible in the global minimum of the red curve, must be achieved.

Figure 6 shows that the minimum prediction error occurs to 30 neurons in the hidden layer. With more than this number of neurons, the network is able to accurately describe the training data, but loses capacity of predicting new data. With fewer than 30 neurons, the network cannot predict accurately the training data, since it will not be able to properly generalize the relationships between the variables and predict the monthly solar radiation. Analogously, it can be asserted that the best predictive capacity for the daily and weekly data sets results from 40 and 50 neurons in the hidden layer, respectively.

To deepen the present analysis, the effect of data set size used in the training was assessed. The results are summarized in Figs. 7, 8 and 9.

Fig. 7
figure 7

Training error vs data set size used in the daily case study

Fig. 8
figure 8

Training error vs data set size used in the weekly case study

Fig. 9
figure 9

Training error vs data set size used in the monthly case study

The analysis of these graphs shows clearly the importance of applying a significantly large dataset during the neural networks training. It is possible to confirm that increasing the amount of data roughly implies in smaller prediction errors. In other words, the neural networks learn more and better with more data. The dataset used in this work is very long and this allowed us to assess the effect of the chosen dataset on the RMSE. This conclusion would not be achieved from the analysis of a short dataset.

At last, based on the number of neurons obtained previously, an ANNs has been created for each case study and the results can be observed in Figs. 10 and 11 and in Table 6.

Fig. 10
figure 10

Training data, test data of the monthly case study neural network

Fig. 11
figure 11

Scattering of training and test data in the monthly case study

Table 6 Error metrics of the obtained models

In the literature review, no weekly analysis has been found. The following paragraphs compare the results of the present study to different error metrics for daily and monthly analysis found in the literature.

The MAPE of the daily case study was 14.87%, which is somewhat higher than the value of 6.53% found in [26]. Various choices of algorithms were tried [29] in multiple locations resulting in errors ranged from 2.56 to 10.57%.

A comparatively high value of MAPE was found in the monthly case study as well. In a previous work [25], the variation of the number of model parameters results in a MAPE range from 1.67 to 4.25%. In another study [28], it was found MAPE ranging from 2.8027 to 4.1627% to the same model across different locations. Therefore, the value of 20.83% found in this work is significantly higher and probably caused by regional phenomena, such as El Niño and La Niña, which could not be assessed by the set of predictors used in this study.

MAPE values smaller than 10% indicate excellent predictive capacity; MAPE values between 10 and 20% correspond to good model predictive capacity; MAPE values between 20 and 50% represent average model predictive capacity and MAPE values above 50% denote inaccurate models [63]. Based on this criterion and from the findings, it can be concluded that the predictive models generated with the aid of neural networks are able to deliver good results, except for the monthly model, which showed average capacity, but for a tiny amount (20.83%).

A similar comparison was made to assess the quality with different metrics, namely the RMSE. For the daily case study [27], six ANNs architectures were tested with various numbers of neurons and inputs and found RMSE values from 0.044121 to 0.167655. The model presented in [36] lead to a value of 0.14. Herein, the daily case study produced an RMSE of 8.76, what may be considered good. However, when monthly data are contrasted, the value of 0.0858 found in this work is smaller than the RMSE of a previous paper [23], which is 1.23. The RMSE has the same units of the predicted variable and a more interesting approach is to compare nRMSE instead of RMSE, as the former is relative, and thus can assess much better the impact of the error over the measurement. In such case, the results are very similar. The overall nRMSE for the multiple locations of the above-mentioned study was 5.07%, while our findings show 6.42%.

To increase the scope of the conclusions presented here, and keeping in mind that there are several meteorological and geographical parameters which may affect the solar radiation prediction models, previous work on the same subject [64] can be used as a basis for comparison regarding some of the conclusions obtained. The authors used monthly average solar radiation data (measured) for various meteorological locations in Bangladesh and performed a parameter (predictor) selection to evaluate the ones that mostly impact on the model performance. Three attribute evaluators of Waikato Environment for Knowledge Analysis (WEKA), such as the Classifier Subset Eval, the CFS Subset Eval and the Wrapper Subset Eval were evaluated in the task of selecting the most influential input parameters for the training, validation and testing of an ANNs model. Twelve input parameters were used to develop the prediction model, which was the Month, Local Latitude, Longitude and Altitude, Maximal, Minimal and Average Temperature, Bright Sunshine, Wind Speed, Relative Humidity, Rainfall and Cloud Coverage. The approach used herein, on the other hand, was to insert all the parameters into the model, so that the generated ANNs could assign the necessary weights to the naturally more important parameters, which apparently succeeded, since the values of R achived by these authors [64], for the test data set, was about 0.75596, while those obtained by our model were, also for the case of the monthly average, 0.8864 (R2 = 0.7857). This indicates that the ANNs, as configured in this work, was able to naturally overweight the most important parameters.

Different nRMSE intervals can be defined to represent the model quality as [65]: excellent (nRMSE smaller than 10%), good (nRMSE between 10 and 20%), average (nRMSE between 20 and 30%) and inaccurate (nRMSE higher than 30%). Thus, the models developed during this work can be rated as excellent with respect to their predictive capacities according to this benchmarking method.

The new model is more consistent for weeks and monthly sampling intervals. One of the possible explanations is the increase in the variability of the experimental data set when a model is shifted from monthly to weekly and, finally, to daily. This augmentation, which is seen as an increase in the coefficient of variation (CV), can disrupt the ability of the model to accurately describe more scattered datasets leading to smaller coefficients of determination.

Conclusion

The present work developed a neural network training algorithm applying to the prediction of global solar radiation from meteorological data. The training algorithm was developed and optimized through the BFGS algorithm with the use of the R language. As the data source, an historical experimental meteorological data series ranging from 1974 to 1988 was used.

In this paper, the models developed were used to predict global radiation on Fortaleza. Located in the Northeastern region of Brazil, this semiarid meteorological coastal city is strongly influenced by El Niño and La Niña phenomena. It is, therefore, a unique set of unpaired climatological and geographical features in this type of work.

Previous standardization and data pre-processing provided the dataset to be used in the three case studies with varied sampling frequencies. As a result, a global solar radiation prediction neural network was created. Another unique feature is that the study applies multiple temporal bases, in which daily (as measured), weekly averaged and monthly averaged data allowed to evaluate the quality of the prediction over different sampling frequencies.

The analysis of the coefficient of determination for each case study provides to conclude that the monthly averaged prediction model was the most accurate. Furthermore, the 44-year-long data set not only established and trained the ANNs models, but also assessed the influence of data set size on their performance. The models produced good results in line with MAPE benchmarking and excellent agreement to nRMSE benchmarking. For this, it concludes that it is at the same quality level of similar studies.