Skip to main content
Log in

Forecasting solar irradiation based on influencing factors determined by linear correlation and stepwise regression

  • Original Paper
  • Published:
Theoretical and Applied Climatology Aims and scope Submit manuscript

Abstract

Determination of the factors influencing solar irradiation plays an important role in the performance of a prediction model based on machine learning techniques. This study compared the results obtained using a linear correlation analysis and stepwise regression method to identify the key meteorological, weather, and radiation factors that significantly affect the solar irradiation on the following day, as well as the difference between the quantities of solar irradiation on the current day and following day. These factors were used to establish prediction models for the Qinghai–Tibet Plateau, which has a complex topography and weather caprices. The results indicated that the stepwise regression method was capable of screening more effective influencing factors and the corresponding predictive accuracy was acceptable even under the weather conditions of the plateau.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Abbreviations

w :

weather condition

E :

evaporation (mm)

ST avg :

average surface temperature (°C)

ST max :

daily max surface temperature (°C)

ST min :

daily min surface temperature (°C)

P 20–8 :

20–8 h rainfall (mm)

P 8–20 :

8–20 h rainfall (mm)

P 20–20 :

20–20 h rainfall (mm)

P avg :

daily average station pressure (hPa)

SP max :

daily maximum station pressure (hPa)

SP min :

daily minimum station pressure (hPa)

RH avg :

average relative humidity (%)

RH min :

minimum relative humidity (%)

S :

sunshine hours (h)

T avg :

average temperature (°C)

T max :

daily maximum temperature (°C)

T min :

daily minimum temperature (°C)

W avg :

average wind speed (m/s)

W max :

maximum wind speed (m/s)

W g :

great wind speed (m/s)

H :

daily solar irradiation (MJ m−2 day−1)

H 0 :

daily extra-terrestrial irradiation (MJ m−2 day−1)

H sc :

solar constant (0.082 MJ m−2 min−1)

ΔH :

daily solar irradiation difference (MJ m−2 day−1)

Ht − 1 Ht :

current and next day’s solar irradiation (MJ m−2 day−1)

d :

inverse relative earth–sun distance

ω s :

sunset hour angle (degrees)

φ :

site latitude (degrees)

δ :

solar declination angle (degrees)

j :

day of the year, starting from January 1

N :

total number of datasets

i :

ith data point of data

t-1, t :

current day and next day

x t − 1, i :

input parameter of the current day

y t, i :

daily solar irradiation for the next day (MJ m−2 day−1)

\( \overline{x} \) :

average of input parameters

\( \overline{y} \) :

average of the daily irradiation (MJ m−2 day−1)

R :

correlation coefficients between the influencing factors and H

R’ :

correlation coefficients between the influencing factors and ΔH

\( {y}_{t,i}^o \) :

measured value of daily solar irradiation (MJ m−2 day−1)

\( {y}_{t,i}^f \) :

predicted value of daily solar irradiation (MJ m−2 day−1)

\( \overline{y^o} \) :

average of the measured value of daily solar irradiation (MJ m−2 day−1)

x t, ij :

ith sample value building the regression equation

b j :

coefficient of the regression equation

Qm, Qm + 1 :

sum of residual squares with m and m + 1 factors

Adj-R 2 :

adjustable coefficient of determination

x k :

new factor of multivariate regression equation

V k :

variance contribution of factor xk

R m − p :

Pearson correlation coefficient between the measured value and predicted value

NSE :

Nash–Sutcliffe efficiency coefficient

RMSE :

root–mean–square error

MAE :

mean absolute error

MAPE :

mean absolute percentage error

References

  • Albers SC, Jankov I (2011) Using the LAPS /WRF system to analyze and forecast solar radiation. Am Geophys Union

  • Al-Mostafa ZA, Maghrabi AH, Al-Shehri SM (2014) Sunshine-based global radiation models: a review and case study. Energy Convers Manag 84:209–216

    Article  Google Scholar 

  • Antonanzas-Torres F, Sanz-Garcia A, Martinez-De-Pison FJ, Perpinan-Lamigueiro O (2013) Evaluation and improvement of empirical models of global solar irradiation: case study northern Spain. Renew Energy 60:604–614

    Article  Google Scholar 

  • Badescu V (2002) A new kind of cloudy sky model to compute instantaneous values of diffuse and global solar irradiance. Theor Appl Climatol 72(1–2):127–136

    Article  Google Scholar 

  • Badescu V, Dumitrescu A (2016) Simple solar radiation Modelling for different cloud types and climatologies [J]. Theor Appl Climatol 124(1–2):141–160

    Article  Google Scholar 

  • Bao G, Zhang J, Zhou D, Ma S, Liu W (2017) Analysis of temporal and spatial variation characteristics of solar radiation intensity in Qinghai Province. J Glaciol Geocryol 39(3):563–571

    Google Scholar 

  • Behrang MA, Assareh E, Ghanbarzadeh A, Noghrehabadi AR (2010) The potential of different artificial neural network (ANN) techniques in daily global solar radiation modeling based on meteorological data. Sol Energy 84(8):1468–1480

    Article  Google Scholar 

  • Benghanem M, Mellit A (2010) Radial basis function network-based prediction of global solar radiation data: application for sizing of a stand-alone photovoltaic system at Al-Madinah, Saudi Arabia. Energy 35:3751–3762

    Article  Google Scholar 

  • Benghanem M, Mellit A (2014) A simplified calibrated model for estimating daily global solar radiation in Madinah, Saudi Arabia. Theor Appl Climatol 115(1–2):197–205

    Article  Google Scholar 

  • Chen R, Kang E, Lu S, Yang J, Ji X, Zhang Z, Zhang J (2006) New methods to estimate global radiation based on meteorological data in China. Energy Convers Manag 47(18–19):2991–2998

    Article  Google Scholar 

  • Chen C, Duan S, Cai T, Liu B (2011) Online 24-H solar power forecasting based on weather type classification using artificial neural network. Sol Energy 85(11):2856–2870

    Article  Google Scholar 

  • Chukwujindu NS (2017) A comprehensive review of empirical models for estimating global solar radiation in Africa. Renew Sust Energ Rev 78:955–995

    Article  Google Scholar 

  • Enkal OÅ, Ahin MÅ, Temalci VP (2010) The estimation of solar radiation for different time periods. Energy Sour A Recov Util Environ Effects 32(13):1176–1184

    Article  Google Scholar 

  • Fan S, Fan G, Dong Y, Zhou D (2011) Discussion on the four seasons division method of the Qinghai-Tibet plateau [J]. Plateau Mountain Meteorol Res 31(02):1–11

    Google Scholar 

  • Feng J, Wang W, Li, Liu W (2018) Simulation of solar radiation based on BP neural network and its spatio-temporal change analysis in East China[J]. Remote Sens Technol Appl 33(5):881–889

    Google Scholar 

  • Goh A (1996) Neural network modeling of CPT seismic liquefaction data. J Geotech Eng 122(1):70–73

    Article  Google Scholar 

  • Hasni A, Sehli A, Draoui B, Bassou A, Amieur B (2012) Estimating global solar radiation using artificial neural network and climate data in the South-Western region of Algeria. Energy Procedia 18(1):531–537

    Article  Google Scholar 

  • Kisi O, Parmar KS (2016) Application of least square support vector machine and multivariate adaptive regression spline models in long term prediction of river water pollution. J Hydrol 534:104–112

    Article  Google Scholar 

  • Kumar R, Aggarwal RK, Sharma JD (2015) Comparison of regression and artificial neural network models for estimation of global solar radiations. Renew Sust Energ Rev 52:1294–1299

    Article  Google Scholar 

  • Lewis C D. (1982) Industrial and business forecasting methods: a practical guide to exponential smoothing and curve fitting[M]

  • Li J, Liu W, Sun Z, Wu G (2009) Effects of convection and cumulus processes on atmospheric radiant flux[J]. J Meteorol 67(3):355–369

    Google Scholar 

  • Li H, Cao F, Bu X, Zhao L (2015) Models for calculating daily global solar radiation from air temperature in humid regions-a case study. Environ Prog Sustain Energy 34(2):595–599

    Article  Google Scholar 

  • Liang H. (2012) Simulation of atmospheric water vapor changes and effects on radiation over the Tibetan plateau. Chin Acad Meteorol Sci

  • Liu DL, Scott BJ (2000) Estimation of solar radiation in Australia from rainfall and temperature observations. Agric For Meteorol 106(1):41–59

    Article  Google Scholar 

  • Lockart N, Kavetski D, Franks SW (2015) A new stochastic model for simulating daily solar radiation from sunshine hours. Int J Climatol 35(6):1090–1106

    Article  Google Scholar 

  • Lu J, Wang J (n.d.) Road surface condition detection based on road surface temperature and solar radiation: International Conference On Computer [Z].2010

  • Maa A (2011) Artificial neural network estimation of global solar radiation using meteorological parameters in Gusau, Nigeria. Arch Appl Sci Res 3(2):586–595

    Google Scholar 

  • Mahmood R, Hubbard KG (2002) Effect of time of temperature observation and estimation of daily solar radiation for the northern Great Plains, USA. Agron J 94(4):723–733

    Article  Google Scholar 

  • Mubiru J, Banda EJKB (2008) Estimation of monthly average daily global solar irradiation using artificial neural networks. Sol Energy 82(2):181–187

    Article  Google Scholar 

  • Ouammi A, Zejli D, Dagdougui H, Benchrifa R (2012) Artificial neural network analysis of Moroccan solar potential. Renew Sust Energ Rev 16(7):4876–4889

    Article  Google Scholar 

  • Quej VH, Almorox J, Ibrakhimov M, Saito L (2016) Empirical models for estimating daily global solar radiation in Yucatan peninsula, Mexico. Energy Convers Manag 110:448–456

    Article  Google Scholar 

  • Sumithira TR, Kumar AN (2012) Prediction of monthly global solar radiation using adaptive Neuro fuzzy inference system (ANFIS) technique over the state of Tamilnadu (India): a comparative study. Appl Solar Energy 48(2):140–145

    Article  Google Scholar 

  • Sun Z (2011) Improving transmission calculations for the Edwards & Ndash; Slingo radiation scheme using a correlated-K distribution method. Q J R Meteorol Soc 137(661):2138–2148

    Article  Google Scholar 

  • Supit I, Van Kappel R (1998) A simple method to estimate global radiation. Sol Energy 63(3):147–160

    Article  Google Scholar 

  • Wang F, Ding Y (2005) An evaluation of cloud radiative feedback mechanisms in climate models. Advances in Earth Science 20(2):207–215 (In Chinese)

  • Wang K, Jiang W, Chen S (2001) Total cloud coverage over the Tibetan plateau: comparative analysis of ground observations, satellite inversion and assimilation data. Plateau Meteorol 20(3):252–257

    Google Scholar 

  • Wang W, Li J, Zhang F (2014) Solar radiation prediction based on BP neural network—a case study of Lanzhou City[J]. J Arid Land Resour Environ 28(2):185–189

    Google Scholar 

  • Winslow JC, Hunt ER, Piper SC (2001) A globally applicable model of daily solar irradiance estimated from air temperature and precipitation data. Ecol Model 143(3):227–243

    Article  Google Scholar 

  • Wu G, Liu Y, Wang T (2007) Methods and strategy for modeling daily global solar radiation with measured meteorological data - a case study in Nanchang station, China. Energy Convers Manag 48(9):2447–2452

    Article  Google Scholar 

  • Wu Q, Wang Z, Cui Y (2010) Estimation of temporal and spatial distribution patterns of solar radiation in China in the past 20 years. J Appl Meteorol 21(3):343–351

    Google Scholar 

  • Yao L, Li T, Yi J, Su Y, Hu W, Xiao D (2012) Transparency of neural network model and input variable reduction. Comput Therm Sci 39(9):247–251

    Google Scholar 

  • Zhang J, Li Z, Shuai D, Xu W, Ying Z (2017) A critical review of the models used to estimate solar radiation. Renew Sust Energ Rev 70:314–329

    Article  Google Scholar 

  • Zhou B, Li F, Yan L, Zhang H, He Y (2011) Research on solar radiation estimation model in Qinghai Province. Chin J Agrometeorol 32(4)

Download references

Funding

This research was supported by the National Natural Science Foundation of China (Grant No. 91847302, 51879137, 51979276).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fang-Fang Li.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

  1. I.

    Linear correlation analysis method

LCA is a statistical method used to study the correlation between two or more random variables. In this study, LCA was used to determine the factors influencing the solar irradiation by calculating the Pearson correlation coefficient, as shown in Eq. (10). A correlation coefficient that is closer to + 1 or − 1 indicates a stronger correlation; conversely, a correlation coefficient that is closer to zero shows a weaker correlation:

$$ R=\frac{\sum_{i=1}^N\left({x}_{t-1,i}-\overline{x}\right)\left({y}_{t,i}-\overline{y}\right)}{\sum_{i=1}^N{\left({x}_{t-1,i}-\overline{x}\right)}^2{\sum}_{i=1}^N{\left({y}_{t,i}-\overline{y}\right)}^2}, $$
(10)

where N is the total number of data points, i is the ith data point in all the data, xt − 1, i is the input parameter, yt, i is the daily solar irradiation, t-1 is the day before t,\( \overline{x} \)is the average of the input parameters, and \( \overline{y\ } \)is the average of the daily irradiation.

Table 8 Correspondence between correlation coefficient and correlation degree (http://zh.wikipedia.org/zh-cn)
  1. II.

    Stepwise regression method

The basic concept of the SRM is to introduce parameters into the model one at a time and apply the F-test with every additional parameter to verify the performance of the newly established model. In addition, the significance of all the parameters is also investigated every time by applying the T-test. When some of the original influencing factors are no longer significant as a result of the introduction of a particular parameter, they are omitted from the set to ensure that only significant parameters are included in the regression equation before each new parameter is introduced. This process is repeated until no significant parameter needs to be introduced into the regression equation, and no parameter of significance needs to be eliminated from the equation. Essentially, the SRM is a method used to establish the best multiple linear regression equation by introducing and eliminating factors one at a time based on its variance contribution. The variance contribution of a factor is the difference between the sum of the residual squares before and after introducing the factor into the regression equation. The sum of the residual squares with m factors in the equation is shown in Eq. (11):

$$ {Q}^m={\sum}_{i=1}^N{\left({y}_{t,i}-{b}_0-{\sum}_{j=1}^m{b}_j{x}_{t, ij}\right)}^2, $$
(11)

where bj is the coefficient of the regression equation, yt, i is the observed value, and xt, ij is the ith sample value.

Adding factor xk (k > m) and using (m + 1) factors to establish a new multivariate regression equation, the sum of the residual squares can be obtained as shown in Eq. (12):

$$ {Q}^{m+1}={\sum}_{i=1}^N{\left({y}_{t,i}-{b}_0^{\ast }-{\sum}_{j=1}^m{b}_j^{\ast }{x}_{t, ij}-{b}_k^{\ast }{x}_{t, ik}\right)}^2 $$
(12)

The variance contribution of factor xk is

$$ {V}_k={Q}^m-{Q}^{m+1} $$
(13)

The variance contribution is taken as the measurement of the importance of a factor to the equation. The main process of the SRM is illustrated below.

  1. (1).

    Divide all of the factors into two sets: one composed of the factors introduced into regression equation S and the other containing the factors not introduced into the equation, denoted as \( \overline{S} \). At the beginning, all of the factors are in set S, and \( \overline{S} \)is empty.

  2. (2).

    Calculate the variance contribution of all the factors in S by Eqs. (11) to (13) to determine the factor with the greatest variance contribution. If it can pass the F-test as an introduced factor, it will be brought in the regression equation.

  3. (3).

    Calculate the variance contribution of each of the factors in \( \overline{S} \) by Eqs. (11) to (13) and determine the factor with the smallest variance contribution. If it can pass the F-test as an eliminated factor, it will be omitted in the regression equation.

  4. (4).

    Repeat steps (2) and (3) until no factor can be introduced or eliminated.

    Using the SRM, the effective influencing factors can be selected from various factors affecting solar irradiation to establish a multiple regression equation.

  1. III.

    Back-propagation neural network

A BPNN is a multilayer feedforward network trained by the error back-propagation algorithm. It is one of the most widely used neural network models with high prediction accuracy. A BPNN can learn and store a large number of input and output mapping relationships without a mathematical equation describing the mapping relationship beforehand. Its learning rule is to use the steepest descent method to adjust the weights and thresholds of the network through back propagation, to minimise the sum of the squares of errors of the network. The procedure of the BPNN mainly includes the following rules.

  1. (1).

    In forward propagation, input samples are transferred from the input layer to the output layer after being processed by hidden layers. If the actual output of the output layer does not match the expected value, it goes to the backward propagation stage.

  2. (2).

    In back-propagation, the output is propagated layer by layer from the hidden layer to the input layer, and the errors are allocated to all of the neuros in each layer so that the error signals of the neuros in each layer can be obtained. These signals can be used as the basis for correcting the weights of the neuros.

As an ANN model, a BPNN is also composed of an input layer, an output layer, and hidden layers. Although there is no limitation on the hidden layers in theory, some scholars have proved that a neural network with one hidden layer can realise the mapping of continuous functions with arbitrary precision(Goh 1996). Practice has shown that more hidden layers lead to a more complex network, which affects the generalisation ability of the network. In this study, a three-layer BPNN was adopted with the tansig function in Eq. (14) from the input layer to the hidden layer and the logsig function in Eq. (15) from the hidden layer to the output layer.

$$ tansig(x)=\frac{2}{1+{e}^{-2x}}-1 $$
(14)
$$ logsig(x)=\frac{1}{1+{e}^{-x}} $$
(15)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Qiu, J., An, XJ., Wu, ZG. et al. Forecasting solar irradiation based on influencing factors determined by linear correlation and stepwise regression. Theor Appl Climatol 140, 253–269 (2020). https://doi.org/10.1007/s00704-019-03072-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00704-019-03072-8

Navigation