# Prediction of monthly mean daily global solar radiation using Artificial Neural Network

## Authors

- First Online:

- Received:
- Revised:
- Accepted:

DOI: 10.1007/s12040-012-0235-1

- Cite this article as:
- SIVAMADHAVI, V. & SELVARAJ, R.S. J Earth Syst Sci (2012) 121: 1501. doi:10.1007/s12040-012-0235-1

- 7 Citations
- 181 Views

In this study, a multilayer feed forward (MLFF) neural network based on back propagation algorithm was developed, trained, and tested to predict monthly mean daily global radiation in Tamil Nadu, India. Various geographical, solar and meteorological parameters of three different locations with diverse climatic conditions were used as input parameters. Out of 565 available data, 530 were used for training and the rest were used for testing the artificial neural network (ANN). A 3-layer and a 4-layer MLFF networks were developed and the performance of the developed models was evaluated based on mean bias error, mean absolute percentage error, root mean squared error and Student’s *t*-test. The 3-layer MLFF network developed in this study did not give uniform results for the three chosen locations. Hence, a 4-layer MLFF network was developed and the average value of the mean absolute percentage error was found to be 5.47%. Values of global radiation obtained using the model were in excellent agreement with measured values. Results of this study show that the designed ANN model can be used to estimate monthly mean daily global radiation of any place in Tamil Nadu where measured global radiation data are not available.

### Keywords

Artificial neural networkback propagationglobal radiationmultilayer perceptron networkTamil Nadu## 1 Introduction

Energy crisis is one of the most important challenges faced by many countries. In this scenario, solar energy plays a vital role as a renewable energy because of its unpolluted nature and its reliability in tropical countries like India. Tamil Nadu is one of the states located in the southern peninsular region of India lying in the sunny belt between 8.5° and 13.35°N. Its geographical location is advantageous for utilizing the solar energy. For effective and efficient utilization of solar energy, it is necessary to have a precise knowledge about the various components of solar energy available at the locations of our interest. Global radiation is the most important component of solar radiation since it gives the total solar availability at a given place. Global radiation is measured only at a few locations due to the high cost involved in the purchase of various equipments and maintenance thereof. In view of the above said factors, mathematical models become inevitable for places where measurement of global radiation is not done.

Various empirical models have been developed by many researchers using meteorological parameters such as sunshine hours, temperature, etc., to estimate monthly mean daily global radiation. Veeran and Kumar (1993) developed an Angstrom type regression model correlating global radiation and sunshine hours to estimate the monthly mean daily global radiation at two locations in India using the data of five years. Ahmad and Ulfat (2004) employed the regression technique and proposed Angstrom type empirical equation of first and second order for determining the monthly mean daily global radiation at Karachi, Pakistan. Sabbagh *et al.* (1977) developed a regression model correlating the monthly mean daily global radiation with sunshine hours, relative humidity, maximum temperature, latitude and altitude for various places in Egypt, Kuwait, Lebanon, Sudan and Saudi Arabia. Chandal *et al.* (2005) correlated the monthly mean daily global radiation with temperature, sunshine hours, latitude and altitude for six Indian locations using the data of 10–15 years. Augustine and Nnabuchi (2009) developed an Angstrom type correlation equation to predict monthly mean daily global solar radiation incident on a horizontal surface in Warri, Nigeria. Prieto *et al.* (2009) proposed an empirical model to estimate monthly mean daily global radiation using air temperature in Asturias, Spain. Ali (2008) modified three existing radiation models to estimate monthly mean daily global radiation in various cities in central arid desert of Iran by including the parameters such as altitude, Sun–Earth distance and number of dusty days. These models were developed only for specific locations and so they have limited applications.

Artificial neural network offers a better way to predict various components of solar radiation using many meteorological and geographical parameters as input. The basic assumption underlying the ANN models is that there exists a non-linear relationship between solar radiation and these parameters. Use of ANNs has gained increasing popularity for applications where a mechanistic description of the dependency between dependent and independent variables is either unknown or very complex (Almeida 2002). ANN is one of the best tools for non-linear mapping. ANN technique was previously used to predict the components of solar radiation such as hourly diffuse radiation (Soares *et al.*2004; Alam *et al.*2009), hourly direct radiation (Lopez *et al.*2005), hourly global radiation (Krishnaiah *et al.*2007), and daily global radiation (Mohandes *et al.*1998; Reddy and Ranjan 2003; Fadare *et al.*2010).

Global radiation is the most important component of solar radiation. Fadare *et al.* (2010) studied the feasibility of an ANN based model for the prediction of monthly mean daily global radiation at various locations in Africa. Mohandes *et al.* (1998) used a multilayer perceptron neural network for modelling monthly mean daily global radiation in the Kingdom of Saudi Arabia. So far, there has been only one study on the modelling of global radiation in India using ANN technique (Reddy and Ranjan 2003) where the maximum mean percentage error is 10.2% and 12.5% for the stations, Mangalore and New Delhi, respectively. Reddy and Ranjan (2003) developed a Multi Layer Feed Forward network for the estimation of monthly mean daily and hourly values of global radiation in India using various geographical and meteorological parameters. The ANN model employed in their study contains two hidden layers with eight and seven neurons respectively. The results of their study justify the application of artificial neural networks, the most sophisticated non-linear model for modelling global radiation.

The main objective of our study is to develop a computationally simpler ANN model to estimate global radiation with mean absolute percentage error of less than 5% using commonly measured weather parameters for places in Tamil Nadu. Solar parameters such as solar declination, sunrise hour angle, extraterrestrial radiation and maximum possible sunshine duration (day length) have been used as inputs along with some geographical and meteorological parameters. The inputs used in our study are readily available in all meteorological stations. In the present study, a multilayer feed forward neural network based on the Lavenberg–Marquardt back propagation learning algorithm was developed, trained and tested using the geographical, solar and meteorological parameters of three locations in Tamil Nadu with varying climatic conditions.

## 2 Materials and methods

### 2.1 ANN model

- 1.
Month number

- 2.
Latitude (deg)

- 3.
Longitude (deg)

- 4.
Altitude (m)

- 5.
Solar declination (radian)

- 6.
Sunrise hour angle (radian)

- 7.
Day length (hour)

- 8.
Extraterrestrial radiation on a horizontal surface (MJ/day/m

^{2}) - 9.
Sunshine duration (hour)

- 10.
Maximum temperature (°C)

- 11.
Minimum temperature (°C)

- 12.
Relative Humidity at 8:30 hrs IST (%)

- 13.
Relative Humidity at 17:30 hrs IST (%) and

- 14.
Wind speed (km/hour).

Design and training parameters of the ANN model.

Sl. no. | Parameter | Selected value | Remarks |
---|---|---|---|

1 | No. of hidden layers | 1 and 2 | Simplest and next simpler models were selected |

2 | No. of hidden neurons | 4, 5 and 3 | Chosen such that the it minimizes the error and avoids over fitting |

3 | Activation function – Hidden layers | “tansig” | \({\rm F}\left( {\rm x} \right)={\rm 2} \mathord{\left/ {\vphantom {{\rm 2} {\left( {{\rm 1}+{\rm exp}\left( {-{\rm 2x}} \right)} \right)-{\rm 1}}}} \right. \kern-\nulldelimiterspace} {\left( {{\rm 1}+{\rm exp}\left( {-{\rm 2x}} \right)} \right)-{\rm 1}}\) |

4 | Activation function – the output layer | “purelin” | F(x)=x |

5 | Training function | trainlm | Levenberg–Marquardt back propagation algorithm |

6 | Learning function | learngdm | Gradient descent with momentum learning function to update the weights/biases. |

7 | Learning rate | 0.001 | Adaptive learning rate algorithm was used |

8 | Performance function | msereg | Chosen to avoid over fitting |

9 | Goal | 0.0001 | |

10 | Epochs | 1000 |

Monthly mean daily values of these parameters were given as inputs to our network. Single neuron in the output layer corresponds to the output parameter, monthly mean daily global radiation on a horizontal surface.

Back propagation network is a supervised learning network, i.e., a network that learns with a teacher. The network is trained with a training data that consists of an input vector set and a target vector set. While training the network, weights and biases are so adjusted that the error between the target and the predicted output vectors is minimized. In the first phase of the training, the input vectors are propagated in the forward direction from the input layer to the output layer (called the Forward phase) and in the second phase, the error is propagated in the backward direction (called the Backward phase) to update the weights minimizing the errors.

**p**represents the input vector; y

^{1}and

**y**

^{n}represent the output vector of first and

*n*th layers respectively;

**W**

^{n}and

**b**

^{n}are the weight matrix and the bias vector of the

*n*th layer respectively;

*M*represents the total number of layers in the network; and f

^{n}represents the activation function of the

*n*th layer.

*K*is the number of neurons in the output layer;

*N*is the number of samples in the input–output pair;

*d*

_{ij}and

*a*

_{ij}represent the desired output and the actual output of the

*i*th neuron corresponding to the

*j*th sample of the training data; and

*e*

_{ij}represents the error corresponding to the

*i*th neuron and

*j*th sample and is given by:

- Step 3.
The Jacobian matrix is constructed using the equation

**X**= [x

_{1}x

_{2}... x

_{P}]

^{T}is the vector of adjustable parameters, i.e., weights and biases.

- Step 4.
The weights and biases are adjusted according to the equation

**I**is the identity matrix and \(\upmu \) is the learning parameter.

- Step 5.
The sum of squared errors is recomputed using the updated weights and biases. If the new value is smaller than the previous value of the sum of squared errors, then \(\upmu \) is divided by a factor of \(\upbeta \) and the steps 1–5 are repeated. If the new value of sum of squared errors is higher than its previous value, then \(\upmu \) is multiplied by a factor of \(\upbeta \) and steps 4–5 are repeated. We have used \(\upmu \) = 0.001 and \(\upbeta \) = 10.

*et al.*1996). In the MLFF network, the number of neurons in the input layer, output layer and the hidden layers should satisfy the inequality (8 and 9) (Salai Selvam and Shelbagadevi 2010).

*N*

_{i},

*N*

_{h1},

*N*

_{h2}and

*N*

_{o}are the number of neurons in input layer, hidden layer-1, hidden layer-2 and output layer respectively and N is the number of data points used for training.

In the designed 3-layer MLFF network with four hidden neurons, the number of adjustable parameters came out to be 65. In our 4-layer MLFF network, we chose *N*_{h1} and *N*_{h2} to be 5 and 3, respectively resulting in the number of adjustable parameters to be 97. The size of the training data is 530. Hence the designed network can generalize itself. Another way to improve the generalization is to adopt ‘regularization’ which involves modifying the performance function of the network from ‘mse’ to ‘msereg’. This was also done to ensure that the network generalizes well. Here, \({\rm msereg}=\upalpha\; {\rm mse}+\left( 1-\upalpha \right){\rm msw}\), where mse and msw are mean squared values of errors and weights; \(\upalpha \) is a parameter which determines the relative significance of mse and msw.

*D*and

*D*

_{n}are the actual and the normalized data respectively;

*D*

_{ max}and

*D*

_{ min}are the maximum and minimum values of the actual data. The implementation of the ANN model was done using MATLAB. Training is terminated when the set goal value of performance function is reached or the total number of epochs is completed. MATLAB code for the implementation of ANN model used in the present study shall be made available to the interested readers by the authors on request.

### 2.2 Data

Stations used for the study.

Sl. no. | Climatic zone | Station | Lattitude (°) | Longitude (°) | Altitude (m) | No. of data points (month) |
---|---|---|---|---|---|---|

1 | Hill station | Kodaikanal | 10.2333 | 77.4667 | 2345 | 190 |

2 | Coastal area | Chennai | 13 | 80.1833 | 16 | 342 |

3 | Continental area | Coimbatore | 11 | 77 | 409 | 33 |

The above-mentioned meteorological parameters have been obtained for a period of 342 months (1980 to 2009) for Chennai, of which the data of 2009 (12 months) have been used for testing and the rest (330) were used for training the network. For Kodaikanal, the global radiation data were available for 190 months during the period from 1984 to 2009 with many data gaps though the other meteorological parameters were available for the entire period. Of these, the data of 2009 (12 months) have been used for testing and the rest (178) were used for training the network. At Coimbatore, the measurement of global radiation was started by India Meteorological Department in the year 2006. Hence, the data for 33 months during the period from 2006 to 2008 were only available, of which the data of 2008 (11 months) were used for testing and the rest (22) were used for training.

*H*

_{d}is daily extraterrestrial radiation on a horizontal surface on the

*n*th day, I

_{0}is hourly extraterrestrial radiation,

*n*is the day of the year starting from first January, \(\upomega \) is the sunrise hour angle on

*n*th day, \(\phi \) is the latitude of the location of our interest, \(\updelta \) is the solar declination on

*n*th day.

### 2.3 Performance evaluation methods

- (i)
Mean Bias Error (MBE) defined as:

*K*is the total number of observations,

*H*\(^{\rm i}_{\rm obs}\) and

*H*\(^{\rm i}_{\rm cal}\) are the

*i*th observed value and

*i*th calculated value of global radiation.

- (ii)
Root Mean Square Error (RMSE) defined as:

- (iii)
Mean Absolute Percentage Error (MAPE) defined as:

*t*-test was carried out at the 95% confidence level. TS should lie within the interval defined by −

*T*

_{c}and +

*T*

_{c}where

*T*

_{c}, the critical

*TS*value is obtained from the Student’s distribution at desired confidence level with (

*K*− 1) degrees of freedom (Daniel and Terrell 1992). The smaller the value of

*TS*the better is the performance of the model.

## 3 Results and discussion

*K*− 1) degrees of freedom.

*K*= 12 in the case of Chennai and Kodaikanal,

*K*= 11 for Coimbatore, and

*K*= 35 for the overall model. Estimated TS values for the testing phase are compared with the respective critical values in table 3. The

*TS*values for Chennai and Coimbatore lie within the critical limit. However, the

*TS*value of Kodaikanal lies outside the critical limit. Kodaikanal is a high altitude station. The greenhouse warming in such places is much lower than that in other places. The role of greenhouse gases in the surface radiation budget is not included in our study. We have not used the concentration of greenhouse gases as input in the ANN model. This could be the possible cause for the higher value of

*TS*for Kodaikanal. Due to this, the

*TS*value obtained for all the three stations together also lies outside the critical limit.

Comparison of TS values with *T*_{c}.

Location | T | TS (4-layer MLFF network) |
---|---|---|

Chennai | 2.2010 | 0.7867 |

Kodaikanal | 2.2010 | 3.3414 |

Coimbatore | 2.2282 | 0.2871 |

Overall | 2.0322 | 2.2362 |

## 4 Conclusion

The uniform values of MAPE indicate that the model proposed in this study has generalized the training data of all the three locations equally well. The locations taken for this study have different climatic conditions. The ANN model has been trained to generalize for any new locations with similar climatic conditions as these three stations. Hence, this model can be used to predict the monthly mean daily global radiation at any location in Tamil Nadu.

## Acknowledgements

Authors would like to convey their sincere gratitude to Dr K S Reddy, Indian Institute of Technology, Chennai, India for having given some valuable suggestions regarding the presented work. They also wish to express their sincere gratitude to the reviewers for the valuable suggestions which enabled to improve the manuscript. They thank the colleagues in the Department of English who helped in improving the presentation.