Evaluation of diffuse fraction and diffusion coefficient using statistical analysis

In this study, eighty models are proposed in order to estimate the diffuse fraction and diffuse coefficient. For the proposed models, sunshine ratio and clearness index are considered as predictors. Monthly average global and diffuse solar radiation together with sunshine duration data of Tamanrasset station form 1995 to 2017 were analyzed. The different proposed models are compared and statistically analyzed to assess the performance of the best fitted model. Nine statistical indicators and Global Performance Indicator are computed to evaluate different proposed models. It is concluded that the cubic model with sunshine ratio and clearness index is selected as the best accurate model to estimate diffuse solar radiation on a horizontal surface in the study area.


Introduction
Solar energy is a renewable energy resource in nature, and it plays a major factor in between other alternative energy source. For any study of solar energy, the information of solar radiation at a given geographical location is very important (Bakirci 2009(Bakirci , 2015. There are many important radiation parameters such as global, diffuse and direct radiation used in solar energy techniques (Hussain et al. 1999). Solar energy in Algeria is available in abundant amounts across the year; the average duration of sunshine value is 3000 h/year. Also, the average energy is 1700 KW h/m 2 /year in the North and 2650 KW h/m 2 /year in the South (Boudghe-neStambouli 2011; Bouchouicha et al. 2015). There are many studies carried out in the world for estimating diffuse solar radiation using the available data which include hours of solar radiation (Sabbagh et al. 1977;Iqbal 1979;Erbs et al. 1982;De Miguel et al. 2001;Paliatsos et al. 2003;Li et al. 2012).
The recent studies have used empirical models based on mathematical function to estimate the diffuse radiation using clearness index and sunshine ratio, thus playing an essential role in the absence of required technological installations. Many authors have used linear or nonlinear regression models and polynomial models to correlate diffusion coefficient or diffuse fraction with sunshine ratio and/or clearness index (Orgill and Hollands 1977;Spencer Reindl et al. 1990;Lam and Li 1996;Hua et al. 2002;Soares et al. 2004). Kuo et al. (2014) studied the data for global and diffuse radiation in Tainan, Taiwan, for two years; the proposed models are compared with the fourteen models available in the literature; it is concluded that the proposed piece-wise linear models perform well in predicting the diffuse fraction. Liu et al. (2017) developed four models using global solar radiation and sunshine duration data in China; the analysis of statistical indexes demonstrated that cubic models presented the best performance in radiation zone. For Algeria, Mecibah et al. (2014) proposed quadric and cubic models based on the sunshine-based models. Also, Bailek et al. (2017) reviewed and compared thirty-five proposed correlations to measured irradiation of Algerian Big South (Adrar region); it is concluded that the second-order polynomial model of diffuse fraction is able to estimate the monthly average daily diffuse irradiation on a horizontal surface.
The main objective of this study is to develop and compare different proposed empirical models for estimation of horizontal monthly mean diffuse solar radiation based on clearness index and sunshine ratio.

Solar radiation data
Data of horizontal global solar radiation, diffuse solar radiation and sunshine period of Tamanrasset station were taken from National Meteorological Office of Algeria from 1995 to 2017. The geographical information of the station is given in Table 1. Tamanrasset is located in the southeastern region of Algeria (Fig. 1). Tamanrasset has a hot desert climate (Köppen climate classification BWh), with very hot summers and mild winters. There is very little rain throughout the year, although occasional rain does fall in late summer from the northern extension of the Intertropical Convergence Zone.

Proposed of models
In the current study, the regression analysis is used for the proposed models, where the predictand is the diffuse fraction (k d ) or diffuse coefficient (K D ) and the predictors are sunshine ratio (S t ) and clearness index (K t ). Thus, three types of forty models can be defined for e diffuse fraction and diffusion coefficient. The three respective types can be written as: where H 0 , H, and H d are the monthly mean daily extraterrestrial solar radiation, global solar radiation and diffuse solar radiation on a horizontal surface, respectively. Mathematically, sunshine ratio and clearness index are defined as where S and S o are the sunshine duration and maximum possible sunshine durations, respectively. The monthly average daily extraterrestrial solar radiation on a horizontal surface is calculated from the following equation Klein (1977): where H sc is the solar constant, n is the Julian day of the year, φ is the location latitude, and δ is declination angle, ω s is the sunset hour angle. δ and ω s are mathematically defined as: The maximum possible sunshine duration ( S 0 ) is calculated as: The forty proposed models for each diffuse fraction and diffusion coefficient are presented in Table 2.

Statistical evaluation
In this study, nine statistical indicators were used to evaluate different proposed models such as mean bias error (MBE), mean absolute error (MAE), mean absolute relative error (MARE), mean absolute percentage error (MAPE), root mean squared error (RMSE), root mean squared relative error (RMSRE), relative root mean squared error (RRMSE), correlation coefficient (R 2 ) and t-statistics (t-stat). Mathematical expressions for these indicators are defined as: The statistical descriptive of diffuse solar radiation includes minimum, maximum, mean, standard deviation and coefficient of variance is shown in Table 3. The mean value of global and diffuse solar radiation is 2320 and 704.2 MJ/ m 2 day with a standard deviation of 394.4 and 282.35 MJ/ m 2 day, respectively. The coefficient of variance of H and H d is 17 and 40%, respectively. The diffuse fraction and diffusion coefficient values ranged from 0.105 to 0.577 and 0.030 to 0.121 with a mean of 0.298 and 0.074, respectively. High values of k d and K D are observed in the months of April, May, June, July, August and September (Fig. 2). The values of sunshine ratio and clearness index varied from 0.205 to 0.298 and 7.105 to 16.401 with a mean of 0.251 and 12.826, respectively, where high values for both predictors (S t and K t ) are observed between the months of January to December (Fig. 2). From the Fig. 3, it was observed that there is a significant negative correlation between k d -S t (− 0.859), k d -K t (− 0.722), K D -S t (− 0.825), and K D -K t (−0.577).
The statistical indicators of the different models are given in Table 4. For higher modeling accuracy MBE,    ) is observed to be the highest for M23 (k d ) and M63 (K D ) among all the models proposed, which means that the estimated and observed data from the cubic equations shows a maximum closeness.
The results of the statistical indicators show that the estimated diffuse fraction and diffuse coefficient values from the different proposed models are close to each other. Since not all the statistical indicators are in favor of a model, more appropriate combined statistical indicators which can yield a comparative performance of the proposed models need to be established. In this way, we used Global Performance Indicator (GPI) that represents multiplication of all used statistical indicators (Said and Dickey 1983;Despotovic et al. 2015; Jamil and Abid 2018). where α j is weight factor and equals 1 for all indicators, while correlation coefficient (R 2 ) is equal to − 1. y j is the median of the scaled values of indicator j, and y ij is the scaled value of indicator j for model i. A higher value of GPI indicates more accurate model leading to better estimations.
The values of GPI and ranking of the model are shown in Table 5. The GPI of the proposed models ranged from − 4.803 to 0.309 and − 5.70 to 0.23 for k d and K D , respectively. The highest GPI shows the best performing model. Form the Table 5, we observed among the models, 33% and 43% of the total models attain a positive GPI value and the other models have a negative GPI. The maximum value of GPI is observed for model 23 (GPI = 0.309) for k d models and model 63 (GPI = 0.230) for K D models. Thus, it can be inferred that the cubic model best estimates the e diffuse fraction and diffusion coefficient for the study area (Figs. 4,5).

Conclusion
In this study, solar radiation data was used to evaluation the diffuse fraction and diffusion coefficient using sunshine ratio and clearness index as predictors in Tamanrasset station, Algeria. The results show that the high values of k d and K D are observed between the months of April to September; however, those of S t and K t are observed between the months of January to December. Significant negative correlation between k d -S t , k d -K t , K D -S t , and K D K t . Forty models are proposed in order to estimate the diffuse fraction and diffusion coefficient using sunshine ratio and clearness index as predictors. Based on the values of different statistical indicators and GPI, the best models for diffuse fraction and diffusion coefficient are models 23 and 63, respectively.