Introduction

Renewable energy sources have taken increasingly significant attention these days. Particularly, solar energy that could contribute efficiently to attain the proper solution for the rapid growth problem in energy demand. The short-term solution can be through offering the sustainable system design via hybridizing solar energy with fossil fuel to sustain the existing energy resources, while the long-term solution can be the entirely replacing for the conventional energy sources to compensate the shortage in these resources. The depletion of fossil fuel resources (oil, natural gas, coal) approximately would be up to 2042, except coal which will last after 2042 [1].

The primary assessment of the potential of solar energy at a specific site is essential for selecting and designing solar energy systems (e.g., photovoltaic systems and solar concentrated collectors). However, the substantial impact of uncertainty of the solar irradiance forecast (especially, direct normal irradiance) on the solar power plants output and their profitability over time should be addressed. Moreover, much attention should been paid to the significance of acquiring hour-ahead or day-ahead forecasts of solar irradiance [2]. Accordingly, most recent studies have emphasized on attaining the best forecast accuracy based on high-quality solar irradiance data to reduce the effect of the intermittency nature of solar energy on the uncertainty in the optimal design parameters and the errors in all modeling and measurements [3,4,5].

The solar radiation that travels through the sky until reaching the earth’s surface can obtain various forms: direct (beam), diffuse, and reflected (scattered) radiation based on the distance traveled through the atmosphere, the cloudiness amount, the ozone layer intensity, the concentration of haze in the air (water vapor, dust particles, pollutants, etc.), and types of ground surface [6]. Indeed, the most relevant component of solar radiation for concentrated solar power technologies (including parabolic trough, central receiver, linear Fresnel reflector, and parabolic dish) is the direct normal irradiance (DNI). Thus, the performance of the previous technologies reduces dramatically with growing cloud cover; whereas, photovoltaics can generate electric power from diffuse irradiation. Therefore, the long-term evaluation for the technical and economic performance of solar energy technologies is based on the availability of solar radiation data and their accuracy. To move successfully from the investment in small to large-scale solar projects, accurate solar radiation data are essential because small uncertainty in the measured and estimated quantity of solar radiation may jeopardize the economic feasibility of proposed solar projects [7]. Solar radiation measuring instruments (e.g., pyranometer and pyrheliometer) are utilized to obtain reliable solar radiation data over various periods of time [6]. However, the measured data may not available or easily accessible due to the high cost of instruments which used in measuring stations and the technical difficulties to calibrate these instruments, especially in developing countries.

The lack of measured DNI data at the most solar project’s sites is a challenging task for researchers and workers in the field of solar energy applications. Despite the availability of global a horizontal irradiance (GHI) and diffuse (DHI) a horizontal irradiance data that can be used to obtain DNI values, there is still a need to model the solar resource in most cases. Consequently, most researchers in this field have formulated various models, regression equations, and empirical correlations to predict solar radiation based on the division basis of the time period (e.g., hourly, daily, monthly) and on the meteorological and geographic parameters. These parameters are maximum and minimum temperature, relative humidity, sunshine duration, cleanness index, cloud cover, geographical site, etc. [7]. The estimated datasets from various models, regression equations, and empirical correlations require precise validation via comparing with high-quality measured datasets. For large-scale solar projects, the importance of the mutual relationship between a lower uncertainty in solar radiation data, minimal financial risks, and profitability has been discussed in [5].

Existing models and methods

The significance of solar radiation modeling emerged through presenting numerous literature which include developing various models, regression equations, and empirical correlations to estimate solar radiation. However, the considerable abundance of models that use for obtaining the solar radiation data sets requires assessing their validity and performance. Therefore, the “Existing models and methods” of this work is allocated for introducing a comprehensive overview of the existing models and methods which included in the various literature. The two categories of solar radiation models: parametric and decomposition are used to predict beam (direct), diffuse, and global components of irradiance based on the availability of other measured or calculated quantities. The parametric (broadband) models have been formulated based on astronomical, atmospheric and geographic parameters to predict the solar irradiance precisely. Additionally, these models are the better choice than decomposition models when meteorological data are not obtainable [6, 8,9,10]. First models have been formulated and tested to estimate the amount of clear-sky direct and diffuse solar radiation on a horizontal surfaces under various climate conditions [11,12,13]. The attenuation influence of a large range of atmospheric constituents on the DNI has been studied. This study demonstrated that the major attenuation was occurred by effecting of constituents, molecular scattering, and water vapor absorption, respectively, while the ozone layer and CO2 have a minor effect. The tested models have shown a reasonable agreement with small values of the zenith angle [14, 15]. The availability of the input parameters (aerosol optical depth or Link turbidity) and implementation simplicity were used as the selection criteria for a number of clear-sky solar irradiance models and to evaluate their accuracy. The parameters, which are measured locally, were more recommended than climatic data sets to avoid underestimated values of the direct and global irradiance [16]. Several simple clear and cloudy sky models of solar global irradiance that do not need meteorological data as inputs have been evaluated. The models can be used to predict the global irradiance for the next few hours or might be for the next day. In addition, the clear-sky model can be used for partially cloudy days and the estimated total cloud amount is crucial for the cloudy sky model [17]. Three types of analyses have been used to assess the validity, limitations, and performance of many clear-sky solar irradiance models. These analyses were carried out based on studying the effect of atmospheric effects (e.g., water vapor absorption, aerosol extinction), statistical evaluation, and comparison with a large number of calculated and measured data [18]. The performance of broadband models has been evaluated to identify their accuracy to predict clear-sky direct normal irradiance (DNI) by comparing with high-quality measurements along with a large range of conditions that were selected carefully. Furthermore, the uncertainty in the predicted values of DNI increase pointedly with air mass and they were more sensitive to errors in values of turbidity and precipitable water, which are the two substantial inputs of the parametric models [9, 19]. The evaluation procedure, which consists of 42 stages, has been created to test 54 parametric models through the sensitivity analysis. These models can be used to compute global and diffuse irradiance on a horizontal surface. The input data for the models have been adopted from satellite measurements including ground meteorological data and atmospheric column integrated data [20]. The significant review for eighteen clear-sky models has been carried out to assess their performance by comparison between predicted values and measured values under various climate conditions. The high-quality input data were collected from five locations. The selected models can be applied to set up solar datasets, solar resource maps, and large-scale applications. All models were ranked based on their accuracy that determined by four statistical indicators. It has also been found that there is complexity in the prediction of DNI, the prediction of DHI is less accurate, and the number of the model input may not have that obvious influence on its performance and precise [21]. To select a suitable site to install the concentrating solar power plant, seventeen clear-sky models have been studied to verify which model can be used for predicting the more precise values of direct normal irradiance. The performance and accuracy of the models have been tested by comparing their predictions with measured irradiance of a specific site along with using the statistical accuracy indicators. The parametric models have been classified into two groups: simple models that are included less than three inputs (astronomical and geographical parameters) such as ASHRAE, Meinel, HLJ, etc., and complex models that are based on various parameters (the air mass, the ozone layer, aerosols, precipitable water and Linke turbidity factor) such as Bird family models. It is worth noting that simpler models can offer more accurate DNI data than complex models, in other words, an increase in the number of model inputs (e.g., atmospheric parameters) may not necessarily enhance the accuracy and performance of a model [22].

Based on the above-mentioned, the clear-sky models (parametric models) have been developed to estimate the clear-sky irradiation (in the absence of clouds). Hence, they cannot be used to predicate direct normal irradiance (DNI) under cloudy conditions. Consequently, decomposition models are based on the phenomenon of fitting the historical experimental data through empirical correlations, which are typically utilized to calculate direct normal radiation and diffuse radiation on a horizontal surface from global solar radiation data [23]. It is axiomatic that the availability of solar radiation at the earth’s surface is considerably influenced by cloudy sky condition. The direct normal irradiance is attenuated significantly with increasing cloud cover and its value may be reached to zero. In contrast, once the value of cloud cover attains intermediate range values, the diffuse solar irradiance (sky radiation) starts growing in the sky until mounting to a maximum value at high range values of cloud cover, or fading to zero at the overcast sky condition [24]. Because of that, the sky state study, based on the temporal and spatial distribution modeling of clouds, is crucial to estimate the availability of all radiation types at a specific site [25]. The various concepts of cloud detection and classification have been discussed, various techniques were developed for cloud classification based on instruments (ground-based, satellite integrated) that used to determine the state of the sky [26,27,28].

Numerous types of cloud cover-based models have developed to estimate hourly and daily solar radiation using cloud cover data [2, 29,30,31]. The cloud-cover radiation model (CRM) is widely used to obtain hourly global solar irradiance forecast based on the cloud cover, which is measured in Oktas and ranging from zero Oktas (an entirely clear sky) through eight Oktas (an entirely overcast sky). The CRM was developed by Kasten and Czeplak using 10 years of hourly cloud amount data [32]. Many researchers have tested the Kasten–Czeplak model (CRM) using the dataset of various sites around the world, and to improve the model’s accuracy, the locally fitted coefficients for each of the selected locations were determined by regression analysis [25, 27, 29, 30, 32,33,34,35,36].

In order to obtain average hourly solar radiation values from long-term daily values, global solar radiation decomposition models can be used to transform daily solar radiation values into hourly solar radiation values [37]. The existing models can be divided into three categories based on parameters, physical significance, and constructing methods: the first group of models entails the time factor like solar time, day length, solar hour angle, etc. The most widely used models are the Whillier model [38], Liu and Jordan model [39], and Collares-Pereira and Rabl model [40, 41], the second group of models is developed in the Gaussian function form such as Jain model 1 [42], Jain model 2 [43], Shazly model [44], and Baig et al. model [45]. Newell model [46] is the most known model of the third group of models, which is modified from the Collares-Pereira and Rabl model [8, 36, 47].

Other empirical models have been developed by correlating the clearness index, diffuse fraction, and meteorological parameters based on using the measured data of selected sites to estimate the global and diffuse solar irradiation. The meteorological parameters consist of sunshine period, cloud cover, minimum and maximum temperature, relative humidity, and geographical location.

The clearness index is a random parameter which can sense the meteorological stochastic effects (e.g., atmospheric aerosols, cloudiness, temperature, etc.) on the solar radiation for a time of the day, a season of the year, and a geographical site [48]. It should be noted that the clearness index is sensitive to the short-term effects (atmospheric influences which are described by statistics and the long-term effects (Earth’s movement which is described by astronomy) [49]. In general, it represents the ratio of the global solar irradiance on a terrestrial a horizontal surface (which is a stochastic quantity) to the global solar irradiance on an extraterrestrial a horizontal surface (which is a deterministic quantity) for the same time and site [6, 50]. In this context, the concepts of long-term of solar radiation data (either daily or monthly average daily) and short-term of solar radiation data (either hourly or monthly average hourly) can be utilized to estimate the cleanness index [6]. As already stated, the clearness index and diffuse fraction are essential factors for evaluating the impacts of cloud on extraterrestrial radiation. Therefore, they both should be considered as random variables to construct probability functions (PDF and CDF) through studying the statistical distribution of their past occurrence to predict their future values within a precise range. Based on that, several investigators have used probability function, which depends on local conditions, in modeling clearness index to predict terrestrial solar radiation and to classify the level of the sky clearness [10, 39, 49, 51,52,53,54,55].

The sunshine duration is another key indicator for specifying the different sky conditions along with the clearness index and cloud cover. It is the ratio of the actual (bright) hours of sunshine (which is a stochastic value) to the average daylight hours (which is a deterministic value). When the sky is completely cloudless, the bright sunshine hours will be equal to the average daylight hours and the ratio will be 1 and the majority of radiations that gained by the solar energy systems are direct normal irradiance (DNI). In contrast, on a completely or partially cloudy day, the bright sunshine hours may reach zero, thus diffuse radiation will dominate the working of solar energy systems during the time of spreading scattered thin clouds in the sky [36]. When the sunshine duration fraction is approximately 0.3–0.5, the highest diffuse radiation values typically is obtained [23]. However, the uncertainty influence of scattered clouds and their movement in the sky is still representing a great obstacle in estimating a nature and quantity of received radiations on the earth surface [56]. The estimation of sunshine duration data from cloud cover by developing an empirical correlation is quite useful to calculate global solar radiation on a horizontal surface [57]. In the same context, a simple theoretical model has been presented that represents the interrelation of sunshine duration and cloud cover fraction to predict cloud cover fraction that can be further used to calculate global solar radiation on a horizontal surface (GHI) under different sky conditions [56].

Thus, the Angstrom–Prescott correlation, which represents the simple, linear, and pioneering relationship between clearness index and relative sunshine, was established by Angstrom and then was modified by Prescott [58, 59]. Over the last decades, there were considerable endeavors for evaluating and interpreting the Angstrom–Prescott equation [60]. New formulations (either linear or non-linear) of the Angstrom–Prescott equation were proposed by many researchers using clearness index against sunshine fraction [8, 36, 47, 57, 61,62,63,64,65,66,67,68,69], ambient temperature [8, 47, 62, 67, 69, 70], relative humidity [8, 47, 69], precipitation [47, 62, 71, 72], cloud cover [47, 57, 61, 73], and multi-parameters [47, 60, 67, 69, 74].

It is obvious that the performance evaluation of solar energy systems (solar photovoltaics and solar thermal applications) and selecting their optimized design depends on the availability of solar radiation data and its components. The diffuse radiation is undoubtedly a significant component besides direct normal irradiance for assessing the solar radiation quality. Hence, numerous empirical correlations have been developed to predict diffuse radiation or monthly average daily diffuse solar radiation using clearness index, relative sunshine duration, and cloud cover data [10]. The first correlation developed by Liu and Jordan [39] to estimate hourly diffuse radiation on a horizontal surface from global solar radiation, and based on the same concept, many correlations have been modified by researchers using a large amount of data from different locations over a period of years [75,76,77,78,79]. Other models have been developed for calculating monthly average diffuse solar radiation by employing regression analysis to correlate diffuse fraction with clearness index and relative sunshine duration [39, 80,81,82]. To enhance the accuracy of models for estimating diffuse solar radiation or monthly average daily diffuse solar radiation, several researchers have demonstrated the importance of adding more variables such as ambient temperature, relative humidity, cloud cover, etc. [83]. The prediction of hourly, daily, and monthly global solar radiation and its components on inclined surfaces were discussed in [48, 84, 85] because the maximum amount of incident solar radiation is received on inclined surfaces.

Although the quite abundant of models and evaluation methods for them that were presented by the existing literature over a few decades ago, it is rarely, in the current literature, finding proper methodologies that can be easily followed by researchers, engineers, and workers in the field of designing solar energy systems to create solar radiation datasets and to evaluate solar energy availability under different sky conditions for assorted solar radiations. Consequently, the aim of this study is to develop two hierarchical calculation methodologies for estimating hourly solar irradiance using various models, empirical correlations and regression equations. Specifically, hourly direct normal irradiance data are utilized for designing solar concentrated collectors. Additionally, the preliminary evaluation for the potential of solar energy in the selected region is carried out by performing a comprehensive analysis of the solar irradiance data and the clearness index to make a proper decision for the capability of utilizing solar energy technologies. The validation and performance evaluation of the proposed approaches for estimating solar data are carried out by using various statistical indicators while comparing with measured solar data.

Theoretical analysis

The design and operation of various solar energy technologies and their applications such as photovoltaic systems and concentrated solar thermal energy systems require obtaining high-quality solar irradiance data for a specific site at any time of a day and a year to make the long-term evaluation for the techno-economic performance for these technologies. Thus, various existing models, empirical correlations and regression equations, which have been discussed in detail in “Existing models and methods”, will be investigated along with developing some regression equations in this work to predict different solar radiation types based on the time period and the meteorological and geographic parameters.

Estimation of hourly direct normal irradiance

Parametric (broadband) models

A large number of parametric models are selected and then tested for accuracy fit by using statistical indicators. The existing models, which are formulated based on astronomical, atmospheric and geographic parameters, are used to predict direct normal irradiance (\(I_{\text{DNI }}\)) under clear-sky condition. The performance of 22 models can be assessed by comparing their results with the measured high-quality datasets through statistical indictors. These models are summarized in Table 1.

Table 1 Summary of selected parametric models

Based on the above-mentioned description of parametric models, they can be classified: a simple group, and complex group. The simple models are developed by using the zenith angle in addition to a few atmospheric parameters such as temperature, pressure and relative humidity such as Meinel Laue, Kasten and Czeplak, etc., whereas, various input atmospheric parameters such as aerosols, ozone layer and perceptible water are included in models that account as a complex group such as Davies and Hay, Hoyt (Iqbal B), etc. Table 2 is the summary of various astronomical and atmospheric parameters which are used to develop the models.

Table 2 Summary of astronomical and atmospheric parameters

Cloud cover model (CRM)

In order to predict direct normal irradiance (DNI) under different sky conditions, the cloud-cover radiation model (CRM), which represents a regression-type model and described in detail in “Existing models and methods” can be used. The performance of this model is evaluated against the dataset extracted from a selected site. The first step toward determining DNI from the Kasten–Czeplak model (CRM) is to estimate the hourly global solar radiation \((I_{{{\text{G}}_{\text{cs}} }} )\) on a horizontal surface under a cloudless sky. The obtained value is used along with cloud cover range (measured in Oktas) to find the hourly global radiation \((I_{{{\text{G}}_{\text{cc}} }} )\) on a horizontal surface under cloud cover condition. Several instruments (ground-based, satellite integrated) are utilized to determine the sky conditions. Next, the hourly diffuse radiation \((I_{\text{d}} )\) is determined to obtain the value of hourly DNI \((I_{\text{DNI}} )\) under different sky conditions as described in the following formulas that are summarized in Table 3.

Table 3 Cloud-cover radiation model (CRM)

A hierarchical calculation methodology

Accordingly, the hourly direct normal irradiance under various sky conditions for different geographical locations can be estimated based on the previous equations, which may contribute to compensate for lack of the solar dataset for a certain site. It should be noted that the availability of DNI dataset is essential to the design and operation of concentrated solar power technologies including central receiver, linear Fresnel, dish sterling and parabolic trough collector, particularly if the expected contribution of these technologies in the total renewable energy production would be about 50.34% by 2030 [22]. The hierarchical methodology is summarized in Fig. 1, which can be used to predict DNI values in this work through testing fit accuracy of the selected models using statistical indicators and high-quality measured datasets.

Fig. 1
figure 1

A hierarchical methodology of predicting DNI

Estimation of monthly average hourly direct solar irradiance from daily data

Daily global solar radiation (decomposition models)

The decomposition models can be utilized to transform daily values (long-term data) of solar radiation into hourly values (short-term data). The two frequently used correlations for this purpose were chosen. The Collares-Pereira and Rabl correlation represents the ratio of monthly average hourly global irradiance \((\bar{I}_{\text{G}} )\) to monthly average daily global irradiance \((\bar{H}_{\text{G}} )\), whereas, the Liu and Jordan correlation represents the ratio of monthly average hourly diffuse irradiance \((\bar{I}_{\text{d}} )\) to monthly average daily diffuse irradiance \((\bar{H}_{\text{d}} )\) [6], as illustrated in Table 4.

Table 4 Two decomposition models

Angstrom–Prescott correlation

A number of formulations (linear and non-linear) of the Angstrom–Prescott correlation were selected for the estimation of the monthly average daily global solar radiation on a horizontal surface \((\bar{H}_{\text{G}} )\) using clearness index \((\bar{K}_{\text{T}} )\) against sunshine fraction (\(\frac{{\bar{S}}}{{\bar{S}_{\text{o}} }}\)), ambient temperature (T), relative humidity (R) latitude (L), site elevation (H), cloud cover, and multi-parameters. Four of regression equations have been utilized that developed by modifying the Angstrom–Prescott correlation as given in Table 5.

Table 5 Regression equations of Angstrom–Prescott model

Empirical models

Decomposition models were developed to estimate hourly global and diffuse irradiance that have an essential role in solar energy engineering applications. Such models are formulated based on the correlations between the diffuse fraction \(\left( {\frac{{\bar{H}_{\text{d}} }}{{\bar{H}_{\text{G}} }}} \right)\), cleanness index \(\left( {\frac{{\bar{H}_{\text{G}} }}{{\bar{H}_{\text{o}} }}} \right)\), and sunshine fraction \(\left( {\frac{{\bar{S}}}{{\bar{S}_{\text{o}} }}} \right)\). Four representative models were selected which are expressed as the ratio of diffuse \((\bar{H}_{\text{d}} )\) to global irradiance \((\bar{H}_{\text{G}} )\) on a horizontal surface. These are described as in Table 6.

Table 6 Summary of empirical models

A hierarchical calculation methodology

The implementation of calculating monthly average hourly direct solar irradiance (\((\bar{I}_{\text{DNI}} )\) from daily data requires using a hierarchical calculation methodology that consists of multiple sequences steps as described in Fig. 2. The first step in a proposed approach is to estimate geographical and astronomical parameters \((L, \theta_{\delta } , \theta_{\text{hs}} , T,R,H)\) based on a selected site and period of time through using Eqs. (35, 39). In order to estimate monthly average daily global irradiance on a horizontal surface (\(\bar{H}_{\text{G}}\)) from equations of Table 5 and monthly average daily diffuse (\(\bar{H}_{\text{d}}\)) on a horizontal surface from equations of Table 6, the estimated values of monthly average daily extraterrestrial irradiance \(H_{\text{o}}\) (from Eq. 48) and maximum possible monthly average daily length \(S_{\text{o}}\) (from Eq. 49) should be determined. Next, the obtained daily irradiance data can be transformed to the hourly irradiance data by utilizing Eq. (39) to estimate the value of monthly average hourly global irradiance on a horizontal surface (\(\bar{I}_{\text{G}}\)), and Eq. (40) to estimate monthly average hourly diffuse irradiance on a horizontal surface (\(\bar{I}_{\text{d}}\)). Once, the values of (\(\bar{I}_{\text{G}}\)) and (\(\bar{I}_{\text{d}}\)) are obtained, monthly average hourly direct solar irradiance \((\bar{I}_{\text{DNI}} )\) can be estimated from Eqs. (41) and (42). Eventually, to demonstrate the capability of the proposed methodology and used equations, the statistical indicators can be utilized for comparing estimated irradiance values with measured irradiance datasets.

Fig. 2
figure 2

A hierarchical methodology of predicting monthly average hourly direct solar irradiance

Site description and data collection

In order to demonstrate validation of proposed methodologies and selected models to estimate reliable and high-quality solar radiation data for different sites in Texas or other locations around the world, San Antonio city (29.42° N, 98.49° W) was chosen as a case study as depicted in Fig. 3 [87], which represents one of the significant hotspots in the United States due to various activities of water–energy–food nexus [88] such as shale oil and gas production [89,90,91], agricultural production [92, 93], etc. The solar data for San Antonio is obtained from the National Solar Radiation Data Base (NSRDB) between 1991 and 2010 are: hourly global solar irradiance, hourly direct solar irradiance, hourly diffuse solar irradiance, hourly solar incidence angle, hourly dry bulk temperature, hourly wet bulk temperature, and relative humidity.

Fig. 3
figure 3

The location map of a case study in Texas [87]

Statistical methods of model evaluation

The performance of proposed methodologies and selected models have been tested through comparison between their estimated data and measured data by using various statistical indicators. For this purpose, five statistical indicators were applied including mean bias error (MBE), root mean square error (RMSE), absolute percent error (MAPE), coefficient of determination (R2), t statistic method (tstat), and the percentage error (e  %), as given in Table 7.

Table 7 Statistical indicators

Results and discussion

In this study, the monthly average daily global irradiance data on a horizontal surface, which was measured in San Antonio, Texas, during the time period 1991–2010, was analyzed to calculate the monthly average clearness index (\(\bar{K}_{\text{T}}\)). This index is the ratio between monthly average daily total radiation on a terrestrial horizontal surface (\(\bar{H}\)) and monthly average daily total radiation on an extraterrestrial horizontal surface (\(\bar{H}_{\text{o}}\)), as defined in Eq. (47). The comparison between the obtained values from calculating (\(\bar{K}_{\text{T}}\)) in e time interval 1991–2010 and the values of (\(\bar{K}_{\text{T}}\)) that provided by Solar Energy Information Data Bank (SEIDB) [94] in the time interval 1952–1975 was carried out and its result has shown a responsible agreement, as shown in Fig. 4. Similarly, the monthly average hourly clearness index (\(k_{\text{t}}\)) values are calculated and reported in Table 8, which is the ratio of the global solar irradiance on a horizontal surface (I) to the hourly extraterrestrial solar irradiance on a horizontal surface (Io), as given in Eq. (60).

Fig. 4
figure 4

Monthly average clearness index

Table 8 Monthly average hourly and daily values for the clearness index
$$k_{\text{t}} = \frac{I}{{I_{\text{o}} }}$$
(60)

The daily clearness index can be utilized to partition days throughout the year according to the sky condition (Sunny, partly cloudy and cloudy) that dominates transmission of the extraterrestrial irradiance to the earth surface in the chosen site, as shown in Fig. 5.

Fig. 5
figure 5

Monthly average daily global radiation according to the sky condition

In addition, the solar irradiance may be subjected to the atmospheric attenuation (absorption, diffusion) during passing through the earth atmosphere due to air pollution, cloudy conditions, and other influencing parameters. Therefore, the hourly clearness index (\(k_{\text{t}}\)), which is considered as a stochastic parameter because it is a function of a period of year, seasons, climatic conditions and geographic site, can be used to predict the influence of these parameters by calculating the average daily sunshine (bright) hours based on the classification of clearness index level, as follows:

$${\text{Cloudy}}:k_{\text{t}} < \, 0. 3 ,$$
$${\text{Partly cloudy}}: \, 0. 3 { } \le k_{\text{t}} \le \, 0. 5 ,$$
$${\text{Sunny}}:k_{\text{t}} > \, 0. 5.$$

The analysis of the monthly average hourly clearness index through the classification of the clearness index level shows that more than 80% of the days can be defined as either sunny or partly cloudy and less than 20% of the days are classified as cloudy. It has been also noted that the individual monthly sky conditions percentage of sunny daytime hours exceed 40% from April through September, while the percentage of cloudy daytime hours do not exceed about 20%, as shown in Fig. 6.

Fig. 6
figure 6

Monthly sky conditions of San Antonio, Texas, during daytime hours

It is apparent from the above-mentioned comprehensive analysis of the irradiance data and the clearness index, the selected region is characterized by a relatively high value of the monthly average percentage for sunny and partly cloudy days, which can be more than 80% throughout the year. Furthermore, the monthly average percentage of sunny daytime hours exceeds more than 50% in the interval time June–October along with a relatively high (\(k_{\text{t}}\) > 0.5). Consequently, the San Antonio region in Texas is unequivocally amenable to harnessing solar energy as the prime source of energy by utilizing concentrating and non-concentrating solar energy systems.

In addition to collecting the measured solar irradiance data for the implementation of the proposed methodologies and models, the average daily sunshine hour, average daily length of sunshine hours, ambient temperature and relative humidity are also essential for this purpose, as given in Table 9.

Table 9 Ambient temperature, relative humidity and daily sunshine ratio for San Antonio region

The performance of the selected parametric models (22 models) was tested by comparing its estimations with measured data. The obtained results from implementing the clear-sky models on specific days for 12 months are visualized in Fig. 7a–l. It can be seen that the estimated values of hourly direct normal irradiance for most models are in favorable agreement with the measured values for all the months of the year. However, the accuracy and quality evaluation of models’ performance require statistical tests for selecting the most precise models under the San Antonio climate conditions.

Fig. 7
figure 7figure 7

a Measured and estimated DNI by 22 models for January. b Measured and estimated DNI by 22 models for February. c Measured and estimated DNI by 22 models for March. d Measured and estimated DNI by 22 models for April. e Measured and estimated DNI by 22 models for May. f Measured and estimated DNI by 22 models for June. g Measured and estimated DNI by 22 models for July. h Measured and estimated DNI by 22 models for August. i Measured and estimated DNI by 22 models for September. j Measured and estimated DNI by 22 models for October. k Measured and estimated DNI by 22 models for November. l Measured and estimated DNI by 22 models for December

The results of testing the performance of 22 parametric models through using statistical indicators were tabulated in Appendix 1 (Tables 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22). In addition to more complicated models that consist of a large number of atmospheric parameters such as Davies–Hay, Hoyt (Iqbal B) models, some simpler models like Meinel and Laue have shown a good fit accuracy for all months during the year. Also, the models can be classified into two groups based on their performance during the months of summer and winter seasons. The first group, which includes simple models with a few parameters (less than three geographic and astronomical parameters) such as Meinel, Laue, Haurwitz, Berger–Duffie, ABCG, Kasten–Czeplak, Robledo–Sole, ASHRAE, Kumer and HLJ, can provide relatively accurate DNI values. While the second group, which comprises more sophisticated (complex) models such as Bird, Iqbal C, METSTAT, Modified Iqbal C, CSR, Atwater–Ball, ESRA, Hoyt (Iqbal B), Heliosat-1, Davies–Hay and Iqbal A models, have shown more accuracy in estimating DNI values during winter months (October–March) than summer months (April–September). Thus, precise values of DNI that are essential for selecting a proper location to install solar energy conversion systems and calculating the harvested amount of solar irradiance on the earth surface may be estimated using simpler parametric models.

The impact of cloud amount on the estimation of solar irradiance on a specific month (November is chosen as a study paradigm) under the climate conditions of San Antonio, Texas, was studied by using the cloud-cover radiation model (CRM). The cloud amount utilized in this model is evaluated in Oktas, ranging from 0 to 8, and the regression coefficients of the model were obtained from [85]. It can be observed the significant influence of cloud amount on reducing the intensity of global solar irradiation as shown in Fig. 8, specifically DNI, whereas the amount of diffuse irradiance increases in the atmosphere until reaching zero under an overcast sky.

Fig. 8
figure 8

Estimated global solar irradiance values (by cloud-cover radiation model) for San Antonio, Texas

To elucidate the capability of the hierarchical calculation methodology proposed in “A hierarchical calculation methodology” for estimating DNI precisely, four formulations of the Angstrom–Prescott correlation were developed through regression analysis to determine their coefficients as shown in Table 10. The correlations accuracy was tested by comparing the estimated values of the monthly average daily global solar radiation on a horizontal surface with measured data (which represents monthly average daily solar radiation for 30-year in San Antonio, Texas, offering by [95, 96], the National Solar Radiation Data Base (NSRDB), and Solar Energy Information Data Bank (SEIDB) [94] using statistical indicators, as given in Table 10. It is obvious from Figs. 9, 10, 11, 12 and 13 that the estimated values obtaining from correlations show a good agreement with measured data from different sources.

Table 10 Regression coefficients and statistical indictors of correlations
Fig. 9
figure 9

Comparison between estimated (by four models) and measured (from different sources) values of monthly average daily global solar irradiance for San Antonio, Texas

Fig. 10
figure 10

Estimated (by linear model) and measured values of monthly average daily global solar irradiance for San Antonio, Texas

Fig. 11
figure 11

Estimated (by quadratic model) and measured values of monthly average daily global solar irradiance for San Antonio, Texas

Fig. 12
figure 12

Estimated (by multi-parameters model) and measured values of monthly average daily global solar irradiance for San Antonio, Texas

Fig. 13
figure 13

Estimated (by Gopinathan’s model) and measured values of monthly average daily global solar irradiance for San Antonio, Texas

In addition to the significance of monthly average daily global solar irradiance in calculating monthly average hourly direct solar irradiance on a horizontal surface by using two decompositions models that transform daily solar irradiance data to hourly solar irradiance, monthly average daily diffuse solar irradiance values are essential for the same purpose. Therefore, the validation of four selected empirical models was performed by comparing their estimated values of monthly average daily diffuse solar irradiance against the measured data. Clearly, the estimated values, which were obtained from three models including Collares-Pereira and Rabl, Liu and Jordan, Gopinathan models, are in good agreement with the measured data [95] except for Iqbal model that shows less consent with measured data, as shown in Figs. 14, 15, 16 and 17.

Fig. 14
figure 14

Estimated (by Collares-Pereira and Rabl model) and measured values of monthly average daily diffuse solar irradiance for San Antonio, Texas

Fig. 15
figure 15

Estimated (by Liu and Jordan model) and measured values of monthly average daily diffuse solar irradiance for San Antonio, Texas

Fig. 16
figure 16

Estimated (by Gopinathan model) and measured values of monthly average daily diffuse solar irradiance for San Antonio, Texas

Fig. 17
figure 17

Estimated (by Iqbal model) and measured values of monthly average daily diffuse solar irradiance for San Antonio, Texas

Based on the previously estimated values of monthly average daily global (by linear model) and diffuse (Liu and Jordan model) solar irradiance and two decomposition models. The estimated values of monthly average hourly direct solar irradiance on a horizontal surface were calculated to attain monthly average DNI values through utilizing zenith angle for this purpose. Scatter plot of the estimated values and measured data (extracted from the National Solar Radiation Database (NSRDB) and [95]) is demonstrated in Fig. 18, which exhibits a relative agreement between these values because original coefficients, which are used in Liu and Jordan model and two decomposition models, were not reconsidered for fitting locally as in Angstrom–Prescott correlation (linear model) coefficients. Therefore, an agreement value between the estimated values and measured data may be enhanced by obtaining locally fitted coefficients for used models.

Fig. 18
figure 18

Estimated and measured values of monthly average hourly direct normal solar irradiance for San Antonio, Texas

Conclusions

The significant challenge for exploiting and managing solar energy is the lack of solar radiation datasets and the high cost of measurement equipment in most locations around the world. Consequently, there are quite abundant of models and evaluation methods for them that were presented by the existing literature over a few decades ago, it is rarely, in the current literature, finding proper methodologies that can be easily followed by researchers, engineers, and workers in the field of designing and operation solar energy systems to obtain required solar radiation datasets and to evaluate solar energy availability under different sky conditions for assorted solar radiations. In this study, two hierarchical calculation approaches were developed by using various models, empirical correlations and regression equations to estimate hourly DNI and monthly average hourly DNI data under different sky conditions. The calculation processes can be performed along with the presence or absence of measured solar irradiance data. Additionally, the preliminary assessment for the potential of solar energy was carried out to make a proper decision for installing concentrated solar collectors at the selected site. A case study for the San Antonio region in Texas was solved to demonstrate the accuracy of the proposed approaches for estimating hourly solar irradiance, which is utilized for designing solar concentrated collectors. The obtained results from the study are presented as follows:

  • Based on the preliminary assessment for the potential of solar energy for the selected location by performing the comprehensive analysis. The San Antonio region in Texas is unequivocally amenable to harnessing solar energy as the prime source of energy by utilizing concentrating and non-concentrating solar energy systems because the analysis of the monthly average hourly clearness index through the classification of the clearness index level shows that more than 80% of the days can be defined as either sunny (\(k_{\text{t}}\) > 0.5) or partly cloudy (0.3 ≤ \(k_{\text{t}}\) ≤ 0.5) and less than 20% of the days are classified as cloudy (\(k_{\text{t}}\) < 0.3).

  • Based on five statistical indictors, most estimated values of hourly direct normal irradiance for 22 parametric models are in favorable agreement with the measured values for all the months of the year.

  • Some simple parametric models that have a few parameters (less than three geographic and astronomical parameters) such as Meinel and Laue have shown a good fit accuracy for most months during the year with the values of R2 are in the range of 0.93–0.99. While some values were not consistent perfectly with the measured data.

  • More sophisticated (complex) parametric models such as Bird, Iqbal C, METSTAT, Modified Iqbal C, CSR, Atwater–Ball, ESRA, Hoyt (Iqbal B), Heliosat-1, Davies–Hay and Iqbal A models have shown more accuracy in estimating DNI values during winter months (October–March) with the values R2 are in the range of 0.87–0.99 than summer months (April–September) with the values of R2 are in the range of 0.33–0.96.

  • The significant influence of cloud amount on reducing the intensity of global solar radiation, specifically DNI, was studied by using the cloud-cover radiation model (CRM) and the cloud amount indicator in Oktas, ranging from 0 to 8. For illustrate, the global solar radiation intensity has been attenuated from 765 W/m2 (0 Oktas, clear sky) to 213 W/m2 (8 Oktas, overcast sky). While the amount of diffuse irradiance increases in the atmosphere with growing the cloud amount until reaching zero under the overcast sky.

  • The estimated values of the monthly average daily global solar radiation on a horizontal surface obtaining from four formulations of the Angstrom–Prescott correlation, which were developed through regression analysis to determine their local coefficients, show a good agreement with measured data from different databases with the values of R2 are in the range of 0.98–0.99.

  • The validation of four selected empirical models was performed by comparing their estimated values of monthly average daily diffuse solar irradiance against the measured data. Clearly, the estimated values, which were obtained from three models including Collares-Pereira and Rabl, Liu and Jordan, Gopinathan models, are in good agreement with the measured data with the values of R2 are ranging from 0.94 to 0.98 except for Iqbal model that shows less consent with measured data with the R2 value is 0.65.

  • The estimated values of monthly average hourly direct solar irradiance on a horizontal surface, which were calculated to attain monthly average DNI values through utilizing the Angstrom–Prescott correlation (linear model), the empirical model (Liu and Jordan model), two decomposition models, and zenith angle, showed a relative agreement (R2 = 0.82) with the measured data because some used models require obtaining locally fitted coefficients.

  • It is obvious that the proposed methodologies have offered a reasonably good estimation for the hourly solar radiation values and they can be implemented for other locations around the world by creating new locally fitted coefficients for empirical and regression correlations. However, it is worth noting that the estimated solar data (by solar radiation modeling) can never substitute the measured solar data (by measurement equipment).