1 Introduction

Sunshine duration (SD) is an essential meteorological variable which has been routinely measured in meteorological practice since over 160 years (Sanchez-Lorenzo et al. 2013). Nowadays, sunshine duration measurements are available from numerous sites, e.g., the long-term monthly mean number of hours with sunshine duration over the standard normal period 1961–1990 is reported for over thousand sites at the data repository of the World Meteorological Organization (WMO; http://data.un.org/Data.aspx?d=CLINO&f=ElementCode:15). Although sunshine duration data became less important for climate monitoring in recent years—due to the advent of comprehensive solar radiation monitoring networks such as the Baseline Surface Radiation Network (BSRN, Ohmura et al. 1998)—they provide important information for our understanding of past and present climate variability, particularly regarding the decadal variation of global and direct solar radiation (e.g., Sanchez-Lorenzo et al. 2013; Sanchez-Lorenzo and Wild 2012; Stanhill 2003; Wild 2009; Wood and Harrison 2011).

As stated in the seminal paper by Curtis (1898) “the registration of bright sunshine was first introduced as an element of meteorological observation by the late Mr. J. F. Campbell, of Islay”. Campbell developed the first recorder in 1853 and the systems description was published in Campbell’s own presentation in the British Meteorological Societies Report of the Council in 1857 (Curtis 1898). Stanhill (2003) and Sanchez-Lorenzo et al. (2013) provide a comprehensive overview and history of what is now known as the Campbell-Stokes sunshine recorder (hereinafter referred to as CS) and outline that many of the subsequent adjustments and improvements of the original design occurring between 1853 and 1880 have been described by Campbell himself in his “Thermograph” (Campbell 1883). In the late 1870s, the Meteorological Office tasked G.G. Stokes to improve modifications made to Campbell’s instrument at the Kew Observatory (e.g., Sanchez-Lorenzo et al. 2013), which are documented in Stokes (1880). There is ongoing debate about the individual contributions of Campbell and Stokes to what became known as the CS (see Sanchez-Lorenzo et al. 2013; Stanhill 2003). The instrument itself was for the first time introduced as the “Campbell-Stokes recorder” by Ellis (1888), and named following Curtis (1898) “as the ‘Campbell Stokes’ by which justice is done to the two men whose joint creation it is.” It is known within the community under this name ever since. For a brief description of the CS instrument, we refer the reader to section 2.1 of this manuscript, and for further details to Curtis (1898), Stanhill (2003), and Sanchez-Lorenzo et al. (2013).

The history of competing techniques for the measurement of sunshine duration dates back to the development of the first instruments. In the same year, Ellis introduced the CS in his seminal paper (Ellis 1888), another sunshine recording device was introduced by J.B. Jordan (1888). This instrument became known as the Jordan Recorder (hereinafter referred to as JR; Curtis 1898) whose original design dates back to a plan for a “Heliograph” proposed by his father, the late T. B. Jordan.

The CS and JR have been the only instruments coming into general use in the meteorological practice of the nineteenth century. Even back then, a debate about important differences among these instruments and the question about how far results obtained with them could be inter-compared (Curtis 1898) arose—a question, although for other sensor types that motivates also the present manuscript. Based on a comprehensive investigation by the Council of the Royal Meteorological Society, which attested the CS great advantages over the JR, the CS became the standard sunshine recorder. The number of meteorological sites operating it rose to more than 300 within 100 years (e.g., Stanhill 2003; WRDC 1992).

Since then, much progress has been made in instrumentation for SD monitoring; most important, the development of different types of automatic sensors. With the advent of automatic SD sensors, the number of meteorological sites operating such instruments has rapidly increased while the number of stations equipped with the traditional CS instrument has been diminishing (e.g., Sanchez-Lorenzo et al. 2013).

The WMO characterizes five measurement methods for sunshine duration in its Guide to Meteorological Instruments and Methods of Observation: (i) pyrheliometric method, (ii) pyranometric method, (iii) burn method, (iv) contrast method, and (v) scanning method (WMO 2014). A threshold of 120 W/m2 of direct solar irradiance was proposed by the WMO in 2003 to determine bright sunshine (WMO 2014).

The CS belongs to instruments utilizing the burn method, and became in 1962 the WMO’s Interim Reference Sunshine Recorder (IRSR; WMO (1962)). All methods have their individual advantages and disadvantages (WMO 2014) and the comparison of sunshine records obtained with different observing systems has recently received increasing attention (e.g., Kejna and Uscka-Kowalkowska 2006; Kerr and Tabony 2004; Legg 2014; Matuszko 2015; Pokorný and Vaníček 2007; Schöner and Mohnl 1999; Urban and Zając 2016).

The increased interest in such inter-comparisons stems from (i) the great variety of automated sensors (manufacturers Delta OHM, EKO Instruments, Haenni Solar, Kipp&Zonen, Thies Clima, Vaisala, …), (ii) their wide introduction in meteorological networks, and (iii) the resulting problems in data comparability, and hence homogeneity of long-term records, among measurement series obtained with them and heliographs. Furthermore, breaks in the continuity of SD time series, i.e., inhomogeneities, through the transition from CS systems to automated sensors can lead to wrong conclusions about the long-term variability of SD (e.g., Matuszko 2015). Furthermore, a significant influence of cloud type and cloud amount on SD measurements was documented for several locations with parallel SD and sky observations (e.g., Matuszko 2012a and references therein).

Various studies highlight the necessity of parallel measurements (and their comparison) by CS and automated sensors (e.g., Urban and Zając 2016 and references therein) and the WMO requires a verification period of new devices against old ones for at least 12 months (WMO 2014). Several recent papers present interesting results on differences in sunshine duration records obtained with a CS and automated systems (e.g., Matuszko 2015; Urban and Zając 2016). Unfortunately, such investigations are limited to a few locations that have long enough simultaneous records for individual sensor types. The frequent lack of overlapping measurements, due to closure of manned stations simultaneously with the installment of automated sensors, is highlighted in Kerr and Tabony (2004), who, however, were able to compare measurements during overlapping periods, as was also the case with Legg (2014). For those stations where long enough records have been available and comparative analyses have been performed, the authors report significant differences (of both positive and negative signs) in sunshine duration recorded by automated sensors compared to CS (e.g., Kejna and Uscka-Kowalkowska 2006; Matuszko 2012b; Matuszko 2015; Pokorný and Vaníček 2007; Urban and Zając 2016).

The majority of these studies documents comparisons of automated sensors from manufacturer Kipp&Zonen with traditional CS systems. Much less is known about the degree of agreement among (i) CS systems and other automated sensors and (ii) different automated sensors. The present study aims on closing this gap by evaluating 9 years of parallel measurements of sunshine duration by a CS and two automated systems (Kipp&Zonen CSD2 and Haenni Solar 111b) performed at the Kanzelhöhe Observatory for Solar and Environmental Research of the University of Graz, Austria (KSO, 46° 40′ 39″ N, 13° 54′ 06″ E, 1540 m.a.s.l.). Furthermore, we emphasize on problems in homogenizing records among individual time series and in trend analysis applying reconstructed records of automated systems or sensor-type-equivalents.

2 Data

Simultaneous records of sunshine duration have been obtained between 2007 and 2016 with (i) a traditional CS system (operational at KSO since January 1928, Fig. 1 a), (ii) a Kipp&Zonen CSD2 sunshine duration meter (hereinafter referred to as CSD, operational at KSO since December 2004, Fig. 1b), and (iii) a Haenni Solar 111b sunshine sensor (hereinafter referred to as HS, operational at KSO since September 2007, Fig. 1c). The joint record spanning from 1 October 2007 to 30 September 2016 used in this study comprises 3166 observational days (96.3% data coverage). Individual gaps in this period result from system maintenance and calibration, platform renovations or failures in power supply, and data acquisition systems. The longest gap (40 days) occurred in summer 2008 when, following a lightning strike, the data acquisition system had to be replaced. An overview about the location of the individual SD recorders/sensors at the measurement platform of KSO is provided in Fig. 1d.

Fig. 1
figure 1

Sunshine duration (SD) recording instruments at site Kanzelhöhe Observatory (KSO). a Traditional Campbell-Stokes sunshine recorder (CS). b Kipp&Zonen CSD2 sunshine duration meter (CSD). c Haenni Solar 111b sunshine duration sensor (HS). d Overview of sensor locations on the radiation monitoring platform at KSO

KSO has a long tradition in SD monitoring dating back to January 1928. Since then, SD has been almost continuously recorded, although platform locations have slightly varied until 1968, when the CS instrument was moved to its current location at the observatory’s main building. As obstruction of the horizon is throughout the year small at KSO (see supplemental Fig. S1), the geographic location of the observatory provides a unique opportunity for measurements of sunshine duration and solar radiation in Southern Austria. A comparison of CS with CSD and HS as well as the inter-comparison of the latter two is of interest for SD monitoring in Austria as the HS is the standard instrument for SD monitoring operated by the Zentralanstalt für Meteorologie und Geodynamik, Austria (ZAMG), since 1981 within its network of automated weather stations (TAWES sites). The instrument specifics of the three SD recorders/sensors used in this study are briefly described below.

2.1 Campbell-Stokes sunshine recorder

A brief history of the CS sunshine recorder is presented in Sect. 1 of this paper, for more detailed historic accounts, we refer the interested reader to the seminal papers of Campbell (1857) and Stokes (1880) and the reviews of Stanhill (2003) and Sanchez-Lorenzo et al. (2013). As detailed in WMO (2014), the CS in the specification of an IRSR consists of a glass sphere—with a diameter of 10 cm, a focal length of 75 mm for sodium “D” light, and a refractive index of 1.52 ± 0.02—which is mounted concentrically in a section of a spherical gunmetal bowl of radius 7.3 cm. The “recorder” of the CS system is a dark, homogeneous card (with a thickness of 0.4 ± 0.05 mm following IRSR specifications), which is burned by the sharply focused rays of the sun. Through this technique, the CS system belongs to the category of burn method SD recorders. The recording card is held in grooves of the spherical bowl; a standard system comprises three overlapping pairs of grooves so that cards suitable for different seasons of the year can be used (WMO 2014).

The accuracy of sunshine duration measurements by CS can be influenced by astronomical, meteorological, and instrumental factors. Furthermore, the properties of the CS sphere’s glass (e.g., transparency, color, potential scratches), recording cards (e.g., paper type, color, quality of the printed scale), and training/competence of observers maintaining the sphere and exchanging and analyzing the cards can significantly alter the measurement quality (e.g., Brazdil et al. 1994; Matuszko 2012a).

Various sunshine thresholds for CS systems have been reported in the literature, ranging from 70 to 280 W/m2 (Baumgartner 1979; Kuczmarski 1990), the latter being significantly higher than the 120 W/m2 specified in WMO (2014) as average of IRSR experiments performed in France. A threshold accuracy of 20% is accepted according to WMO guidelines (WMO 2014). Rigor guidelines for measuring the burn marks on IRSR record cards shall guarantee uniform results from CS recorders. We note that alternative methods for determining SD from IRSR records were proposed in the recent literature (e.g., Fan and Zhang 2013). Horseman et al. (2013) proposed an innovative semi-automatic method for imaging CS recording cards. Furthermore, several studies documented the usefulness of CS records to derive information on direct solar irradiance (e.g., Sanchez-Romero et al. 2015 and references therein) or to determine additional sky properties (Wood and Harrison 2011 and references therein).

2.2 Kipp&Zonen CSD2 sunshine duration meter

The Kipp&Zonen CSD2 sunshine duration meter is one of the most frequently used automatic devices using the “contrast method” (WMO 2014) for SD observations. SD is measured with the CSD through a high-quality glass tube using three photodiodes with diffusers (e.g., Rösemann 2004). Furthermore, the CSD provides simultaneously a measurement of direct solar irradiance. Following manufacturer specifications, the sunshine signal corresponds to 1 ± 0.1 V for direct radiation >120 W/m2 and a measurement accuracy larger than 90% is achieved for direct signals on a clear day. As with many other automatic devices, the CSD comprises a two-level built-in heating system (1 W continuous heating or 10 W heating for frost, rime, snow removal if temperature is below 10 °C). Data from the CSD is recorded in seconds and is summed into hourly and daily totals for the present study, the latter being also the standard format of use at ZAMG.

Various generations of CSD instruments have been compared against CS systems (e.g., Kerr and Tabony 2004; Legg 2014; Matuszko 2015; Urban and Zając 2016). Given the higher sensitivity of CSD systems compared to that of CS, one would expect that CSD-derived SD records yield larger SD totals than their CS counterparts. This first guess is supported by problems of CS systems registering SD at a low position of the sun over the horizon (3–5°, e.g., Matuszko 2012b). Nevertheless, the majority of studies shows both positive and negative differences between SD measurements of CSD and CS instruments, and generally, no governing pattern could be identified (e.g., Matuszko 2012b; Urban and Zając 2016). Several authors (e.g., Kerr and Tabony 2004; Matuszko 2012b; Matuszko 2015; Urban and Zając 2016) highlight that the largest differences (of both positive and negative signs) occur in the presence of clouds of different layers and/or the presence of scattered clouds.

2.3 Haenni Solar 111b sunshine sensor

The Haenni Solar 111b sunshine sensor uses the contrast method (WMO 2014) for the determination of SD, and is in its current configuration (from manufacturer Kroneis (Austria)), an updated version of the original system from manufacturers Haenni (Switzerland) and Lufft (Germany). The HS sunshine signal corresponds to +5 V for direct radiation >120 W/m2 and −5 V otherwise. The HS comprises two heating options: continuous heating with 1 or (up to) 30 W for rain, snow, and frost removal at temperatures below 5 °C. Since 1981, the HS has been the standard sensor for SD measurements within the TAWES network of ZAMG in Austria. Comparisons of SD totals from CS and HS instruments (Aguilar et al. 2003; Dobesch 1992; Mohr 2012; Schöner and Mohnl 1999) at selected Austrian monitoring sites showed considerable differences among SD totals of these instruments (of both positive and negative signs) and an altitude dependency of the sign of the difference. To the knowledge of the authors to date, no systematic long-term comparison of HS records with those of other automated systems is available. The present study aims on closing this gap.

3 Methods

3.1 Selection and aggregation of individual SD records

Joint observations of all three sensor types at KSO are available starting 7 September 2007, when the HS was installed at the observation platform. For the present study, we consider data for the joint 9-year record spanning from 1 October 2007 to 30 September 2016. Over the joint record, neither of the automated sensors nor the CS sphere has been replaced.

Observations from individual sensors are considered “valid” and thus included in the analysis if daily data coverage is ≥90%. This 90% requirement is applied to data in 1-min temporal resolution from automated sensors, i.e., a day is considered valid if a total of 90% or more of individual (1 min) SD recordings is available for a calendar day. We note that we use the full calendar day (between 00:01 and 24:00 UTC) instead of the astronomical day (i.e., sunrise to sunset, which differs in timing throughout the year) when applying this completeness criterion to avoid an unequal weighting for different seasons of the year. For CS records, the 90% criterion is considered as fulfilled, if the complete set of SD recording cards (two halves) comprising an observation day is available. We note that recording cards are halved at KSO to ensure continuous SD recordings. Recording cards are exchanged at KSO by observers at the time of the synoptic weather observations (7:00 and 14:00 mean solar times). The card collected at 14:00 mean solar time (morning half) and the card collected at 07:00 mean solar time on the next day (afternoon half) compose a daily set. The number of valid observation days over the study period is 3285 for CS, 3270 for HS, and 3178 for CSD. SD data for valid days of individual sensors have been aggregated to daily SD totals, which will be used hereinafter throughout most of this study. When SD is expressed as relative sunshine duration (SDrel), this refers to the ratio of measured and astronomically possible daily SD.

3.2 Estimation of sensor-type-equivalent SD totals

Several authors highlight homogeneity problems when combining SD records derived by different sensor types. While sophisticated break-detection, bias-correction, and homogenization methods have been developed in recent years (e.g., Li-Juan and Zhong-Wei 2012), the application of such techniques to SD records is difficult and rare (e.g., Manara et al. 2015), partly because needed reference data are not available over long enough time periods and/or sufficiently close spatial proximity, partly because cloud coverage, cloud type, and cloud amount differ strongly on spatial scale and with altitude (and this information is rarely provided with the temporal and spatial resolution needed). One approach frequently used in practice to overcome these hurdles is to convert readings from one sensor to another. Such derivation of sensor-type-equivalent (STE) SD metrics, instead of the derivation of homogenized SD records, has been applied, e.g., by Kerr and Tabony (2004), Legg (2014), and Matuszko (2015).

Given the “quasi”-linear relationship between SD totals of different sensor types (see Fig. 3), we derive STE SD totals for each instrument from the other sensor records on daily/monthly time scales. To this aim, three different statistical models are applied/compared: (i) a one covariate linear regression model (LM, Eq. 1); (ii) an expanded one covariate linear regression model (ELM, Eq. 2); and (iii) a quadratic regression model (QM, Eq. 3), as illustrated below:

$$ {\mathrm{STE}}_{i, j}={\beta}_j\cdot {S}_j $$
(1)
$$ {\mathrm{STE}}_{i, j,{M}_b}={\beta}_{j,{M}_{a- c}}\cdot {S}_{j,{M}_b} $$
(2)
$$ {\mathrm{STE}}_{i, j}={\beta}_{1, j}\cdot {S}_j+{\beta}_{2, j}\cdot {S_j}^2 $$
(3)

where subscripts i and j denote the regressed and regressing sensor type (S) and β denotes the regression coefficient(s). The ELM is equivalent to the LM although data for three consecutive months (M a-c ) are used to derive the regression coefficient for calculating the STE for the middle month (M b ). In all three models (Eqs. 13), the regressions were forced through the origin (0, 0) as overcast conditions should be reflected in zero SD (Kerr and Tabony 2004) from both measurement techniques/sensors considered. Following Kerr and Tabony (2004), we also consider simple “point regression models”, specifically, the three-point and four-point models. The three-point model comprises the origin (0, 0), the astronomical maximum daily/monthly SD totals, and a “half-way” point representing the sensor-specific mean of the SD records on daily/monthly resolution. The four-point model comprises the same three points and additionally the sensor-specific maximum daily/monthly SD totals.

If the regression model(s) are suitable for the conversion of SD totals among instruments, the results for individual sensors should be good approximations of each other and thus fulfill A ∼ A B  ∼ A C , where A, B, and C identify individual sensors and A B and A C are the regression model-derived equivalents of A.

4 Results

4.1 Differences among SD records on daily resolution

First, we turn the focus to the comparison of daily SD totals among individual sensors over their 9-year joint record. Figure 2(a–b) shows differences between CS and CSD (ΔCS,CSD) and HS (ΔCS,HS), respectively; while Fig. 2c gives the difference between the two automatic sensors (ΔCSD,HS). The difference in daily SD totals among pairs ranges from −5.4 to +2.0 h for CS/CSD, −5.3 to +2.3 h for CS/HS, and −1.8 to +2.3 h for CSD/HS.

Fig. 2
figure 2

Differences in daily sunshine duration (SD) totals among instrument pairs. a CS/CSD. b CS/HS. c CSD/HS, with color-coding following the Urban and Zając (2016) classification scheme

Correlation between individual SD records ranges from 0.89 (CS/CSD) to 0.96 (CSD/HS), as illustrated in Fig. 3(a–c). This brief analysis shows that despite general structural agreement, quite large differences among daily SD totals of different sensor types can occur. In the following, we adopt the classification of Urban and Zając (2016) for the characterization of the agreement/disagreement among individual SD records. These authors proposed a four-group classification scheme for differences in daily SD totals with a gradation scale in units of hours: (i) insignificant, within the margin of error [−0.1 to +0.1]; (ii) small, within the norm [−0.5 to −0.1) or (+0.1 to +0.5]; (iii) medium [−1.5 to −0.5) or (+0.5 to +1.5]; and (iv) high <−1.5 or >+1.5. These classes ((i) to (iv)) are hereinafter referred to as C1 to C4. The largest proportion of observations falling within each category is found at KSO for the following instrument pairs: (C1 and C2) CSD/HS and (C3 and C4) CS/CSD. A complete overview on days per category and instrument pair is provided in Fig. 4(a–c) and Table S1. For completeness, we provide also another categorical analysis using bins of 0.5-h increments in Fig. 4(d–f). The latter shows that differences larger than −3 h (although rare) occur when comparing CS with automated sensors. Lower SD totals for CS compared to that of CSD/HS are not surprising and can be mainly attributed to different sensitivity thresholds of CS systems, i.e., the direct solar threshold irradiance corresponding to the burning threshold of the CS instrument. While automated sensors operate at/close to the WMO-specified 120 W/m2 threshold, sensitivity thresholds ranging between 70 and 280 W/m2 have been reported for CS systems in the literature (Baumgartner 1979; Kuczmarski 1990). The occurrence of both positive and negative deviations has also been reported for several other measurement sites (Kejna and Uscka-Kowalkowska 2006; Matuszko 2015; Pokorný and Vaníček 2007; Urban and Zając 2016).

Fig. 3
figure 3

Scatterplots of daily sunshine duration (SD) totals among instrument pairs. a CS/CSD. b CS/HS. c CSD/HS. Correlation coefficient (R) is provided in each panel

Fig. 4
figure 4

Difference in daily sunshine duration (SD) totals based on the Urban and Zając (2016) classification scheme for SD sensor pairs. a CS/CSD. b CS/HS. c CSD/HS. d to e as a to c but as histograms with bin size 0.5 h

Nevertheless, it is important to note that also SD totals of CS larger than CSD/HS (6.9/10.4%) occur (see Fig. 4) as CS is prone to “overburn” during periods of intermittent sunshine (e.g., Legg 2014). Although overburn, compared to CSD and HS records, occurs throughout relative sunshine durations (SDrel) (Fig. 5a–b), exceedance of both automats SD totals is most frequently found on days with SDrel between 40 and 95% (Fig. 5c). On a monthly basis, overburn is most frequently observed between May and August (Fig. 5d–f). Here, the CS records exceed those of CSD and HS on ∼10 and 20% of observation days, respectively. Conversely, overburn is least frequently observed during the height of winter (January and February), which can be attributed in part to radiation being below the CS sensitivity threshold. For the majority of days where CS SD totals are larger than those of automats, exceedances are ≤1 h (Fig. 5g–i). Nevertheless, CS SD totals can exceed those measured by automats by up to 2 h during summer. The largest exceedances of CSD and HS totals are found in the mid-range of SDrel (between 30 and 80%, Fig. 5k–l). On the majority of days where CS totals exceed those of automats, differences between CS and HS are larger than those between CS and CSD, on average by 0.2 h but differences exceeding 1 h occasionally occur (Fig. 5m).

Fig. 5
figure 5

Number of days with CS “overburn”, i.e., days where relative sunshine duration (SDrel) of CS (binned in 5% intervals) exceeds those measured by (a) CSD, (b) HS, and (c) both CSD and HS. Fraction of days per month, where SDrel of CS exceeds those of (d) CSD, (e) HS, and (f) both CSD and HS. gi Disaggregated data from df showing daily differences in SDrel from CS to automated systems (color-coded per month) for days with CS overburn. kl Data from gh as function of SDrel from CS. m Differences in SDrel between CS and HS minus difference in SDrel between CS and CSD as function of SDrel from CS. The vertical black dashed lines in km mark 30 and 80% SDrel, i.e., the range in between the largest exceedances of CSD and HS SD totals is observed by CS

As previously stated, comparisons of automated sensors are scarce in the literature. For the two automated SD sensors used in this study to date, no comprehensive long-term comparative analysis has been performed. The results presented here show in general a good agreement of SD totals from CSD and HS: differences for 40.7% of daily totals can be classified (following the Urban and Zając (2016) classification scheme) as small, within the norm and 43.8% as insignificant, and within the margin of error (see Fig. 4c and Table S1). This result lends confidence in the comparability of TAWES SD records with those of other meteorological networks using different automated SD sensors.

4.2 Differences among SD records on monthly and annual resolution

Following the analysis of differences in daily SD totals, we focus on the analysis of SD totals aggregated on monthly and annual time scales. To ease the comparison among instruments, we restrict the analysis to days where data for all three systems are available and calculate monthly and annual SD totals from the joint record if at least 90% of daily data are available.

Figure 6 provides the monthly time series of SD totals per year for CS, CSD, and HS. The overall large variability in SD totals is visible on monthly time scales driven by ambient meteorology (cloud cover, cloud type, and cloud amount). Comparing SD totals among different instruments (see Table S2) confirms in general (i) the good agreement among automated sensors; (ii) the significant differences between automated sensors and CS; (iii) the occurrence of differences of both positive and negative signs among all sensors; and (iv) the best agreement among sensors during the height of summer (July and August), the time period where SD thresholds are least critical as also shown in other studies (e.g., Matuszko 2015). Given the large month-to-month variability and limited joint record, no trend analysis of monthly SD totals was considered.

Fig. 6
figure 6

Time series of monthly sunshine duration (SD) totals in 2007–2016 from CS, CSD, and HS sunshine recorders/sensors

To address sensor (dis-)agreement further, we group the SD totals, following amplitude, by extended seasons: JFMA, MJJA, and SOND. Figure 7(a–b) provides box-plot comparisons for SD differences among sensor pairs for these extended seasons using daily (panel a) and monthly SD totals (panel b). Although, for the majority of days the differences in daily SD totals are within ±2 h (ranging between 94.3% for CS/HS and 99.9% for CSD/HS), interesting seasonal differences emerge. Differences in SD totals between CS and automated sensors are similar in JFMA and SOND; a slightly different picture emerges in MJJA with CS showing smaller differences from HS than from CSD. Correspondingly, differences among automated sensors are larger during MJJA than other seasons, i.e., the interquartile range is about twice as large. For monthly SD totals aggregated on seasonal scale (Fig. 7b), a slightly different picture emerges. In all three seasons, larger differences are found for CS/CSD then CS/HS. While differences among SD totals of automated sensors are largest in MJJA, the smallest spread among sensor differences is found in that season.

Fig. 7
figure 7

Difference in (a) daily and (b) monthly sunshine duration (SD) totals for sensor pairs CS/CSD, CS/HS, and CSD/HS on extended seasonal basis (JFMA, MJJA, SOND), c time series of annual SD totals for CS, CSD, and HS in 2008–2015

A similar picture as seen on a monthly basis emerges on aggregated annual time scale (Fig. 7c). Differences among CS and automated sensors in annual SD totals range over the study period between 7.7 (HS, 2015) and 23.2% (CSD, 2009), differences among automated sensors (CSD/HS) between −3.5 (2015) and +1.3% (2011), respectively.

4.3 Comparison of sensor-type-equivalent SD totals and original SD records

As outlined in Section 3.2, we use five model types to derive equivalent SD totals for each sensor type. None of these models emerges as clear best candidate across all sensor types. We note in passing that this assessment does not change if the intercept is not omitted in LMs, ELMs, and QMs. Nevertheless, quadratic and (both types of) linear regression models compare favorably with the corresponding original sensor records (see Table 1) compared to three- or four-point models (not shown). Furthermore, bias-metrics (absolute standard error (ASE) and root mean square error (RMSE)) are strongly reduced when comparing STEs with the original records of their parent (regressor) system (see Table 2). For the latter, the QM-based STEs yield larger improvements than their linear and expanded linear counterparts. The same conclusions are drawn if the fraction of joint observations in classes C1 and C2 of the Urban and Zając (2016) classification is used instead of ASE and RMSE. A detailed comparison of STEs and original SD totals on daily time scale is presented in Table 1 and for illustrative purposes, we discuss only CS equivalent SD totals in the main body of the manuscript. Considering only the first two classes of the Urban and Zając (2016) classification scheme, i.e., data within marginal or small error, a strong increase in the proportion of observations falling in these categories is found when comparing CS-STEs with CSD/HS SD totals in contrast to the comparison of original records. The proportion of observations in C1 + C2 is 42.0% for ΔCS,CSD and 47.1% for ΔCS,HS, which increases to 56.3 and 61.7%, respectively. Even larger improvements are found when comparing CS totals with CSD/HS-STEs based on CS data (see Table 2).

Table 1 Comparison of daily sunshine duration (SD) totals from regression model-derived sensor-type-equivalents (STE) with original SD records by: (i) fraction of differences in daily SD totals following Urban and Zając (2016) and (ii) selected benchmark statistics
Table 2 As Table 1 but for comparison of regression model-derived STEs with parent SD records

STEs derived by simple point regression models compare in general unfavorably with the corresponding originals (and other sensor original records; see Table S3), compared to STEs from QMs, LMs, and ELMs. Also, the three-point model performs slightly better than the four-point model in most cases.

Comparisons of STEs with original SD totals on monthly time scale yield similar results regarding error statistics (see Table S4). A classification-type analysis according to Urban and Zając (2016) was not performed as differences in monthly SD totals exceed the thresholds for daily time scales and monthly averaged deviations are affected by the compensation of differences of positive and negative signs in daily means.

5 Discussion and conclusions

This study investigates the differences/similarities in daily, monthly, and annual sunshine duration (SD) totals across the 9-year joint record (1 October 2007 to 30 September 2016) of a Campbell-Stokes sunshine recorder (CS) and two automated sunshine sensors (Kipp&Zonen CSD2 and Haenni Solar 111b (CSD and HS, respectively)) operated at the Kanzelhöhe Observatory (KSO). Furthermore, different methods for deriving sensor-type-equivalent (STE) SD totals on daily and monthly time scale are compared.

Differences in recorded daily SD totals among individual sensors are of both positive and negative signs, ranging from −5.4 to +2.3 h, with smallest differences found for the automated sensors CSD and HS (−1.8 to +2.3 h). The larger differences of CS-derived SD totals from those measured by automated sensors can be attributed (largely) to different sensitivity thresholds of CS systems compared to those of automated sensors. Larger SD totals of CS (i.e., overburn) compared to those of automated sensors occur most frequently found on days with SDrel between 40 and 95% and during spring and summer seasons. Applying the Urban and Zając (2016) classification scheme for differences in SD totals, we find that between 42.0 (CS/CSD) and 84.5% (CSD/HS) of daily SD totals can be classified as marginal or small. The latter number lends confidence in the records of automated sensors and thus in the comparability of ZAMG’s TAWES SD records (which are based on HS measurements) with SD records obtained by other automated sensors operated in other meteorological networks. Comparison of SD records on monthly time scale shows in general good agreement among automated sensors while significant differences between automated sensors and CS are found. The closest agreement among all sensors is found during the height of summer (July and August), the time period when SD thresholds (particularly for CS) are least critical as also shown in other studies. Furthermore, large inter-annual variability in SD totals is visible on monthly/seasonal time scales driven by ambient meteorology (i.e., cloud cover, cloud type, and cloud amount) (e.g., Legg 2014; Matuszko 2015; Urban and Zając 2016). Several studies highlight a pronounced influence of cloud amount and cloud type on SD measurements, with the influence of different genera of clouds depending slightly on season and the sun’s position above the horizon (e.g., Matuszko 2012a and references therein). Depending on cloud type and/or amount, SD totals can be increased (reflection of radiation) or reduced (absorption of radiation) relative to clear sky values (e.g., Matuszko 2012a).

Comparing STEs with their original counterparts (and other sensor SD totals), we find that regression models including all available data on daily (or monthly) time scale generally perform better than simple three- (or four-) point regression models. Despite none of the considered regression models (in linear or quadratic form) emerging as the optimal model across all sensor types we find, using the Urban and Zając (2016) classification scheme, a strong increase in the proportion of observations falling in classes C1 and C2 when comparing STEs with original SD totals.

We note in closing that although STEs prove useful for relating SD records of individual sensors on daily/monthly time scales, this does not ensure that STE records (or joint records) can be used for trend analysis. Additional information, particularly on cloud type and amount would be needed to derive homogenized SD records.

The results presented in this manuscript highlight the similarities/differences among SD totals of individual sensors and STEs. In agreement with other studies (e.g., Kejna and Uscka-Kowalkowska 2006; Pokorný and Vaníček 2007; Matuszko 2015; Urban and Zając 2016), we find significant differences between SD totals of automated sensors and CS on daily, monthly, seasonal, and annual time scales, despite a general agreement and quasi-linear relationship between SD totals from individual sensor types. Records of SD totals from CS systems are not “perfect” due to intrinsic problems (particularly the burning threshold), nevertheless CS records are the only long-term SD records available and will remain in the near- to mid-term future important for our understanding of climate variability/change.