1 Introduction

All across the globe, lakes are both important ecological resources and serve a range of socio-economic purposes at the same time (Encyclopædia Britannica 2018; Cech 2010). On the one hand, they provide habitats to a substantial variety of organisms (Strayer and Dudgeon 2010), which are largely determined by distributions of thermal states throughout the seasonal cycle. This relates to the composition of ecosystems, physiology and abundance of organisms as well as to food chain dynamics (Balian et al. 2008; Kalff 2002). Reaches of lakes are not limited to their mere extent but generally spread far beyond (Pareeth et al. 2016; Vadeboncoeur et al. 2011; Williamson et al. 2009). Hence, lakes shape their environments—not only in terms of flora and fauna—across a wider area (Sharma et al. 2015). On the other hand, humankind benefits from lakes in many ways, which include—amongst others—fishery and associated production chains, tourism, recreation resorts and related jobs, energy production as well as their usage as a trusted resource for water extraction (e.g. drinking water).

Lake surface temperature (LST) is amongst the most important system characteristics as it pictures temperature conditions within the photosynthetic active layer driving biological processes. This entails that LST has high impact on water-ecological environments, on watershed ecosystem features as well as on the biodiversity levels in lake environments (Yang et al. 2018; Carey and Zimmerman 2014; Imberger 2004). LST summarizes the uptake of energy (solar energy, energy introduced by conduction, etc.) and oxygen by the epilimnion (the top-most layer close the surface) and the influence of fierce wind conditions in spring and fall, breaking stratification and thus affecting the vertical mixing regime (Kirillin 2010; Imberger and Ivey 1993; Imberger and Spigel 1987). A significant fraction of the interaction between lakes and their environments occurs largely at the water-air surface and is determined by fluxes (Leavitt et al. 2009; Oswald and Rouse 2004). These comprise, for instance, the transfer of energy via radiation, latent and sensible heat and the transport of momentum, minerals and gases. The average amount of thermal energy contained within the epilimnion is approximated by LST, which is recorded in a depth of less than 1 m. Amongst cloud cover, wind speed, water vapour pressure and air-temperature, which all influence heat exchange processes of lakes (O’Reilly et al. 2015; Ragotzkie 1978; Edinger et al. 1968), air temperature exerts most influence on LST developments (Livingstone and Dokulil 2001; Livingstone and Lotter 1998). Albeit wind mixing as well as radiative heat exchange processes can cause short-term distortions—particularly in spring and fall—the close linkage of LST to air-temperature is generally observed for both medium and long-term scales of time (Livingstone et al. 2005; Livingstone and Dokulil 2001; Livingstone and Lotter 1998).

This study aims at providing long-term LST time series at twelve lakes within the complex orography of Austria back to 1880. They are to become part of HISTALP—the homogeneous, long-term, multi-element database for the European Alps. Thereby, HISTALP is extended from the atmosphere to the hydrosphere of the climate system. The augmented range of data is earmarked for the scientific community and freely downloadable from the HISTALP-website (HISTALP 2018).

In this paper we present the procedure that was employed to obtain these homogeneous, long-term time series of LST at selected lakes in Austria. Following an introduction of the data sources used and the methods applied for clustering, homogenization and reconstruction of LST time series, we discuss our findings and their implications for a broad range of application areas. For demonstration purposes, we put special focus on the description and interpretation of results within the summer half-year, since winter conditions are expected to be of lesser importance due to lower floral and faunal activity as well as lower light irradiance. Apart from that the linkage between the atmosphere and LST is different in winter from that in summer. While this linkage is strong during the warm season, it is weaker in winter, especially when ice covers lakes. Therefore the modelling of LST is conceptionally different and findings—particularly those pertaining to performance—cannot be compared straightforwardly. So, for the above mentioned biological reasons as well as for those just mentioned, the focus in this paper is not on months close to winter. However, we will put more emphasis on winter conditions in the companion paper focusing on future changes of LST, since the impacts of progressing climate change on LST (and thus on related trophic dynamics) may imply potentially far-reaching consequences for ecology and society in wintertime as well (Carey and Zimmerman 2014; Sørensen et al. 2011).

2 Data and methods

2.1 Lake surface temperature

LST observations used in this study are assembled from two complementary sources. Firstly, recordings from hydrological yearbooks, dating back to 1950 were used. Secondly, we made use of time-series provided by eHYD (the digital hydrographic archive of Austria), which covers the period from 1976 to 2013. By digitizing LST observations for the period 1950–2001 from hydrological yearbooks and merging them with data obtained from eHYD, we obtained uninterrupted LST records from 1950 to 2013 for twelve lakes, which are considered in this study. In order to account for a possible bias in historic recordings, we analysed the overlapping period between hydrological yearbooks and eHYD data, which covers a time period of about two decades. As we did not detect any breakpoints or other discrepancies between both data sources, the resulting time-series are regarded to be consistent. It has to be noted that no records are available from eHYD for Lake Wörther See. Digitized Wörther See-LSTs (1950–2013) have been provided by the hydrographic service of Carinthia, Austria.

Aside from being characterized by complete LST data from 1950 to 2013, the lakes used in this study are spread out rather evenly over the Austrian territory (Fig. 1). This entails that the selected lakes exhibit a high diversity in terms of their properties (Table 1). This is not only reflected in the variability that prevails with respect to lake size and location (as characterized by lake surface area, altitude, depth and volume), but also in additional characteristics shaping aquatic ecology, such as renewal time, discharge and mixing regime. The latter distinguishes partial (meromictic) from total (holomictic) mixing as well as how often it takes place within a year: monomictic, dimictic and polymictic indicate once, twice and multiple times. Therefore, selected lakes comprise e.g. Lake Bodensee (a large holomictic-monomictic lake in the alpine foreland), Lake Weissensee (a high-lying meromictic-dimictic glacial lake in the Southern Limestone Alps), and Lake Neusiedler See (an holomictic-polymictic steppe lake located in the little Hungarian plain).

Fig. 1
figure 1

Location of lakes considered in this study

Table 1 Characteristics of selected lakes. Lake type refers to the mixing regime, characterized by a classification according to (1) the extent of intermixture into holomicic (H) or meromictic (M) and (2) the mixing frequency into monomicic (m), dimictic (d) or polymicic (p)

2.2 HISTALP

HISTALP (Auer et al. 2007) is a database created and maintained by an international effort to provide information on long-term climate evolutions to the scientific community.

Long-term data on climate elements are of outstanding importance e.g. for the description of natural variability, the assessment of non-stationary interrelations between climate drivers, the development of climate models and of course impact research.

However, long-term time-series are prone to sudden or creeping inhomogeneities, which are caused e.g. by station-relocations or vegetation changes in close vicinity of measuring stations, respectively. Since these changes introduced by inhomogeneities have to be removed, particular emphasis needs to be put into ensuring continuously high levels of data-quality. This has been achieved by the application of the homogenization procedure HOMER (Mestre et al. 2013).

HISTALP consists of monthly, homogenized, long-term, atmospheric time-series at about 150 stations across the European Alps for air-temperature and air-pressure (back to 1760), precipitation totals (from 1800 onwards) as well as sunshine and cloudiness (backwards to the 1840s and the 1880s), respectively (Fig. 2).

Fig. 2
figure 2

Geographic overview of HISTALP stations providing records of atmospheric parameters within the complex topography of the European Alps

2.3 Cluster analysis

The applicability of climatological homogenization procedures relies on observations sharing similarity and their output in general gains quality with the amount of similarity amongst observations and their number. For reasons of consistency HOMER is used for homogenization—for HOMER has been applied to HISTALP, of which the LST dataset is to become part of.

In order to provide optimal conditions for homogenization, LSTs are classified in groups showing high degrees of similarity and outer separation. Such classifications can be achieved by clustering techniques and Rotated Empirical Orthogonal Functions (REOFs), which both have been often applied for that purpose (e.g. Gubler et al. 2017; Rebetez and Reinhard 2008; Auer et al. 2007; Matulla et al. 2003; Schmidli et al. 2002). Comprehensive descriptions of REOFs and the agglomerative hierarchical clustering method AGNES, which is used here, can be found in von Storch and Zwiers (1999) and Kaufman and Rousseeuw (2008), respectively.

Findings yielded by REOFs and AGNES using Ward’s method (Murtagh and Legendre 2014) show reasonable matches, which are most coherent during fall. However, the fact that at least five time-series are required for a reliable homogenization using HOMER, which has been used to homogenize HISTALP (see section ‘Data’), constrains findings. Therewith complying classifications obviously consist of two, about equally-occupied, groups. Associated groups of lakes are almost identical for REOFs and AGNES when applied to either monthly, annual or seasonal time-series.

Figure 3 illustrates the unification process of groups produced by AGNES and compares its outcome to findings attained by REOFs. In homogenization-context these groups are called ‘networks’. The ordinate refers to similarity whereby low values correspond to high similarity levels, which decrease with increasing values. Starting from ‘Zero’ (maximum similarity), where each lake forms its own network since slightest deviations cause separation, the number of networks decreases (networks are onwardly joined to larger ones) with decreasing accuracy of distinction (demanded similarity decreases) until at vanishing similarity (largest y-values) no distinction can be made and only one network remains.

Fig. 3
figure 3

Dendrogram depicting results of the hierarchical agglomerative clustering AGNES LST anomalies. The dashed line indicates the ‘cutoff’ height yielding optimal starting conditions for the homogenization procedure, which is carried out subsequently. REOF findings are signified by circles and squares in the bottom

The height of the ‘cutoff’ in Fig. 3, which depicts the separation of the whole data set into two clusters, is set so that resulting networks represent optimal starting conditions for the homogenization procedure HOMER. The two groups of lakes identified by REOFs are signified by circles and squares in the bottom. Findings are in high accordance. Network-1 consist of (see names listed along the ordinate) Bodensee, Millstättersee, Mattsee, Wallersee, Weissensee and Lunzer See and network-2 comprises Zellersee, Hallstätter See, Mondsee, Neusiedler See, Altausseer See and Wörther See.

Since other network selections would cause the homogenization process (see below) to yield different results we ensured their statistical robustness by the application of REOFs (see Fig. 3, bottom for pertaining results). Apart from robustness our classification corresponds to the climatology of the Alps. In addition the output produced by HOMER (Sect. 2.4) shows only a few adjustments, which coincide with station relocations and changes of instruments, documented by entries in logbooks (metadata) of LST measurement sites. Such results are achievable only when based on a proper classification.

2.4 Homogenization of lake surface temperature time-series

Records of lake surface temperatures are—just as other observation-based time-series—prone to inhomogeneities (Reeves et al. 2007; Aguilar et al. 2003; Vincent 1998; Alexandersson 1986). In order to provide robust and reliable results that comply with HISTALP’s strict quality criteria, all LST time-series have to be homogenized.

Here we apply a well established and widely used homogenization procedure called HOMER (Mestre et al. 2013). Based on monthly records of variously paired LSTs, breaks in time-series at individual lakes within each of the two networks are identified by using a maximum likelihood approach. Subsequently necessary adjustments have been determined by means of an ANOVA.

This way, breaks within time-series at individual lakes can be singled out. Within network-1, HOMER detects breaks in LST-records at four lakes. Network-1 exhibits six breaks in total with the first in 1954. Two can be substantiated by so-called metadata associated with LST observing sites. Metadata contain information on modifications that possibly affect readings and should be available at every measuring location—no matter whether it is a weather station, a lake temperature observing site or any other locality at which measurements are carried out over a long period of time. The first break matches with a change in observation time from 9 am to 8 am at lake Weissensee in 1954 and the other one with a relocation of the observing site at Lunzersee in 2009.

In network-2, HOMER detects eight breaks, which are distributed over all LST time-series. Here, five of the eight breaks are addressable to events documented in metadata collections, whereby all refer to changes in observation times. Considerable compliances of detected breaks with actually recorded changes (metadata) indicate the accuracy of approach and achieved results. After carrying out appropriate amendments, which is done by HOMER as well, homogeneous LST time-series are finally available at all twelve lakes.

2.5 Derivation of transfer functions

In this study observations are used to establish transfer-functions depicting LSTs by means of atmospheric covariates via Multiple Linear Regression models (MLRs). This approach has already been successfully applied some 10 years ago (e.g. Matuszek and Shuter 1996; Shuter et al. 1983). Here it is used to extend homogenized LST time-series back to 1880. The objective of this section is to lay out the procedures, which may be split in two tasks: model selection and performance assessment.

In order to avoid obvious collinearities, we employ the constraint that no MLR model may contain more than one time-series of the same atmospheric variable. Hence, considered MLR models comprise three HISTALP time-series at most and are, thus, defined by four or less non-zero coefficients (Eq. 1). This first step reduces the tremendous number of potential models to a still large but machinable amount of about 1.5 million MLR-models for each lake and every month.

$$\begin{aligned}{\text {LST}}^{l,t,r,p,m}(y) &=\alpha _0^{l,t,r,p,m} \\&\quad +\, \alpha _1^{l,t,r,p,m} {\text {T}}^{t,r,p,m}(y) \\&\quad +\,\alpha _2^{l,t,r,p,m} {\text {R}}^{t,r,p,m}(y) \\&\quad +\, \alpha _3^{l,t,r,p,m} {\text {P}}^{t,r,p,m}(y) + \epsilon , \end{aligned} $$
(1)

m, y and l refer to month, year and the lake under consideration. t, r and p run through the corresponding HISTALP sites from which air-temperature (T), precipitation totals (R) and air-pressure (P) time-series are taken. \(\epsilon, \) drawn from normal distributions with zero mean and variances adapted to the respective cases, represents the fraction of LST variability not captured by this deterministic approach.

Climatological conditions imposed by the European Alps (Fig. 4) are to be rendered by MLR models as well. These refer, for instance, (1) to the functioning of the alpine ridge as climate divide, (2) the need to underrun de-correlation-distances for atmospheric elements, which are shortest for precipitation totals, and (3) to avoid temperature decouplings or error-prone measurements of precipitation totals by assuring same altitudinal bands for lakes and corresponding predictor-sites. These conditions exclude physically inconsistent settings and thereby further reduce the number of potential MLR-models. Figure 4 illustrates the implementation of the above discussed physical conditions and shows on the example of lake Millstättersee and January the three finally chosen models. Since Lake Millstättersee is located south of the alpine ridge, stations have to be situated there as well (i.e. south of the black lines). In summer selected precipitation sites entering models have to be within the blue circle, which indicates the de-correlation-distance.

Fig. 4
figure 4

Model settings for Lake Milstätter See (blue triangle). HISTALP stations at which air-temperature, precipitation totals and air-pressure time-series are taken, are indicated by red circles, blue squares and yellow triangles (which in this case lie on top of each other), respectively

The final step of the selection process is neither related to statistical nor to physical consistency, but to model quality. The decision on how far back LST reconstructions should be extended is a matter of quality, which is treated here as a mandatory requirement and depends upon station coverages. Around 1880 (see Fig. 5), about 80% of the network available for MLR-model establishment is in effect. Ten years earlier, this percentage has reduced to 50% and another 20 years only about a quarter of stations would be on hand for model building. 1880 is a trade-off between quality and the desire for long LST time-series. So, mathematical and physical considerations as well as quality-assurance result in about 55,000 MLR-models that are to be established and tested within the performance assessment that is to be carried out next.

Fig. 5
figure 5

Development of the number of HISTALP stations where homogenized and gap-filled records are available

Before that, however, linear trends contained in LST and HISTALP observations are removed. They would otherwise introduce spurious correlations, falsely enhancing correlations of transfer-functions based on them. As such detrended and standardized time-series (e.g. von Storch and Zwiers 1999) are used in transfer-functions entering calibration and validation procedures (Zorita and von Storch 1999; Matulla et al. 2002; Landgraf et al. 2015). Predictive skill can be assessed by means of correlation coefficients and distances. Here Pearson’s \(\rho \), Kendal’s \(\tau \) and Spearman’s \(\sigma \) as well as root mean square error (RMSE), mean relative error (MRE) and mean absolute error (MAE) are used to determine how closely LST observations are reproduced by various transfer-functions (i.e. MLR models) and to rank them accordingly.

For each lake and every month (January to December) two calibration and validation experiments (exp-1, exp-2) are conducted for all MLR models based on covariates identified by the above described selection procedure. In exp-1 model calibration is carried out within 1950–1985 and validation throughout 1986–2013. In exp-2 no distinction between calibration and validation periods is made—they both stretch the entire period. Therefore, findings of exp-1 carry more weight assisting in identifying best performing combinations of atmospheric covariates, while MLR-coefficients derived in exp-2 are used for LST reconstruction. The actual implementation of this approach comes in this study with about 160 million experiments.

3 Results and discussion

Results in this section are presented in logic succession, meaning that findings pertaining to transfer-functions—which are used to simulate LSTs from atmospheric quantities—are discussed first, and LST-reconstructions back to 1880 afterwards.

The model selection process is followed by an extensive assessment of model performance, which is carried out for each month and every lake. Achieved results show that from all eligible models only a very small fraction stands out by means of skill. Three MLR models, differing at least with respect to the HISTALP site providing air-temperature (for they carry most weight), are chosen from each fraction and used for LST reconstructions.

Fig. 6
figure 6

Model skill quantified by Pearson’s correlation coefficient \(\rho \) (first row), mean relative error (MRE, middle row) and mean absolute error (MAE, bottom row) for three arrangements (geographic, depth and renewal time) depicted by six groups. The x-axis indicates months that are discussed in the text

Figure 6 depicts the performance of these models in terms of (top to bottom) Pearson’s correlation coefficient \(\rho \), MRE and MAE. Since Pearson’s \(\rho \), Kendall’s \(\tau \) and Spearman’s \(\sigma \) yield similar results, the latter two are omitted for the sake of brevity. The same applies to MAE and RMSE (which are associated with arithmetic mean and median, respectively). In case the deviations between the two underlying vectors are normally distributed, RMSE and MAE coincide (up to \(\sqrt{\frac{\pi }{2}}\)). The similarity between the metrics of RMSE and MAE indicates that differences between LST simulations and observations are approximately normally distributed, which is indicative of sound model quality. Groups of boxplots refer to three arrangements of lakes. One is based on geographical location (see Table 1) and distinguishes lakes situated north of the alpine ridge (‘north’, orange) from those in its south (‘south’, red). The second arrangement splits all lakes according to depth, separating lakes deeper than 35 m (‘deep’, dark blue) from the rest (‘shallow’, light blue). The third divides lakes with respect to renewal times of less than 4 years (‘fast’, dark green) or more (‘slow’, light green).

Notable results illustrated by Fig. 6 include an overall satisfying performance, which is visible across all performance metrics, as well as the cycle of performance, which roughly increases from March to midsummer and decreases afterwards. From the latter, April and June seem to deviate slightly, showing performance values somewhat below expectations. These differences indicate a larger fraction of stochastic variation (i.e. \(\epsilon \) in Eq. 1), which cannot be captured by the applied deterministic approach.

In April, the lakes’ top layers have—in general—warmed up enough so that vertical water-temperature profiles do not vary much with depth. This entails that the whole water column is characterized by similar density. Surface water, reaching its maximum density at \(4\,^{\circ }{\hbox {C}}\), will start to sink to the bottom. In addition, strong winds inducing momentum on the surface may now trigger currents penetrating throughout the entire water body, accelerating the process known as ‘spring turnover’. The date of its onset, however, varies from year to year as it depends on the ice breakup time (if any) as well as on the course that weather has taken so far. Therefore, LST in April is more difficult to predict than in midsummer. This is most apparent when taking into account ‘deep’ and ‘shallow’ lakes. LST-models at ‘deep’ lakes show a significant better performance than those for ‘shallow’ lakes, expressed in pronounced differences concerning correlation, MRE and MAE. This is at least partly due to the comparably low LST-variance associated with ‘deep’ lakes and the absence of outliers.

In May, humid, unstable air masses normally trigger frequent and intense precipitation events. Together with melting snow-pack this drives water exchange processes, which particularly affect lakes exhibiting short renewal times. MAE values displayed in Fig. 6 reveal that from June on those associated with group ‘fast’ are significantly larger than values referring to group ‘slow’ even though rainfall-totals decrease along this period.

In June, a suddenly arising cold spell, which sometimes splits in two parts (‘Schafskälte’ in mid-June and ‘Siebenschläfer’ at its end), weakens the link between monthly atmospheric covariates and LST, too. These events are caused by rather cool and humid air masses originating from the northwest Atlantic and propagating east-southeastwards until the Alps are reached. This characteristic pattern-change causes the above mentioned slight deviation from the expected cycle of performance. Beyond that, it strongly increases the discernibility between models associated with the group ‘north’ from those of group ‘south’. This is well recognizable in Fig. 6, particularly when taking into account MAE.

The overall high performance in midsummer is based on several effects. Lake stratification is fully developed and stable conditions inside prevail (Imberger and Ivey 1993). Temperature differences throughout the epilimnion are negligible, the atmosphere is well mixed, and persistent low air-pressure generally dominates Central European weather. The development of air-temperature is closely followed by LST. However, local and heavy downpours as well as high evaporation, which are also characteristical for this time of the year, can impact small water bodies. Thereby driven effects are best visible in MAE concerning group ‘fast’, which in this study coincides with small lakes (see Table 1).

Towards the end of August, first signs of fall associated with passages of cold fronts may be noticeable in northern regions. Weather south of the Alps, however, remains usually stable and warm until the end of the month. This causes the performance of group ‘south’ to reach its maximum skill level, while group ‘north’ already begins its decrease.

September and October are accompanied by meteorological and hydrological phenomena, whose occurrence date vary annually. Amongst these the so-called ‘Altweibersommer’ (sometimes referred to as ‘Indian summer’) most important. It stands for a period of resistant high pressure resting over Central Europe, which may last from mid-September to early October. Such variation in occurrences weaken the atmosphere-LST linkage even more. In this case, warm daytime air-temperatures and cool nights predominate. Another source of uncertainty is the onset and length of ‘fall turnover’. When surface waters begin to cool in late autumn, surface water sinks once it reaches its maximum density at \(4\,^{\circ }{\hbox {C}}\). As in spring, this process is heavily fortified by the influence of winds.

Figures 78 and  9 present the main outcome of this study, namely the creation of a high quality, monthly LST dataset for Austrian lakes that starts as early as 1880. These figures serve two purposes. First, they depict LST developments for selected months from 1880 to 2013 by combining reconstructions (1880–1949) with homogenized observations. Second, they compare model simulations amongst each other and with observations by showing their temporal run together with associated boxplots. As for the observation period, this display mode adds to the above discussed validation procedure by graphical means in terms of LST curves and associated summaries of the corresponding distributions. In regard to reconstructions the use of three transfer-functions gives a rough estimate of uncertainty. This corridor is amended by the uncertainty induced by the transfer-functions. Gray bands indicate the 99% confidence interval associated with simulated LSTs.

The presentation form of the presented figures was chosen, since it directly shows the quality of achieved results in an unaltered way. Results could have been aggregated to seasonal-levels or groups of lakes. Albeit this would decrease variance and perhaps enhance the impression of even more credibility. The chosen mode of display in Figs. 7, 8 and 9 give a pure view that matches the purpose of this study more properly. Below depicted findings refer to characteristic and (from a modelling perspective) difficult months (see Fig. 6 and the discussion above) as well as representative lakes.

Fig. 7
figure 7

LST of lake Milstätter See in April. Slopes for reconstructed, modelled and observed time-series are indicated by k, \(k_{mod}\) and \(k_{obs}\), respectively. Trend lines within the second and third period refer to observations. The table presents performance statistics associated with models employed for reconstructions

Figure 7 shows the evolution of April-LST at lake Millstättersee from 1880 to 2013 and is divided into three major parts. Millstättersee belongs to the groups ‘deep’, ‘slow’ and is situated south of the Alps. The left (white background) depicts reconstructions. From 1950 to 2013, observations together with LST simulations are presented (yellow: 1950–1985 and red: 1986–2013) and the right part shows LST distributions as boxplots pertaining to presented LST developments, whereby background colors indicate periods. LST reconstructions show a slight increase and variances that are changing on decadal-scales. Until 1900, LSTs are characterized by high variances and fluctuations on a time-scale of less than 10 years. The following three decades show generally reduced variabilities (aside from a few pronounced deflections at the end of WWI), separating the cooler first half of this period from its warmer second half. While LST developments in the 1930s seem to resemble the outgoing nineteenth century at a somewhat lower temperature level, the 1940 are characterized by a rather steep LST increase and low variance values. Observations can be split according to their tends. During the first 35 years, overall decreasing LSTs can be seen, while the slope afterwards is about three times times higher and positive. Pertaining variances feature reductions towards 1985 and a recurring pattern with larger values thence. This description is summarized in boxplots in the right part. Concurring medians, averages, quartiles as well as variances demonstrate the good model-performances, which may also be seen from temporal LST developments.

Fig. 8
figure 8

LST of lake Mattsee in July. The composition of this figure is analogous to Fig. 7

Figure 8 refers to Mattsee (a ‘shallow’ and ‘slow’ lake north of the alpine ridge) and July. One obvious feature reconstructed LSTs refers to their variability. From 1880 to 1935, pronounced and periodic deviations dominate, which are negligible throughout the final 15 years. LST levels generally decrease from 1880 until 1913, when they reach their twentieth century minimum, which may generated by the eruption of the Alaskan Novarupta (visible across all lakes)—the largest eruption of the twentieth century (Hildreth and Fierstein 2012). Ever since then, LSTs experience a very pronounced increase until 1935, at which level they remain until 1949. Observed LST trends share signs with those described above, but to a much smaller extent. The appearance of sudden and sharp deflections, achieving highest and lowest percentiles in immediate succession, is most prominent.

Fig. 9
figure 9

LST of lake Wörthersee in October. The composition of this figure is analogous to Fig. 7

After the decline in the 1880s, LSTs at Lake Wörthersee (member of groups ‘deep’ and ‘slow’) situated south of Alps follow a linear increase until 1910, which is interrupted by a significant, one-time drop in temperature (Fig. 9). After that, a decade of cool temperatures is succeeded by stable LSTs at recovered levels until the mid-1960s. Then a large, positive anomaly is followed by steeply decreasing LSTs over a period of somewhat less than 10 years until lowest observed temperatures occur. A general warming prevails for the rest of the period.

Clearly visible LST oscillations from 1880 to 1950 as well as slight increases as depicted in Figs. 78 and  9 are commonly shared by investigated lakes. Their extent, however, depends on the time of the year and varies from lake to lake. Apparent sequences of high and low LST-levels coincide with the occurrence of so-called ‘outstanding periods’ identified in Central European air-temperatures, whose prints are to be found in glacial advances and retreats as well (Matulla 2005).

The temporal development of measured LSTs from 1950 onwards adds to the proof of mankind’s fingerprint on climate based on air-temperatures (Hasselmann et al. 1995; Zwiers and Zhang 2003; Mitchell et al. 2001; Moss et al. 2008), since LST records are independent of air-temperature observations and originate from another climate sphere. The decline in LST until the mid-1980s and the pronounced increase afterwards can be observed most clearly from March to August and is visible in all lakes. The reasons for this development are of course the same as in case of the atmosphere. Steadily rising amounts of industrial aerosols, which are reflecting incoming solar radiation back to space, causing LSTs to cool until the LRTAP (Convention on Long-Range Transboundary Air Pollution, 1979) and further agreements came into effect. Ever since diminishing loads of aerosols have continuously unmasked the anthropogenic greenhouse effect, which caused lakes to warm.

The right part of Figs. 78 and  9 summarizes the three just mentioned periods (i.e. 1880–1949, 1950–1985 and 1986–2013) via their LST-distributions condensed in boxplots. Arithmetic mean and median almost always coincide closely, indicating symmetric distributions. More important, however, is the good agreement between observed and simulated distributions, which is true for all three models. This fact places confidence in the reconstructions and therefore in the LST changes between considered periods too (see the boxplots in the right part of Figs. 78 and 9). Simulated inter-quartile-ranges are somewhat smaller compared to observed ones. This is attributable to the linearity of the MLR approach and may be directly deduced from Eq. 1. The variability associated with \(\epsilon \) cannot be captured by the applied models. The amount of variance not captured by models is given by the fraction of simulated variance and observed variance, which equals the square of Pearson’s \(\rho \) shown in Fig. 6. Hence, its seasonal cycle that obtains its minimum (best case) in mid-summer can be deduced from there.

Considering individual lake-month combinations, the mode of display used in Figs. 78 and  9 is suitable for a detailed discussion of LST-evolutions from 1880 onwards. In case a general overview is sought for, this mode is not feasible. In order to complement the above presentation with an illustration granting an overview, we focus on Lake Bodensee and Lake Neusiedler See since they are opposing extremes in various respects. Hence, a figure based on LST-differences between these lakes can be used to represent all considered lakes.

Fig. 10
figure 10

Conceptual framework of this Figure matches those of Fig. 6 and e.g. Fig. 7. Boxplots show monthly distributions of LST-differences (BOD-NEU) centered at the zero-line. Bold black lines present courses of monthly medians of LST-differences. Values above boxplots are LST averages (please see the text for more details)

The conceptual framework of Fig. 10 makes use of the conventions introduced in Figs. 6 and 7. Panels are assigned to the periods considered this study and indicated by corresponding background colors. The abscissa depicts months while the ordinates refer to LSTs. The right hand side bears on median LST-difference values (LST at Lake Bodensee minus LST at Lake Neusiedler See), which are connected with bold black lines. Their shapes are determined by lake characteristics (see Table 1).

It appears reasonable to first focus on period 1880–1949 as the evolution of median values experiences an increasing shift towards negative values, driven by LST increases less marked at Lake Bodensee compared to Lake Neusiedler See. The comparably huge water-body of Lake Bodensee allows LSTs in March to clearly exceed those at Lake Neusiedler See. This, however, changes in April with the advent of spring turnover and rising air-temperatures. In May median LST-differences reach minimum values, followed by a continued rise afterwards along with developing stratification until its completion in midsummer (Imberger 1985). This stable state strengthens the link between the atmosphere and the upmost water-layer. The close linkage is reflected by highest model skill (see Fig. 6), largely independent of lake characteristics and causes LST-differences to attain values close to zero (see Fig. 10). Afterwards this trend is supported by shorter day-lengths and reduced air-temperatures, which exert less effect on Lake Bodensee LSTs for the vast amount of stored heat preventing rapid surface cooling.

The general appearance of this course experiences a transition through the latter two periods (second and third row in Fig. 10), which is—apart from the above mentioned shift—most obvious in respect to temperature gradients. In this respect the weakening of the pronounced decline in spring, the switch in steepness of temperature steps from July to August and August to September as well as the noticeable flattening around May stand out.

Pronounced inter-quartile ranges such as those of May and October are due to uncertainties in the occurrence of LST-governing processes, triggered by random times of occurrence, random intensities and random duration (e.g. spring and fall turnover). Small inter-quartile distances like the ones in July and August point to a high degree of regularity. Fully developed lake stratification achieved in summer counteracts the spread of disturbances. At this time of the year these are seldom accompanied by magnitudes exerting effects visible on monthly scales anyway.

Identifiable differences of inter-quartile ranges between the periods are relatively small in general and interpretation of changes require much care and should be made with great caution. This is related to their low level of confidence compared to the above discussed medians, because first and third quartiles carry twice the error of the median and inter-quartile ranges (according to error propagation theory) are, hence, associated with nearly triple median errors.

These considerations, however, do not concern the representativity of Fig. 10 with regard to the desired comprehensive overview of LSTs and should not obscure strong LST increases that are most pronounced from May to August (see monthly averages depict above the boxplots).

Livingstone and Lotter (1998) showed that LSTs of individual lakes are highly correlated with local air-temperature and may be sufficiently well modelled from air-temperature alone. Results of this study substantiate this claim and confirm that the usage of more remote sites providing air-temperature records do mean no quality loss as measured by skill (for the long de-correlation distance of monthly air-temperatures). Beyond that, attained findings show that precipitation totals carry significance in some cases and occasionally air-pressure does too. To this effect, presented results may be considered expedient also in regard to statements on the spatial reach of lake provinces involving scales from several tens of km (Magnuson et al. 1990; George et al. 2000) to much larger geographical extension (Benson et al. 2010).

However, the close linkage between air-temperature and LSTs, the aforementioned extensive de-correlation lengths of monthly air-temperature, and the detection of homogeneous-LST groups of lakes may be combined to advantageous use. Combined use may disentangle problems whose resolutions are hampered by e.g. the large number of lakes or the vast extension of geographical regions or poor data-availability. The existence of homogeneous groups of lakes extending over significant geographical areas and the close link between LST and air-temperature (which is in line with large de-correlation distances associated with air-temperatures) show the opportunity to simulate hydrobiological states in lakes by atmospheric observations even at remote stations.

Homogenization procedures (Auer et al. 2007) enhancing the quality of data sets by considering the same parameter at several stations (e.g. air-temperature) may benefit from findings presented here (e.g. extending single-parameter, single-sphere methods to multi-parameter, multi-sphere techniques).

Climate change induced alterations act through changes in air-temperature, wind and precipitation, which influence the amount of nutrients in lakes, temperature profiles, stratification, periods of ice-cover as well as overall biological activities. Sustained warming trends of air-temperatures are expected to particularly affect the epilimnion, but also thermal profiles and hence the mixing processes in lakes (Kirillin 2010; Stefan et al. 1998). Borasi et al. (2013) lists the following effects on lakes that are potentially caused by climate change: (1) the onset of water warming in spring at earlier stages than now (Gronskaya et al. 2001; 2) increasing temperatures throughout different lake levels (surface and lower lying levels; Endoh et al. 1999; 3) an extension of periods during seasonal cycles, when lake temperatures exceed temperature levels observed in summer so far (Jarvet 2000); and (4) shorter periods of ice cover and thinner ice layer thicknesses (Tood and Mackay 2003).

Climate change is expected to effect the dynamics of lakes throughout Europe, especially with respect to lake productivity, frequency and severity of algal blooms, water quality, as well as changes in water color (Dokulil and Teubner 2003). In addition, aquatic ecosystems are severely affected by changes in water temperature. Since the end of the nineteenth century, substantial changes in the aquatic biocoenosis of lakes, which are attributable to water temperature increases, have been observed. Since fish are poikilothermic organisms, they are particularly sensitive to temperature changes. As pointed out by Schmutz and Jungwirth (2003), even slight changes may therefore have huge effects on distribution of species, since all relevant physiological processes (e.g. metabolism, food intake, growth rate), behavior, habitat selection, swimming ability and predator–prey interactions are determined by water temperature. Hence, several aquatic species that are already critically endangered due to various anthropogenic impacts on inland water bodies will be on the brink of extinction in case of additional detrimental climate effects.

Datasets reaching far back in time almost inevitably cover a substantial variety of climate states. The approach demonstrated in this paper may have multifarious applications both for localities with poor observations or for sites and time periods with no observations at all. The LST dataset provided here contains climate states from shortly after the end of the Little Ice Age when anthropogenic effects were small to the final decades of the twentieth century, which are significantly impacted by mankind; the ‘hiatus’ (Stocker et al. 2013) in the early twenty-first century which is characterized by record \(\hbox{CO}_{2}\) levels and a decade of very high, stable air-temperatures; conditions triggered by volcanic eruptions as, for instance, Krakatoa (1883), which was about seven times as strong as Pinatubo (1991), during periods not influenced much by mankind and effects induced by industrial aerosol at times primarily characterized by anthropogenic climate change.

Long LST datasets of high quality are an prerequisite for a number of applications as, for instance, calibration purposes of paleolimnological inference models (Livingstone and Lotter 1998), for the transformation of air-temperature changes to LSTs and for the validation of lake thermal models (Gal et al. 2003). Another utilization possibility may be downscaling where LST data can help establishing linkages between regional and synoptic scales, which are required to foretell changes in extreme events driven by potential future pathways of mankind (e.g. Representative Concentration Pathways; Moss et al. 2008), for instance. The same is obviously applicable to reconstructions of LSTs back in time or for closing gaps in LST records. A somewhat different utilization rests on the wealth of different (hydrological and atmospheric) states encased in this long-term dataset. This wealth may be used to quickly derive first estimates of impacts due to altered conditions by drawing analogs.

4 Summary and outlook

The central goal of this study is the generation and provision of a long-term, high-quality LST dataset representative for Austria extending back to 1880. This has been achieved by a two step approach: first, subjecting about 6 decades of LST-observations (previously digitized from hydrological yearbooks) to a homogenization procedure conducted on the basis of lake-networks, which have been identified by Rotated Empirical Orthogonal Functions and hierarchical clustering techniques; and second, by reconstructing LSTs from homogeneous atmospheric covariates provided by HISTALP via transfer-functions that have been identified through a selection process. This process enforces mathematical, physical and quality auxiliary conditions. Model-setups based on eligible covariates are entered into a calibration-validation analysis. In total about 160 million experiments are carried out, which are followed by a performance assessment. The paper lays out these analysis steps, discusses the evaluation of model skill by means of various performance measures as well as attained results.

The intention of making this LST dataset freely available to the research community is driven by the underlying assumption that it is considered valuable for further investigations. The dataset covers a significant extent of time featuring various climate states and miscellaneous influences under varying anthropogenic impact. The methodology applied in this research has potential for resolving lake surface temperatures for lakes with incomplete or uncertain records. Datasets of this nature are essential to related fields such as lake thermal modelling, palaeoclimate research, prediction of future lake behaviour, etc. and may hence lead new insights and contribute to the current body of knowledge. Amongst many potentially beneficial endeavors, the derivation of future LST developments driven by different pathways of mankind appears as a natural continuation of this work.