Introduction

Human-induced pressures on the Baltic Sea ecosystem have significantly accelerated during the twentieth century, especially in terms of anthropogenic nutrient loading (e.g., Savchuk et al. 2008 and references therein), leading to increased nutrient concentrations (e.g., Fonselius and Valderrama 2003), increased summer cyanobacteria blooms (e.g., Finni et al. 2001) and expanding hypoxia (e.g., Conley et al. 2009 and references therein). In addition to the direct anthropogenic impact on the Baltic Sea ecosystem, climatic conditions show strong and partly unprecedented changes in recent decades (e.g., Lehmann et al. 2011) which, combined with anthropogenic pressures, have been associated with an ecosystem-wide regime shift in higher trophic levels in the Baltic proper (e.g., Möllmann et al. 2009 and references therein).

The combined effects of changed trophic and climatic conditions have raised concerns about the Baltic Sea ecosystem being pushed towards its natural limits (BACC Author Team 2008). As these limits are not known and difficult to estimate, ecosystem models can be used for control and sensitivity studies on the reaction of the ecosystem to different observed or expected changes in atmospheric, terrestrial, and biogeochemical forcing. A first step towards this is to simulate the changes from a less anthropogenic-disturbed system to the present state.

There are, in addition, practical needs for a good description of the pre-industrial state of the Baltic Sea ecosystem, i.e., the implementation of recent European Union legislations—the Water Framework Directive and Marine Strategy Framework Directive and the HELCOM Baltic Sea Action Plan. In these, water quality is quantified in relation to the so-called reference conditions that should represent the system undisturbed by human influence. However, little is known about the pre-industrial state of the Baltic Sea and one has to rely to a large extent on estimations from dynamic models (Schernewski and Neumann 2005; Savchuk et al. 2008).

In this study, we present a holistic transient time-dependent reconstruction of the development of hydrography and eutrophication of the Baltic Sea over the past 150 years. The reconstruction involves developing meteorological fields of high temporal and spatial resolution, river runoff and nutrient loads from land and atmosphere to drive a coupled physical–biogeochemical model. The present paper focuses on describing the inter-annual to multi-decadal general variations of the drivers and their impact on hydrographical and hydrochemical conditions, as well as consequences for the biogeochemical cycling in the Baltic Sea.

Materials and Methods

The BALTSEM Model

The major tool in this paper is the coupled physical–biogeochemical model BALTSEM (Savchuk 2002; Gustafsson 2003). BALTSEM is a coupled-basin model that resolves the Baltic Sea with 13 marine basins (Fig. S1, Electronic Supplementary Material). The exchange between the horizontally homogeneous basins are forced by wind, varying sea level and density differences between the basins, and are controlled by frictional resistance and dynamical flow contraction due to Bernoulli and Coriolis effects (Gustafsson 2000, 2003). Vertical stratification is resolved by a variable number of water layers created by inflows and kept below a maximum number by fusion (Gustafsson 2000). Vertical mixing is described by a bulk mixed-layer model for the surface layer (Stigebrandt 1985) and a deep-water mixing parameterization where the turbulent vertical diffusion coefficient depends on stratification (Stigebrandt 1987) and mixing wind (Axell 1998). Mixing into dense gravity currents is modeled according to Stigebrandt (1987). BALTSEM also includes the sea-ice model by Björk (1997), with dynamics adapted to the Baltic Sea (Nohr et al. 2009). Heating/cooling and evaporation at the sea surface is calculated using bulk formulae (Björk 1997; Gustafsson 2003). In the northern Kattegat, open boundary conditions are implemented.

The biogeochemical component of BALTSEM closely follows Savchuk (2002). It describes the dynamics of nutrients and oxygen, including nitrate, ammonia, and phosphate, as well as particulate organic matter consisting of phytoplankton (autotrophs), dead organic matter (detritus), and zooplankton (heterotrophs). Three functional groups of phytoplankton are included: diatoms, flagellates and others, and cyanobacteria. Cyanobacteria cover part of their nitrogen demand by nitrogen fixation. Organic material sinks and enters the model sediment as benthic nitrogen and phosphorus. Mineralization of organic matter occurs due to heterotrophic excretion, during sinking of detritus, and in the bottom sediments. All mineralization pathways are temperature dependent and coupled to oxygen consumption. In the presence of oxygen, a fraction of nitrate produced by mineralization and nitrification is released into the overlying water, while the remaining fraction is denitrified. Under anaerobic conditions all mineralized nitrogen is released as ammonium. Likewise, under aerobic conditions one part of phosphate produced by mineralization is retained (sequestered) in sediments, mimicking adsorption and complexation. The remaining fraction is released into the overlying water. Under anaerobic conditions, no phosphorus retention occurs and even some of the previously sequestered phosphorus is released into the water column. Finally, the sediment nutrients are buried with a constant rate. Hydrogen sulfide concentrations are represented by “negative oxygen” equivalents (1 mol H2S = −2 mol O2). Nutrients are removed from biogeochemical turnover by export to the North Sea, burial in the bottom sediments, and—in case of nitrogen—by denitrification.

The model was recently evaluated against both observations and two three-dimensional models for a contemporary time period, 1970–2005 (Eilola et al. 2011). The comparison showed that BALTSEM performed as well as the three-dimensional models, but also that all models suffered from distinct and differing deficiencies.

The hydrodynamic part of the model is forced by 3-hourly weather parameters: winds, temperature, humidity, cloudiness, pressure, and precipitation, by daily average sea level in Kattegat, river runoff and stratification at the Skagerrak boundary. In addition, the biogeochemical part needs nutrient loads from land and atmosphere, and concentrations of nutrients in the Skagerrak.

The simulations presented below were initialized from an additional spin-up run, using physical driving forces for 1850–2006, but nutrient loads were kept constant at the 1850 level. This way of initialization made sure that the slowly responding nutrient pools where equilibrated to pre-industrial levels. However, it also caused an initial transition in, e.g., salinities since the physical conditions were different around 1850 as compared to around 2006 (see “Results”).

Reconstructed Forcing

Atmospheric Forcing

The multivariate HIgh RESolution Atmospheric Forcing Fields (HiResAFF) (Schenk and Zorita 2011, 2012) comprise consistent daily fields of sea-level pressure (SLP), 10 m zonal and meridional wind, surface air temperature at 2 m (T2m), precipitation, relative humidity, and total cloud cover with high spatial resolution (0.25° × 0.25°, ~25 km). The analog-method (AM, Zorita and von Storch 1999) was applied as statistical upscaling tool using long records of historical station data of daily sea-level pressure (Ansell et al. 2006) and monthly air temperature (Jones and Moberg 2003). The AM searches for the two times, t in the past since 1850 and t′ in the recent past (1957–2007) that are most similar according to observations. In the latter period, complete atmospheric fields for t′ are available from a regional climate simulation which then can be used as analog fields at time t. Hence, the fields at t since 1850 are reconstructed by their analogous fields from t′. The atmospheric fields (analogs) are taken from the Regional Climate Atmosphere Ocean (RCAO) simulation for 1958–2007 forced by ERA40 reanalysis at its boundaries (Meier et al. 2011) and RCA3 in case of monthly T2m (Samuelsson et al. 2011). Hence, the reconstruction inherits properties and biases of these simulations.

Schenk and Zorita (2011, 2012) showed that HiResAFF successfully reproduces the frequency distribution, serial correlation and variance on daily and monthly scales for the different meteorological variables, partly with exceptions of humidity and cloud cover. Here, the long-term variability of HiResAFF is evaluated against NCEP/NCAR reanalysis and the RCAO simulation.

Compared to the previous reconstruction of forcing fields by Kauker and Meier (2003) for the period 1902–1998, HiResAFF aims to extent the period back to 1850. Kauker and Meier (2003) use daily station SLP and monthly mean gridded fields of surface air temperature and precipitation as predictors to reconstruct SLP and wind on daily and other variables on a monthly scale. Further, the method is based on independent regressions for each predictant, therefore variability and co-variability of the reconstructed fields are explicitly retained. In addition, final probability distributions may be distorted compared to the original distributions, in particular for non-normally distributed daily variables like wind. The AM explicitly targets the reconstruction of a realistic probability distribution and the co-variability between the reconstructed fields (Schenk and Zorita 2012).

River Runoff

Monthly mean river flows for the period 1850–2009 were calculated from five different data sources. The method follows Meier and Kauker (2003), but for this study the record was prolonged and updated.

Prior to 1900, no runoff data are available and therefore a reconstruction based on temperature and atmospheric circulation indices for three basins, the northern, southern, and eastern part of the Baltic catchment area was used (Hansson et al. 2011).

For 1901–1920, monthly discharges for the basins were calculated from total discharge data to the entire Baltic catchment by Cyberski and Wroblewski (2000) using climatological mean ratios between the basins and the total discharge. After 1921, the reconstruction by Mikulski (1986) of basin-wise river discharge based on 17 rivers is used. The catchments of these rivers represent 63 % of the total drainage basin.

For 1950–2004, monthly river runoff data from Bergström and Carlsson (1994) were used, based on data from some 200 stations, representing 86 % of the catchment. Missing values were replaced by results of the HBV model, a large-scale hydrological model by Graham (1999). For 2005–2009, only HBV model data are available (Phil Graham, pers. comm.). The simulation of discharges to the Bothnian Sea and Bothnian Bay takes river regulation into account.

As the spatial resolution of the various data sets differ (total catchment, basins, individual rivers), the variability of the reconstructed runoff changes with time. For rivers entering Kattegat and Danish Straits, only a climatological mean ratio to the runoff entering the entire Baltic proper was used, except for 1970–2005 when observed runoff was used.

Sea-Level Forcing

A reconstruction of daily average sea level in the southern Kattegat was necessary because measurements are only available since 1887. Contrary to previous reconstructions that used a single variable, the air pressure gradient across the North Sea, to model the sea-level variations (Gustafsson and Andersson 2001), the full fields of SLP from HiResAFF are used. Further, to better resolve the non-linear response in the frequency domain, the analysis was done for three period bands separately, T < 10 days, 10 < T < 200 days, and T > 200 days. SLP and sea-level time-series were filtered using a time-domain low pass filter (Bloomfield 1976). Thereafter, EOF eigenvectors and eigenvalues were determined for each of the filtered time-series. A linear model was fitted using the time-series of the EOF modes and observed sea level. For consistency, the whole simulation period was run with reconstructed sea levels rather than changing to observed sea levels when available.

Atmospheric Loads

For 1970–2006, monthly time-series of the atmospheric nitrogen deposition to each basin were compiled at the Baltic Nest Institute (BALTSEM data set; Savchuk et al. 2012) combining estimates based on measurements and modeling. The reconstruction of the time-series for 1850–1970 was made by a linear interpolation between a few selected “reference” years: 1850, 1900, 1925, 1950, 1960, and 1970, for which atmospheric deposition was estimated from available data. The data set and methods are described in detail in Ruoho-Airola et al. (2012). Deposition of organic nitrogen was assumed to be 20 % of wet deposition of inorganic nitrogen (Savchuk et al. 2012 and references therein).

The atmospheric phosphorus load was assumed based on published ranges to be 15 kg P km−2 year−1, invariably and evenly deposited over the entire Baltic Sea during 1970–2006 (Ruoho-Airola et al. 2012 and references therein). According to Mahowald et al. (2008), the historical increase of phosphorus deposition was less pronounced than the increase of nitrogen deposition, and therefore phosphorus deposition in 1850 was assumed to be only half of that in 1970.

Land Loads

For 1970–2006, nutrient inputs from rivers and direct point sources to each basin were compiled from information contained in the Baltic Environmental DatabaseFootnote 1 and the HELCOM database (e.g., HELCOM 2011 and references therein), as well as from a multilingual collection of old papers, technical reports, and working documents as described in Savchuk et al. (2012). Nutrient inputs for 1850–1970 were linearly interpolated between the years 1850, 1900, 1950, and 1970. Pre-industrial loads for 1900 are based on estimates by Savchuk et al. (2008), where direct point sources were re-evaluated from information on population in the large coastal cities and annual human nutrient excretion rates of 3.3 kg N and 0.4 kg P per capita (Meybeck et al. 1989).

The development during the two decades 1950s and 1960s was remarkable both globally and within the Baltic Sea Region (e.g., Jansson and Dahlberg 1999; Clarke et al. 2006; de Wit et al. 2011). Intensified agriculture facilitated by both mechanization and greatly increased fertilizer application (e.g., Kremser and Schnug 2002; Eriksson et al. 2007; Eglin et al. 2010) led to a much larger increase in nutrient input from the southern agricultural landscapes of Denmark to Poland than from the vast forested areas of Northern Finland and Sweden. Progressing urbanization and, especially growth of the large coastal cities (Anonymous 2006) was initially not accompanied by appropriate wastewater treatment (Conley et al. 2007). Wastewater treatment started earlier and developed faster at the northwestern shores of the Baltic Sea comparing to its southeastern shores. These inter-basin differences have been taken into account when prescribing nutrient inputs from riverine and direct point sources for the year 1950 as an intermediate breaking point between 1900 and 1970. The increase of loads between 1850 and 1900 was assumed to be proportional to the load increase between 1900 and 1950.

Boundary Conditions in the Skagerrak

Profiles of salinity, temperature, oxygen, and nutrients were obtained from a hydrographic cross-section between Torungen (Norway) and Hirtshals (Denmark) operated by the Norwegian Institute of Marine Research. This standard section has been operated since 1952 and from 1961 has been sampled at least monthly. Extrapolation to times prior 1950s was done using the climatological mean seasonal cycle. In addition, a linear decrease of nutrient concentrations back in time was added assuming that concentrations in 1900 amounted to 85 % of present day concentrations (Savchuk et al. 2008). Data were interpolated to daily resolution and larger gaps were filled with results from climatological mean seasonal cycles.

Marine Validation Data

All available marine observations of salinity, temperature, nutrients, and oxygen from selected stations were extracted through the Data Assimilation System.Footnote 2 The data set is described in Gustafsson and Rodriguez-Medina (2011). In addition, surface temperature was compared to daily light vessel measurements obtained from SMHI (e.g., Lindkvist and Lindow 2006).

Results

Reconstructed Model Forcing

Figure 1 presents the long-term decadal variations of important HiResAFF variables together with validation data from NCEP and RCAO, over the Central Baltic Sea region (52.5°N–62.5°N, 17.5°E–22.5°E), covering the period 1850–2009 for winter (DJF), spring (MAM), summer (JJA), and autumn (SON). Objective measures to compare the three data sets, the correlation (cor), the deviation of the mean (bias), and the ratio of variance is presented in Electronic Supplementary Material Table S2.1.

Fig. 1
figure 1

Long-term reconstructed seasonal atmospheric forcing over the central Baltic Sea for 1850–2009 (HiResAFF) compared to RCAO and NCEP. 11-year running seasonal SLP anomalies (a), T2m anomalies (b), sums of precipitation anomalies (c), frequency of W-SW wind directions (d), frequency of E-NE wind direction (e), and wind speed anomalies (f). Anomalies are calculated relative to the 1958–2007 mean, except for T2m anomalies that are calculated relative to 1961–2007

The long-term seasonal co-variability of SLP between HiResAFF, NCEP, and the RCAO simulation is very high. In fact, HiResAFF performs partly better than RCAO when both are compared to NCEP. The lowest agreement in SLP between RCAO and NCEP exists for JJA (cor = 0.87) and SON (cor = 0.93), where the corresponding correlations between HiResAFF and NCEP are 0.96 and 0.99, respectively. Also the reconstructed long-term seasonal T2m (Fig. 1b) shows very high agreement (cor > 0.7) with NCEP and RCAO with exception of the pre-satellite period, when NCEP winter temperatures are consistently warmer and summer temperatures colder. In contrast, the long-term variations in seasonal precipitation (Fig. 1c) are generally underestimated in HiResAFF. However, there is also little agreement between RCAO and NCEP in MAM and JJA. It should be noted here that NCEP provides only very coarse large-scale precipitation data compared to RCAO and HiResAFF.

All three datasets show a very high temporal co-variability for the decadal mean wind speed (Fig. 1f) with lowest seasonal correlation in JJA (cor > 0.6). The long-term decadal changes in the frequency of W–SW (225°–292.5°) and E–NE (22.5°–90°) wind directions per season are shown in Fig. 1d–e. Besides a systematic deviation caused by the model bias between RCAO and NCEP, the W–SW wind directions show high agreement for all seasons with exception of JJA. The E–NE wind directions show very good agreement in SON and DJF. In JJA, the multi-decadal variations agree well although NCEP shows generally higher frequency of the E–NE direction than RCAO with HiResAFF in between.

Similar seasonal skills like estimated relative to NCEP and RCAO after 1948 can be assumed since 1850, as the number of stations used for the reconstruction is fairly constant especially for the Baltic proper region. Reanalysis data are not necessarily as consistent, and doubts remain about possibly spurious long-term trends in different variables in these data sets (cf. Dessler and Davis 2010 and references therein).

Jones and Moberg (2003) and BACC Author Team (2008) calculate seasonal T2m trends (per 100 years) in the period 1871–2004 for the northern/southern Baltic Sea region as +0.9 K/+1.0 K, +1.5 K/+1.1 K, +0.6 K/+0.3 K, and +0.8 K/+0.6 K for DJF, MAM, JJA, and SON, respectively. The reconstructed trends (per 100 years) of HiResAFF for the Baltic proper in the same period agree in DJF (+0.9 K), JJA (+0.3 K), and SON (+0.7 K) but are underestimated in MAM (+0.6 K). The trend for the full period 1850–2009 in HiResAFF is +1.1 K/100 years in DJF and exceeds those calculated for 1871–2004 owing to the very cold winters prior to 1870. The quality of HiResAFF is further corroborated by the comparison presented below between modeled and observed surface water temperatures.

The annual mean river runoff to the Baltic Sea (Fig. 2a) shows only little variability before 1900. This is probably due to the reconstruction, which is not based on observed runoff in 1850–1900. The beginning and end of the twentieth century were relatively wet periods while particularly dry periods occurred around 1930–1940, 1960–1980, and from 2003 and onwards.

Fig. 2
figure 2

Time series of annual average total river runoff (Q), nitrogen (N), and phosphorus (P) loads from land and atmosphere to the whole Baltic Sea

Nutrient loads to the Baltic Sea (Fig. 2b, c) reached a peak level at about 1980, and since then decreased again. The decrease is primarily due to a significant reduction in coastal point sources due to sewage treatment, while riverine loads have dropped by less than 10 %. In a longer time perspective, the reconstruction indicates total loads of about 13 and 400 kton year−1 around 1850 for phosphorus and nitrogen compared to about 75 and 1250 kton year−1 around 1980, respectively. Thus, the relative increase in phosphorus loads has been much larger than that for nitrogen and consequently the nitrogen/phosphorus molar ratio in the nutrient input to the Baltic decreased from about 68 to 37 between 1850 and 1980. Improvements in sewage treatment cause the nitrogen/phosphorus ratio to gradually increase again (about 50 in 2006).

Temporal Development in the Water Column

Salinity and Temperature

The annual average surface salinities and temperature in major basins and deep-water salinity and temperature in the Gotland Sea are drawn in Fig. 3. For comparison, observational data on salinity are also included in the figures and a statistical evaluation of inter-annual salinity and temperature variations are supplied in Electronic Supplementary Material Tables S3.1 and S4.1, S4.2, respectively. The long-term development of surface salinities of the Baltic Sea is highly dependent on river runoff, but also to long-term variations in westerly winds (Meier and Kauker 2003). Thus, salinities do not change much during 1850–1900 when the variability in the reconstructed river runoff is quite low (Fig. 2a). Surface salinities (Fig. 3a, c, e, g) remain on a generally low and stable level in 1900–1930. Thereafter salinities gradually increase in response to dryer conditions and lower frequency of westerly winds in the wintertime (Fig. 1d). Peaks in surface salinities occurred around 1950 and in the mid-1970s. Around 1980, both river runoff (Fig. 2a) and westerly winds increased in strength and frequency (Fig. 1d, f), causing the salinity to start decreasing and since the late 1980s, salinities returned to a level similar to the beginning of the twentieth century. The major changes in modeled annual average surface salinities follow observations rather closely: RMSE are between 0.15 and 0.35 and correlation coefficients 0.54–0.85, for the major basins inside the sills (Electronic Supplementary Material Table S3.1). At the Baltic entrance, short-term variability dominates over inter-annual variability making the estimates of annual averaged surface salinities from observations quite uncertain. This causes the scatter seen in Fig. 3a and is reflected in lower performance statistics.

Fig. 3
figure 3

Annual averages of surface salinity and temperature in Southern Kattegat (a, b), Gulf of Riga (c, d), Gulf of Finland (e, f), Gotland Sea surface (g, h), and 200 m depth (i, j). Lines are simulated and open circles are observed annual averages. Because of few observations, 5-year averaged salinities are computed from data before 1950. Trends in temperatures are shown by dashed lines

In the Gotland Sea deep water (Fig. 3i), several notable stagnation periods occurred. Particularly well reproduced is a stagnation period from the beginning of the 1920s to the beginning of 1930s. However, the model did not catch the record high salinities of the major inflow in December 1951.

There are considerable long-term variations in the annual average water temperatures (Fig. 3b, d, f, h, j). In the surface of all basins there is a significantly warmer period in the 1930s, with water temperatures about 1–1.5 K higher than in the 1850–1920 average. However, this warm anomaly did not propagate into the deep waters. On the contrary, periods of warm deep waters seem to be more or less out of phase with surface warming. Surface temperatures show increasing trends over the period, least in the Gulfs of Finland and Riga with about 0.4 K per 150 years, while the trends in the other basins are around 0.8 K per 150 years. In the Gotland Sea, deep-water temperature increased substantially more than surface temperatures.

Seasonally averaged surface temperatures were validated against seasonally averaged temperatures computed from high frequency observations (1–4 measurements per day) obtained from Swedish light vessels in 1860–1970. The full resulting statistics are presented in Tables S4.1 and S4.2 in Electronic Supplementary Material, both as average for all years with observations and divided into three periods (1860–1915, 1916–1941, and 1946–1970). In general, there is a negative bias (model underestimate) in Kattegat during wintertime, shifting to a positive bias in Baltic proper for all seasons. All biases are invariant between the time periods, indicating that they are inherited from the RCAO reanalysis. Correlations between modeled and observed surface temperature inter-annual variability show somewhat lower skills than the skill of the T2m reconstruction, especially for wintertime.

Nutrients

Winter surface nutrient concentrations for a selection of basins together with deep-water nutrient and oxygen concentrations from the Gotland Sea are shown in Fig. 4. The long-term dynamics of both surface phosphate and nitrate concentrations are clearly dominated by a response to the accelerated increases of the loads that began in the 1950s, however, with some delay. Surface nitrate concentrations seem to level off after the peak in load passed in the early 1980s in all basins, while phosphate continues to increase, especially in Gulfs of Riga and Finland. In the latter basins concentrations of both nutrients are also higher than in the Gotland Sea.

Fig. 4
figure 4

Winter average surface nitrate and phosphate concentrations in southern Kattegat (a, b), Gulf of Riga (c, d), Gulf of Finland (e, f), and Gotland Sea (g, h). Annual average nitrate (i), phosphate (j), ammonia (k), and oxygen (l) concentrations at 200 m depth in Gotland Sea. Lines are modeled averages and open circles are averages made from observations. Oxygen concentrations before 1950 are averaged over 5-year periods because of few available data

As nutrient measurements are lacking for the period prior to 1960 and are relatively unreliable for a decade after that, it is difficult to validate how well the model reproduces the concentrations in the beginning of the reconstruction. A quantitative comparison between modeled and observed surface winter nutrient concentrations for the data available is given in Electronic Supplementary Material Tables S3.2 and S3.3. Notable is that average deviations are not larger than for the simulations using best available forcing for contemporary conditions presented in Eilola et al. (2011). Linear correlation between model results and observations varies in the range 0.25–0.65 for nitrate and 0–0.52 for phosphate. However, in cases with low covariance, the standard deviation of the observations are very much higher than the standard deviation of the simulation results indicating that there are too few observations to estimate the winter mean concentrations accurately. Deep-water ammonia concentrations are at similar levels as observations during anoxic periods (Fig. 4k), while deep-water phosphate is severely underestimated (Fig. 4j). This deficiency is a recognized weakness of BALTSEM, and of at least most other Baltic Sea models, which is elaborated on by, e.g., Eilola et al. (2011), and is not an effect of the forcing applied in the present model set-up. Consequences for the surface and total pools of phosphate are limited, because the underestimation is confined to the very deepest part. Simulated deep-water oxygen concentrations in the Gotland Sea are close to observations since the 1960s, but may be slightly overestimated in the first half of twentieth century as indicated by a few measurements of the dissolved oxygen available from those times (Fig. 4l).

Development of Eutrophication Indicators

The decadal average bottom water oxygen distribution and summer (July–August) phytoplankton biomass in the surface layer are drawn in Fig. 5 for four periods. It is evident that oxygen starts to decline already between 1900 and 1950, not only in the Gotland deep shown in Fig. 4l, but also throughout the Baltic proper including Bornholm and Arkona basins, as well as in the southern Kattegat and the Danish Straits. The deterioration of oxygen conditions below the halocline continues throughout the 1980s into the 2000s and the largest extension of low oxygen zones occurred in the most recent period. This is not only the case for the well-known hypoxia of the Baltic proper and the Gulf of Finland, but areas of low average bottom oxygen concentrations also increased in the Kattegat and the Danish Straits, as well as in the Bothnian Sea.

Fig. 5
figure 5

Decadal averaged bottom oxygen concentrations (top row) and July–August phytoplankton biomass in the upper 10 m of the water column (bottom row). The maps are produced by projecting the vertical oxygen profiles on a topographic grid (Seifert and Kayser 1995)

Up to the 1950s, the increase in summer phytoplankton biomass is largest in the Danish Straits, and especially in the Öresund. Phytoplankton biomass in this small basin culminates around 1980 and thereafter declines significantly, which is corroborated by observations of nutrients and phytoplankton biomass (ICES 2010). This is in contrast to the major Baltic basins where increases in summer phytoplankton biomasses occur later and do not culminate until the end of the simulation. Typically, summer phytoplankton biomass increases with a factor of 4–6 from 1900 to the maximum decadal average.

Changes in Nutrient Turnover

Feedbacks in the biogeochemical cycles cause a remarkable response to the load changes (Fig. 6). From 1850 to the peak around 1980, nitrogen and phosphorus loads increased by a factor of 4 and 5, respectively. In response to this, nitrogen and phosphorus uptake by phytoplankton increased 6.5 and 6 times, respectively, but with a significant delay. The uptake equilibrates, or at least increases significantly slower, approximately a decade after the peak in loads. The sediment nitrogen and phosphorus pools (and consequently burial as being proportional to the pool) equilibrate at approximately the same time as uptake. Equilibration of the sediment pools indicates that the system approaches a balance between supply and loss.

Fig. 6
figure 6

Nitrogen (a) and phosphorus (b) fluxes and pools integrated over the whole Baltic Sea and smoothed by decadal (11-year) running mean. The fluxes and pools are normalized with the 1850–1860 average

The pelagic nitrogen pool stopped increasing already in the 1960s, and it is clear that the huge rise in primary production afterwards must mostly be sustained by intensified internal cycling and nitrogen fixation. This is to a lesser extent also the case for the pelagic phosphorus pool that started to decline a few years before the primary production leveled off. The model results show a minimum in nitrogen fixation in the late 1960s to early 1970s, but thereafter nitrogen fixation increases continuously to the end of the simulation.

The nutrient cycles for a selection of decades are summarized in Fig. 7. Between 1850 and 2006, all internal fluxes of nitrogen and phosphorus increased, in particular the nutrient uptake by phytoplankton, which is paralleled by intensified pelagic nutrient regeneration. However, the major part of this increase took place only after the 1950s. Around 1900, phosphorus and nitrogen uptake were about 10 and 3.5 times larger than the load, but around 2000 these factors increased to 23 and 8.5. However, looking closer at the different time periods, it can be observed that this is a recent phenomenon, and around 1980 the uptake of nutrients still had the same proportionality to the loads as around 1900. Thus, as loads decrease after 1980 production continued to grow leading to increase of the ratios of internal fluxes to nutrient inputs.

Fig. 7
figure 7

Decadal average nutrient cycles of nitrogen (left) and phosphorus (right panel) cycles. I import, E export from and to the Skagerrak, respectively, L loads from land and the atmosphere, N nitrogen fixation, S sedimentation, O flux from the sediments, D denitrification, B burial, U uptake by primary production, R pelagic nutrient regeneration. P and T are the pools of nutrients in the pelagic and sediment, respectively. All units are in kton year−1 except for the pools that are in kton

After 1980 external nutrient loads started to decrease, but the phosphorus and nitrogen outflux from the sediments continued increasing and compensated the load reduction. The last decade of the simulation was also characterized by a marked increase in nitrogen fixation, which was in itself already larger than the decline in nitrogen loads. Therefore, despite decreasing loads the nutrient supply to the pelagic ecosystem continued to grow.

Sediment pools of nitrogen and phosphorus increased throughout the simulation, reaching a new equilibrium around the year 2000 (Fig. 6). The sediment nitrogen pool is about 10 times the annual load, except for the period around 1980 when it was somewhat smaller, about 6.5 times the load. Thus, the pool of nitrogen seems to adjust to the loads rather rapidly. The phosphorus pool, on the other hand, stored about 40 times the annual loads in 1900 to 27 times by 1950 and slightly less than 20 times by 1980. Due to the recent decrease in loads and the continued increase in the sediment pool, the sediment phosphorus pool was approximately 35 times the load by 2000. The slow reaction to phosphorus load changes can also be seen as an imbalance between load and burial (net export through advection is fairly small).

Discussion

Climate Variations

The reconstruction of the long-term variability in T2m clearly shows the large change the Baltic Sea has gone through between 1850 and 2009. Periods with significant negative anomalies occurred in the 1850–1880s and 1980s, but the recent two decades are showing the highest decadal temperatures. From 1850 to the middle of the 1930s, decadal mean temperatures in autumn were more than 0.5 K colder than the 1958–2007 average. The abrupt increase of 1 K at the end of the twentieth century is unprecedented in the record. The trends in sea surface temperature in the basins examined range from 0.4 to 0.8 K during the 157 years, i.e., 0.25–0.5 K/100 years (Fig. 3). This is significantly lower than the range of estimated change for air temperature. A closer examination of the seasonal sea surface temperatures in the Baltic proper indicates that the DJF temperatures increased on average by 0.46 K/100 years (not shown), in comparison with the T2m increase of 1.1 K/100 years. Partly, the reason for this is the minimum temperature set by the freezing point of sea water, but also the high temperatures in recent decades are substantially lower in the surface water compared to T2m. In fact, surface water temperatures in the recent decades seem not to be much higher compared to the warm period in the 1930s (Fig. 3). The discrepancies between water and air temperature change clearly demonstrate that concurrent changes in other forcing variables, e.g., wind speed, are also important for the development in the water.

Wind speeds in combination with changes in wind direction are important for sea-level variations and inflow events (Gustafsson and Andersson 2001 and references therein; Meier and Kauker 2003). The period since the beginning of the 1980s with unprecedentedly few inflow events coincides with exceptionally strong winter winds with a high frequency of W–SW directions. Although the reconstructed forcing is not perfect, the BALTSEM results indicate that the simulation captures most of the inflow events (see e.g., variability in deep-water salinity and oxygen in Figs. 3g, 4l). However, a further evaluation regarding short-term events is beyond the scope of the present paper.

Hydrographic Variations

The model simulation showed modest long-term changes in salinity (Fig. 3), with the most prominent feature being the saline period of 1940–1990. Meier and Kauker (2003) obtained a more pronounced saline period around 1920 than the present simulation. Looking in detail at the salinity in the Gotland Sea (Fig. 3g), which dominates the average salinity, the present simulations do indeed somewhat underestimate salinities around 1925, followed by an overestimation in the late 1930s. Also for many years in the period 1950–1970, surface salinities seem to be underestimated by the model. However, it has to be remembered that we compare a large horizontally homogeneous basin extending from the Gdansk Deep to the northern Baltic proper with measurements from a single station in the Gotland Deep. A comparison between data and observations in the Bornholm basin (not shown) does not show the same bias.

The decoupling of surface and deep-water temperatures became quite obvious in the simulations. Naturally, the deep-water temperature is dependent on both the timing of the inflow and the variations of surface temperatures in the source region for inflowing water. Not all of the simulated inflow events are comparing well to the temperature observations at the monitoring station in the Gotland Deep, but the general tendency towards warmer deep waters is probably correct.

Oxygen Concentrations

Simulated oxygen concentrations in the deep basins seem to be overestimated for the first half of the twentieth century (Fig. 4l). Since Winkler titration methods have not changed, the early measurements should be fairly comparable to more recent data, indicating a true model data mismatch. With exception of river runoff, the physical forcing of the circulation model is of consistent quality for the whole simulation period and oxygen concentrations are reproduced well for the latter part of the simulation, implying plausible simulation of oxygen transports. Therefore, it is tempting to assume that biogeochemical oxygen consumption is underestimated in the early part of the simulation either due to underestimation of reconstructed loads, simulated primary production and sedimentation, or from possible shortcomings in the parameterization of deep-water oxygen consumption in a more oligotrophic Baltic Sea. However, higher loads in the early part of the simulation would, according to sensitivity experiments (not shown), inevitably lead to too high nutrient concentrations by the end of the simulation because of accumulation of too much nutrients in the sediments.

Nutrient Cascade in the Pelagic System

An interesting result of the model simulation is the increase in pelagic nutrient recycling in the last decade of the simulation (Fig. 7). The reason for this is the high phosphorus availability in the system due to the large sediment outflux and an increasing mismatch between external nitrogen and phosphorus sources. Up to the mid-1980s, the winter molar DIN/DIP ratio in the Gotland Sea was about 8 or even higher, but after that declined rapidly to 5–6. Thus, as nitrate concentrations limit the spring bloom in the Baltic proper, more phosphorus is available for summer production, including nitrogen fixation. During summer, pelagic nutrient recycling is considerably higher than during the spring bloom, due to the combined effect from lower sinking rates of the smaller summer species, and larger heterotrophic nutrient excretion and detritus mineralization at high temperature.

A reasonable cause for the rapid change in DIN/DIP ratio could be the increase in oxygen deficiency in the bottom waters. Increased oxygen deficiency gives opposite effects with a decreasing nitrogen pelagic pool due to denitrification, and an increasing phosphorus pool due to outflux from the sediments. The sensitivity of the model was therefore explored with an experiment in which oxygen concentrations were artificially prohibited to drop lower than a minimum of 5 mL L−1 throughout the simulation (not shown), and indeed the DIN/DIP ratio was then more constant in the last decade, primarily because winter DIN concentrations were 50 % higher in the surface of Baltic proper, but also winter DIP concentrations were somewhat lower. This experiment also indicates that the slightly overestimated oxygen concentrations in the late 1960s to early 1970s (Fig. 4l) might have contributed to the winter surface DIN being somewhat overestimated in that period (Fig. 4g).

Time Scales of Eutrophication

This first realistic long-term time-dependent simulation of the Baltic Sea shows the remarkably long time scales of the biogeochemical cycles. The continued deterioration in water quality of the major Baltic Sea basins in recent years is clearly caused by the delayed response in primarily the phosphorus cycles. Coincidentally, the load reductions in recent years caused the phosphorus cycles to reach equilibrium with the loads in the last year of the simulation (2006), that is when burial plus export equals loads (about 36 kton year−1 including atmospheric loads). This is, however, bad news for those who hoped that the massive reduction of phosphorus loads since the peak would be sufficient to improve water quality in the Baltic Sea—instead our simulation suggests that productivity will remain on a high level without further load reductions.

The total phosphorus loads in 2006 are almost identical to the phosphorus loads in 1956, i.e., for the past 50 years the phosphorus loads have been higher than today. The fact that the phosphorus cycles have not reached equilibrium at higher loads clearly demonstrates the exceptionally long time scales of phosphorus cycling in the Baltic. It also shows that returning to loads of the 1950s will not bring back the water quality of the 1950s.

Conclusions

The reconstructed forcing fields of HiResAFF realistically replicate long-term seasonal changes in atmospheric circulation, wind statistics, precipitation, and near-surface temperatures (Fig. 1). The very good agreement of modeled and observed water temperatures gives high confidence in the reconstruction and the model simulation. Clearly decreased frequencies in E-NE wind directions in winter and spring in recent decades coincide with a period of exceptionally few inflow events, and BALTSEM results reproduce this anomaly. Starting from the cold conditions in 1850–1870, the unprecedented high decadal mean seasonal temperatures in all seasons in past decades highlight the large thermal transition the Baltic Sea has gone through in the last 160 years.

The long- and short-term salinity and temperature variations that are well described in the Baltic Sea basins shows that river runoff, at least after about 1900, and sea-level reconstructions, together with the high resolution atmospheric forcing (HiResAFF), are of sufficient quality to drive the physical processes in BALTSEM.

The BALTSEM simulation plausibly describes the evolution of nutrient concentrations in the early part of the simulation and reproduces very well the development in recent years. The simulated oxygen concentrations are probably slightly higher than indicated by measurements for the first half of the twentieth century, which might be due to changes in the sediment biogeochemistry not yet accounted for. Besides this, the simulation gives clear indication that the nutrient load reconstructions are reasonable.

It is shown that the time scales of eutrophication are exceptionally long, and that only recently phosphorus content in the Baltic went into equilibrium with loads. This has huge implications for management, because if this is true, all efforts taken to reduce nutrient loads up to now basically resulted in maintaining the status quo and that additional action is needed in order to improve water quality.