Introduction

Diatoms provide ca. 40% of the global marine primary production, and thus they play a fundamental role in food webs and chemical cycles in the aquatic ecosystems [1]. Changes in the diatom composition therefore affect entire ecosystems. The variation in diatom composition and species richness has been the subject of a large body of research both in marine and fresh waters [e.g. 2, 3]. Traditionally, diatoms are thought to respond merely to local environmental variables, such as water chemistry, and in many cases, water chemistry variables, such as nutrients (nitrogen, phosphorus and silicon), pH, salinity and conductivity have been found to be the most important explanatory variables for diatom composition [e.g. 1, 2, 4]. However, the roles of climate and dispersal limitation have recently also been emphasized in explaining diatom distributions [3]. Organisms have distinct regional pools of species, and both climate and local physicochemical factors act as filters that determine the species composition in local communities [3, 5]. The relative importance of climatic factors and local physicochemical factors varies with study scale: climatic factors may override the local physicochemical factors at continental scales, while at regional scales (100–3000 km) diatom communities are influenced by both climatic and local factors [5].

The Baltic Sea is one of the world’s three large brackish water seas. Its surface water salinity ranges from ca. 9 in the South-West to almost zero in the North-East [6]. Such a decrease in salinity is caused by the restricted exchange of water with the North Sea through the narrow Danish Straits. A number of rivers also discharge from the catchment to different parts of the Baltic Sea—mostly to the Gulf of Bothnia, the Gulf of Finland and the Gulf of Riga. The largest annual total river runoff flows into the Gulf of Bothnia and the Gulf of Finland [7], which explains their low salinities. The Gulf of Bothnia and the Gulf of Finland also receive most of the incoming nutrient loads and organic material due to river runoffs. Anthropogenic nutrient inputs result in serious eutrophication including internal loading and hypoxia [8].

The Baltic Sea is a young formation with a low biodiversity [9]. Present conditions have only prevailed for ca. 3000 years, which is too short a period for brackish water communities to develop, and thus only a few freshwater or marine species have been able to adapt to live in the Baltic Sea [10]. Consequently, most important marine animal and algae groups are missing. This situation makes the whole ecosystem sensitive to disturbances as the extinction of a keystone species may affect the entire ecosystem [9].

Although diatoms are highly useful in indicating various environmental changes and they are used as standards in freshwater biomonitoring, diatom research has been somewhat scarce in the Baltic Sea. However, there have been important studies of benthic diatoms by Leskinen and Hällfors in the southernmost Finland [e.g. 1113], Vilbaste et al. [14, 15] in the Gulf of Riga and the coastal area of Estonia, Sommer in the coastal area of Germany [16], and Snoeijs, Busse and Ulanova in the coastal area of Sweden [e.g. 17, 18]. However, sampling has usually covered only a limited spatial scale with the exception of the study by Ulanova et al. [19]. Thus, there is a lack of broad research studies investigating the effects of climatic, spatial and water physicochemical variables on benthic diatoms in the Baltic Sea.

The aim of this study was to determine the most important climatic, spatial and physicochemical variables that explain variation in diatom species richness and composition in the Baltic Sea, using epilithic (i.e. growing on stones) diatom data collected in the littoral zone of the Gulf of Finland and the Gulf of Bothnia. Sampling covered the entire Finnish coastline, and explanatory variables were represented by climatic (July air temperature and precipitation), spatial (latitude, longitude and coast line exposition) and water physicochemical (total nitrogen, total phosphorus, pH, salinity, dissolved oxygen, silicon and water temperature) variables.

Materials and methods

Study area

The study comprised 37 sampling sites located in the southern and western coasts of Finland, in the Gulf of Finland and the Archipelago Sea (sampling sites 1–13), the Bothnian Sea (sampling sites 14–23) and the Bothnian Bay (sampling sites 24–37). The northernmost sampling site was in Kemi (65° 44′ N, 24° 34′ E), the southernmost in Hanko (59° 49′ N, 22° 58′ E), the westernmost in Kustavi (60° 33′ N, 21° 22′ E) and the easternmost in Virolahti (60° 35′ N, 27° 42′ E) (Fig. 1).

Fig. 1
figure 1

Map of the study area, showing the 37 sampling sites, including the northernmost (Kemi), westernmost (Kustavi), southernmost (Hanko) and easternmost (Virolahti) sites

The Gulf of Finland is the easternmost part of the Baltic Sea and is bounded by Finland, Estonia and Russia. Unlike several other parts of the Baltic Sea, the western outline of the Gulf of Finland does not form a hydrographic threshold [7]. The Finnish archipelago in the Gulf of Finland is shallow, disintegrated and contains many islands. The Neva River in the eastern end of the Gulf of Finland produces the largest single fresh water inflow into the Baltic Sea [20]. The Bothnian Sea and the Bothnian Bay belong to the Gulf of Bothnia, which is a northern extension of the Baltic Sea. The Gulf of Bothnia is isolated from the rest of the Baltic Sea by an archipelago and a hydrographic threshold, which leads to substantially different environmental features [21]. River discharges lower the salinity in the Gulf of Bothnia (ca. 0 in the North and ca. 6 in the South) and the Gulf of Finland (<2 in the East and ca. 6 in the West) [20, 21] and bring most of the nutrient loads thus increasing the nutrient levels especially in the littoral zone [22]. Both gulfs suffer from eutrophication and water quality deterioration, but the issue is even more severe in the Gulf of Finland.

Biogeographically, the studied area belongs to the boreal ecoregion. The climate represents an ecotone between continental and oceanic climate zones [23]. The variation in annual mean temperatures is from 1.6 °C (Kemi) to 6.0 °C (Hanko), and the annual precipitation varies between 580 mm and 768 mm [24].

Diatom sampling and laboratory analysis

The diatom samples were collected between the 1st and 15th of July 2013 in sheltered bays to minimize the effect of waves. The distance between sampling sites was ca. 30 km (Fig. 1), but large river mouths with ample nutrient loads were avoided so that minimum distance to river mouths was kept to 1 km. The diatom sampling complied with a standard SFS-EN 13946 adapted by Eloranta et al. [25]. Five stones with a diameter of at least 10 cm were randomly collected from each site from depths of 20–50 cm, and diatoms were sampled with a toothbrush from a 25 cm2 surface area of each stone. The samples were preserved with ethanol in the field.

The samples were processed according to the SFS-EN 13946 standard. Organic material was removed from the samples by boiling with hydrogen peroxide (30% H2O2). Cleaned diatoms were mounted on slides using Naphrax [25]. To ensure an adequate number of counted valves per sample, two slides were prepared for each sample. Approximately 300 valves per sample were counted and identified to species level (when possible) with a light microscope (magnification 1000×) according to Krammer and Lange-Bertalot [26]. The most recent names of the taxa were verified using AlgaeBase [27]. The counting of 300 valves per sample was documented to be an adequate number for a reliable diatom community analysis in most environments [28].

Water sampling and laboratory analyses

Water samples (0.5 l) were collected simultaneously with diatom sampling and they were used for the analysis of total phosphorus complying to standard SFS-EN 1189. Water temperature and conductivity were measured in situ using Mettler Toledo MX300 probe. Conductivity was used to calculate the salinity complying the UNESCO formula [29].

In order to include a higher number of potentially important explanatory variables, we also collected environmental data from existing databases. The values of total nitrogen, pH, dissolved oxygen and silicon for sampling sites were interpolated from the materials of the Finnish Environment Institute (2013, raster, spatial resolution 20 m). The coast line exposition was also interpolated from the materials of the Finnish Environment Institute (2004, raster, spatial resolution 20 m). July air temperature and July precipitation were calculated using data from the Finnish Meteorological Institute (1981–2010, grid, spatial resolution 10 m) [24] with the Spatial Analyst tools of ArcGIS 10.2.1, Arcmap-applications [30]. For a list of all the explanatory variables, see Additional file 1.

Data analyses

Intercorrelations among the explanatory variables were assessed using Spearman’s rank correlation coefficient (rs). Six of the correlations were diagnosed as strongly correlated (rs < −0.7 or > 0.7, p < 0.001) [31] (see Additional file 2 for details), and thus silicon, latitude, and air temperature were excluded from species richness analyses.

Correlograms with Moran’s I [32] were constructed to examine the degree of spatial autocorrelation in the environmental variables and diatom species richness. The correlograms show correlation values in the range of [−1, 1], where values near −1 signify strong negative spatial autocorrelation, 0 complete randomness, and values near 1 strong positive spatial autocorrelation [33]. Autocorrelation coefficients were calculated for distance classes with 80 km intervals generating a spatial correlogram with 9 distance classes. The significance of the spatial autocorrelation in the distance classes was tested using the Bonferroni criterion α = 0.05/k, where k is the number of distance classes used [34]. Thus, the corrected level of significance was set at α = 0.05/9 = 0.00556.

Generalized Additive Models (GAM) were used to study the relationship between species richness and explanatory variables [35, 36]. GAMs are likelihood-based regression models that replace the linear function by an additive function [35]. Thus, they uncover the nonlinear covariate effects by estimating the shape of a smooth curve directly from the data [37]. GAMs were fitted using the R statistics package mgcv with maximum degrees of smoothing restricted to 3 [36]. The significance of the GAMs was analyzed using F-tests and Chi square tests [38].

Prior to statistical species composition analyses, water quality variables (except pH), climate variables and coast line exposition were log-transformed (log10(x)) to reduce their skewed distributions, and diatom species composition data were Hellinger-transformed [39], because this method produces more precise estimates of the percentage of variation explained by the predictor variables [40].

As a constrained ordination, Redundancy Analysis (RDA) [41] was used to study the effects of climatic, spatial and water physicochemical variables on the diatom composition. Multicollinearity among the variables was assessed by determining variance inflation factors (VIF) [42], and values of VIF > 10 were considered to represent high collinearity and thus removed from the analysis [43]. Due to high VIF, latitude was omitted from the RDA analysis.

Furthermore, variation partitioning was conducted to partition the variation in diatom community compositions with respect to all climatic (July air temperature and July precipitation), spatial (exposition, latitude and longitude) and water physicochemical variables (total phosphorus, total nitrogen, pH, salinity, oxygen, silicon and water temperature) and their combined effects [44]. Variation partitioning, based on and utilizing the eigenvalues of Redundancy Analysis, enables the determination of the individual and combined effects of local and regional explanatory variables and the proportion of unexplainable variation. All statistical analyses were conducted in the R environment [45] using packages mgcv [46], ncf [47], rda [48] and vegan [49].

We then conducted Mantel tests to examine community turnover along geographical gradients [50]. The Mantel test is a method for modeling pairwise community dissimilarities as a function of pairwise spatial distances. Overall, the Mantel statistic r is a correlation between two dissimilarity or distance matrices.

Results

Spatial autocorrelation in explanatory variables

Moran’s correlograms for water temperature, pH, salinity, silicon and air temperature were significant in several distance classes according to the Bonferroni corrected level of significance (p < 0.00556) (Fig. 2). Explanatory variables typically showed gradient-like structure, where correlation values were positive at short distances, but negative at longer distances. Importantly, no significant spatial autocorrelation was observed in the species richness data.

Fig. 2
figure 2

Spatial autocorrelation of environmental variables and species richness based on Moran’s I. Filled squares indicate autocorrelation coefficients, which remain significant after Bonferroni correction at p = 0.05/9 = 0.0056

Characteristics of diatom assemblages

The number of species in a site varied between 16 and 54, and the mean richness was 34.0. A total of 230 diatom taxa belonging to 65 genera were identified. The genera with the highest number of species in the data were Nitzschia (35), Navicula (30) and Achnanthes (14). The most abundant species in the data were Diatoma tenuis, Nitzschia frustulum and Fragilariforma virescens var. subsalina. In addition, we observed 18 common taxa that represented at least 1.0% of the identified diatom cells, and whose average abundance in sampling sites also exceeded 1.0% (Table 1) (and see Additional file 3: Supplementary material for the entire list of diatom species).

Table 1 Diatom taxa for which average abundance in sampling sites exceeded 1.0%. Mean and range of the species’ relative abundance (%) and the percentage of sampling sites are also shown

Some of the species were common (>1.0% of cells) only in a particular region of the study area. For example, Navicula incertata was abundant only in the Gulf of Finland and the Archipelago Sea (sampling sites 2–13), while Bacillaria paxillifera (sampling sites 1–20), Diatoma monoliformis (1–25), Fragilaria fasciculata (2–21), Navicula margalithii (1–19), Nitzschia frustulum (1–24), Nitzschia inconspicua (3–26) and Nitzschia liebetruthii (2–23) were abundant in all areas except the Bothnian Bay. Encyonema silesiacum (28–37) and Nitzschia fonticola (24–37), were abundant only in the Bothnian Bay.

Species richness

Our best GAM model explained 35.2% of the total variation in species richness and indicated that diatom species richness had a statistically significant U-shaped relationship with pH (p < 0.01) and a statistically significant positive relationship with total phosphorus (p < 0.01) and total nitrogen (p < 0.05) (Table 2; Fig. 3). All other variables had non-significant relationships with species richness.

Table 2 Results of the best approximating Generalized Additive Model for explaining variation in diatom species richness
Fig. 3
figure 3

The results of best Generalized Additive Model explaining variation in diatom species richness. The dashed lines are the 95% of confidence intervals. On the y-axis, penalized regression splines (s) of each variable present their respective degrees of freedom

Redundancy analysis and variation partitioning

According to RDA, climatic, spatial and water physicochemical variables explained 43.3% of the variation in diatom community composition. The first RDA axis explained 17.7% of the variation and was mainly related to climate variable air temperature and water physicochemical variables water temperature, silicon, total phosphorus, salinity and pH. This axis separated sampling sites in the Gulf of Finland and the Archipelago Sea (dominant species Rhoicosphenia abbreviata, Nitzschia frustulum and Nitzschia liebetruthii) on the left side of the diagram from the sampling sites in the Bothnian Bay (dominant species Diatoma tenuis and different species of Fragilaria) on the right side of the diagram (Fig. 4). The sampling sites in the Bothnian Sea were located in the central part of the diagram. The second axis explained 7.5% of variation and was mostly related to dissolved oxygen. No clear division among the study regions was observed on the second axis.

Fig. 4
figure 4

Redundancy Analysis ordination diagram showing the relative contributions of explanatory variables as vectors and sampling sites as symbols. Sampling sites were classified into three groups depending on their geographic location

In variation partitioning of community composition, the combined effect of all explaining variables (8.8%) and the combined effect of spatial and water physicochemical variables (5.0%) accounted for most of the variation (Fig. 5). The individual effects of climatic, spatial and water physicochemical variables, as well as the combined effect of climatic and water physicochemical variables, and combined effect of climatic and spatial variables were low.

Fig. 5
figure 5

Variation partitioning of diatom communities using water physicochemical variables, climate and geographic location as explaining variables. ac show the individual effects of variables; dg show the combined effects of variables; h shows the amount of explained variation

The Mantel test indicated that the community dissimilarity of diatoms was significantly correlated with geographical distance (see Additional file 4 for details).

Discussion

Our analyses showed that diatom species richness was related to three water chemistry variables: pH, total phosphorus and total nitrogen. The importance of pH on the diatom species richness in the Baltic Sea is in accordance with several earlier studies [e.g. 51]. However, our result of a U-shaped relationship between species richness and pH disagrees with some other studies, which have shown a unimodal relationship [e.g. 52]. We speculate that our different result may be due to some unmeasured environmental variables, such as trace metals or organic solvents. We also expect that covering a wider pH range would have further emphasized the importance of pH in regulating the species richness.

Previous studies investigating the effect of nutrients on diatom and phytoplankton species richness have settled upon contradictory results, and concluded that the role of nutrients varies between regions [53]. Marine environments are usually considered as nitrogen limited [54], while freshwaters are often thought as phosphorus limited [55]. Previous studies have found differences in nutrient limitation between the Baltic Sea basins: the Bothnian Bay in the North seems to be phosphorus limited throughout the year, and the Gulf of Finland seems to be nitrogen limited, with the exception of some high nitrogen peaks in autumn. The nutrient limitation of the Bothnian Sea varies within the year [56] (see also [57] about joint limitation by N and P). Thus, we assumed that both total phosphorus and total nitrogen would be among the most important explanatory variables for diatom richness in our study [e.g. 11, 17]. However, we found that the effect of phosphorus on richness was slightly stronger than the nitrogen effect, which agrees with the findings of Blomqvist et al. [58]. This finding may reflect the fact that salinity in our study area was closer to freshwater than marine. Some earlier studies have observed a unimodal relationship between nutrients and diatom species richness, stemming from P limitation in low nutrient levels leading to increased P competition, and high pH in high P values [53, 59]. The nutrient levels in our sampling sites remained moderate, thus leading to positive, however leveling off, dependence between total phosphorus and species richness, and total nitrogen and species richness.

RDA showed that salinity was one of the key variables affecting diatom composition on the first axis, albeit its effect was somewhat weaker than air and water temperature, silicon and total phosphorus. The salinity gradient in our study area was 0.1–6.1, thus making the RDA result consistent with previous studies. Busse and Snoeijs [17], Ulanova and Snoeijs [18], and Ulanova et al. [19] studied the Swedish coast (salinity gradient approx. 0.4–11.4) and emphasized the role of salinity as one of the most important explanatory variables for diatoms. Clarke et al. [60] studied the Danish Straits (salinity gradient 2.7–31.1) and noted a similar strong impact of salinity on diatoms. In contrast, Weckström and Juggins [59] (salinity gradient 0.7–6.4), and Leskinen and Hällfors [12] (salinity ca. 6) studied the Gulf of Finland and reported salinity to have only a minor effect on diatom communities. In conclusion, these studies show how both the extent of the salinity gradient and study area seem to affect the importance of salinity on diatoms.

RDA also showed that water and air temperatures were important variables for diatom compositions. Temperature may affect diatoms directly by influencing metabolism, growth and reproduction [61], or indirectly by changing the physical, chemical and biological characteristics of the water. Therefore, some earlier studies have noted air and water temperatures to be influential to diatoms [e.g. 62]. We also showed silicon to exert a strong effect on littoral diatom composition, which is inconsistent with several previous studies suggesting that silicon is a limiting nutrient only in the pelagic areas [12, 19].

Due to these driving variables, RDA clearly separated the southern and northern study areas based on their diatom communities. The southern sampling sites in the Gulf of Finland and the Archipelago Sea were characterized by high temperatures, salinity, pH and concentrations of total phosphorus, but low concentrations of silicon. Diatom samples collected from the southern area were dominated e.g. by Navicula incertata, which is a brackish water species tolerating somewhat high salinities, and Nitzschia frustulum, Nitzschia liebetruthii and Diatoma monoliformis, which thrive in eutrophic waters and indicate high nutrient levels. Samples from the Bothnian Bay in the North, were dominated by the freshwater species Encyonema silesiacum and Nitzschia fonticola. Such regional segregation of species is typical for diatom studies [e.g. 63] due to climatic variation, restricted dispersal and regional differences in water chemistry.

According to the traditional view, unicellular organisms are ubiquitous and their community compositions are only regulated by local water chemistry variables [e.g. 64]. During the past decades, however, the traditional view has been questioned, and particularly the seminal approach of variation partitioning by Borcard et al. [44] has given rise to research concerning the role of local and regional variables affecting diatom community composition [e.g. 63, 65]. In particular, benthic algae have been documented to exhibit spatially more restricted distributions than previously believed, possibly because they are less exposed to wind and currents than planktonic algae [66]. In our variation partitioning, only 0.6% of variation in community composition was explained solely by local water physicochemical variables, whereas the spatial variables explained purely 1.7% of the variation and the combined effect of all explaining variables was 8.8%. Such a spatial component was also evident in Mantel test in which community dissimilarity was correlated with geographical distance. This may reflect either dispersal limitation or spatially structured environmental variation. For example, Heino et al. [67] and Potapova and Charles [65] also reported that spatial variables had stronger influence on communities than water chemistry variables in boreal and North American streams, respectively. The importance of spatial variables on diatom community composition has been criticized, however, because spatial effect may also include the effects of any spatially structured, unmeasured environmental variables [67]. However, in our study, we included water chemistry variables that have been previously documented to be the most important drivers for diatom communities (e.g. pH, salinity, total nitrogen, total phosphorus) [e.g. 18]. Therefore, we assume that the importance of regional variables is not solely due to unmeasured water chemistry variables. In our study, climatic variables explained 1.8% of variation, i.e. slightly more than spatial variables. Previous studies taking into account the effect of climate on diatom composition are rare, but these studies have suggested that climate may be influential for diatoms [68]. Our study confirms that it is important for future diatom studies to consider climatic variables.

Our variation partitioning showed that 81.5% of variation remained unexplained. Such a high value is typical for corresponding studies [e.g. 67], and it may be due to the fact that we missed some abiotic (e.g. calcium and magnesium concentrations) or biotic (e.g. grazing pressure) factors that may have been influential to the community composition. While the individual effects of variable groups remained admittedly low here, the combined effects proved to be important. This result was to be expected, as spatial and climatic variables are strongly related, and both also affect water chemistry. In general, this shows that most of the variables act in concert to influence diatoms and any individual effect of a single variable is not easily disentangled.

Conclusions

The usefulness of diatoms in indicating the local environment has been recognized, and diatoms are increasingly utilized in monitoring the state of the aquatic ecosystems. The Baltic Sea is a unique and vulnerable ecosystem with low salinity and biodiversity, and thus demands continuous research to support conservation efforts and monitoring programs. However, previous research on large-scale distribution patterns of diatoms in the Baltic Sea is scarce. Here, we showed that the most important variables affecting diatom species richness were pH, total phosphorus and total nitrogen. The most important variable groups affecting diatom composition were climatic and spatial variables, whereas the effect of water physicochemical variables was surprisingly weak. The combined effects of climatic, spatial and water physicochemical variables were, however, stronger. Therefore, we conclude that explanatory variables affecting diatom species richness and composition are diverse, and understanding the distribution of diatoms requires the inclusion of not only the local water physicochemical variables, but also regional explanatory variables such as climatic and spatial variables.