Environmental determinants of lake macrophyte communities in Baikal Siberia

We investigated whether environmental filtering or dispersal-related factors mostly drive helophyte and hydrophyte species richness and community composition in 93 lakes situated in Baikal Siberia. Using partial linear regression and partial redundancy analysis, we studied (1) what are the relative roles of environmental variables, dispersal variables, spatial processes and region identity (i.e., river basins) in explaining variation in the species richness and species composition of helophytes and hydrophytes across 93 Siberian lakes, and (2) what are the differences in the most important explanatory variables driving community variation in helophytes versus hydrophytes? We found that, for both species richness and species composition, environmental variables clearly explained most variation for both plant groups, followed by region identity and dispersal-related variables. Spatial variables were significant only for the species composition of hydrophytes. Nutrient-salinity index, a proxy for habitat trophic-salinity status, was by far the most significant environmental determinant of helophytes and hydrophytes. Our results indicate that environmental factors explained the most variation in both species richness and species composition of helophytes and hydrophytes. Nevertheless, dispersal-related variables (i.e. spatial and dispersal) were also influential but less important than environmental factors. Furthermore, the dispersal-related variables were more important for hydrophytes than for helophytes. Most brackish permanent lakes were mostly located in the steppe biomes of southern Transbaikalia. This characteristic along with the oldest age, the largest distances to both river and settlements and the lowest temperatures in the study region distinguished them from freshwater, drained and more nutrient-rich floodplain lakes.


Introduction
Understanding the environmental variables that structure biological communities is important for both fundamental and applied ecological research (Jackson et al. 2001;Ricklefs 2004). The need for accurate information on species richness-environment and species composition-environment relationships has further increased in the face of the global change (McGill et al. 2015). It is highly important to better understand how different environmental variables drive species distributions and ecological communities and how changes in these variables further affect biota. Although species richness-environment and species composition-environment relationships have been intensively studied during the past decades (e.g., Low-Dećarie et al. 2014), there are still some gaps in our knowledge on patterns in species composition and species richness. One of such shortages is the profound geographical bias in the published investigations.
The majority of ecological studies published in English have so far been executed in Western Europe, Americas, Australia and East Asia, whereas much less is known how environment drives biological communities in Africa, as well as in Northern and Central Asia (Vilmi et al. 2017;Alahuhta et al. 2019;Nuñez et al. 2019). The importance to conduct ecological research in remote regions is further highlighted by context-dependency, which is the tendency for the same biological group to respond differently to environment gradients among study regions (Heino et al. 2012;Duncan et al. 2015). Context dependency has been found to be especially strong in freshwater ecosystems (Alahuhta and Heino 2013;Grönroos et al. 2013;Tonkin et al. 2016).
In addition to the context-dependency in the species richness-environment and species composition-environment relationships, different ecosystems and biological groups have been studied with varying intensity. Most of the research effort in studying how biological communities respond to the environment has traditionally been made in the terrestrial realm, but considerable progress has been made in the freshwater ecosystems in the last two decades (Field et al. 2009;Alahuhta et al. 2019). Increased interest towards freshwaters is highly important because these ecosystems cover only 0.01% of Earth's total surface area but support 10% of global biodiversity, and global change is particularly threatening to these small-scale biodiversity hotspots (Dudgeon et al. 2006;Vilmi et al. 2017;Harrison et al. 2018). However, studies investigating species richness-environment and species composition-environment relationships in freshwaters have mostly focused on relatively well-known and widely-examined biological groups, such as diatoms (Verleyen et al. 2009), macroinvertebrates (Grönroos et al. 2013) and fish (Jackson et al. 2001). Yet, the patterns found in these biological groups may not be applicable to less-studied organism groups, such as aquatic macrophytes which, for example, have passive dispersal mode, are strongly dependent on water quality, and have unique responses to carbon limitation and extreme temperatures (Santamaría 2002;Lacoul and Freedman 2006). This shortage in limited taxonomic coverage also hinders our ability to generalize whether the same or similar environmental gradients are responsible for detected ecological patterns among biological groups. Thus, it is essential to study how environmental factors contribute to variation in biological communities in different ecosystems and using different organism groups as model groups.
One poorly-studied biological group is aquatic macrophytes occurring in lakes. Such plants strongly influence the structure and functioning of freshwater ecosystems by providing habitat and shelter, breeding areas and food resources for other aquatic and terrestrial species (Lacoul and Freedman 2006;Bornette and Puijalon 2009). In addition, macrophytes respond to decreased light availability, increased sedimentation and nutrient concentrations (Lacoul and Freedman 2006;Bornette and Puijalon 2011). Many of these plant species are also intolerant of high salinity concentrations and are, therefore, mainly confined to freshwater lakes (Nielsen et al. 2003;Bornette and Puijalon 2011). Although these local environmental variables are often important for aquatic macrophytes, it is still unclear whether environmental filtering or spatial processes mostly structure aquatic macrophytes in accordance with metacommunity dynamics (Leibold et al. 2004;Winegardner et al. 2012). In the metacommunity theory, the fundamental aim is to understand whether environmental factors or spatial processes (e.g., dispersal limitation) are key drivers of biological communities (Heino et al. 2015;Brown et al. 2017). Recently, Alahuhta et al. (2018) found that environmental filtering controlled macrophyte community variation in most of the studied metacommunities in a global analysis across 16 regions. However, spatial processes have been found to significantly structure macrophyte communities in some mountain ponds, wetlands and floodplain lakes (Hájek et al. 2011;Padial et al. 2014;Alahuhta et al. 2018). Thus, more research is needed to gain a better understanding of the metacommunity structuring in aquatic macrophytes across regions. Specifically, macrophyte metacommunities found in specific water body types, such as brackish lakes, are understudied.
Aquatic macrophyte communities of Baikal Siberia have been poorly studied. Most previous studies have focused so far on the biggest freshwater lake situated in the region, Lake Baikal (Izhboldina et al. 2017a, b), or were mainly descriptive in relation to aquatic and wetland flora of the region (Chepinoga and Rosbakh 2012;Chepinoga et al. 2013;Chepinoga 2015). However, very little is known about how species richness-environment and species composition-environment relationships are formed in smaller, often brackish lakes across Baikal Siberia. To fill this gap of knowledge, we investigated 93 lakes situated in this remote area and tested whether environmental filtering or dispersal-related variables mostly drive the helophyte and hydrophyte communities. Specifically, we examined the relative roles of environmental variables, dispersal variables, spatial variables and regional identity in explaining variation in the species richness and species composition of helophytes and hydrophytes in the studied lakes. We expected that environmental filtering more strongly affects the communities of hydrophytes than those of helophytes (O'Hare et al. 2012;Alahuhta et al. 2014), due to the differences in their ecophysiology Viana et al. 2014). For example, helophytes, mostly use nutrients from sediment and carbon from atmosphere, are more vulnerable to climate effects compared with hydrophytes, and respond strongly to changes in shoreline structure and water level fluctuations (Partanen et al. 2009;Alahuhta et al. 2014;Kolada 2016).

Study area
Baikal Siberia, our study area (Fig. 1), is located in the eastern part of Southern Siberia, congruent with the catchment areas of Lake Baikal and the River Angara that flows out of the lake. Baikal Siberia expands across almost 15° of latitude (49°09′-64°19′ N) and 26° of longitude (95°39′-122°08′ E), and it covers 1.6 million km 2 of southeastern Siberia, equaling approximately 15% of the size of Europe. Details of the study area can be found in Supporting Information (Appendix S1).

Macrophyte surveys
In order to cover the existing aquatic macrophyte diversity of Baikal Siberia's lakes, we selected seven local study regions arranged along a transect stretched for 2000 km from NW to SE of the region. The local study regions are confined to the plain parts of seven river basins of comparatively similar size arranged from west to east (Biryusa, Iya, Oka, Belaya, Khilok, Ingoda, Onon; Fig. 1; Appendix S1). In each local study region (Table S1), we studied from 10 to 23 lakes (floodplain or permanent lakes, ponds, quarries), which were subsequently surveyed for the presence of aquatic macrophytes. Vascular plants were surveyed at each lake by walking along lakeshore (up to 10-20 m from the shoreline) fording or by boat, with the assistance of rakes and hydroscopes. Great care was taken to avoid plant identification bias by collecting herbaria specimens from taxonomically 'problematic' groups (e.g. Carex, Batrachium, Potamogeton) for the cross-check by leading taxonomists of corresponding groups in the region. The surveys were carried out between July and the beginning of September 2002-2008. For further statistical analysis, we used macrophyte species composition data (separately for helophytes and hydrophytes following classification of Chepinoga 2015) expressed as a site-byspecies presence-absence matrix. Altogether, data from 93 lakes with helophytes and 92 lakes with hydrophytes were used in the analysis.

Explanatory variables
Four sets of explanatory variables were used: (1) environmental variables, (2) dispersal variables, (3) spatial variables, and (4) region identity (Table 1). Environmental variables consisted of study site biome (dummy variables of taiga, forest-steppe or steppe), shoreline length (m), sediment type (dummy variables of gravel, sand, mud), water transparency (in meters), habitat trophic-salinity status (a variable ranging from 1 to 30), trophic status (dummy variables of oligotrophic, mesotrophic and eutrophic), growing degree days (GDD; > 5 °C), temperature of the coldest month (°C) and Thornthwaite aridity index (TAI; Thornthwaite 1948). The biome data were obtained from the vegetation map of the southern part of East Siberia (Belov 1973). The sediment type was determined visually at study sites with transparent waters and/or by driving the rake manually into the sediment in turbid waters. Shoreline length indicates horizontal habitat availability with water bodies with longer shores having more different habitats available for macrophyte establishment (Lacoul and Freedman 2006). The water transparency was measured at several locations at the time of macrophyte surveys in a corresponding lake with the Secchi disk, and it reflects vertical habitat availability (Toivonen and Huttunen 1995). We used a community mean indicator value for nutrient-salt contents in soil/water (thereafter 'NS index') as a proxy for habitat trophic-salinity status (Diekmann 2003;Korolyuk 2006). Briefly, this indicator value adopted from Korolyuk (2006) indicates the position of a species along the nutrient-salinity gradient. The concept of the NS index was initially developed for grasslands of Southern Siberia (Tsatsenkin et al. 1974) and differs from the widely-used European Ellenberg and Landolt indicator values. The main difference of this indicator value is that it includes species ecological optima along salinity and nutrient (NPK) gradients, the two most important factors driving grassland species occurrence and frequency in the region (Korolyuk 2006). The rationale of merging these two environmental variables in one indicator is that high salinity levels in soils have a stronger, usually negative impact on plant growth than available nutrients even at high concentrations. The NS index has 30 grades, where the grades from 1 to 4, 5 to 8 and 9-12 indicate species with ecological optima in oligo-and meso-and eutrophic habitats, respectively, with very low concentrations of dissolved salts. The range from 13 to 16 is occupied by plants typical for habitats with a slight salinity. The grades from 18 to 30 are reserved for plants with different levels of salt tolerance, ranging from species tolerating moderate salinity to salt desert ('solonchak') species (Tsatsenkin et al. 1974). The calculations were based on the complete vascular plant data including not only the hydrophytes and helophytes, but also species in reed belts and fen meadows occurring around lakes. Although we acknowledge the potential problem of circular argument when using the NS index, only a small proportion of studied species were used to calculate the index. This problem was completely overcome, when we found that the NS index was positively correlated both with actual salinity values (R Spearman = 0.71, p < 0.001) and trophic status (R Spearman = 0.58, p < 0.001) in a subset of the study lakes (n = 63). The trophic status was estimated subjectively based on water transparency, sediment type and depth and presence-absence of indicator species (e.g. free-floating Lemna spp. or Spirodela polyrhiza as indicators of high nutrient content in water).
The GDD (i.e., annual sum of all the days with temperature > 5 °C) is a direct determinant of the growth season length and intensity, whereas January temperature is a proxy for harsh winter conditions affecting macrophytes through thick ice cover, ice erosion and freezing of sediments (Lind et al. 2014;Nilsson et al. 2015). The TAI reflects the balance between precipitation and evaporations in a given climate and is positively correlated with the probability of salinization of a water body (Korolyuk et al. 2017). The TAI values Three types of biome (taiga, forest-steppe and steppe), substrate (gravel, sand and mud) and trophic status (oligotrophic, mesotrophic and eutrophic) were used as a dummy variables in the models but shown here. Spatial variables derived from staggered distance-based Moran's Eigenvector Maps based on Euclidean distances among the lakes are deliberately not shown, because their values are not comparable per se with other explanatory variables. Region identity refers to different river basins (see Table S1 for details) above zero indicate humid climates, whereas negative values indicate arid climates. GDD5 and temperature data were obtained from the WorldClim 2 (Fick and Hijmans 2017) and TAI from the ENVIREM (Title and Bemmels 2018) databases for the period of 1970-2000. The dispersal-related variables comprised of distances (in meters) to the nearest river, the nearest lake and the nearest settlement, as well as minimum lake age. The distance to river (based on Euclidean distances) was included in the analysis because flowing waters act as an important dispersal vector (Johansson et al. 1996). The distance to the nearest lake can be used as a proxy for isolation: the larger the distance, the lower probability of diaspora dispersal (Padial et al. 2014). Humans have been shown to affect the dispersal processes of aquatic macrophytes (Cutway and Ehrenfeld 2010); therefore, distance to the nearest settlement was included as a proxy for the probability of human-mediated seed dispersal. Also, we suggested that minimum lake age could reflect dispersal processes, with older lakes having a higher frequency of dispersal events as compared to the younger ones. The minimum lake age was identified by comparing topographic maps and satellite images issued in different years (from 1861 to 2016). This variable reflects minimum age for each lake (i.e., they are at least this old), but in some cases they can be older. ArcGIS 10 (ESRI, Redlands, CA, US) was used to measure shoreline length, estimate the distance from the lake to the next settlement, river and lakes. The measurements were based on recent satellite images (2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016).
Spatial variables were obtained using staggered distancebased Moran's Eigenvector Maps (dbMEM; Declerck et al. 2011;Borcard et al. 2011) for lakes in each local study region situated in seven different river basins (see Fig. 1, Appendix S1). Whereas dispersal variables indicate actual potential dispersion of aquatic macrophyte species from nearby water bodies and via humans acting as dispersal vectors in the landscape, spatial variables mirror geographical patterns among spatially clustered lakes. Thus, spatial variables can, for example, reflect potential dispersion of species and/or spatial structure in environmental variables among lakes (see below for details). Geographical distances are converted into dbMEMs that map neighborhood relationships at different scales onto orthogonal and linearly uncorrelated components. dbMEMs depend on the eigenfunctions of the matrix of truncated geographic distances among the study sites and can display spatial autocorrelation in the data. Staggered dbMEMs are specifically developed for situations, where sites are not evenly distributed in landscape but irregularly separated into smaller groups (Declerck et al. 2011), as in our study system comprising the seven river basins. Traditional dbMEMs cannot properly model spatial relationships among sites in this kind of uneven landscape (Borcard et al. 2011;Declerck et al. 2011).
Staggered dbMEMs describe spatial relationships among the sites of a focus group but cannot tell anything about spatial relationships among groups. For this reason, during the calculation of canonical analysis, the sites of a focal group have, with each other, relationships defined by the dbMEMs of that group, whereas the sites outside the focal group have weights of 0 (Borcard et al. 2011). In our study, we used Cartesian coordinates of lake centroids to calculate Euclidean distances between lakes in each focal lake group within separate river basins and only positive eigenvectors based on minimum truncation distances were employed. Staggered dbMEMs were estimated using the "adespatial" package in the R (Dray et al. 2018).
We used a dummy variable of "region identity" to indicate historical effects and biogeographical factors among the seven study regions (see Appendix S1 for details of different study regions). Dummy variables take values of 0 s or 1 s. Then, if lakes of a basin belong to that particular basin, they have 1 s, whereas lakes not belonging to that basin have 0 s.

Statistical methods
We applied partial linear regression (pLR) and partial redundancy analyses (pRDA) to explain the relationships between variation in helophytes and hydrophytes and four different explanatory variable groups in 93 and 92 study lakes, respectively. pLR was employed for species richness data and pRDA for species composition. For species composition, species presence-absence data were Hellinger transformed prior to further analysis (Legendre and Gallagher 2001). The well-established protocol of Borcard et al. (1992) was followed for pPLSs and pRDAs. We partitioned total variation in helophyte and hydrophyte species richness and species composition into 16 fractions: (a) pure effect of environment, (b) pure effect of dispersal, (c) pure effect of spatial processes, (d) pure effect of region identity, and joint effects of the four above pure combinations as well as unexplained variation. Statistical significance of pure fractions (< 0.05) was evaluated using an anova function but joint fractions could not be assessed (see Oksanen et al. 2015). The comprehensive protocol to estimate these fractions are explained in Borcard et al. (2011).
We assessed the variation explained by each variable group with adjusted R 2 , which provides unbiased estimates of the explained variation (Peres-Neto et al. 2006). The use of adjusted R 2 values often decreases the percentage of variation explained, resulting in a substantial amount of unexplained variation (Alahuhta and Heino 2013;Tonkin et al. 2016;Alahuhta et al. 2018). We used a variable selection procedure, where the forward selection was carried out only if a global test using all explanatory variables in a variable set was significant (Blanchet et al. 2008). Forward selection using the Monte Carlo permutation test (100 permutations, p = 0.05) was then utilized to gain significant variables for the analysis. Dummy variable of region identity was forced in the models. In addition, RDAs were used to visualize which environmental and dispersal variables influenced on species compositions. All pLRs and pRDAs were performed separately for helophytes and hydrophytes in the R statistical environment with ordiR2step and varpart functions implemented in 'vegan' package (Oksanen et al. 2015).

Species richness
Altogether, we found 106 macrophyte species, divided into 53 helophytes and 53 hydrophytes (Appendix S2). pLRs accounted for 58.9% of the total variation in species richness for helophytes and 39.0% for hydrophytes (Table 2). Of the pure fractions, environmental variables clearly explained most variation for both plant groups (helophytes: 29.1%, and hydrophytes: 25.4%). Dispersal variables (4.6%) and region identity (7.0%) were also marginally statistically significant for helophytes (p = 0.054 and p = 0.075, respectively). Of the joint fractions, environment, dispersal and region identity contributed notably for helophytes (10.5%) and hydrophytes (4.5%). Joint effects of environment and region (5.0%) were important for helophytes, whereas joint effects of environment and dispersal had a relatively high influence on hydrophytes (7.8%). Spatial variables had no effect on the species richness of aquatic macrophytes.
Of the environmental variables, the nutrient-salinity index was the most important variable for both plant groups, followed by shoreline length, steppe biome and transparency for helophytes and eutrophic status and GDD5 for hydrophytes (Table 3). Distance to settlements had the highest effect on helophytes, whereas distance to river was the most important dispersal variable for hydrophytes but also for helophytes.

Species composition
Total explained variation for macrophyte species composition was 31.2% for helophytes and 19.3% for hydrophytes ( Table 2). A pure fraction of environmental variables had the highest effect on both plant groups (helophytes: 9.1%, and hydrophytes: 5.1%). In addition, pure fractions of dispersal and region identity influenced helophytes (2.4% and 2.9%, respectively) and hydrophytes (1.9% for both fractions). A pure fraction of spatial variables (1.0%) was marginally statistically significant for hydrophytes (p = 0.063). Of the joint fractions, environmental variables and region identity had the highest contribution to for helophytes (8.0%) and hydrophytes (3.3%). Joint contributions of (a) environment, dispersal and region (4.9% and 2.4%, respectively), and (b) environmental and dispersal (3.9% and 1.2%, respectively) had some effect on both plant groups. For the hydrophytes, the joint fraction of environment, dispersal and spatial Table 2 Variation partitioning (adjusted R 2 × 100) based on linear regression for species richness (SR) and pRDA for species composition (SC) divided among environmental variables, dispersal variables, spatial variables (based on staggered distance-based Moran's Eigenvector Maps using Euclidean distances among the lakes) and region identity (i.e., river basins, variables also contributed to species composition (1.2%). The variation partitioning results were similar when the nutrient-salinity index was removed for both species richness and species composition (Appendix S3). Of the environmental variables, habitat nutrient-salinity index was the most important variable for species composition in both functional plant groups. Many other environmental variables (temperature of the coldest month for both plant groups, GDD5, oligotrophic status, sand substrate and steppe biome for helophytes, and eutrophic status, taiga biome, mud substrate and shoreline length for hydrophytes) were also influential. Of the dispersal variables, distance to river had the highest contribution to both plant groups, followed by minimum lake age for helophytes and distance to settlements for hydrophytes. Spatial variables contributed significantly only to hydrophytes.
The RDA ordination indicated that there was a distinct compositional pattern in our data set. The first two axes accounted for 27.0% and 14.8% of the total variance in species composition for helophytes and hydrophytes, respectively (Figs. 2, 3). Most of this variation was attributed to the axis 1 in both functional groups (20.8% and 10.8% respectively), and correlated (r > |0.5|) positively with NS index, distance to either river (helophytes) or settlements (hydrophytes), minimum lake age, steppe biome, sandy substrate, with mean temperature of the coldest month and oligotrophic status, and negatively with muddy substrate, taiga biome and TAI. Thus, axis 1 indicates primarily the trophic-salinity status, the most important habitat variable revealed by pRDA analysis (Table 2). Therefore, on the right side of both ordinations, there are separated relatively old, isolated permanent steppe lakes characterized by brackish water and sandy substrates and localized in steppe as well as in forest-steppe regions of Transbaikalia (Khilok, Onon and partly Ingoda river basins; Fig. 1). Left part of ordinations occupied by more diverse (relative to axis 2) and younger  (Appendix S4, S5). The ecological explanation of axis 2 is more complicated, but following the position of species ordinated (Appendix S4, S5), we can suggest that the axis is partly connected with trophic status and bogging processes. Thus, in the upper part of ordinations, we see, Glyceria triflora, Schoenoplectus tabernaemontani and Eleocharis palustris, which prefer mesotrophic or oligotrophic conditions and even can create communities along river banks (Chepinoga 2015). In the bottom half of ordinations, we see such species as Typha latifolia, Comarum palustre and Calla palustris, which can form carpet-like stands floating on the water surface in mesotrophic or dystrophic waters. Hence, if Transbaikalian lakes from different river basins are ordinated mainly along axis 1, mainly freshwater lakes from Cisbaikala (Biryusa, Iya, Oka, and Belaya river basins) are arranged along axis 2. This regularity is evident in both functional groups, but it is more visible for helophytes than hydrophytes.

Discussion
Previous studies on lake macrophyte metacommunities have found that environmental filtering dominated over spatial structuring in explaining variation in lake macrophyte communities (Alahuhta et al. 2018), although opposite patterns have been reported in certain lake types (Hájek et al. 2011;Padial et al. 2014). Our results revealed that environmental filtering explained most variation for both species richness and species composition of helophytes and hydrophytes, but dispersal-related variables (i.e. spatial and dispersal) were also influential. In addition, we found the dispersal-related variables were stronger correlates of hydrophyte communities than helophyte communities. For all the groups studied, Fig. 2 Biplots of partial redundancy analysis (RDA) investigating the effect of 19 variables on helophytes surveyed in 93 Baikal Siberia lakes. Ordination space represents sites distributed based on Euclidean similarity as determined by species abundance. Arrow length is proportional to the strength of correlation between each variable and the RDA axes. RDA axes 1 and 2 explain 27.0% and 20.8% of observed variation, respectively. See Table S2 for lake numbers. bio6 temperature of the coldest month, GDD5 growing degree days, NS nutrient-salinity index, Driver distance to the nearest river the nutrient-salinity index was the most important individual driver of species richness and species composition.

Environmental filtering prevails in lake macrophyte metacommunities
We found that environmental variables were the best correlates of species richness and species composition of lake macrophytes, which is in line with the idea of species sorting (Leibold et al. 2004). Environmental variables, incorporating both local, landscape-level and climate features, had the strongest correlations with lake macrophyte communities. This outcome was especially notable for species richness. Moreover, the relatively high importance of region identity (solely or together with environmental variables) further emphasized the importance of environmental filtering on macrophytes. Region identity can indirectly mirror the influence of historical effects and climatic forcing on local community structure (Declerck et al. 2011;Heino et al. 2017). Thus, our correlative results lend support to previous investigations that have similarly found that environmental filtering controls variation in species richness and species composition of lentic macrophytes at the regional scale (Alahuhta et al. , 2018Viana et al. 2014).
Although spatial variables were important only for the species composition of hydrophytes, the pure fraction of dispersal variables was important for the species richness of helophytes, and species composition of both plant groups. This finding suggests that dispersal drives lake macrophyte communities to some extent. Similar results have been reported in various other regions (O'Hare et al. 2012;Viana et al. 2014). Lake macrophytes have traditionally been considered to have wide geographic distributions, due to efficient colonization capacity and good dispersal abilities of  Table S2 for lake numbers. NS nutrient-salinity index, Driver distance to the nearest river many species (Santamaría 2002). However, a global analysis on aquatic macrophytes reported that only 1% of all aquatic plants have global range sizes and most species are confined to certain continents and ecoregions (Murphy et al. 2019). Thus, it is likely that dispersal is not the controlling factor of lake macrophyte communities at a regional scale. Instead, the importance of dispersal (e.g. dispersal limitation) may increase at continental and global spatial scale for macrophytes (Heino et al. 2015;Brown et al. 2017).
Concerning other specific differences between helophytes and hydrophytes, overall explained variation was higher for both species richness and species composition of helophytes than those of hydrophytes. However, environmental variables had the highest correlations with both helophytes and hydrophytes, suggesting dominance of environmental filtering on these plant groups. Instead, dispersal-related variables (both spatial and dispersal) were slightly more important for the species composition of hydrophytes than that of helophytes. Joint fraction of environmental and dispersal variables was also considerably high for species richness of hydrophytes. One possible explanation for the patterns detected is that some helophytes (e.g., Phragmites, Carex, and Typha) can be effectively dispersed by wind, water (Barrat-Segretain 1996;Soomers et al. 2013), and by endochorous and epizoochorous dispersal by waterfowl and shorebirds (Soons et al. 2016;Lovas-Kiss et al. 2019). Thus, the distance to sources of diaspores (e.g. rivers, settlements, other lakes) should not be as crucial for them as for hydrophytes. In addition, many helophyte species are adapted to variable environmental conditions, possess high reproduction rates and are opportunistic competitors (Santamaría 2002). However, contradicting findings have been presented in the previous investigations. In Scottish lakes, spatial processes explained more variation in submerged than emergent macrophyte communities (O'Hare et al. 2012), whereas no congruent differences were found between helophytes and hydrophytes in the lakes of Midwestern USA . Alahuhta et al. (2014) discovered that space contributed more to the species richness of hydrophytes than helophytes, but no similar trend was evident for species composition patterns. In the Brazilian coastal wetlands, environmental variables solely explained the species composition of floating-leaved macrophytes and spatial variables were only important for submerged communities, whereas both environment and space were important for the species composition of emergent macrophytes (Trindade et al. 2018). Thus, variable patterns in environmental filtering vs. spatial processes between functional macrophyte groups seem to exist in different regions.

Ecological gradients related to species composition and species richness of lake macrophytes
The habitat salinity-nutrient index had the highest influence on species richness and species composition of helophytes and hydrophytes. We discovered that a particular group of lakes predominantly occurring in steppe and forest-steppe biomes was notably structured along a nutrientsalinity gradient, which was relatively short in our study, and species richness decreased strongly with increasing salinity. These clustered lakes were oligotrophic with brackish water and nutrient-salinity values over 13. Nutrients and salinity interact in such a way that plants cannot uptake nutrients from water, even if it is nutrient-rich when salinity is too high (Nielsen et al. 2003;Bornette and Puijalon 2009). This is largely because high concentrations of dissolved salts are toxic for the majority of macrophyte species (Nielsen et al. 2003;Lacoul and Freedman 2006;Bornette and Puijalon 2011). In these lakes, species responding clearly positively to the nutrient-salinity gradient included Bolboschoenus planiculmis, Triglochin maritima, Phragmites australis and Stuckenia pectinata. These species either prefer brackish habitats (first two species) or have wide environmental preferences (latter two species; Tsatsenkin et al. 1974). Nevertheless, the majority of macrophytes in Baikal Siberia are freshwater species. The most striking example is Equisetum fluviatile that had a strongly negative attitude to the nutrient-salinity gradient (Appendix S2). Although this helophyte species has a wide tolerance to water and substrate pH and nutrient levels, it is rarely growing in brackish waters (Pearce and Corder 1988). Thus, our results confirm that a high concentration of salts is a stress factor (Melack 1988), for which reason floras of oligotrophic brackish lakes differ considerably from other freshwater habitats.
These clustered brackish waters, permanent lakes mostly located in the steppe and forest-steppe biomes of southern Transbaikalia, were also the oldest, had the longest distances to both river and settlements and subjected to the coldest temperatures. This suggests that temporal availability of colonization sources for macrophytes was the highest in the oldest lakes, for which environmental conditions have had most time to stabilize (e.g., Alahuhta et al. 2018). On the other hand, these lakes are also most isolated. Although the dispersal of macrophyte species in these lakes may be contributed by waterfowl (Soons et al. 2016), their possibilities to receive propagules from nearby rivers and via human-mediated dispersal vectors is somewhat hindered. This pattern was also noticeable for species richness, which linearly decreased with increased distances. However, even long-distance dispersal can be hampered by niche processes especially in harsh environmental conditions (Viana et al. 2015), such as high salinity levels as in our study. Moreover, environmental filtering and dispersal were equally important in the early successional stage of plant communities in stream riparian habitats, although the importance of environmental filtering increased towards the wet end of the riparian gradient (Fraaije et al. 2015). Thus, it seems that landscape-level dispersal is important for macrophytes at least in these isolated lakes, but it is overshadowed by environmental filtering.
The correlations between climate and macrophyte communities were quite variable between species richness and species composition and between helophytes and hydrophytes. In general, GDD5 and mean temperature of the coldest month had the strongest relationship with species richness and species composition of macrophytes, respectively. Obviously, the temperature during the growing season strongly affects macrophytes through heat accumulation, which has a direct effect on metabolic processes and influences primary production (Brown et al. 2004). However, GDD5 was important only for species richness of hydrophytes and species composition of helophytes. Mean temperature of the coldest month mirrored harsh winter conditions (via ice erosion, freezing of sediments, limiting of light penetration and air-water gas exchanges due to thick ice and snow cover), which are known to limit macrophyte distributions in freshwaters (Lind et al. 2014;Nilsson et al. 2015). Milder winters favor growth and survival of many hydrophyte species (Netten et al. 2011), but helophyte species composition was similarly influenced. The aridity index had a negligible role for aquatic macrophytes compared to other climate variables. The reason for this could be the shortness of the humidity-aridity gradient covered by our data as all lakes were completely located in regions with strong continental and ultra-continental climate.
In summary, our analysis of lake macrophyte communities in Baikal Siberia documented both similarities and differences in the species richness-environment and species composition-environment relationships in comparison to previous studies. Environmental filtering has been found to primarily control macrophyte metacommunities in different parts of the world (Alahuhta et al. 2018), but spatial processes have been evidenced to drive macrophytes more than species sorting, for example, in wetlands and floodplain lakes (Hájek et al. 2011;Padial et al. 2014). Our correlative findings suggest that environmental filtering is the primary driver of macrophyte communities in Baikal Siberia lakes, but dispersal has also some role for both helophytes and hydrophytes. However, overall explained variation was relatively low, suggesting that multiple mechanisms may interact to determine species richness or compositional differences among lakes. In addition, we discovered that brackish water was the main determinant for helophytes and hydrophytes, which has been seldom reported in freshwater systems.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.