Many endangered plant species are considered to suffer from the effects of ongoing habitat destruction, which has often led to an increased isolation and fragmentation of their populations. As a consequence, populations of rare plants often suffer from a loss of genetic diversity due to the loss of alleles through genetic drift (Young et al. 1996; Frankham and Wilcken 2006; Yuan et al. 2012). Genetic differentiation among populations and the loss of genetic diversity is expected to increase with time since fragmentation (Coates 1988; Gitzendanner and Soltis 2000; Zawko et al. 2001). In Europe, nutrient-poor wet grasslands belong to the most threatened and fragmented habitat types (van Duren et al. 1997; Bakker and Berendse 1999; Oelmann et al. 2009). Many of these grasslands have been drained and fertilized to increase agricultural productivity or have been abandoned. These changes in land use practices have led to the decline and genetic impoverishment of many plant species (Bakker and Berendse 1999; Vergeer et al. 2003; Wesche et al. 2012). Among the plant species that have become rare through the destruction and fragmentation of wet grasslands is the sword-lily species Gladiolus palustris GAUDIN, a Western and Central European species which is declining all across Europe and is considered to be threatened in many countries (Bilz et al. 2011; UICN France et al. 2012). G. palustris is legally protected in the European Union (Annex II Habitats Directive 92/43/EEC) and in Switzerland. The closely related Gladiolus imbricatus L., considered to be less threatened, has a larger distribution area extending from Central to Eastern Europe (Fig. 1). Morphological distinction of both taxa in the field is difficult without examining the corm, which is often not possible for conservation reasons. At the western limit of its distribution area in France Gladiolus palustris occurs in very isolated populations and some populations have been identified morphologically as G. imbricatus but their taxonomic status is unclear (Tison and Girod 2014).

Fig. 1
figure 1

Distribution map and study sites of G. palustris and G. imbricatus across Central and Western Europe. For population names, see Table 1, regions are in between brackets. Gray circles represent G. palustris populations and black circles represent G. imbricatus populations. Taxonomic identification was based on morphology. Populations with evidence for hybridization based on genetic analyses are marked with asterisk. Map created with SimpleMappr modified from Meusel et al. (1965) and Szczepaniak et al. (2016)

Closely related congeneric taxa are likely to hybridize as they often have incomplete reproductive isolation (Mallet 2005, 2008; Soltis and Soltis 2009; Soltis 2013). When the ecological niches of closely related species are similar, they may coexist at certain parts of their distribution area and hybridization can occur in these contact zones especially when both taxa have the same ploidy level (Chapman and Abbott 2010) as it is the case for G. palustris and G. imbricatus (2n = 4x = 60; Cantor and Tolety 2011). Hybridization and polyploidy strongly influenced the evolution of European Gladiolus species (van Raamsdonk and de Vries 1989; Cantor and Tolety 2011; Tison and Girod 2014). It has been recently confirmed by molecular and morphological analyses that G. palustris and G. imbricatus naturally hybridize to form the nothospecies G. × sulistrovicus (Cieślak et al. 2014; Szczepaniak et al. 2016) in the eastern part of the distribution area of G. palustris where the distributions of both taxa partially overlap. Molecular data showed that in the case of G. × sulistrovicus hybridization is unidirectional with G. palustris as the maternal species and G. imbricatus as the pollen donor (Szczepaniak et al. 2016).

Although interspecific hybridization is considered to be an important force of plant evolution (Soltis and Soltis 2009), it also can have detrimental effects on endangered plant taxa through genetic swamping and hybrid backcrossing with one of the parental taxa (Allendorf et al. 2001; Todesco et al. 2016). The discrimination between cryptic hybrids (Gaskin and Kazmer 2009) and purebred taxa has important implications for the delineation of conservation units (Allendorf et al. 2001). The clarification of the taxonomic status of populations of endangered species is important for appropriate conservation measures (Smith and Waterway 2008; Nierbauer et al. 2017). Although hybrid populations may have a certain conservation interest, protective laws often ignore hybrids. It has been argued that hybrid populations are not worth conserving because hybridization events may be due to anthropogenic influences such as the introduction of plants and animals, habitat fragmentation and habitat modification (Allendorf et al. 2001). The use of molecular methods can help to clarify the taxonomic status of populations of closely related rare plant species (Turchetto et al. 2015; Segatto et al. 2017).

We studied the genetic population structure of the endangered G. palustris in the western part of its distribution area in France and Switzerland using AFLP markers to assess the effects of habitat fragmentation. We wanted to investigate the genetic diversity and conservation status of extant populations as a basis for more appropriate management measures. We furthermore investigated some isolated populations with unclear taxonomic status at the western limit of the distribution area of G. palustris. We therefore sequenced the ITS region of the nuclear ribosomal DNA as well as two regions of the chloroplast DNA for a large number of G. palustris populations in Western and Central Europe and compared them to populations of the closely related G. imbricatus taxa occurring in the same area.

We asked the following questions: (i) does habitat fragmentation affect the genetic structure and diversity of G. palustris populations? (ii) are the populations of G. palustris at the western distribution limit genetically distinct from core populations in Central Europe and what is their taxonomic status?

Materials and methods

AFLP study of G. palustris

Thirteen populations of G. palustris located in the western part of the distribution area in France and Switzerland were used to conduct an AFLP study (Table 1). The studied populations represent a large part of the extant populations in the area. At each study site, leaf samples were collected from individual plants along a 20 m long transect. In most populations, leaves from 14 to 16 individuals were sampled (Table 1). The leaf material was placed immediately in separate paper bags and conserved in silica gel for storage until DNA extraction. The minimum distance among sampled plants was one meter in order to avoid sampling clones. Population sizes were estimated as the number of flowering individuals (Table 1).

Table 1 Location of the 32 studied populations of Gladiolus palustris and G. imbricatus in Western and Central Europe

The AFLP reactions were performed according to the AFLP® Core Reagent Kit (Invitrogen™). Approximately 100 ng of genomic DNA was digested at 37 °C for 2 h with 0.8 µl of the restriction enzymes EcoRI and MseI (1.25 U/µl each) and 2 µl of 5 × Reaction Buffer in a final volume of 10 µl. Endonucleases were then inactivated at 70 °C during 15 min.

Adaptor ligation was achieved by adding 9.6 µl of Adapter/Ligation Solution and 0.4 µl of T4 DNA ligase (1 U/µl) to the previous reaction and by incubating the mix for 2 h at 20 °C. After diluting the ligation solution to 2:5, 2 µl were used to perform pre-amplification by adding it to 9.4 µl of AFLP® Pre-amp Primer Mix I (Invitrogen™), 1.2 µl of 10 × PCR buffer plus Mg and 0.2 µl of Taq DNA polymerase (5 U/µl, Thermo Scientific). Polymerase chain reaction was performed with 20 cycles at 94 °C for 30 s, 56 °C for 1 min and 72 °C for 1 min. The solution was then diluted with 115 µl of ddH2O.

Two out of twelve primer combinations with distinct polymorphic loci were selected for the selective amplification: E-AAG/M-CAC and E-AAC/M-CAC. Amplifications were performed using 5 µl of diluted pre-amplification reaction added to 0.4 µl of dNTPs (10 mM), 2 µl of 10X PCR buffer with (NH4)2SO4, 1.2 µl of MgCl2 (25 mM), 0.16 µl of Taq polymerase (5U/µl, Thermo Scientific), 1 µl of EcoRI primers (1 µM) and 1 µl MseI primer (5 µM) in a total volume of 20 µl. Amplifications were programmed for 1 cycle at 94 °C for 2 min, 10 cycles of 20 s at 94 °C, 30 s at 66 °C and 2 min at 72 °C, where the annealing temperature was reduced by 1 °C each cycle. Amplification continued for 20 more cycles with the same conditions except for a 56 °C annealing temperature. The final extension step was conducted at 60 °C for 30 min.

Capillary electrophoresis of all samples was performed with 5 µl of a 1:10 dilution of the selective amplification products of AFLP and 1 µl of a 1:4 dilution of ET550-ROX Size Standard (GE Healthcare) on a MegaBACE 500 DNA Analysis System (GE Healthcare) after a denaturation step of 95 °C for 2 min.

Statistical analysis of AFLP data

The fragments amplified by AFLP primers were scored using MegaBACE Fragment profiler v1.2 (GE Healthcare) as either present (1) or absent (0). Fragments with lengths between 60 and 500 base pairs were included in the analysis. Due to their geographic proximity, data from the two populations from Perrignier were pooled together for further statistical analysis resulting in a total of twelve populations. To estimate the error rate of the AFLP analysis we made three replicates from four randomly chosen individuals from four populations. The mean error rate per sample was calculated as the number of errors divided by the total number of phenotypic comparisons within replicated samples (see Paun and Schönswetter 2012).

Genetic diversity within populations was estimated as Nei’s expected gene diversity (He) within a population that averages expected heterozygosity of the marker loci (Nei 1987) using the program GenoDive which allows to correct for unknown dosage of alleles in polyploids (Meirmans and Van Tienderen 2004). The fact that G. palustris is a tetraploid may affect the estimates of genetic diversity, as polyploids often maintain greater genetic diversity (Meirmans et al. 2018) and the dosage of alleles is not known (Dufresne et al. 2014). However, due to the sheer abundance of anonymous nuclear markers scattered over the entire genome, dominant markers such as AFLP are considered to be a powerful source of information to study population genetic structure even in polyploids (see Dufresne et al. 2014). In GenoDive, dominant data such as AFLP are coded as haploids setting the maximum ploidy level to one. To test for the relationship between population size and genetic diversity we correlated population size (log-transformed) and Nei’s gene diversity.

The genetic structure of G. palustris at the landscape level was studied using a Bayesian clustering method to infer population structure and assign individuals to population groups, as implemented in STRUCTURE version 2.3.4 (Pritchard et al. 2000) which allows the analysis of dominant data (Falush et al. 2007). We used a model of no population admixture and a model of admixture for the ancestry of the individuals in two different runs without prior information about the regional membership of the populations and assumed that the allele frequencies are correlated within populations. We conducted a series of ten independent runs for each value of K (the number of clusters) between one and twelve to quantify the amount of variation of the likelihood of each K. We found that a length of the burn-in of 100,000 and Markov chain Monte Carlo (MCMC) of 500,000 was sufficient. Longer burn-in or MCMC did not significantly change the results.

The model choice criterion implemented in STRUCTURE to detect the K most appropriate to describe the data is an estimate of the posterior probability of the data for a given K, Pr(X|K) (Pritchard et al. 2000). We used an ad hoc quantity based on the second order rate of change of the likelihood function (DK) with respect to K (Evanno et al. 2005) as implemented in STRUCTURE HARVESTER (Earl and vonHoldt 2012) as an indicator of the signal detected by STRUCTURE.

Finally, the ten runs of the STRUCTURE simulations using the model of no population admixture with the most appropriate K were aligned using the Greedy option in CLUMPP V1.1.2 (Jakobsson and Rosenberg 2007). Convergence of the ten replicate runs for the most appropriate K was high as they produced very similar clustering results as shown by the pairwise G’ (similarity function) values (> 0.99) for each pair of permuted runs in CLUMPP. The mean membership coefficients were represented as a bar graph using DISTRUCT 1.1 (Rosenberg 2004).

We also conducted a discriminant analysis of principal components (DAPC) with the R package Adegenet version 2.1.1 (Jombart 2008) as DAPC is not based on pre-defined population genetic models and makes no assumptions about Hardy–Weinberg equilibrium, which is potentially an advantage in the analysis of polyploids (see Turchetto et al. 2015; Meirmans et al. 2018). We first set an a priori group number of ten in the DAPC analysis and in a second step reduced the group number to four based on the results of the first analysis.

To represent the overall genetic relationship among individuals a neighbor-net network (Bryant and Moulton 2004) was constructed using SplitsTree version 4.14.6 (Huson and Bryant 2006). We used uncorrected P method to calculate pairwise distances among individuals (Nei and Kumar 2000).

A hierarchical analysis of molecular variance (AMOVA) was used to partition the genetic variability among geographic regions, populations within regions and individuals as implemented in GenAlEx (Excoffier et al. 1992; Peakall and Smouse 2006, 2012; Meirmans 2006). The variance components from the analysis were used to estimate Φ-statistics, which are similar to F-statistics (Excoffier et al. 1992). We tested the association between pairwise Φst genetic distances among all pairs of populations and geographic distances using a Mantel test, as implemented in GenAlEx (1000 permutations). We estimated the population-specific FST values using BAYESCAN 2.1 (Foll and Gaggiotti 2008) with the default settings and used a linear regression to test if there was a relationship between the population-specific FST values and measures of genetic diversity (He) of populations. We expect a strong relationship if genetic differentiation is affected by genetic drift (both current and historical), reflecting a migration–drift disequilibrium (Cox et al. 2011).

ITS and cpDNA study of G. palustris and G. imbricatus

We collected plant material from 23 populations of G. palustris and nine populations of G. imbricatus across France, Switzerland, Italy, Hungary, the Czech Republic and Germany (Fig. 1, Table 1). Taxonomic identification in the field was based on morphology and was done in collaboration with local botanists who also helped with the sampling of the plant material. At each study site, leaf samples were collected from two to 42 randomly selected individual plants. To include Gladiolus communis subsp. byzantinus as a phylogenetic outgroup, seeds from the Andalusian Seed Bank (Accession number BGVA 13452) originating from Mallorca were germinated and cultivated to the seedling stage. Leaf material of five seedlings was sampled and preserved in silica gel until DNA extraction. Genomic DNA was extracted using a DNeasy Plant Mini Kit (Qiagen) starting from approximately 100 mg of fresh leaf material or 10 mg of dried material.

ITS and cpDNA sequencing

Nuclear ITS region

PCR amplification and sequencing of the ITS region was implemented with ITS4 and ITS5 primers (White et al. 1990). The PCR protocol was conducted using approximately 60 ng of genomic DNA, 2.5 µl of deoxyribonucleotide triphosphates (dNTPs) mix (2 mM), 1 × Pfu buffer with MgSO4, 0.625 U of native Pfu DNA polymerase (Thermo Scientific®) and 1.75 µl of each primer (10 µM), in a total volume of 25 µl. Cycling conditions consisted of an initial denaturation at 95 °C for 3 min, followed by 35 cycles with denaturation at 95 °C for 1 min, an annealing at 50.8 °C for 1 min and 2 min of elongation at 72 °C, and a final elongation at 72 °C for 7 min. All PCR products were visualized by electrophoresis on 1.5% agarose gels stained with SYBR® Safe DNA Gel Stain (Invitrogen) to check for amplification. The amplified samples were purified using the QIAquick PCR Purification Kit Protocol (Qiagen).

Due to fungal contamination in the populations Nantua Nord and Les Rosses, an annealing temperature gradient was used to maximize PCR product yield to extract the desired PCR product after gel migration. To generate sequences for the samples from population Nantua Nord the genomic DNA was amplified with an annealing temperature of 52.1 °C in a 50 µl reaction, migrated on a 1.5% agarose gel, purified with a GeneJET™ Gel Extraction Kit (Thermo Fisher Scientific) and then amplified a second time using the same conditions. For population Les Rosses we were not able to generate a clean ITS sequence.

Purified PCR products were sequenced with the dideoxy chain termination method using the Big-Dye® Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems) with 2 µl of purified PCR product, 0.5 µl of Ready Reaction Premix (2.5 ×), 2 µl of sequencing buffer (5 ×) and 0.5 µl of primer (5 µM) in a total of 10 µl. Cycling conditions consisted of 1 min at 96 °C and 25 cycles of 10 s at 96 °C, 5 s at 50 °C and 4 min at 60 °C. The samples were precipitated with ethanol and EDTA according to the same kit, resuspended in 10 µl of Hi-Di™ Formamide and run on a 3730xl DNA Analyzer (Applied Biosystems).

cpDNA markers trnQ(UUG)-rpS16x1 and trnV(UAC)x2-ndhC

PCR amplification and sequencing of the cpDNA was carried out using the two primers trnQ(UUG)-rpS16x1 and trnV(UAC)x2-ndhC (Shaw et al. 2007). The PCR protocol was identical to the one used for the ITS amplification except that we used an annealing temperature of 55 °C and a total volume of 50 µl. The amplified samples were purified using the QIAquick Gel Extraction Kit (Qiagen). The purified products were then sequenced using the DYEnamic ET Dye Terminator Cycle Sequencing Kit for MegaBACE DNA Analysis Systems (GE Healthcare). To approximately 20 ng of purified PCR product, we added 1 µl of primer (5 µM) and 4 µl of Sequencing Reagent Premix in a total of 10 µl. Cycling conditions were 25 cycles of 95 °C for 20 s, 50 °C for 15 s and 60 °C for 1 min. Precipitation of the cycle sequencing products was achieved with sodium acetate/EDTA buffer and ethanol according to the DYEnamic ET Dye Terminator Cycle Sequencing Kit for MegaBACE DNA Analysis Systems (GE Healthcare). The samples were then resuspended in 10 µl of MegaBACE loading solution (GE Healthcare) and run on a MegaBACE 500 DNA Analysis System (GE Healthcare). Due to technical problems with the MegaBACE sequencer during the project, part of the samples had to be sent to Macrogen (Republic of Korea) for sequencing on a 3730xl DNA Analyzer (ABI).

Sequence editing and alignment

Between two and five randomly chosen samples per population were sequenced for the ITS region and the cpDNA markers trnQ(UUG)-rpS16x1 and trnV(UAC)x2-ndhC to check for intrapopulational variation and to build robust consensus sequences. The raw sequencing ITS data were imported into Geneious 11.1.5 (, Kearse et al. 2012) and assembled by population and both forward and reverse sequences were used for the assembly. The alignment of the consensus sequences of the populations was performed with MUSCLE (Edgar 2004). Raw data from the two cpDNA regions were analyzed with CodonCode Aligner 4.0.2 (CodonCode Corporation, An assembly of the data was done for each primer pair for each sample, checked and edited if necessary. The consensus sequences obtained for each population with the two primer pairs were combined into a single cpDNA consensus sequence per population. These consensus sequences were imported into the Geneious software to be aligned with the MUSCLE alignment tool. Obvious alignment errors were manually edited.

Statistical analysis of sequence data

A maximum likelihood (ML) analysis was performed using the PHYML algorithm (Guindon et al. 2010) implemented in Geneious. The model of nucleotide substitution used was determined with jModeltest 2.1.5 (Darriba et al. 2012). We chose to use the substitution model GTR + G for the ITS data set and the GTR model for the cpDNA data set. The number of bootstraps was set to 1000 and the gamma distribution parameter for the ITS data set was fixed to 1.134 according to the jModeltest results.

We conducted Bayesian analyses of phylogeny in MrBayes 3.2.2 (Ronquist et al. 2012) for both the combined cpDNA and the ITS after simple indel coding with SeqState 1.4.1 (Müller 2005). The models were the same as for the ML analysis. Four Monte Carlo Markov chains were run in two runs of 1,000,000 generations each. The burn-in discarded the first 25% of the trees and every 1000 generations Markov chains were sampled. A consensus tree was built after the average standard deviation of split frequencies was below 0.01 and posterior probabilities of the clades were computed. Dendroscope 3.5.9 (Huson and Scornavacca 2012) was used to produce tanglegrams.

To construct a phylogenetic network of the cpDNA haplotypes, we used PopART v1.7 (Leigh and Bryant 2015) with the minimum spanning network algorithm (Bandelt et al. 1999).


AFLP analysis of Gladiolus palustris populations

Genetic diversity within populations

Using two primer combinations, 130 AFLP bands were scored with no private bands specific to a G. palustris population. All 212 individuals had a unique multilocus genotype. The estimated mean error rate per sample was 3.6%. Error rates for AFLP below 5% are considered acceptable (see Bonin et al. 2004). The expected heterozygosity, as measured by Nei’s gene diversity (He), was similar in all populations (range He: 0.18–0.27; Table 1). Mean genetic diversity varied among geographic regions (ANOVA, F3,8 = 5.728, P = 0.02) and was significantly lower in the Alsace region (0.19) in comparison to the Ain region (0.26; Post-hoc Tukey test). We found no significant relationship between genetic diversity as measured by Nei’s gene diversity and population size (r = 0.33, P = 0.30).

Population genetic structure

Using the modal value of ΔK rather than the maximum value of L(K) allowed us to identify with STRUCTURE several groups corresponding to the uppermost hierarchical level of partitioning among populations. In the model with no admixture, the highest modal value of ΔK was at K = 2 which separated the two Bas-Rhin populations from all other populations indicating that the G. palustris populations of the Rhine Valley are genetically distinct. The second highest modal value of ΔK was at K = 4 corresponding to the number of geographic regions (Fig. 2, Appendix Fig. S1a). In the STRUCTURE model with admixture, the highest modal was also at K = 2 corresponding to the uppermost level of structure already detected by the model with no admixture (Appendix Fig. S1b). An examination of the standard deviation estimate of L(K) showed that its value strongly increased after K values higher than four indicating that the most likely number of groups would be four. In a third model, we run STRUCTURE on a reduced dataset in a model with admixture without the group containing the two Bas-Rhin populations to detect potential subdivisions of the second group containing the other populations. This analysis confirmed the subdivision of the second group into the same clusters as the analysis with no admixture (results not shown). The DAPC analysis confirmed that K = 4 corresponded to the most likely number of clusters (Fig. 3). Membership probabilities of individuals to the four groups in the DAPC analysis were nearly identical to those of the STRUCTURE analysis with no admixture (results not shown).

Fig. 2
figure 2

Estimated population structure of G. palustris inferred by a Markov chain Monte Carlo Bayesian clustering method (STRUCTURE) of AFLP data using a model of no population admixture. Each individual is represented by a vertical line, which is partitioned in a maximum of K = 4 differently colored segments that represent the individual’s estimated membership fractions in four clusters. Vertical black lines separate the twelve populations. Runs of ten simulations were aligned using CLUMPP (see text for “Results”). For population names see Table 1. Geographical origin of individuals is indicated. (Color figure online)

Fig. 3
figure 3

Estimated population structure of G. palustris inferred by discriminant analysis of principal components (DACP) of AFLP data from twelve populations. Posterior group assignments and geographical origin of individuals: Cluster 1 = Jura and Switzerland; Cluster 2 = Haute-Savoie; Cluster 3 = Ain; Cluster 4 = Bas-Rhin. Eigenvalues of the three retained discriminant functions are indicated

There was a nearly complete correspondence between the four clusters identified by STRUCTURE and the four geographic regions (Table 1, Fig. 2). A first cluster corresponded to the two populations of the French Jura region and the only Swiss population from the Haute-Savoie region. The second cluster corresponded to the four French populations of the Haute-Savoie region and the third cluster to the three populations of the Ain region. The fourth cluster corresponded to the two Bas-Rhin populations situated in the Rhine Valley.

The neighbor-net network revealed a clustering pattern very similar to the clusters identified by STRUCTURE (Figs. 2, 4). The separation of the two Bas-Rhin populations was highly supported by bootstrapping (842 out of 1000 replicates). In contrast to the STRUCTURE groups, the two populations of the Jura region were clearly separated in the neighbor-net network (Fig. 4). Population Ranchette was more related to the Swiss and Haute-Savoie region whereas population Légna was related to the Ain region. The Swiss population was placed at an intermediate position between the Haute-Savoie region and the Ranchette population of the Jura region.

Fig. 4
figure 4

Neighbor-net network (Bryant and Moulton 2004) for AFLP data of 212 Gladiolus palustris individuals inferred using SplitsTree (Huson and Bryant 2006). Geographical origin of individuals is indicated

The estimate of overall FST obtained by the AMOVA was high (Фst = 0.46). There was a strong differentiation among regions (Фst = 0.33) and a lower differentiation among populations within regions (Фst = 0.14). The variation among individuals within populations accounted for another 54%. Nei’s gene diversity of a population decreased with its specific FST value (Fig. 5; r = − 0.90, P < 0.001) indicating that populations were not in migration–drift equilibrium. Overall genetic differentiation between all twelve populations (pairwise Фst) was related to their geographic distance (Mantel test, r = 0.84, P < 0.001; Fig. 6) indicating isolation by distance (IBD).

Fig. 5
figure 5

Relation between genetic diversity (He) and the population-specific FST values in Gladiolus palustris. Population Les Rosses, with only three samples, was excluded from the analysis. r = − 0.90, P < 0.001

Fig. 6
figure 6

Relation between genetic distances (pairwise Фst) and geographic distances for twelve sampled populations of Gladiolus palustris. Note log-scale for geographic distances. r = 0.84, P < 0.001 (Mantel test)

ITS and cpDNA study of G. palustris and G. imbricatus

Sequence and nucleotide variations

We obtained 32 assembled ITS consensus sequences (Genbank accession numbers MK005888–MK005919) and 33 combined consensus sequences from the two cpDNA markers (Genbank accession numbers MK014501–MK014566). We found no evidence of intrapopulational variation for the three markers, except in two populations (KraL and Roh) where part of the sampled individuals differed only by indels in their trnV(UAC)x2-ndhC region. In these cases, we randomly chose one of the two haplotypes as the populational consensus sequences. Intragenomic polymorphic sites in ITS sequences were marked as ambiguities. The data matrix used for the phylogenetic analysis containing the aligned consensus sequences was 723 bp long for the ITS region and 1015 bp for the combined cpDNA regions. Without considering the outgroup, 89 bp sites were variable for the ITS region and 43 bp sites were variable for the combined cpDNA regions, adding up to 12.8% and 4.2% of variable sites, respectively.

Phylogenetic and phylogeographic analyses

Phylogenetic trees resulting from the ML and Bayesian analysis of cpDNA and ITS data were presented in a combined tanglegram. They showed generally the same branching, with only slight changes in topology and branch support values (Fig. 7). Both the cpDNA and the ITS analysis indicated that there were two main clades regrouping the great majority of the Gladiolus palustris and G. imbricatus populations, respectively. The ITS trees allowed us to further subdivide the main G. palustris clade into two distinct subgroups (ML bootstrap = 99.6% and posterior probability PP > 0.99). The first subgroup contained the French populations of G. palustris from the Bas-Rhin region (Herbsheim, Osthouse), and G. palustris populations from Italy (TurinIII, ComoI, ComoII, ComoIII, Rovereto), from Hungary (Zákányszék) and Germany (Ried). The second subgroup included all French populations of G. palustris from the adjacent regions of Jura (Légna, Ranchette), Ain (Mont d’Ain, Nantua Nord, Oyonnax) and Haute-Savoie (Margencel, Perrignier, Perrignier petite, Reuland), the Swiss population (Faverges) and one population from Italy (Ivrea). The second main clade regrouped G. imbricatus populations from Germany (Dauban, Dubraucke), Italy (TurinII) and the Czech Republic (Miroslav, Rohozná, Krahulci L). In the cpDNA tree the Czech G. imbricatus populations of Rohozná and Krahulci L were separated from the main G. imbricatus group (ML bootstrap = 89.3% and PP > 0.99) (Fig. 7).

Fig. 7
figure 7

Tanglegram of the phylogenetic trees based on a cpDNA and b ITS markers. Colors represent: green = outgroup, blue = Gladiolus palustris clade, red = G. imbricatus clade, light blue = outliers. Numbers near the nodes represent the posterior probabilities of the MrBayes tree. The bootstrap values from the maximum likelihood (ML) analysis are represented in between brackets behind the posterior probabilities. Asterisk collapsed branch in the ML tree. GP G. palustris, GI G. imbricatus, for population names see Table 1. (Color figure online)

Some populations of both taxa were classified into different clades in the cpDNA and ITS trees. The Czech population of Krahulci R, determined morphologically as G. imbricatus, was placed into the G. palustris clade according to its cpDNA sequences and in the G. imbricatus clade in the ITS tree. The Czech G. palustris population of Hodonin was included in the G. palustris clade according to the cpDNA data, but the ITS sequence data revealed a position within the G. imbricatus clade. Population Lus-La-Croix-Haute, determined morphologically as G. imbricatus, was placed in the G. palustris clade in the ITS tree (Fig. 7) where it had an outlier position (ML bootstrap = 90.9%, PP = 0.91). In the cpDNA tree the same population was placed outside the two main G. palustris and G. imbricatus clades (ML bootstrap = 81.1, PP > 0.99). A closer examination of the cpDNA sequences showed that the haplotype of population Lus-La-Croix-Haute differed both from the G. palustris and G. imbricatus haplotypes by several indels (Table 2). The second putative French G. imbricatus population Lezoux grouped together with the G. imbricatus population from Lus-La-Croix-Haute in the ITS trees but was placed inside the G. imbricatus clade according to its cpDNA sequences. The G. palustris population Felon was placed within the G. palustris ITS clade but had a particular outgroup position in the cpDNA trees (Fig. 7). A comparison of the cpDNA sequences from Felon with the outgroup species G. communis showed a 100% identity.

Table 2 Variable sites and haplotypes of the combined cpDNA consensus sequences of Gladiolus palustris (23 individuals) and G. imbricatus (nine individuals)

Haplotype network

The minimum spanning network calculated with the cpDNA sequences of all G. palustris and G. imbricatus populations revealed five different haplotypes (Fig. 8, Table 1). For G. imbricatus the minimum spanning haplotype network confirmed the phylogenetic subdivision of the G. imbricatus cpDNA tree with the Czech populations of Rohozná and Krahulci L sharing the same haplotype. The position of the G. palustris population Felon and the G. imbricatus population Lus-La-Croix-Haute was intermediate between the G. palustris and G. imbricatus haplotype clades. Both populations were separated from these clades by three, respectively, five mutational steps. The haplotype of population Felon was identical to the haplotype of the outgroup species G. communis (not shown in Fig. 8), whereas the haplotype of Lus-La-Croix-Haute differed from extant G. palustris or G. imbricatus haplotypes (Table 2).

Fig. 8
figure 8

Minimum spanning network of cpDNA sequences of 23 Gladiolus palustris and nine G. imbricatus populations. The number of mutations among the different haplotypes are marked as hatch marks. GP = G. palustris, GI = G. imbricatus, for population names see Table 1


Genetic diversity and population structure of G. palustris

The overall genetic diversity found in the populations of G. palustris was comparable to that of other Gladiolus species like G. hybridus (Chaudhary et al. 2018) and other long-lived perennial plant species (Nybom 2004). The fact that G. palustris is polyploid could contribute to the conservation of genetic diversity even in fragmented and isolated populations (see Doyle and Sherman-Broyles 2017; Meirmans et al. 2018). Other rare Central European plants such as Saxifraga rosacea subsp. sponhemica (Walisch et al. 2014), Arnica montana (Maurice et al. 2016) and Iris pumila (Dembicz et al. 2018) are also known to have conserved high levels of genetic diversity despite strong habitat fragmentation. The longevity of these plants and the long-term stability of their habitats could explain the maintenance of genetic diversity (Young et al. 1996; Tang et al. 2010). Many studies have found a reduction of genetic diversity in small and fragmented populations through genetic drift (Leimu et al. 2006). In G. palustris we found no relationship between the genetic diversity of the populations and their size. The relationship between genetic diversity and population-specific FST-values indicated a migration-drift disequilibrium suggesting that the population structure of G. palustris has been modified by genetic drift (Whitlock and McCauley 1999; Cox et al. 2011; Walisch et al. 2014).

In G. palustris the strong correlation between genetic and geographic distances indicated an isolation by distance pattern (IBD) where there is higher gene flow among geographically close populations. As the current distribution is highly disjunct and most extant populations are highly isolated, current gene flow among populations is probably very low or absent. It is likely that the observed IBD pattern may reflect historical gene flow when suitable habitats of G. palustris were much more common. The conservation of the historical pattern may be explained by the fact that the habitat fragmentation is quite recent and mainly due to the intensification of agricultural practices during the last century. Moreover, the fact that G. palustris is a long-lived geophyte may slow down the effects of genetic drift (Aguilar et al. 2008) and thus contribute to the conservation of the historical pattern.

The results of the STRUCTURE analysis suggest that the genetic population structure of G. palustris in the studied area is hierarchical and consists of two main clusters substructured into regional groups of populations. The distinctiveness of the low-altitude Rhine Valley populations is remarkable and was confirmed by the neighbor-net network and the DAPC analysis. It is likely that these isolated populations are the remnants of a much larger population of the Alsatian Ried, which may be locally adapted to the warmer climatic conditions of the Rhine Valley. Common garden experiments and reciprocal transplant experiments would be necessary to confirm this hypothesis. The AMOVA revealed that a large part of the molecular variation of the G. palustris populations is due to differences among geographical regions indicating that the regional clusters have been isolated for a long time, probably since the post-glacial period. An additional explanation for the strong differentiation among populations could be the breeding system of G. palustris (facultative autogamous; Maurice, unpublished) as facultative selfing reduces gene flow and increases genetic differentiation among populations (Mable and Adam 2007).

Hybridization and phylogeography

Interspecific hybrids between G. palustris and G. imbricatus are difficult to detect in the field as they are either morphologically similar to G. imbricatus or morphologically intermediate between the two parental taxa (Szczepaniak et al. 2016). The results of the cpDNA and ITS analyses confirmed that G. palustris and G. imbricatus are genetically distinct and represent two different clades (see Szczepaniak et al. 2016). Our genetic analyses also showed that the three French populations Lezoux, Lus-la-Croix-Haute and Felon at the western limit of the distribution area of G. palustris, considered being G. imbricatus (Tison and Girod 2014), are in fact of hybrid origin. Our cpDNA and ITS data suggest that population Lezoux is a hybrid taxon between G. imbricatus as maternal species and G. palustris as a pollen donor. This indicates that hybridization events between both taxa are not only unidirectional as reported in Szczepaniak et al. (2016) for the nothospecies G. × sulistrovicus with G. palustris as the maternal species and G. imbricatus as the pollen donor. In population Lus-La-Croix-Haute the cpDNA data suggest a hybridization event with an unknown Gladiolus lineage acting as maternal species. The separation of both Lezoux and Lus-La-Croix-Haute from the two larger ITS subgroups of G. palustris suggest that these hybridization events may be ancient. The lineage of G. palustris involved in these hybridization events is likely to be nowadays extinct, especially in the case of population Lezoux that occurs beyond the western range limit of extant G. palustris populations. Moreover, as there are no extant populations of G. imbricatus in this region, recent hybridization events cannot have taken place. Our results suggest that in the post-glacial period, the area of G. imbricatus may have extended further to the West and that population Lezoux may reflect an ancient hybridization event between G. imbricatus and G. palustris. This hybridization event at the western range limit of G. palustris in France may be qualified as genetic swamping where parental lineages are replaced by hybrids (see Todesco et al. 2016).

The isolated population Felon was morphologically less clearly related to G. imbricatus (see Tison and Girod 2014) and was determined as G. palustris by the field collector in our study (Table 1). Our cpDNA analysis suggests that population Felon may be an interspecific hybrid between G. communis as maternal species and G. palustris as a pollen donor. Population Felon is situated in the Burgundian gate, which is a flat saddle about 30 km wide at an altitude of about 400 m between the Vosges and the Jura mountains. It connects the Rhine Valley and the foothills of the Saône Valley. During the post-glacial period, numerous plant and animal species used this gateway on their way north (Wassmer et al. 1994; Sternberg 1998; Granoszewski et al. 2004; Scholler et al. 2012; Westrich and Bülles 2016). To our knowledge, there is currently no extant G. communis population in the Felon region, but a population determined as G. communis occurs some 35 km away near Mulhouse in France (Hoff 2018). Population Felon may thus reflect an ancient hybridization event during the post-glacial period between G. communis and G. palustris.

Our genetic analyses further revealed that two Czech populations are also of hybrid origin and may correspond to the recently described nothospecies G. x sulistrovicus (Szczepaniak et al. 2016). These two hybrid populations are situated at the eastern limit of the main distribution area of G. palustris in Central Europe where G. palustris and G. imbricatus are partially sympatric. In this area, interspecific hybridization between the two closely related Gladiolus species is facilitated due to their occurrence in the same habitat and their overlapping flowering periods (Szczepaniak et al. 2016).

These results suggest a different hybridization asymmetry at the western than at the eastern range limit of the distribution area of G. palustris in Western and Central Europe, respectively. At the western range limit in France, we found indications for nuclear introgression from the ‘common’ G. palustris into the ‘rare’ G. imbricatus that became locally extinct, whereas at the eastern range limit in Czech Republique there were signs of nuclear introgression from G. imbricatus into the locally ‘rare’ G. palustris. However, as the distribution area of G. imbricatus in the East was only sampled with a few populations, we cannot distinguish between the influence of gene flow or ancestral polymorphisms in the incongruence of the cpDNA and ITS dataset, especially as polyploids may retain ancestral polymorphisms much longer than diploids (see Slotte et al. 2008). Our findings are in line with expectations of unidirectional ‘pollen swamping’ of rare species by more abundant congeners (Ellstrand and Elam 1993; Levin et al. 1996; Beatty et al. 2010) and of extinction through genetic swamping (Todesco et al. 2016). Hybridization has been shown to present an extinction risk for other rare plant species like Nuphar pumila (by the common N. lutea; Arrigo et al. 2016) and the endangered Castilleja levisecta (by the common C. hispida; Sandlin IJ 2018 unpublished). However, there are also examples where the gene flow is unidirectional from the rare species into the common conspecific as in the Mediterranean Lotus fulgurans and L. dorycnium (Conesa et al. 2010).

Conservation implications

The analysis of the genetic population structure of G. palustris revealed a strong differentiation among geographical regions indicating that current gene flow is very low. This implies that the conservation of populations from each regional cluster is important to preserve genetic diversity of this legally protected taxon. Although most populations within regions retained a considerable amount of genetic variation, genetic diversity is likely to decrease in small populations of G. palustris due to genetic drift as the extant populations are not in a migration–drift equilibrium.

To counteract the future loss of genetic diversity through drift, artificial gene flow has been advocated (Keller and Waller 2002; Hamilton et al. 2017; Van Rossum and Raspé 2018) to increase population genetic diversity and evolutionary resilience. The moderate level of population differentiation within regions suggests that the largest populations per region could be used as seed source to increase genetic diversity in genetic depauperate populations of the same region, especially in small populations. However, as we cannot exclude the risk of outbreeding depression, we suggest conducting first within and between-population crossing experiments in a common garden experiment and measure the vigor and fitness of the offsprings. Introducing seed material instead of young plants would allow for selection of maladapted genotypes before establishment and flowering (Willi et al. 2007; Maschinski et al. 2013; Van Rossum and Raspé 2018). As the isolated Bas-Rhin populations in the Alsatian Ried occur at a lower altitude, we suggest investigating in a further study if these lowland populations are locally adapted to warmer climatic conditions in the Rhine Valley. This would further enhance their conservation interest especially in the context of climate change.

Our phylogenetic analysis permitted to clarify the effects of ancient and recent hybridization events on the conservation of G. palustris. It showed that none of the presumed G. imbricatus populations in France correspond to a purebred G. palustris or G. imbricatus species. Our results indicate that certain hybrid populations at the western range limit of G. palustris may be products of ancient hybridization events where one of the parent species went regionally extinct. Because these populations seem to be self-sustainable without the presence of the parent taxa, we advocate that efforts should be made to conserve the French hybrid populations due to their biogeographic interest. In contrast, the recently discovered hybrid populations at the eastern range limit of G. palustris in Central Europe confirm that G. palustris and G. imbricatus are closely related and are still able to hybridize if they co-occur (Szczepaniak et al. 2016). It is however unclear if this recent hybrid can exist without the parent species as it had been shown that the hybrid has a reduced seed set (see Szczepaniak et al. 2016). The nothospecies G. x sulistrovicus should therefore be monitored over several generations to evaluate its demographic and genetic dynamics.