Introduction

The range of most contemporary species of plants and animals in Europe has been shaped in the Pleistocene as a result of climate change and the glaciation cycle, during which most northern and central parts of the continent were covered by a glacier. Species retreated to the south, where small areas were prevalent for survival, the so-called glacial refuges. In Europe, these were Iberian, Apennine, Balkan-Peloponnese, and Ponto-Caspian refuges. In these areas, species were exposed to allele loss and reduction of genetic diversity. After the retreat of the ice sheet, in the post-glacial period, the species of that time began recolonizing Europe northwards (Hewitt 2000; Culling et al. 2006; Alexandri et al. 2012). The role of Carpathian refugia and the history of post-glacial colonization of Central and Northern Europe seems to be significant and is increasingly documented. This has been reported in molecular phylogeographic studies of populations of various animals, like beetles (Homburg et al. 2013), butterflies (Paučulová et al. 2017), amphibians (Wielstra et al. 2017), or mammals (Filipi et al. 2015). Such data on land gastropods (Harl et al. 2014) are scarce and deficient.

Faustina faustina (Rossmässler, 1835) (Gastropoda: Eupulmonata: Helicidae: Ariantinae) is a mountain species, that usually occurs in forests among vegetation, on shady rocks, and in debris. It can often be found scrawling on the herbs (Urbański 1957; Wiktor 2004). Terra typica of this species is Galizien (Rossmässler 1835: 4), a historical and geographic region now located in Poland and Ukraine. The distribution range of F. faustina includes the Carpathians, the Sudetes, and the Świętokrzyskie Mts (Riedel 1988). Its presence is confirmed in Poland, the Czech Republic, Slovakia, Ukraine, Romania, and Hungary (Riedel 1988; Wiktor 2004; Welter-Schultes 2012). Despite the fact that this species typically inhabits mountain areas, some isolated sites are also known from the lowlands in northern European areas, such as Romincka Forest near Polish-Russian border (NE Poland) or Kaunas (Lithuania) (Skujiene 2002; Marzec 2005). It was suggested that in these locations F. faustina was probably introduced from southern mountainous localities (Wiktor 2004).

Faustina faustina is characterized by the presence of several distinct morphological variants of the shell mostly refer to as different forms. The most common is the nominotypical form faustina which has a shell with slightly raised spire and a narrow to medium wide umbilicus (Subai and Neubert 2016). It is present in the Western Carpathians (Poland, the Czech Republic, Slovakia). The second form associata has the shell with a depressed spire, almost planispiral, and a very wide umbilicus. It inhabits northern and north-western Romania in the region limited to the Carpathians and widespread in the isolated mountain ranges in Transylvania. Two color morphs of F. faustina are recorded: unicoloured or with broad dark color band just above periphery (Welter-Schultes 2012; Subai and Neubert 2016). The knowledge about habitat preferences of F. faustina is residual, and so far any differences in this aspect between different forms or population have not been noted. Moreover, many other forms of F. faustina are known (e.g., wagneri, sarmizegethusae, cibiniensis, talmacensis, orba), but their occurrence is limited to smaller areas (mostly in Romania) and no clear geographic pattern of stable geographic subspecies could be detected (Subai and Neubert 2016).

The knowledge about F. faustina is extremely limited and so far only little information about its biology or ecology is available (Marzec 2013). For many years, its systematic position was unclear. This species was previously subsequently classified in different genera such as Helix L., 1758; Campylaea Beck, 1837; Helicigona Férussac, 1821; Chilostoma Fitzinger, 1833, and finally Faustina Kobelt, 1904 (Subai and Neubert 2016). Groenenberg et al. (2016) in their systematic study on Ariantinae presented phylogenetic relationships between F. faustina and other closely related species confirming its placement within the genus Faustina. This study was based on combining data of traditional shell morphology, genital anatomy, and DNA sequences (H3, CytB, 16S, COI markers). In the constructed phylogenetic tree based on H3-COI-CytB dataset, all individuals of F. faustina formed one, separate clade without any other Chilostoma, Helicigona, or Campylaea species (Groenenberg et al. 2016). According to this study, Faustina is the most closely related to the genus Kosicia Brusina, 1904 with the split between them estimated at 56–51.7 MYA (Groenenberg et al. 2016).

In the present study, we investigated phylogeographic patterns of terrestrial gastropod, F. faustina across its distribution range. To analyze the genetic variation, we applied one mitochondrial (COI) and two nuclear (ITS-2 and 28S rRNA) markers. Additionally, we performed species distribution modeling (SDM) in order to predict suitable glacial refugia and to study directions of species expansion. Molecular clock analysis was done to estimate time of splitting clades restricted to particular areas.

Material and methods

Taxon sampling and identification

A total of 181 individuals of Faustina faustina (Rossmässler, 1835) were collected manually during the summer periods within 2013–2018 from 22 localities from the area of its distribution (Table 1). In 2018, we tried three times to collect specimens from two populations in Świętokrzyski National Park previously reported by Barga-Więcławska (2011), but no specimens were found in these locations. The nomenclature was adopted from the study of Groenenberg et al. (2016). All specimens were preserved in 96% ethanol and stored in − 80 °C. A small piece of tissue was taken for DNA extraction. Specimens were identified based on anatomical studies and conchological features of shell using stereomicroscope (Olympus SZX10). Remaining soft body parts and shells were deposited in the Institute of Environmental Sciences, Jagiellonian University (Poland).

Table 1 Sampling localities of F. faustina (N - number of collected individuals from each population, PL - Poland, UA - Ukraine, ROM - Romania)

DNA extraction, amplification, and sequencing

DNA extraction was performed using the NucleoSpin Tissue Kit (Macherey-Nagel, Germany) from all collected individuals of F. faustina. PCR generated fragments of the cytochrome oxidase subunit I (COI, mtDNA), internal transcribed spacer 2 (ITS-2, nDNA), and 28S ribosomal RNA (28S rRNA) genes using the following primers, LCO1490 (5′-GGTCAACAAATCATAAAGATATTGG-3′), HCO2198 (5′-TAAACTTCAGGGTGACCAAAAAATCA-3′) (Folmer et al. 1994), LSU-1 (5′-CTAGCTGCGAGAATTAATGTGA-3′), LSU-3 (5′-ACTTTCCCTCACGGTACTTG-3′) and LSU-2 (5′ – GGGTTGTTTGGGAATGCAGC – 3′), LSU-5 (5′ – GTTAGACTCCTTGGTCCGTG – 3′), respectively (Wade et al. 2006). A PCR reaction of each sample was performed in a 20 μl reaction mixture, consisting of 3.2 μl of template DNA, 0.8 μl of each primer, 2 μl of 10x buffer B1 (HOT FIREPol®, Solis BioDyne, Estonia), 2 μl of 25 mM MgCl2 (HOT FIREPol®, Solis BioDyne, Estonia), 10.2 μl of ddH2O, 0.8 μl of 20 mM dNTP (ThermoFisher Scientific, USA), and 0.2 μl of Taq-Polymerase (HOT FIREPol®, Solis BioDyne, Estonia). PCR conditions for all markers consisted of 15-min initial denaturation at 95 °C; 1-min denaturation at 95 °C, followed by 90-s annealing at 45 °C, and 90-s elongation at 72 °C for 5 cycles; 1-min denaturation at 95 °C, followed by 90-s annealing at 50 °C, and 1-min elongation at 72 °C for 35 cycles; followed by a final elongation step for 5 min. A 3-μl sample of PCR product was run on a 1.5% agarose gel for 30 min at 100 V to check DNA quality. PCR products were cleaned up by using NucleoSpin Gel and PCR Clean-up (Macherey-Nagel, Germany). A sequencing reaction was performed in 10 μl reaction mixture, consisting of 2 μl of PCR product, 0.15 μl of primer, 1 μl of sequencing buffer (BrilliantDye Terminator Sequencing Kit, Nimagen, The Netherlands), 5.85 μl of ddH2O, and 1 μl of Terminator (BrilliantDye Terminator Sequencing Kit, Nimagen, The Netherlands). Sequencing program consisted of four steps: 1-min initial denaturation at 96 °C, followed by 10-s denaturation at 96 °C, 5-s annealing at 55 °C, 4-min elongation at 60 °C for 25 cycles. Sequencing products were cleaned up by using ExTerminator (A&A Biotechnology, Poland) and sequenced. The sequencing reactions were performed in Genomed company (Warsaw, Poland). A total of 156 COI sequences, 152 ITS-2 sequences, and 152 28S rRNA were obtained from tissue samples. Haplotype sequences were deposited in GenBank with the following accession numbers: COI: MT835291-MT835382, ITS-2: MT833344-MT833374, 28S rRNA: MT833304-MT833343.

Sequence alignment and comparative analyses

All sequences obtained in this study were aligned for each marker separately using Auto strategy implemented in MAFFT version 7 (Katoh et al. 2002; Katoh and Toh 2008). Aligned sequences were trimmed to the shortest available alignment, 559 bp length for ITS-2, 524 bp for 28S rRNA, and 519 bp for COI. All COI sequences were translated into protein sequences in MEGA version 7 (Kumar et al. 2016) to check against the eventual presence of pseudogenes. Uncorrected p-distances were calculated in MEGA version 7 (Kumar et al. 2016). Number of haplotypes and population genetic parameters (such as haplotype diversity - Hd, nucleotide diversity - Pi, number of variable sites - S) were calculated in DnaSP (Librado and Rozas 2009).

In order to check if the genetic structure of the F. faustina populations on current localities fits an isolation by distance model (IBD) (Slatkin 1993), straight-line geographic distances (in kilometers) between populations were estimated in QGIS (QGIS Development Team 2019). FST indices were calculated in Arlequin 3.1 by using 1.000 permutations to test for statistical significance (Schneider et al. 2000). A Mantel test (Mantel 1967) was performed in the R software (R Core Team 2018) using “ade4” package (Chessel et al. 2004).

Haplotype network for nuclear markers was prepared by using PopART with the implementation of the Median-Joining method (Bandelt et al. 1999). For this purpose, single sequences of each haplotype present in each population were used (N = 40 for ITS-2, N = 58 for 28S rRNA). Sequences were aligned as described above. Haplotypes were labeled by numbers with # mark, circles without a number indicated a hypothetical intermediate haplotype necessary to link observed haplotypes. Hatch marks in the network represented single mutations.

Unrooted phylogenetic tree was constructed for COI marker using single sequences of all distinct haplotypes (92 sequences/haplotypes). Sequences were aligned as described above. The obtained alignment (519 bp length) was divided into three data blocks, constituting three separate codon positions using PartitionFinder version 2.1.1 under the Akaike Information Criterion (AIC) (Lanfear et al. 2016). The best scheme of partitioning and substitution models were chosen for posterior phylogenetic analysis, and the analysis run to test all possible models implemented in the program. As best-fit partitioning scheme, PartitionFinder suggested retaining three predefined partitions separately. The best-fit models for these partitions were K81UF+G for the first codon position, TIMEF+G for the second codon position, and HKY+I for the third codon position. Models GTR, GTR+I, GTR+G, and GTR+I+G were also tested separately (Stamatakis 2014). The best-fit model for all partitions in this analysis was GTR+G.

For the COI marker, maximum-likelihood (ML) topology was constructed using RAxML v8.0.19 (Stamatakis 2014). The strength of support for internal nodes of ML construction was measured using 1000 rapid bootstrap replicates. Bayesian inference (BI) marginal posterior probabilities were calculated using MrBayes v3.2 (Huelsenbeck and Ronquist 2001; Huelsenbeck et al. 2001) with 1 cold and 3 heated Markov chains for 10,000,000 generations, and trees were sampled every 1000th generation. Each simulation was run twice. Convergence of Bayesian analyses was estimated using Tracer v. 1.3 (Rambaut et al. 2014). In this study, a posterior probability of > 0.90 and bootstrap value of > 70% were considered a significant nodal support (Douady et al. 2003). All trees sampled before the likelihood values stabilized were discarded as burn-in. All final consensus trees were viewed and visualized by FigTree v.1.4.3, available at http://tree.bio.ed.ac.uk/software/figtree.

Molecular clock analysis

The COI alignment (519 bp length) from the phylogenetic analysis was used to calculate the approximate time of divergence among clades of F. faustina with a relaxed molecular clock model implemented in the BEAST programme (Drummond et al. 2006, 2012). The fossil of Helicigona species was used as a single calibration point. Helicigona species is the oldest and only identified Ariantinae fossil dated to 17.5–16 MYA (Nordsieck 2014). The parameters for such calibration imposing a normal distribution prior were used as proposed by Groenenberg et al. (2016) (mean 16.75 MYA, stdev = 0.375). These values allow for establishment of soft minimum and maximum age boundaries.

Divergence time analyses were run with the lognormal relaxed clock under the Yule process for partition into codon positions as following: 2 partitions (1 + 2), 3. ESS was calculated in Tracer 1.3 (Rambaut et al. 2014). The MCMC analysis was run for 10,000,000 generations, sampling every 1000 generations. Trees were combined using TreeAnnotator v. 1.7.5 to produce maximum clade credibility tree (Rambaut and Drummond 2007). Phylogenetic tree was viewed and visualized by FigTree v.1.4.3, available at http://tree.bio.ed.ac.uk/software/figtree.

Species delimitation

Because of the observed high genetic variability in all markers of F. faustina populations, which can suggest the presence of cryptic species complex, we analyzed sequence data using three quantitative methods of species delimitation: Automatic Barcode Gap Discovery (ABGD) (Puillandre et al. 2012), Poisson tree process model (PTP) (Zhang et al. 2013), and Generalized Mixed Yule Coalescent (GMYC) analysis (Pons et al. 2006; Fontaneto et al. 2015). All of the analyses were based on COI dataset since it showed the highest genetic variability and it is also routinely used for single-locus molecular delimitation methods in many invertebrate taxa (Puillandre et al. 2012; Fontaneto et al. 2015; Schenk and Fontaneto 2019).

1) ABGD: This method for species delimitation is based on genetic distances of single barcoding genes which assumes that divergence among individuals of different species is higher than the divergence among individuals of the same species (Puillandre et al. 2012). To perform this analysis, cut alignment of 519 bp for COI was used. A dataset comprised only sequences of previously detected haplotypes. A number of clusters were calculated by using the ABGD tool available at website: http://wwwabi.snv.jussieu.fr/public/abgd/abgdweb.html. We used the default priors, with prior intraspecific divergence = 0.001–0.1, divided into 10 partitions, default barcode relative gap width of 1.5.

2) PTP: This method uses a non-ultrametric phylogenetic tree as the input data, based on which, the switch from speciation to coalescent processes is modelled and then used to delineate primary species hypotheses. PTP analysis was conducted by using online service available at https://species.h-its.org/ptp/. Unrooted phylogenetic tree was uploaded and analysis was done by using the following defaults: 500000 MCMC generations, 100 thinning, 0.1 burn-in.

3) GMYC: This method for species delimitation is based on sequence data and uses ultrametric tree and similarly to PTP tries to detect the transition when branching pattern switches from this attributed to speciation to that attributed to the intra-species coalescent processes (Pons et al. 2006; Michonneau 2015). Phylogenetic tree for COI marker prepared previously in BEAST was used for PTP analysis (for details see chapter Molecular clock analysis in Materials and methods). GMYC analysis was done by the splits package for R software (R Core Team 2018).

Species distribution modeling

We generated SDMs for F. faustina to identify regions of climate suitability across the three time periods, using the Maxent modeling algorithm, ver. 3.4.1 (Phillips et al. 2017). Models were based on bioclimatic variables obtained from WorldClim 1.4 database (http://www.worldclim.org/) (Hijmans et al. 2005) at resolution of 30 arc-seconds (~ 1 km) for current and mid-holocene (MH; ~ 6 ka) climatic scenario and at resolution of 2.5 arc-minutes (~ 5 km) for last glacial maximum (LGM; ~ 22 ka). For LGM and MH, we used three general circulation models (GCMs) since individual GCMs use different data and algorithms, and thus create different projections of the past climate (Guevara et al. 2019). The considered GCMs were (1) CCSM4, (2) MIROC-ESM, and (3) EPI-EM-P. Out of the set of 19 available bioclimatic layers, we excluded bio8 and bio9 (mean temperature of wettest quarter and mean temperature of driest quarter, respectively), because they displayed artificial discontinuities between adjacent grid cells in some areas of the analyzed region (Zhu and Peterson 2017; Raghavan et al. 2019). Retained bioclimatic layers were limited to the region only feasibly colonized by the species and clipped to the accessible area (M) (Barve et al. 2011). Range of the M region was set as two degrees (ca. 220 km) buffer around the minimum convex polygon (MCP) counting all gathered occurrence records. To avoid over-parametrization of the model, we eliminate highly correlated variables with Pearson’s correlation value ≥ |0.8| (Fig. S1, SM). The eight selected bioclimatic variables were bio1 (annual mean temperature), bio2 (mean diurnal range), bio3 (isothermality), bio4 (temperature seasonality), bio14 (precipitation of driest month), bio15 (precipitation seasonality), bio16 (precipitation of wettest quarter), and bio18 (precipitation of warmest quarter). Comparisons of correlation structures among these bioclimatic variables (Table S1, SM), performed with the Mantel test, showed that the model may be transferred among training and projected areas and time periods (Jiménez-Valverde et al. 2009). Additionally, in order to test the climatic similarity between calibration area (M) and projections for the past periods, we used the mobility-oriented parity (MOP) metrics (Owens et al. 2013), based on 10% sampling of the reference region (M). This method is used to assess the general novelty of conditions in the projected regions and to identify regions where strict extrapolation occurs (i.e., climatic values are outside the range of the present values from the calibration area).

Occurrence records for F. faustina were gathered from a variety of sources including published literature (Juřičková et al. 2006; Cameron et al. 2011; Kuznecova 2012; Subai and Neubert 2016), online databases (GBIF.org 2019; iNaturalist 2019), and unpublished field data (Table 1). To reduce the spatial autocorrelation due to the sample bias in occurrence records, only randomly selected locations separated from each other by a minimum distance of 5 km were used in modeling. Therefore, only 167 out of 296 occurrence records collected were used in species distribution modeling.

In the next step, we created 255 candidate models, using 70% of randomly selected occurrence records, 8 bioclimatic variables, 10,000 randomly selected background points (pseudoabsences), and combining 17 values of regularization multiplier (0.1–1.0 at intervals 0.1, 2–6, 8, and 10) as well as 15 possible combinations of 4 feature classes (linear = l, quadratic = q, product = p, and hinge = h). Threshold feature was excluded to improve model performance and make models smoother and simpler, and hence likely to be more realistic (Phillips et al. 2017). Then, all candidate models were assessed with using separate testing data subset (30% of all occurrence records used in SDM not included in training data). The first one was based on statistical significance using partial receiver-operator curve (partial ROC) approach (Peterson et al. 2008), the second one based on predictive power using omission rates (Anderson et al. 2003), and the third one based on model complexity using AICc (Warren and Seifert 2011). Statistically significant models (with 500 iterations and 50% of data for bootstrapping) with omission rates ≤ 10% and with delta AICc ≤ 2 were selected as best models describing predicted distribution of F. faustina. Parameter settings of these best models were used in the final models creation with performing 50 bootstrap replicates. MaxEnt complemental log-log (cloglog) output format, which appears to be most appropriate for estimating probability of presence (Phillips et al. 2017), was used as a relative habitat suitability index. Final maps of the climatic suitability for each period were generated by averaging of results of all bootstrap replications. Predictive accuracy of the final model was assessed using the area under the receiver-operator curve (AUC).

Based on the final models, we identified areas with long-term stable conditions where the species may have persisted across time from the LGM to the present day. Such areas were determined for each GCM by summing averaged layers of models for LGM, MH, and present day (Chan et al. 2011). Regions with the highest values were treated as areas of the highest predicted ecological stability and potential refugia of the species.

Model calibration, evaluation, and final models creation as well as MOP analyses were performed in the R environment (R Core Team 2018) with the kuenm package (Cobos et al. 2019), which uses Maxent as modeling algorithm. The “mantel.rtest” function in ade4 package (Chessel et al. 2004) was used for Mantel tests. Pearson’s correlation coefficients between bioclimatic variables were computed using ENMTools software (Warren et al. 2010). All manipulations with vector and raster layers and maps were carried out in QGIS 3.8 software (QGIS Development Team 2019).

Results

Species identification

Anatomical dissections confirmed that all studied snails represented F. faustina. A total of 152 individuals had conchological characteristics typical for the faustina form. These features were as follows: shell with a slightly raised spire and a narrow to medium wide umbilicus, aperture rounded and only laterally and basally flared. In the case of COI marker, verification with BLAST resulted in hitting our sequences with F. faustina faustina (JX157839) (Query cover: > 97%, E-value: 0.0, Perc. ident: > 99%), from the Czech Republic (NE Böhmen, Opocno, Schlosspark), which also indicates that the most European populations can be regarded as nominotypical faustina form.

Additionally, four individuals of associata form were identified from a single Romanian locality in Județul Harghita (central Romania). This form is known from the northern and north-western area of Romania and main characteristic features of the shell are depressed spire, almost planispiral with low teleoconch whorls, very wide umbilicus, and aperture broadened. Our sequences of individuals classified as associata form also resulted in hitting F. faustina associata (JX156837) from Romania (Komitat Harghita, Mt Hasmas, Bicaz Canyon) (Query cover: > 97%, E-value: 0.0, Perc. ident: > 99%).

Variation in mtDNA and geographic distribution of F. faustina clades

The analysis of the partial mt COI gene (519 bp length of alignment) revealed 92 different haplotypes (Table S2, SM). The range of p-genetic distances between haplotypes was from 0.2 to 18.1% (9.4% on average). Constructed Bayesian and ML phylogenetic tree for this dataset resulted in similar topology (Fig. 1). All individuals of F. faustina clustered into seven strongly supported clades (1.00 posterior probabilities). Clades 1 and 2 were formed by individuals belonging to Polish populations, i.e., clade 1 was composed of snails from the Western Carpathians (the Gorce Mts and Żywiec), whereas clade 2 comprised individuals from three distinct areas including the Sudetes (SW Poland), the Bieszczady Mts (SE Poland, the Eastern Carpathians), and Romincka Forest (NE Poland). Clades 3 and 5 were formed by individuals from the Ukrainian localities (the Eastern Carpathians) but both of them were mixed with snails from the same populations. Clades 6 and 7 comprised gastropods from the Romanian populations. Clade 4 was formed by Romanian individuals represented by associata form. The associata clade (clade 4) is the most closely related to Ukrainian clade of faustina form (clade 3). Genetic p-distances between all distinguished clades (Table 2) varied from 6.3% (between clades 1 and 2) to 16.6% (between clades 3 and 7) (12.7% on average). The highest haplotype diversity defined as a number of different haplotypes occurred in a particular population was observed in the Polish populations from the Western Carpathians (Ochotnica Górna - 11 haplotypes, Żywiec - 8 haplotypes), the Sudetes (Bardo - 8 haplotypes), and in Ukrainian Lazeschyna (8 haplotypes). Population genetic parameters calculated for all obtained sequences of F. faustina were S = 216, Pi = 0.08062, Hd = 0.976.

Fig. 1
figure 1

A phylogenetic tree of mitochondrial COI sequences of F. faustina associated with the map of the haplotype distribution in the studied area. Only topology of Bayesian tree is shown. Numbers at nodes of each clade indicate Bayesian posterior probability and bootstrap values separated by “/” mark. Values less than 0.90 for BI and less than 70 for ML are not shown. The scale bar represents 0.02 substitutions per nucleotide position

Table 2 Genetic p-distances (%) between clades of F. faustina distinguished based on COI sequences

The overall relation between population pairwise FST and straight-line geographic distance is not high (Mantel’s r = 0.44), but statistically significant (P = 0.000999). Obtained results indicate that F. faustina is a species isolated by distance.

Nuclear haplotype networks

The analysis of the partial ITS-2 gene (alignment length 559 bp) amplified from 126 individuals revealed 31 different haplotypes (Fig. 2, Table S3, SM) and a pattern of haplotype distribution. Four haplogroups can be distinguished: Ukrainian, Polish, and two Romanian (Fig. 3). The first Romanian haplogroup represented gastropods of faustina form, the second one of associata form. Haplotype #18 from Ukraine occurred close to Polish haplogroup, which was separated from haplotype #10 (from Romincka Forest) by only single mutational step. The highest haplotype diversity defined as a number of different haplotypes occurred in a particular population was observed in the Polish population from the Western Carpathians (Żywiec, 5 haplotypes) and in Yamna population from the Ukrainian Eastern Carpathians (6 haplotypes). Genetic p-distances between haplotypes range from 0.2 to 4.2% (1.3% on average).

Fig. 2
figure 2

Haplotype distribution for nuclear markers: ITS-2 (left) and 28S rRNA (right)

Fig. 3
figure 3

Haplotype Median-Joining network of nuclear ITS-2 marker for F. faustina. Haplotypes are represented by black circles. The size of circles is proportional to the number of populations in which a particular haplotype is present. Populations are listed in Table S2 (SM). Haplotypes are marked from #1 to #31, gray circles without a number indicate a hypothetical intermediate haplotype linking observed haplotypes of F. faustina. Hatch marks in the network represent single mutations

The analysis of the partial 28S rRNA gene (alignment length 524 bp) amplified from 146 individuals revealed 40 different haplotypes (Fig. 2, Table S4, SM). In the case of this marker, a pattern of haplotype distribution was also detected, similar to that observed in ITS-2 gene. Three haplogroups can be distinguished comprising Polish, Ukrainian, and Romanian haplotypes, respectively (Fig. 4). Romanian haplogroup was composed of individuals represented by both faustina and associata forms. The highest genetic variability defined as a number of detected distinct haplotypes in a particular populations was observed in the northernmost population in Romincka Forest (8 haplotypes) and in the Western Carpathians (Żywiec, 7 haplotypes). A high number of haplotypes was also detected in two Ukrainian populations, i.e., Yamna (7 haplotypes) and Zavoyelya near Worochta (6 haplotypes). Genetic p-distances between haplotypes ranged from 0.2 to 2.5% (0.9% on average). Population genetic parameters calculated for all obtained sequences of F. faustina were S = 53, Pi = 0.00817, and Hd = 0.8867 for ITS-2 marker and S = 40, Pi = 0.00451, and Hd = 0.817 for 28S rRNA.

Fig. 4
figure 4

Haplotype Median-Joining network of nuclear 28S rRNA marker for F. faustina. Haplotypes are represented by black circles. The size of circles is proportional to the number of populations in which a particular haplotype is present. Populations are listed in Table S3 (SM). Haplotypes are marked from #1 to #40, gray circles without number indicate a hypothetical intermediate haplotype linking observed haplotypes of F. faustina. Hatch marks in the network represent single mutations

Molecular species delimitation

Based on the COI datasets, ABGD analysis showed that depending on prior intraspecific divergence, it comprises from one (0.0599 prior intraspecific divergence) to ten (0.0010 prior intraspecific divergence) species defined in program as groups. We assumed that in the case of F. faustina individuals, which differ even with 6.0% of intraspecific divergence, constitute a single species, similarly to previous observations in other terrestrial gastropods (see Discussion chapter). This is confirmed by the ABGD analyses if 0.0599 is chosen as prior intraspecific divergence in COI dataset.

The PTP analysis, which infers putative species boundaries on a given phylogenetic input tree, showed that based on COI marker, an estimated number of species ranged from 8 to 54. Both Maximum Likelihood and Bayesian solutions resulted in 15 species; however, only single one was strongly supported (support value 0.982).

GMYC analysis, similar to ABGD and PTP, revealed that the COI dataset was comprised of more than one species. The number of estimated species ranged from one to six, but only one was strongly supported (support value 0.990). Each obtained species corresponded to particular clades on the COI-based phylogenetic tree (Fig. 1).

Molecular clock analysis

The F. faustina clades were probably split in the Miocene, where associata form was split from the faustina form approx. 12.41 MYA (Fig. 5). The associata clade (clade 4), formed by a Romanian population, is the most closely related to the Ukrainian clade (clade 3). Considering nominotypical form (faustina), the Polish populations (clades 1 and 2) are the most closely related to the Romanian ones (clades 6 and 7). The split between them was also in Miocene, about 12.75 MYA. A split between the Polish populations clustered into two separate clades was estimated at approx. 7.91 MYA.

Fig. 5
figure 5

BEAST phylogenetic tree based on the COI sequences. Node values indicate divergence estimated in MYA

Species distribution modeling

In all, we assessed 255 candidate models with parameters reflecting all combinations of 17 regularization multiplier settings, 15 feature class combinations, and 1 set of environmental predictors (Table S5, SM). All assessed models were statistically significant and 111 (43%) met the omission criterion 10%. Only one best model was selected among them based on the AICc value (Table S6, SM). It was a model with a regularization multiplier of 3 and product and hinge features (Table S7, SM). The average training AUC value was 0.894 (SD = 0.008) for final model created with these settings.

The present species distribution model shows a high congruence between climatic suitability predictions and location of records of F. faustina within its native range in the Carpathians and the Sudetes (Fig. 6a). However, according to the model, there is a high probability that the species is present also in the Alps, outside of the species’ known range. However, in the northeast Poland and Lithuania, where the species is considered introduced, the climatic suitability was significantly lower compared with the native range.

Fig. 6
figure 6

Maps of predicted climatic suitability for F. faustina under present (a) and paleo periods (b–g). Warmer colors indicate higher climatic suitability. Black dots indicate present records used in species distribution modeling. For LGM (maps b–d) and Mid-Holocene (maps e–g) projections for three GCMs were performed (CCSM4, MIRC-ESM, and MPI-ESM-P)

Final models for the LGM scenarios (Fig. 6b–d) differed from that for present climatic conditions. However, they indicate that the species may have survived the glacial period in situ, within its current distribution range. The areas with more suitable conditions and the higher occurrence probability were mainly located in the Southern Carpathians. Additionally, according to CCSM4 and MPI-ESM scenarios, they occurred locally, at least in small patches of favorable microclimate also in the Western and Eastern Carpathians. It is consistent with the results of the MOP analysis showing that, during the LGM, large parts of these last two regions were outside the range of current climatic conditions in the calibration area M (areas of strict extrapolation; Fig. S2, SM). Climatic suitability of the Western Carpathians increased during the Holocene (Fig. 6e–g). According to the CCSM4 scenario, it was the area with the highest probability of occurrence of F. faustina during this period, with decreasing suitability of the Southern and Eastern Carpathians. In contrast, the MIROC-ESM and MPI-ESM-P models predict that the climatic suitability of the last two parts of the Carpathians increased between LGM and Mid-Holocene. All GCMs also predict an increase in the probability of occurrence of the species in the Sudetes during MH compared with LGM (Fig. 6f, g).

Based on the final SDMs, we identified areas of long-term stability of climatic conditions suitable for F. faustina that could serve as potential refugia of the species during climatic oscillations during the last approx. 22,000 years. The resulting maps obtained for three GCMs point to few potential refugia in the Carpathians, with the most climatically stable and largest areas in the Southern Carpathians (Fig. 7). Predicted areas of climatic stability in the Western and Eastern Carpathians as well as in the Apuseni Mts differed among GCMs. According to some climate scenarios, only smaller stable areas acting as putative microrefugia were predicted in these regions.

Fig. 7
figure 7

Areas of climatic stability over time periods from the LGM through the present, based on summed climatic suitability models for the LGM, mid-Holocene, and present day for three differed GCMs. Stability increase from red to yellow color. White-filled areas show the ice-sheets during the LGM, based on data from Ehlers et al. (2011). Background hillshade layer was obtained from the EEA website (www.eea.europa.eu)

Discussion

Our study summarizes the genetic structure of F. faustina populations across its distribution range in Europe. Our results ascertain the presence of multiple refugia in the Carpathians. Moreover, the data suggest that despite the high genetic variability within COI marker, F. faustina is one single species which diverged on the studied area.

Phylogeography and taxonomic considerations

F. faustina is a montane species, but it has also been recorded in isolated localities in lowlands, situated north of the main range (Romincka Forest, NE Poland or Kaunas, Lithuania) (Skujiene 2002; Marzec 2005). The presence of F. faustina in these areas was hypothesized to be a human introduction (Wiktor 2004). Our genetic analysis of the snails from Romincka Forest showed that haplotype #16 of COI marker, present in this population, also occurs in other Polish populations from the Bieszczady Mts and the Sudetes. It might suggest that the snails in Romincka Forest were introduced from southern populations of F. faustina. On the other hand, this population also exhibits some unique haplotypes which were not detected in other localities (Table S2, SM). This may be the first signal of genetic differentiation of this isolated population.

F. faustina was also detected in isolated localities in the Świętokrzyskie Mts in the past (Urbański 1957; Piechocki 1981; Wiktor 2004; Barga-Więcławska 2011). However, field sampling conducted several times in this area by the first author revealed that F. faustina is absent in two sites previously reported by Barga-Więcławska (2011). The species most likely should be considered extinct in Lithuania since these records could not be confirmed in the last decades (Subai and Neubert 2016).

In our study, high genetic distances between haplotypes of COI marker were detected. The highest genetic distance between haplotypes was 18.1%. Similar results were obtained by Groenenberg et al. (2016), who showed up to 10.7% COI sequence divergence between subspecies of F. faustina faustina and F. faustina associata (as they were regarded here). In our study, high genetic distances were also detected between clades reaching 16.6%. Distances between associata clade and faustina clades varied from 12% (between clades 4 and 5) to 16.4% (between clades 4 and 7) (Table 2). Similarly such high divergence in COI was also observed in other mountain pulmonates classified in different families (Haase et al. 2013; Harl et al. 2014). Similarly, Orcula dolium (Draparnaud, 1801) (Orculidae), which occurs in the Alps and the Western Carpathians, showed maximum intraspecific divergence of 18.4% (Harl et al. 2014). This may be one of the largest intraspecific distance measured for mitochondrial genes among pulmonate gastropods (Harl et al. 2014) apart from up to 18.9% mean distance between the clades of Trochulus hispidus complex (Hygromiidae) (Kruckenhauser et al. 2014). High genetic distances, reaching more than 12.5%, were also previously detected in Arianta arbustorum (L., 1758) (Helicidae) from the Alps (Haase et al. 2013). Such high divergence within one species was usually explained as a confirmation of the presence of refugia in studied areas. Our results support the interpretation that gastropods might have survived in several geographically separated refuges (Haase et al. 2013; Harl et al. 2014).

Pinceel et al. (2005) also reported high levels of intraspecific mtDNA differences in terrestrial slug Arion subfuscus (Draparnaud, 1805) (Arionidae) from its native distribution range in Western Europe (Belgium, France, Germany, Ireland, The Netherlands, UK). Authors stated that two hypotheses proposed by Thomaz et al. (1996) may have a profound impact on the taxonomic interpretation of strongly divergent mtDNA groups within classical, well-known pulmonate species. Thomaz et al. (1996) proposed that (1) mtDNA evolution in pulmonates is exceptionally fast and (2) haplotypes differentiated in isolated refugies and subsequently came together, which may explain the extreme intraspecific mtDNA divergence among pulmonates.

Not only the Carpathians but also other mountain ranges like the Alps, the Pyrenees, or the Dinarides are chains that comprise glacial refugia in Europe. There are several cases of glacial refugia documented in the Alps and the Dinaric mountains. For instance, Carychium minimum Müller, 1774 and C. tridentatum (Risso, 1826) demonstrated a genetic structure composed of several different lineages of haplotypes, most likely resulting from allopatric differentiation in isolated shelters during the Pleistocene glacial periods (Weigand et al. 2012). Additionally, Alpine refuges were suggested for several other terrestrial gastropods, like helicid A. arbustorum (Haase and Bisenberger 2003; Gittenberger et al. 2004; Haase and Misof 2009; Haase et al. 2013) or hygromiids Trochulus oreinos (Wagner, 1914) (Duda et al. 2011) and T. villosus (Draparnaud, 1805) (Dépraz et al. 2008). In the Pyrenees, refuges were noted for the representative of Geomitridae Candidula coudensis (Holyoak and Holyoak 2010; Madeira et al. 2017).

Defining the species depends on the problem which is about to be solved. Several species concepts (e.g., biological, ecological, evolutionary, and phylogenetic) and their definitions were listed by Mayden (1997). Each of them can lead to different conclusions in the context of number and boundaries of species (De Queiroz 2007). Many species concepts have consequences in species delimitation due to the problem how to define a species (De Queiroz 2007). Application of integrative taxonomy methods that combine anatomical, conchological, and genetic data allow to identify species and to delineate them with much greater confidence (e.g., Carmona et al. 2011; Neusser et al. 2011; Gladstone et al. 2019).

F. faustina is conchologically very diverse and several morphological forms were distinguished (Subai and Neubert 2016). We analyzed widely distributed nominotypical faustina and Romanian associata forms. We demonstrated that based on the sequences of nuclear markers, which are the same for faustina and associata forms, they should be regarded as one species. Also, despite the large genetic variability in COI, our results of molecular species delimitation analyses suggest that both these forms constitute a single species, similar to previously recorded scenarios for montane snail species (e.g., Harl et al. 2014). In many molluscan species, remarkable morphological divergence is demonstrated as the result of interactions between environment/climate and phenotypic plasticity (e.g., Chiba 2004; Pascoal et al. 2012; Inoue et al. 2013; Proćków et al. 2017b, 2018). F. faustina may represent a similar case in which populations differ in shell morphology. One step further would be cross-breeding experiments on the mating behavior and reproductive success. This approach could help to establish species boundaries and the extent of reproductive isolation between different morphological forms of F. faustina (e.g., faustina and associata) and/or populations with high genetic divergence in the COI gene. Producing less fertile offspring with a reduced fitness could indicate the formation of initial reproductive barriers between these entities. Also geographical distribution and habitat preferences may have a considerable significance for defining species boundaries. For instance, a successful delimitation of cryptic taxa in sea slugs was demonstrated using geographical data combined with molecular markers and morpho-anatomy (Carmona et al. 2011). On the other hand, a unique combination of morphological and genetic studies with cross-breeding effects of different morphological species, as well as an assessment of the extent of the environmental impact on species and populations, revealed that the land snails Trochulus hispidus (Linnaeus, 1758) and T. sericeus (Draparnaud, 1801) are ecophenotypes and do not represent biological species (Proćków et al. 2017a; Proćków et al. 2018). There are no data on specific habitat preferences for distinct Faustina forms. This, however, does not exclude a hypothesis that different forms or haplogroups might differ in terms of habitat preferences. This requires further studies that could profitably focus on more extended sampling, number of populations/specimens, and different forms as well as molecular markers. All these data may be linked with the detailed information about the habitat in which a particular specimen/form/population was found.

Groenenberg et al. (2016) showed high sequence divergence within the Faustina genus. The authors discovered that Faustina is the most closely related to Kosicia and the split between them is estimated about 56.0–51.7 MYA. Additionally, they showed that the most recent common ancestor of particular forms of F. faustina, namely faustina, associata, and orba, which were studied therein, is approximately 13.4–11.3 MYA. These results are in accordance with our findings, as the split between faustina and associata forms was estimated at 12.41 MYA (Miocene).

Potential pleistocene refugia and post-glacial expansion

The results of the species distribution modeling confirm the possibility of occurrence of multiple refugia of the species in the Carpathians, revealed by our genetic data. Glacial in situ survival of strictly forest snails, including F. faustina, was suggested earlier based on fossil records in the Western Carpathians (Ložek 2006; Juřičková et al. 2014). The finding of these woodland species in the layers dated to LGM may indicate that cryptic glacial refugia of broadleaf forests existed in this region, on small islands in sheltered sites of southward orientation (Juřičková et al. 2014). The results of our modeling seem to partly support this conclusion, despite some differences between different GCM scenarios. Model projections for the LGM indicate that the Western Carpathians was an area with relatively low climatic suitability with a higher predicted probability of the presence of F. faustina only in a small, local patches (Fig. 6b–d).

Predictions from SDM analysis indicate the presence of long-term climatic stability areas, providing potential refugia of the species, also in the Eastern and Southern Carpathians as well as in the Apuseni Mts. Unfortunately, there are no fossil data from these parts of the Carpathians that can confirm. However, it is worth mentioning the presence of individuals determined as Chilostoma cf. faustinum in Quaternary sediments in NE Serbia, in the foothills of the Southern Carpathians (Mitrović and Jovanović 2000). The precise age of the studied sediments is unknown, although the authors suggest that sediments may have originated during the latest glaciation. In the Eastern Carpathians, the oldest layers containing F. faustina are known from its westernmost part, in Slovakia radiocarbon-dated to approx. 15,000 cal year BP (Juřičková et al. 2019a). Our models indicate the possibility of occurrence of the species during the LGM primarily in the more southern locations in the Romanian Eastern Carpathians, probably limited (as in the Western Carpathians) to small, isolated patches with higher climatic suitability. Indirectly, the possible presence of the glacial refugia of F. faustina in this region may be confirmed by pollen records which suggest that some woody cover was maintained in the Carpathians also during the LGM, between 26.5 and 19 ka cal BP (Feurdean et al. 2014). At the end of LGM, more compact temperate deciduous populations were confined to latitudes south of 46° N, i.e., in the Southern Carpathians, while in the rest of the area open coniferous forests, with small populations of temperate deciduous trees occurred (Feurdean et al. 2014). This is in agreement with our models, predicting that the Southern Carpathians were the most suitable region for F. faustina during the LGM according to all GCMs (Fig. 6b–d).

Our models predict a significant increase in climatic suitability during the Holocene in almost all Carpathian massifs as well as in areas more westward, in Moravia and the Eastern Sudetes. This change in environmental conditions has also been confirmed by fossil data from the Western Carpathians, where a number of forest snails sharply increased during the first half of this period and were predominating during the entire Holocene (Horsák et al. 2019). In addition, F. faustina dispersed westwards from the Carpathians to the Moravian Karst and the Bohemian Sudetes foothills during the Holocene (Juřičková et al. 2013, 2019b). Late-glacial or Early-Holocene records of this species from the Slovak Eastern Carpathians (Juřičková et al. 2019a) suggest that the West-Carpathian population may also have expanded its range towards the east. However, our models did not show a considerable increase in the environmental suitability in this region between LGM and Mid-Holocene. Nevertheless, identical haplotypes of mtDNA shared between populations of clade 2 from the Eastern Sudetes and the northern part of the Eastern Carpathians suggest a post-glacial colonization of these areas from a single refugium in the Western Carpathians. On the other hand, according to the results of our modeling, a glacial refugium of clade 1 may have also existed here, which seems to confirm the hypothesis about the presence of multiple island-like glacial refugia in the Western Carpathians. The presence of multiple isolated stable areas in the Eastern Carpathians, predicted by SDM, is also in line with genetic differentiation of the species revealed by genetic analyses of local populations.