Introduction

A general trend in latitudinal studies of genetic variation is the loss of genetic variability towards the poles (e.g. Adams and Hadly 2012; Miraldo et al. 2016; Smith et al. 2017). In the northern hemisphere, populations of many species become smaller and more fragmented with increasing latitude (e.g. Wan et al. 2018). This observation is explained by harsher environmental conditions and increasing seasonality with longer and harder winters towards the north (Vellend 2005; Vellend and Geber 2005). Furthermore, climatic conditions have oscillated during recent evolutionary times. During the Quaternary, large parts of the northern hemisphere were completely covered by glaciers and thus uninhabitable for any form of life (Hewitt 1996, 1999). Since the retreat of the glaciers after the last glacial maximum (LGM) approximately 20.000 years ago, northern species have expanded across the formerly uninhabitable areas. While doing so populations have lost variation during the recolonization process (Hewitt 2000, 2004).

This loss of genetic diversity towards the poles has consequences for the conservation of contemporary populations and species. Since adaptation to changing conditions requires genetic variation, small and fragmented populations are predicted to be more vulnerable to local extinction (McMahon et al. 2014). This prediction is borne out in comparative studies of threatened and non-threatened species (Spielman et al. 2004; Li et al. 2016).

Genetic diversity can be assayed in various ways (Holderegger et al. 2006; Rödin-Mörch et al. 2019). On the one hand, molecular surveys of natural populations may consider putatively neutral loci, i.e. loci evolving mainly through the effects of the stochastic process of genetic drift, such as the mitochondrial control region (d-loop) and nuclear short tandem repeats (referred to as microsatellites). On the other hand, direct assessments of the evolutionary potential of species should rely on loci also subjected to the effects of natural selection, such as the vertebrate major histocompatibility complex (MHC). The MHC encompasses a series of genes encoding cellular proteins involved in immune response, and features highly polymorphic loci, in particular the class II B genes, that serve to assist in the presentation of pathogens. As such, they may be recognized by T cells and trigger the production of antibodies (Sommer 2005; Piertney and Oliver 2006). With higher immunogenetic variation, populations and individuals are consequently better protected against diseases and parasites. For example, resistance to the fungal pathogen Batrachochytrium dendrobatidis has been found to be associated with amino acid residue positions that are located within MHC class II (Fu and Waldman 2017).

The European green toad (Bufotes viridis) is a widespread anuran amphibian distributed in central and Eastern Europe, reaching southern Sweden and Estonia in its northern fringe (Sillero et al. 2014). In Sweden the species has been listed as one of the most vulnerable amphibians in the country (https://artfakta.se/naturvard/taxon/100022, accessed 1/9 2020). The Swedish distribution is confined to the southernmost provinces, where only a few fragmented populations remain, geographically isolated from the European mainland. The potentially reduced gene pool of these northern isolates creates a significant problem for the conservation of B. viridis in Sweden, as lack of adaptive potential and lost parasite resistance could prevent survival and reproduction (Höglund 2009).

In the framework of contemporary conservation efforts, the aim of this study was to understand the geographic distribution of putatively neutral vs adaptive variation in B. viridis, notably to test whether the fragmented, isolated and small Sweden populations are genetically impoverished compared to the more abundant mainland populations, as expected under the biogeographic paradigms. To this aim, we specifically targeted putatively neutral markers (mitochondrial control region and nuclear microsatellites) and loci that show signals of being subjected to local environmental selection (MHC Class IIB loci).

Methods

Mitochondrial barcoding and microsatellite genotyping

One mitochondrial and seven microsatellite markers were genotyped from samples taken all over the range.

For mtDNA, we amplified ~ 900 bp of the hypervariable mitochondrial control region (CR, also named d-loop in previous work) in 340 new individuals (69 localities, Table S1). PCRs were carried out in 25 µL reactions containing 3 µL of template DNA, 12.5 µL of nanopure water, 7.5 µL of multiplex master mix (Qiagen, containing buffer, dNTPs and hot-start polymerase), 1 µL of primer CytbA-L (5′–GAATYGGRGGWCAACCAGTAGAAGACCC–3′; 10 µM), 1 µL of primer ControlB-H (5′–GTCCATTGGAGGTTAAGATC TACCA–3′; 10 µM). The following thermocycling conditions were applied: 95 °C for 15′, 35 cycles of 94 °C for 30″, 53 °C for 45″ and 72 °C for 1′; 72 °C for 5′. Amplicons were Sanger-sequenced in a single direction (CytbA-L) and sequences were assigned to one of the two main mitochondrial lineages found in B. viridis: the viridis lineage and the sitibundus lineage, the latter having spread in Europe through past mtDNA introgression by the Anatolian taxon B. sitibundus (Dufresnes et al. 2019, 2021). To visualize lineage distributions, our new data was combined to the mtDNA dataset recently compiled for all major Bufotes lineages by Dufresnes et al. (2019), and which encompasses 162 localities where the viridis and/or sitibundus mtDNA occur.

The seven microsatellite markers were D106, Bcalµ10, C203, C218, C205, C223, D105, all previously analyzed in southeastern Europe (Dufresnes et al. 2018), and genotyped here in a similar fashion. Briefly, this included amplification in multiplexed PCRs, genotyping on an ABI3130 genetic analyzer and allele scoring with Genemapper 4.0 (Applied Biosystems); details are available in the supplementary information of Dufresnes et al. (2018). A total of 336 individuals (65 localities, Table S1) were genotyped with the same panels, bins and operator (CD) as in Dufresnes et al. (2018). For downstream analyses, we could thus merge our new data with their genotype matrix of the SE-European populations of B. viridis (their loc. 20–106), for a total of 776 individuals (134 localities).

Population genetic analyses of mitochondrial and microsatellite data

Departure from neutrality, possible linkage among loci, genetic structure and diversity were inferred from the combined microsatellite data (7 loci, 776 individuals) by individual and population-based analyses. Neutrality was tested for by examining departures from Hardy–Weinberg assumptions and linkage was examined by the probability of random associations among alleles at the different loci using the software Fstat 2.9.3 (Goudet 1995). For genetic structure, we first performed a Principal Component Analysis (PCA) on individual allele frequencies with the R package adegenet (Jombart 2008). Second, we applied the Bayesian clustering algorithm of STRUCTURE (Pritchard et al. 2000) to assign individuals to genetic groups. The admixture model with correlated allele frequencies was used, with runs of 100,000 MCMC iterations after a burn in of 10,000. We tested K from 1 to 10 groups, each with ten replicates. Likelihood estimates between runs and between K models (Ln(P|K) and ΔK statistics) were summarized and visualized in STRUCTURE Harvester (Earl and von Holdt 2012), runs were combined with CLUMPP (Jakobsson and Rosenberg 2007), and graphical results were produced with DISTRUCT (Rosenberg 2004). Third, for localities with n ≥ 5, we computed population differentiation (pairwise Fst) and within-population diversity indices—observed (Ho) and expected heterozygosity (He)—using the R package hierfstat (Goudet 2005).

MHC Class II genotyping

A total of 430 samples were independently collected from Belarus, Bulgaria, Denmark, Germany, Greece, Latvia, Moldova, Poland, Romania, Russia, Sweden, and Ukraine (Table S2). Within Sweden, we obtained 272 samples. DNA was extracted with a modified high salt extraction process (Paxton et al. 1996). DNA quality and concentration were then measured for each sample using electrophoresis with 1.5% agarose and a Thermo Scientific Nanodrop 2000 spectrophotometer.

We used the forward primer 2F347F (GTGACCCTCTGCTCTCCATT) and reverse primer 2R307R (ATAATTCAGTATATACAGGGTCTCACC) to target and amplify sequences of 279-282 base pairs of the MHC Class II B gene as described in Zeisset and Beebee (2014). Samples were randomly placed into pool groups that determined the forward and reverse primer tag combinations which attached to the primers (e.g. NNN-ACTAGATC) for each sample, so that individual samples could be identified during analysis. The PCR mixtures were made using 13.9 µl double distilled water, 4 µl Phusion High Fidelity Buffer, 0.4 µl dNTPS, 0.5 µl each of the required forward and reverse primers, 0.2 µl Phusion High Fidelity DNA Taq polymerase, and 0.5 µl of the extracted DNA. The PCRs were run, and the results of each sample were measured on a 1.5% agarose gel after electrophoresis. We included 20% of replicated samples. Samples were grouped by the strength of the resulting bands, separating the pools into strong and weak clusters. Bands within the pre-pools were purified by using a Qiagen® MinElute gel extraction procedure (Qiagen, Sollentuna), following the manufacturer’s protocol. The pre-pools were measured for DNA concentration with the Qbit and consecutively pooled in equimolar amounts. Genetic libraries from every pre-pool (A, B, C and D) were prepared using ThruPLEX DNA-seq Kit (Rubicon Genomics) following the manufactures protocol. Sequencing was completed using Illumina MiSeq at the National Genomic Infrastructure, SciLifeLab in Uppsala.

MHC data analyses

For MHC variation, final number of individuals used for data analysis was 330, sourced from 12 countries where toads were sampled. Sequencing data were extracted from the raw data and combined into single forward reads using FLASH (Magoč and Salzberg 2011). Output files from FLASH were converted into fasta file format with the software NextGen Sequence Workbench (Heracle BioSoft). The pools were dimultiplexed and sequences were paired by the attached primer tags to the corresponding individuals through the jMHC software (Stuglik et al. 2011). The output files were aligned with the MUSCLE alignment package and primer tags were trimmed from the sequences for easier analysis (Edgar 2004). An Excel macro (Lighten et al. 2014) was used to estimate allele variants per individual based on Degree of Change (DOC) calculations. The DOC criterion calculates cumulative amplicon sequencing depth among the variants in the amplicon. Amplicons with total sequence reads ≤ 300 and with DOC ai counts ≥ 5 and higher were excluded for consecutively analyses for quality reasons. All valid allele sequences selected for further confirmation using similar criteria to Cortázar-Chinarro et al. (2017). The allele sequences were then aligned and analysed by eye in MEGA 7 to ensure no stop codons were present to note insertions and deletions (Kumar et al. 2018). Abnormal sequences were examined with a BLAST search and removed to filter out contamination (Altschul et al. 1990). The final valid allele aligned sequences were named using structure proposed by Klein (1975), e.g. BuvaDAB01. The sequences were then constructed into a phylogenetic tree with the neighbor-joining method (Saitou and Nei 1987). An unrooted phylogenetic network tree of the sequences was also created with the software SplitsTree4 (Huson and Bryant 2006). The dominant sequences for each sample were assigned based on locus and the DOC ai values. For each country, nucleotide diversity (π), observed (Ho) and expected (He) heterozygosity were computed with DnaSP v6 (Rozas et al. 2017) and Arlequin 3.5 (Excoffier and Lischer 2010), respectively. We calculated heterozygosities for MHC separately and for the putative loci combined. When comparing populations in different countries we used the combined estimate (see Rozsa et al. 2016).

To test for signals of balancing (positive) selection we calculated dN/dS for the entire amplified exon 2 sequences and also for the peptide binding region (putative antigen binding sites according to Bondinas et al. 2007, PBR) and non-PBR sites using the software MEGA X (Kumar et al. 2018).

Results

Distribution of mitochondrial lineages

Combining our new data (340 new individuals from 69 European localities) with the dataset of Dufresnes et al. (2019), we mapped the occurrence of the original (viridis) and foreign (sitibundus) mtDNA lineages across the range of the European taxon B. viridis (Fig. 1). These show some latitudinal associations, with viridis mtDNA spread on the western side of the distribution (from Albania – Hungary – Slovakia – Czechia – Germany), as well as in far-eastern Europe (Russia – E-Ukraine), while sitibundus mitochondria extend in between, from the Balkan Peninsula (Greece, Serbia, Romania) to the Baltic coast (Poland, Latvia, Denmark, Sweden).

Fig. 1
figure 1

Occurrence of the viridis (green dots) and sitibundus (blue dots) mitochondrial lineages in Central and Eastern Europe, and surrounding regions. The actual range of our target species B. viridis is shown in light green background (data from Dufresnes et al. 2019)

Genetic structure and diversity in the microsatellite dataset

The microsatellite data (776 individuals from 134 localities) showed some departures from Hardy–Weinberg expectations and linkage when analysed by country (Table S3). However, most of the significant departures and associations disappeared when analysed by population (Table S3). Moreover, inheritance analyses never mapped these loci on the same linkage groups (e.g. Betto-Colliard et al. 2015; Gerchen et al. 2018). We thus assume the microsatellite loci to be neutral and unlinked, and that instances of departures from neutrality and linkage disequilibrium can be ascribed to population structure within some countries (e.g. Poland).

Accordingly, in the PCA, the first component highlighted differentiation between southern and eastern European ranges compared to the northern ranges (Fig. 2a). The geographic pattern was confirmed by STRUCTURE analyses (Fig. 2b), where these populations are sorted under K = 2 (the preferred solution of the ΔK statistic, Fig. S1). Increasing the number of STRUCTURE groups highlighted some additional yet widely admixing clusters, in the northern ranges (K = 3) and the Balkans (K = 4). Interestingly, the isolated populations from W-Germany stand out as a separate cluster by K = 5 (which also shows a ΔK peak, Fig. S1). Further increase of K did not improve the likelihood nor the ΔK (Fig. S1), and did not reveal additional structure with a meaningful geographic association. These instances of genetic structure were confirmed by inspections of pairwise Fst (Fig. S2).

Fig. 2
figure 2

a PCA on individual allele frequencies for 776 microsatellite genotypes of B. viridis sampled across Europe. Colors distinguish the main regions associated with genetic structure:Balkans/E-Europe (green: Albania, Greece, N-Macedonia, Romania, Russia, Serbia, Ukraine, Belarus, Latvia), northern ranges spanning around the Baltic Sea (purple Denmark, N-Germany, Poland, Sweden) and the isolated population from W-Germany (orange). b Bayesian clustering analyses of the microsatellite genotypes into 2 to 5 groups, with the population average ancestries for 5 groups displayed on the map

Both within-population diversity indices, namely the observed and expected heterozygosity Ho and He, recovered similar patterns (Fig. 3a), i.e. higher estimates in southern populations compared to the rest of range, with the lowest values found around the Baltic Sea. Accordingly, there was a significant negative relationship between genetic variation and latitude (P < 0.001, df = 41; linear regressions, with R2 of 0.40 and 0.45 for Ho and He, respectively, Fig. 3b).

Fig. 3
figure 3

a Within-population genetic diversity, as estimated from the microsatellite dataset for localities of at least 5 genotyped toads. The variation in Ho and He estimates are illustrated by color gradients. b Relationships between within-population genetic diversity and latitude for these populations. Both are significant (see “Results”)

MHC Class II variability

Of the obtained sequences, 32 were 279 bp in length while 5 (BuvaDAB22–26) were 282 bp due to insertions/deletions identified after alignment (Fig. S3). Across the sequences, 82 sites were found to be polymorphic. All sequences were present in individuals collected from mainland Europe, while 14 of the sequences were not found in Swedish toads (Fig. 4a).

Fig. 4
figure 4

a neighbor-joining tree (numbers above branches indicate bootstrap support values, when above 50%) and b unrooted phylogenetic network of MHC Class II B sequences; sequences found in Swedish samples are indicated in bold, with asterisks; c geographic occurrence of these sequences (pie size is proportional to the number of samples)

We interpret the two main basal branches of the neighbor-joining tree and the corresponding sequences on the left of the network as belonging to two different loci: locus 1 containing sequences 1 through 21 and 27 through 32, and locus 2 containing sequences 22 through 26 (Fig. 4b). No further divisions of loci could be determined with certainty. Out of all individuals, 29 contain sequences from both loci while the remaining 311 individuals only expressed sequences contained within locus 1, indicating copy number variation. Ratios of dN/dS showed that the putative PBR harboured more non-synonymous substitutions than the non-PBR both when analysed overall and as separate loci (Table S4). This indicates that the amplified loci have been under positive adaptive selection.

Comparing diversity estimates among the Swedish and other European populations reveal low diversity in the former, especially with respect to Ho, which was lower than expected (Table 1; Fig. 5). Furthermore, there was a trend that within populations, MHC- allele sequence diversity tended to be dominated by a single allele sequence towards the northern limits of the distribution (Fig. 4), thus explaining the lower Ho.

Table 1 MHC sequence variation among all populations
Fig. 5
figure 5

Comparison of a observed (Ho) and b expected (He) heterozygosity indices from microsatellite (grey) and MHC Class II B sequences (white) between countries where both data were available for at least 5 toads

Discussion

The main purpose of this study was to investigate the genetic diversity of Bufotes viridis green toad populations throughout its vast range, with a particular focus on the isolated and red-listed Swedish populations. The species is one of only 11 anuran taxa found in the country, and like most of them, its occurrence is confined to the southernmost provinces (here Blekinge, Skåne and Öland). Considerable conservation effort has been directed towards protecting B. viridis during the last decades, including captive breeding and support releases (Wirén 2006). In 2015, it was listed as vulnerable on the national red list, after previously being classified as critically endangered (https://artfakta.se/naturvard/taxon/100022). Historically, the species has been reported from more than 50 localities, but at present, breeding is confirmed in only about ten of these (Kauri 1948; Andrén and Nilson 1993).

Moreover, it is worth mentioning that Swedish green toads have been affected by taxonomic and even nomenclatural changes in recent years (Dufresnes et al. 2019). Bufotes systematics are complicated by far reaching introgression of B. sitibundus (previously “B. variabilis”) mt-DNA into the range of B. viridis (Stöck et al. 2006; Dufresnes et al. 2019, 2021, this study). As a consequence, Swedish populations used to be called “B. variabilis” (as still listed on the Swedish red list, following Stöck 2005), even though they actually correspond to B. viridis, the only species present in continental Europe – expect in Ukraine-Russia (B. sitibundus), Italy and Spain (B. balearicus) (Dufresnes et al. 2019). In addition, the interspecific status of these taxa remains controversial, and some authors rather consider them as subspecies (Speybroeck et al. 2020), in which case Swedish green toads corresponds to B. viridis viridis. Regardless of how the peripheral Swedish populations are called or classified, they are of great interest from a conservation genetic perspective, notably to compare their diversity with central populations of the range.

In this respect, we show that Swedish green toads harbor among the lowest levels of genetic variation documented in the species across Europe, both at putatively neutral microsatellites and MHC loci. This is also true for mtDNA, where a single haplotype of the B. sitibundus lineage (SIT12) has become almost fixed from Anatolia to northern ranges (Dufresnes et al. 2019). Note that the independent origin of the samples analyzed for each marker set prohibits direct statistical comparisons (see Methods). For MHC, our analyzes of dN/dS ratios at PBR and non-PBR sites suggest that the sequences analyzed are subjected to selection (see also e.g. Cortazar-Chinarro et al. 2017, for evidence of selection on MHC Class II in Swedish Rana arvalis). The same gradients of diversity seem to apply to several amphibian species, at least in anurans, where northern populations show lesser immunogenetic variation than southern populations, e.g. natterjack toad Epidalea calamita (Zeisset and Beebee 2014; Höglund et al. 2015), common frog Rana temporaria (Palo et al. 2004), moor frog R. arvalis (Cortazar-Chinarro et al. 2017) and common toad Bufo bufo (Meurling 2019). In fact, lower variability in northern ranges was also retrieved from neutral markers e.g. microsatellites in R. arvalis (Cortazar-Chinarro et al. 2017), and ddRAD-derived SNPs in R. arvalis (Rödin-Mörch et al. 2019) and B. bufo (Thörn et al. 2021).

These general latitudinal trends can be explained by two non-mutually exclusive hypotheses. First, the central-marginal hypothesis posits that peripheral populations harbor less variation than central core populations (Eckert et al. 2008). Second, populations at northern latitudes are predicted to be less variable because of the recolonization history since the last glaciation (Hewitt 1996, 1999). In essence, both hypotheses rest on the idea that populations at the range edge have been subjected to demographic histories signified by smaller and more fragmented populations, which would result into losses of genetic diversity through the increased effects of genetic drift (Kimura 1983). The patterns observed here for green toads are in line with the recolonization hypothesis, where the southern populations bear the highest levels of genetic diversity. Accordingly, previous phylogeographic studies inferred post-glacial expansions from southern refugia putatively located in the Balkan Peninsula (Stöck et al. 2006; Dufresnes et al. 2019). In addition, the presence of discrete genetic clusters in the northern ranges, corresponding to areas with the most impoverished diversity, could be another effect of drift, as genetic variants distinct from the core populations become fixed (e.g. a few MHC alleles are in much higher proportions in Sweden and Germany than elsewhere). However, the comparatively high genetic diversity in southern Europe may have been inflated by the diffusion of alleles from the closely related B. sitibundus. This taxon forms a wide hybrid zone with B. viridis across Western Anatolia, which could have been previously located in the Balkans (Dufresnes et al. 2021). The occurrence of foreign alleles through past introgression, as seen at least from mtDNA, would obviously obscure the comparisons.

Northern green toad populations are predicted to be more threatened and less adaptable than southern populations, as shown for most widespread European amphibians (Dufresnes and Perrin 2015). Given the fragmented distribution and low population sizes of Swedish green toads, it is questionable whether it is at all possible for these populations to adapt to local conditions, since the strength of genetic drift is inversely related to effective population size (Kimura 1983). The current captive breeding program has mainly selected individuals originating from a single locality in Sweden, and there is some evidence of adaptive variation in a few larval life history traits (Rogell et al. 2011). Should the breeding program extend to additional populations, it may be advisable to consider the origin of the sources. This is especially important since MHC-variation may be locally adapted to the parasite fauna (Savage and Zamudio 2011, 2016). Furthermore, the unusual pattern of mtDNA diversity, namely mitochondrial capture of many B. viridis populations, is worthy of consideration. Studies on insects have shown clear evidence of cytoplasmic-nuclear incompatibilities (e.g. Dowling et al. 2006, 2007). The far-reaching spread of mtDNA of B. sitibundus origin could thus affect the ability of B. viridis to persist under local conditions in northern Europe.

To conclude, our study illustrates the strong heterogeneity in the level of genetic diversity across the range of a widespread species that was strongly affected by past climate changes. Importantly, the loss of genetic variants associated with the higher drift experienced by populations that have experienced strong founder effects during post-glacial expansion from refugia and that remain small and fragmented because they occupy more challenging environments on the range periphery, seem to have lost both neutral (microsatellites) and adaptive (MHC) genetic diversity.